Gene Onotology Part 1: what is the GO? - AgBase

Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics Bar Harbor, ME http://www.geneontology.org

What is the GO The scope of the GO The GO Relationships Using the GO for annotation Anatomy of an annotation Evidence codes

qualfiers gene association files What IS the GO The Gene Ontology is a dictionary of concepts used to describe the normal properties of a gene product It has concepts describing molecular functions It has concepts describing biological processes

It has concepts describing cellular locations that the gene products are found in Gene Ontology Built for a very specific purpose: annotation of genes and proteins in genomic and protein databases Built to be applicable to any organism Formed to develop a shared language

adequate for the annotation of molecular characteristics across organisms; a common language to share knowledge. The GO is NOT list of genes or proteins although you might find a synonym as a gene or protein name

does NOT track diseases although certain disease phenotypes might suggest the function of a gene product or a process that it may participate in you will not find tumor suppressor activity/tumor suppression as GO terms The Gene Ontology Consortium Started Small

Original GO created in 2000 Three databases involved: FlyBase (Drosophila) MGI (Mouse) SGD (S. cerevisae) Used immediately www.geneontology.org

More quickly joined... Later databases: TAIR (Arabadopsis) TIGR (microbes including prokaryotes) SWISS-PROT (several thousand species inc. human) PSU (P. falciparum) ZFIN (zebrafish)

PAMGO (plant pathogens) Gene Ontology widely adopted AgBase 8 Why do we need this?

Tactition Taction Tactile sense perception of touch ; GO:0050975 Often the same term is referred to differently

Bud initiation? Of then the same term is used by different communities to mean different things... More specifically The GO is not just a flat list of terms

transcription transcriptionfactor factoractivity activity DNA DNAbinding binding transcription transcriptionregulator

regulatoractivity activity membrane membrane mitochondrial mitochondrialmembrane membrane glycolysis glycolysis

nucleus nucleus cytoplasm cytoplasm ion iontransport transport ..... .....

There are also relationships between them. Nucleic acid binding is a type of binding. DNA binding is a type of nucleic acid binding. is_a

is_a And the terms can have more than one parent! Ontology Structure The Gene Ontology is structured as a hierarchical directed acyclic graph (DAG)

Terms can have more than one parent and zero, one or more children Terms are linked by three relationships is-a part-of regulates (new) negatively regulates is_a positively regulates

part_of Ontology Structure cell membrane mitochondrial membrane

is-a part-of chloroplast chloroplast membrane

It gets complicated quickly http://www.ebi.ac.uk/ego The 3 Gene Ontologies Molecular Function = elemental activity/task the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity

Biological Process = biological goal or objective broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions Cellular Component = location or complex subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme Cellular Component

where a gene product acts Molecular Function activities or jobs of a gene product A gene product may have several functions; a

function term refers to a single reaction or activity, not a gene product. Sets of functions make up a biological process. glucose-6-phosphate isomerase activity insulin binding

insulin receptor activity Biological Process a commonly recognized series of events limb development gluconeogenesis cell

division Mitochondrial P450 An example (CC24 PR01238; MITP450CC24) Anatomy of a GO term A GO term obo format stanza

begins with [Term] and minimally has id: name: namespace def one or more relationships More GO Term Stanzas

The Regulates Relationship 24 In the Beginning There Were Two Relationships Is_a: denotes a subtype of its parent. Part_of: denotes a portion of a parent Is_part: If it exists, it is always a part of its

parent (this is the relationship we use). Has_part: If there is a parent, then it has this as a part of it. We made the regulation of something a part_of the something But its not really part_of

So, whats the issue with regulates? Regulation is not always an inherent part of the process that it regulates A speed-bump regulates the velocity of my car 50 mph 5 mph

We needed a better way to express regulates We defined regulation as any process that modulates the frequency, rate or extent of something. Something can be: A Biological Process A Molecular Function

A Biological Quality A decomposed Term [Term] id: GO:0000019 name: regulation of mitotic recombination namespace: biological_process def: "Any process that modulates the frequency, rate or extent of DNA recombination during mitosis." [GOC:go_curators]

synonym: "regulation of recombination within rDNA repeats" NARROW [] is_a: GO:0000018 ! regulation of DNA recombination intersection_of: GO:0065007 ! biological regulation intersection_of: regulates GO:0006312 ! mitotic recombination relationship: regulates GO:0006312 ! mitotic recombination The intersection tags make up the logical definition. This places the regulation term in the context of mitotic recombination.

The context of mitotic recombination Old regulation of mitotic recombination part of the graph on top of mitotic recombination Now regulates

regulates regulates What does this buy us? The new relationship portrays the biology more accurately than part_of Regulates Positively rgulates

Negatively regulates The new logical definitions allow automated consistency checks as the ontology is developed. The first implementation of cross-products in GO Sets the stage for: Molecular function -> biological process Cell type -> biological process Chebi -> biological process

On March 18 2008) th [Term] id: GO:0000019 name: regulation of mitotic recombination namespace: biological_process def: "Any process that modulates the frequency, rate or extent of DNA

recombination during mitosis." [GOC:go_curators] narrow_synonym: "regulation of recombination within rDNA repeats" [] is_a: GO:0000018 ! regulation of DNA recombination relationship: part_of GO:0006312 ! mitotic recombination [Term] id: GO:0000019 name: regulation of mitotic recombination namespace: biological_process

def: "Any process that modulates the frequency, rate or extent of DNA recombination during mitosis." [GOC:go_curators] synonym: "regulation of recombination within rDNA repeats" NARROW [] is_a: GO:0000018 ! regulation of DNA recombination intersection_of: GO:0065007 ! biological regulation intersection_of: regulates GO:0006312 ! mitotic recombination relationship: regulates GO:0006312 ! mitotic recombination Evolution of GO

GO term development was annotation-driven Development directed by use: Terms added as new species annotated Terms added on as as-needed basis Developed by an international consortium of biologists and computer scientists members from individual databases central office at EBI

Development involves collaboration with domain experts from different biological fields also formal ontologists Important Consideration for Users The GO changes daily new terms added additional relationships added terms removed: obsoletes terms

GO Slims What is a GO Slim A GO Slim is a smaller slice of the GO that can be used to bin data into categories relevant to the user's experiment Why use this? you want to group several sections of the GO

into a single broader category you want to remove sections that are totally irrelevant for your assay (eg, photosynthetic processes irrelevant for birds). Several GO Slims are referenced in the gene_ontology.obo file

Section of OboEdit showing GO slims built into the ontology But you can build your own In OboEdit, select the Category Manager (under Metadata) Use add to add a new one; I am adding one for translation

After saving in the category manager, the new slim appears in the category list Now I browse through the GO, selecting terms and checkingthem in the catagories box. Make sure you commit (save) each selected term. Note, the children of a term are not automatically selected.You need to decide.

Checking the filter terms box during save will allow you to save just your slim to a new file Now you can use THIS obo in various binning tools such as GO term finder, Vlad, GO Slimmer, rather than the entire GO GO Slimmer tool is part of AmiGO You cans specify your genes in a number of ways

You can filter on species and evidence code you can input or choose a GO slim You can also select various output options

The gene product counts and a tab-delimited file are great for making pie or bar charts in Excel! Visit http://www .geneontology.org and http://www.godatabase.org for more GO Slim help

Recently Viewed Presentations

  • The Humanities: An Introduction to the Adventure

    The Humanities: An Introduction to the Adventure

    The Pictorial Arts Medium used? Lines and Shapes? Color and Light? Patterns? Does it create a meaningful whole? Two-Dimensional Art Media: paintings, drawings, prints and photography Composition elements and principles: line, form, color, repetition and balance Sculpture Full Round or...
  • CELLS and how they relate to Life Functions

    CELLS and how they relate to Life Functions

    4. Nucleus . Controls all the activities of the cell. Contains . Chromosomes (DNA) Brain of the cell. Surrounded by a . Nuclear Envelope, separating it from the rest of the cell
  • Chapter 21

    Chapter 21

    Chapter 21. Chapter 21The New Deal, 1932-1940. The Grand Coulee Dam on the Columbia River in the Pacific Northwest, when completed in 1941, was the largest man-made structure in world history.
  • Intrapartum CTG Workshop - kau

    Intrapartum CTG Workshop - kau

    Intrapartum CTG Workshop Case # 1 A 26 years old, G 3 p 3 with H/O twin delivery in the first pregnancy, admitted to the hospital at 31 weeksgestation with labour pains and preterm premature rupture of membranes for 4...
  • GIO briefing at REASoN Educ Tech Wkshp - NASA

    GIO briefing at REASoN Educ Tech Wkshp - NASA

    Jeff de La Beaujardière, PhD NASA Geospatial Interoperability Office Overview Definition of geospatial interoperability NASA's Geospatial Interoperability Office (GIO) Geospatial Web Services Earth Science Gateway Geospatial Information Content Transfer of Geospatial Information Recommended method: Open-standard, vendor-neutral web services Interoperability among...
  • Molecular Structure & Covalent Bonding Theories

    Molecular Structure & Covalent Bonding Theories

    VSEPR Theory - the arrangement of the valence shell electrons around the central atom determined by the locations of regions of high electron density around the central atom(s) (includes both elements and lone pairs of electrons)
  • Global Interpretations of Christian Scriptures

    Global Interpretations of Christian Scriptures

    textual features do they view as most significant. and do we ignore? Finding that other people's interpretations are strange or seem wrong is a call 2) very quick. to understand how their interpretations are . nevertheless plausible. When viewed from...
  • Trade Adjustment Assistance Program Overview Steven Gustafson, State

    Trade Adjustment Assistance Program Overview Steven Gustafson, State

    Reauthorized by Trade Adjustment Assistance Reauthorization Act (TAARA) of 2015. The TAA program provides resources to help workers obtain new skills and find suitable employment when foreign trade or competition reduces the demand for the products they make or the...