Gene Ontology GO - PowerPoint PPT Presentation

1 / 76
About This Presentation
Title:

Gene Ontology GO

Description:

Introduction to GO, its purpose, structure, annotation and applications, plus ... classes = GO terms, types, kinds, universals ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 77
Provided by: cise8
Category:
Tags: gene | go | ontology

less

Transcript and Presenter's Notes

Title: Gene Ontology GO


1
Gene Ontology (GO)
  • An Introduction
  • Dev H. W. Oliver

2
References
  • GO For Newbies
  • Introduction to the Gene Ontology Suparna
    Mundodi PAG 2006
  • http//www.geneontology.org/GO.teaching.resources
    .shtmlpresent
  • GO the Gene Ontology Introduction to GO, its
    purpose, structure, annotation and applications,
    plus other biomedical ontologies
  • Pascale Gaudet Presentation to bioinformatics
    graduate students
  • http//www.geneontology.org/GO.teaching.resources
    .shtmlpresent
  • Building Biomedical Ontologies Jennifer Clark
    EBI internal seminar
  • http//www.geneontology.org/GO.teaching.resources
    .shtmlpresent

3
Outline
  • Introduction and Motivation
  • Building an Ontology
  • GO Annotations
  • Editing The Gene Ontology
  • Applications of Gene Ontology (GO)

4
Introduction and Motivation
  • Gene annotation system
  • Controlled vocabulary that can be applied to all
    organisms
  • Used to describe gene products

5
What is in a name?
  • Example What is a cell?
  • Mundodi, et al

6
The Importance of a Name
  • A cell is

7
The Importance of a Name
  • A cell is

8
The Importance of a Name
  • A cell is

9
The Importance of a Name
  • A cell is

10
The Importance of a Name
  • A cell is

11
The Importance of a Name
  • Different names can be used to describe the same
    concepts
  • Different concepts can be described by the same
    name
  • Example Glucose synthesis, Glucose biosynthesis,
    Glucose formation, Glucose anabolism,
    Gluconeogenesis all refer to the process of
    making glucose from simpler components

12
What is the Gene Ontology?
  • A controlled vocabulary that can be applied to
    all organisms
  • Used to describe gene products - proteins and RNA
    - in any organism
  • Gaudet, et al

13
Ontology
  • Study of being or existence
  • Seeks to describe the basic categories and
    relationships of being or existence
  • Defines entities and types of entities within its
    framework
  • Studies the conceptions of reality
  • Wikipedia

14
Ontology
  • Includes
  • A vocabulary of terms (names for
  • concepts)
  • Definitions
  • Defined logical relationships to each other

15
Ontology Structure
  • Ontologies can be represented as graphs, where
    the nodes are connected by edges
  • Nodes concepts in the ontology
  • Edges relationships between the concepts

16
Ontology Structure
  • The Gene Ontology is structured as a hierarchical
    directed acyclic graph (DAG)
  • Terms can have more than one parent and zero, one
    or more children
  • Terms are linked by two relationships
  • is-a
  • part-of

17
Simple hierarchies (Trees)
Directed Acyclic Graphs
Single parent
One or more parents
18
Directed Acyclic Graphs (DAG)
protein complex
organelle
other organelles
mitochondrion
other protein complexes
fatty acid beta-oxidation multienzyme complex
is-a part-of
19
Parent-Child Relationships
The cell component term Nucleus has 5 children
20
True Path Rule
  • The path from a child term all the way up to its
    top-level parent(s) must always be true
  • cell
  • cytoplasm
  • chromosome
  • nuclear chromosome
  • nucleus
  • nuclear chromosome

is-a ? part-of ?
21
How does GO work?
  • What information might we want to capture about a
    gene product? i.e. The biochemical material,
    either rna or protein, resulting from expression
    of a gene. The amount of gene product is used to
    measure how active a gene is
  • What does the gene product do?
  • Why does it perform these activities?
  • Where does it act?

22
The Three Gene Ontologies
Molecular Function
23
ExampleGene Product hammer
  • Function (what) Process (why)
  • Drive nail (into wood) Carpentry
  • Drive stake (into soil) Gardening
  • Smash roach Pest Control
  • Clowns juggling object Entertainment

24
Molecular Function
  • A single reaction or activity, not a gene product
  • A gene product may have several functions
  • Sets of functions make up a biological process

25
Biological Process
26
Cellular Component
  • Where a gene product acts

27
Mitochondrial membrane
28
What is in a GO Term?
  • term gluconeogenesis
  • id GO0006094
  • definition The formation of glucose from
    noncarbohydrate precursors, such as pyruvate,
    amino acids and glycerol.

29
Areas Not Covered By GO
  • No pathological processes
  • No experimental conditions
  • NO evolutionary relationships
  • NO gene products
  • NOT a system of nomenclature

30
(No Transcript)
31
Content Of GO
  • Current term counts as of September 17, 2006 at
    200 Pacific time
  • 20623 terms, 95.7 with definitions
  • 11360 biological_process
  • 1806 cellular_component
  • 7457 molecular_function
  • There are 1007 obsolete terms not included in the
    above statistics
  • http//www.geneontology.org/GO.
    downloads.shtmlont

32
Outline
  • Introduction and Motivation
  • Building an Ontology
  • GO Annotations
  • Editing The Gene Ontology
  • Applications of Gene Ontology (GO)

33
Clark et al., 2005
34
Clark et al., 2005
35
Synonyms
  • classes GO terms, types, kinds, universals
  • instances annotated gene product attributes,
    tokens, individuals, particulars

36
A Deeper Look at Relationships
  • Recall
  • is_a
  • part_of

37
part_of
  • Represents how objects combine together to form
    complex objects
  • E.g. Steering wheel is a part of Ford Explorer.

38
How to define A is_a B
  • A and B are names of universals (natural kinds,
    types) in reality
  • All instances of A are as a matter of biological
    science also instances of B

39
Easy term request
  • Please add
  • leucophore differentiation
  • erythrophore differentiation
  • cyanophore differentiation
  • neuron differentiation

40
part_of
41
Examples Of Cell Differentiation
is_a
42
Circular Definitions BAD!
  • What is X-Cell Differentiation?
  • Differentiation of an x cell.

43
Non Circular Definitions Good
  • What is X-Cell Differentiation?
  • The process whereby a relatively unspecialized
    cell acquires specialized features of an x cell.
  • Here we list the characteristics of the x cell

44
This is a term in the GO
Term id GO0030182 name neuron
differentiation namespace biological_process def
"The process whereby a relatively unspecialized
cell acquires specialized features of a neuron."
GOmah is_a GO0030154 ! cell
differentiation relationship part_of GO0048699
! neurogenesis
45
This is the related term in the cell type
ontology
Term id CL0000540 name neuron def "The
basic cellular unit of nervous tissue. Each
neuron consists of a body\, an axon\, and
dendrites. Their purpose is to receive\,
conduct\, and transmit impulses in the nervous
system." MESHA.08.663 xref_analog
FBbt00005106 xref_analog FBbt00005146 is_a
CL0000393 ! electrically responsive cell is_a
CL0000404 ! electrically signaling
cell relationship develops_from CL0000031 !
neuroblast
46
Once we put the info together then the term in GO
is much better and has far fewer circular
definitions
Term id GO0030182 name neuron
differentiation namespace biological_process def
"The process whereby a relatively unspecialized
cell acquires specialized features of a neuron.
The basic cellular unit of nervous tissue. Each
neuron consists of a body\, an axon\, and
dendrites. Their purpose is to receive\,
conduct\, and transmit impulses in the nervous
system." MESHA.08.663, GOmah is_a
GO0030154 ! cell differentiation relationship
part_of GO0048699 ! neurogenesis intersection_o
f is_a GO0030154 ! cell differentiation inte
rsection_of has_participant CL0000540 !
neuron
47
Basis in Reality
  • When building or maintaining an ontology, always
    think carefully about how classes relate to
    instances in reality

48
(No Transcript)
49
Ontology
cartoon character super power ontology
super senses super physical powers x-ray
cat super
super vision senses
leaping strength
is_a
Every genus (parent class) has an instantiated
species (differentia genus) Every genus (parent
class) has at least two children
50
No instance should be annotated to two leaf terms
or two terms on the same level, but they are
here, so what is wrong?
51
Cartoon Character Super Power Ontology
  • Actually it is the superpowers that are
    annotated, not the superheros. Once we fix that,
    the rule is obeyed and the ontology is being
    correctly used so is more powerful for reasoning

52
Ontology
cartoon character super power ontology
super senses super physical powers x-ray
cat super
super vision senses
leaping strength
is_a
Supermans X-ray vision
Catwomans cat senses
Catwomans super strength
Supermans super leaping
53
Ontology
cartoon character super power ontology
super senses super physical powers
is_a
Supermans X-ray vision
Catwomans cat senses
Catwomans super strength
Supermans super leaping
54
Outline
  • Introduction and Motivation
  • Building an Ontology
  • GO Annotations
  • Editing The Gene Ontology
  • Applications of Gene Ontology (GO)

55
substrate O2 CO2 H20 product
56
Types of GO Annotations
  • Electronic Annotation
  • Manual Annotation
  • All annotations must
  • be attributed to a source
  • indicate what evidence was found to support the
    GO term-gene/protein association

57
Manual Annotations
  • Highquality, specific gene/gene product
    associations made, using
  • Peer-reviewed papers
  • Evidence codes to grade evidence
  • Very time consuming and requires trained
    biologists

58
Manual Annotations Methods
  • Extract information from published literature
  • Curators performs manual sequence similarity
    analyses to transfer annotations between highly
    similar gene products (BLAST, protein domain
    analysis)

59
Electronic Annotations
  • Provides large-coverage
  • High-quality
  • Annotations tend to use high-level GO terms and
    provide little detail

60
Electronic Annotations Methods
  • Database entries
  • Manual mapping of GO terms to concepts external
    to GO (translation tables)
  • Proteins then electronically annotated with the
    relevant GO term(s)
  • Automatic sequence similarity analyses to
    transfer annotations between highly similar gene
    products

61
Additionally
  • A gene product can have several functions,
    cellular locations and be involved in many
    processes
  • Annotation of a gene product to one ontology is
    independent from its annotation to other
    ontologies
  • Annotations are only to terms reflecting a normal
    activity or location
  • Usage of unknown GO terms

62
Unknown v.s. Unannotated
  • Unknown is used when the curator has determined
    that there is no existing literature to support
    an annotation
  • Biological process unknown GO0000004
  • Molecular function unknown GO0005554
  • Cellular component unknown GO0008372
  • NOT the same as having no annotation at all
  • No annotation means that no one has looked yet

63
Outline
  • Introduction and Motivation
  • Building an Ontology
  • GO Annotations
  • Editing The Gene Ontology
  • Applications of Gene Ontology (GO)

64
How is GO maintained?
  • Several full-time editors
  • Requests from community
  • database curators, researchers, software
    developers
  • SourceForge tracker
  • GO Consortium meetings for large changes
  • Mailing lists

65
(No Transcript)
66
Ensuring Stability in a Dynamic Ontology
  • Terms become obsolete when they are removed or
    redefined
  • GO IDs are never deleted
  • For each term, a comment is added to explains why
    the term is now obsolete

67
Why modify the GO?
  • GO reflects current knowledge of biology
  • New organisms being added makes existing terms
    arrangements incorrect
  • Not everything perfect from the outset

68
Example - parasites
  • Original GO

69
Example - parasites
  • Annotation of P. falciparum
  • protozoan cellular parasite
  • intracellular infection (erythrocytes)
  • Parasite proteins located in host nucleus
  • What cellular component term to annotate to?
  • nucleus refers to parasite nucleus when
    annotating parasite

70
Example - parasites
  • Added new term host

71
Example - parasites
72
Requesting changes to GO - curator requests
tracker
  • Common changes suggested
  • new term requests
  • reporting errors (typos, etc)
  • obsoletion/merge requests
  • add synonym
  • queries
  • term move (change parents)

73
Outline
  • Introduction and Motivation
  • Building an Ontology
  • GO Annotations
  • Editing The Gene Ontology
  • Applications of Gene Ontology (GO)

74
What can scientists do with GO?
  • Access gene product functional information
  • Find how much of a proteome is involved in a
    process/ function/ component in the cell
  • Map GO terms and incorporate manual annotations
    into own databases
  • Provide a link between biological knowledge and
  • gene expression profiles
  • proteomics data

75
What can scientists do with GO?
  • Microarray analysis / gene expression
  • See http//www.geneontology.org/GO.tools for
    current Microarray analysis tools

76
End of Talk
  • Thanks
Write a Comment
User Comments (0)
About PowerShow.com