Title: Outline
1Gene Ontology Annotation (GOA)
Implementation of GO in SWISS-PROT, TrEMBL and
InterPro
2Aims of GOA project
- To associate GO terms with All proteins in
- - SWISS-PROT TrEMBL
- - InterPro families and domains
- (ClusTr Groupings, CDSs in EMBL)
- Fast tracking GO annotation of completed
- proteomes eg. human
3Q10561
Large-scale GO annotation
- SWISS-PROT Keyword lt-gt GO term mapping
- EC number lt-gt GO term mapping
- InterPro entry lt-gt GO term mapping
-
SP_KWFatty acid biosynthesis gt GOfatty acid
biosynthesis GO0006633
EC6.4.1.2 gt GOacetyl-CoA carboxylase
GO0003989
InterProIPR000438 Acetyl-CoA carboxylase
carboxyl transferase beta subunit gt
GOacetyl-CoA carboxylase GO0003989
4Electronic GO Annotation
5Electronic Annotation Status
Evidence Source Annotations Proteins
IEAIPRO 1 244 525 413 379
IEASPKW 1 012 768 368 328
IEASPEC 164 413 72 521
IEA Total Human 2 415 659 44 254 517 152 16 180
Nov 2002
6Enter SPTR Accession No.
Enter Evidence Code
Enter Reference Type
Enter Reference Accession No
Existing mappings (IPRO,SPKW, SPEC, PINC)
7 8 9Converting SWISS-PROT free text to GO terms
10Manual Annotation Status
Evidence Source Annotations Proteins
Non-IEATotal Human gt 31 000 30 458 gt 10 000 9 075
All CodesTotal Human 2 464 507 74 712 522 844 18 563
Nov 2002
11How did we perform?
- Electronic Manual
- 57 1.3
- 45668 species 151 species
- 2.46 mill assoc. 522844 proteins
- InterPro Keyword Manual
12GOA Dataflow
13Annotation File Format
SPTR O00505 IMA3_HUMAN GO0006886 GOAinterpro IEA
P SPTR O00505 IMA3_HUMAN GO0005634 GOAspkw IEA
C SPTR O00505 IMA3_HUMAN GO0005643 PUBMED9154134
TAS C
Importin alpha-3 subunit IPI00012092 protein taxon
9606 20020920 Importin alpha-3
subunit IPI00012092 protein taxon9606
20011011 Importin alpha-3 subunit IPI00012092 prot
ein taxon9606 20020630
14Data Searching Retrieval
- Gene Association File
- ftp//ftp.ebi.ac.uk/pub/databases/GO/goa/HUMAN/gen
e_association.goa - ftp//ftp.geneontology.org/pub.go/gene-association
s/gene_association.goa - QuickGO or AmiGO Browser
- SRS (GO, GOA)
- InterPro
- Proteome Analysis Pages
- GO MySQL at BDGP
- http//www.fruitfly.org/cgi-bin/wiki/view.pl/GoWeb
/GoExampleQueries - Ensembl GOview
- AltExtron DB
- GOA home page
- http//www.ebi.ac.uk/GOA
15Data Searching Retrieval
16GOBO contribution
SP Tisslist
Oxf. Dict., MGED
GO
GO
GO
17Find the location of human genes mapped to a
particular GO term using ensembl GO-View.
18SRS Functional Queries Across DB
A
19InterPro
20Proteome Analysis Pages
21Distribution of Molecular Function
Sept 2002
22What can I do with GOA?
- Structured vocabularies support semantic
integration within the EBI system and promote
broader integration of knowledge from the
SWISS-PROT database group. - Discover the function of new sequences by
determining more quickly functional similarities
among proteins. - Perform functional queries across our data
repositories - Overview proteomes with GO-slim
- Incorporate our manual annotations into your
database - - to enhance your dataset or use it to validate
an automated way of deriving information about
protein function - Use our mappings to map GO terms to your own
dataset - (microarray/mass spec.)
-
23Using GO for functional genomics
- Allows cluster analysis, e.g expression data,
- microarrays etc
component
process
function
Gene
experimental condition
genes
24Future of GOA
- GO is dynamic, GOA needs regular updating
- Regular release Human(v5) and SPTR(v3)
- GOA file
- Incorporate manual annotations from other grps
- Display of GOA in TrEMBL XML, other EBI db eg.
EMBL - GO Curation tool and QuickGO browser being
updated - Accessing GOA via SRS needs updating.
- By 2004 -gt 70 SWISS-PROT TrEMBL w GO ?
25Acknowledgements
Daniel Barrell GOA File updates David Binns
QuickGO Wolfgang Fleischmann Automation
Coordinator John Maslen - Talisman Paul Kersey
Xref file data set generation Michele Magrane
all curators - GO Annotation Nicola Mulder Co -
InterPro Midori Harris, Jane Lomax, Amelia
Ireland, Cath Brooksbank GO Curators Evelyn
Camon GOA Coordinator BioBabelQLRT-2000-00981
TEMBLORQLRI-2001-00015 NIH1R01HG2273-01
26Standards and Ontologies for Functional Genomics
(SOFG) 1720 Nov 2002 Towards unified
ontologies for describing biology and
biomedicine www.wellcome.ac.uk/hinxton/sof
g