Title: UniProt
1UniProt
- Eric Jain
- Swiss Institute of Bioinformatics, Geneva
- W3C Workshop on Semantic Web for Life Sciences,
October 2004
2 3(No Transcript)
4(No Transcript)
5ID ATPB_CANFA STANDARD PRT 19
AA. AC P99504 DT 15-JUL-1998 (Rel. 36,
Created) DT 15-JUL-1998 (Rel. 36, Last sequence
update) DT 05-JUL-2004 (Rel. 44, Last
annotation update) DE ATP synthase beta chain,
mitochondrial (EC 3.6.3.14) (Fragment). GN
NameATP5B OS Canis familiaris (Dog). OC
Eukaryota Metazoa Chordata Craniata
Vertebrata Euteleostomi OC Mammalia
Eutheria Carnivora Fissipedia Canidae
Canis. OX NCBI_TaxID9615 RN 1 RP
SEQUENCE. RC TISSUEHeart RX
MEDLINE98163340 PubMed9504812 RA Dunn M.J.,
Corbett J.M., Wheeler C.H. RT "HSC-2DPAGE and
the two-dimensional gel electrophoresis database
of RT dog heart proteins." RL
Electrophoresis 182795-2802(1997). CC -!-
FUNCTION Produces ATP from ADP in the presence
of a proton CC gradient across the
membrane. The beta chain is the catalytic CC
subunit. CC -!- CATALYTIC ACTIVITY ATP
H(2)O H()(In) ADP phosphate CC
H()(Out). CC -!- SUBUNIT F-type ATPases have
2 components, CF(1) - the catalytic CC core
- and CF(0) - the membrane proton channel. CF(1)
has five CC subunits alpha(3), beta(3),
gamma(1), delta(1), epsilon(1). CF(0) CC
has three main subunits a, b and c. CC -!-
SUBCELLULAR LOCATION Mitochondrial. CC -!-
SIMILARITY Belongs to the ATPase alpha/beta
chains family. DR HSC-2DPAGE P99504 DOG. DR
InterPro IPR000194 ATPase_a/bcentre. DR
PROSITE PS00152 ATPASE_ALPHA_BETA PARTIAL. KW
ATP synthesis ATP-binding CF(1) Direct
protein sequencing KW Hydrogen ion transport
Hydrolase Mitochondrion. FT UNSURE 8
8 FT UNSURE 17 19 FT NON_TER
19 19 SQ SEQUENCE 19 AA 1871 MW
BB9C163FDC60BB42 CRC64 ATQTSPSPKG
AAAXXXRVV //
6(No Transcript)
7- What have we done so far?
8- DIR Parent Directory 19-Jul-2004 1302
- - cellular-components.rdf 11-Oct-2004 1915
5k - databases.rdf 11-Oct-2004 1915
45k - databases.rdf.gz 13-Sep-2004 1134
6k - datasets.rdf 19-Oct-2004 1632
4k - enzymes.rdf.gz 11-Oct-2004 1915
309k - go.rdf.gz 11-Oct-2004 1915
839k - intact.rdf.gz 11-Oct-2004 1915
636k - keywords.rdf.gz 11-Oct-2004 1915
96k - ontology.owl 19-Oct-2004 1827
77k - taxonomy.rdf.gz 11-Oct-2004 1915
4.0M - uniparc.rdf.gz 13-Oct-2004 1054
762M - uniprot.rdf.gz 11-Oct-2004 1939
768M - uniref.rdf.gz 01-Oct-2004 1256
52.2M
9(No Transcript)
10(No Transcript)
11(No Transcript)
12(No Transcript)
13use ExpasyRDF my parser ExpasyRDFParser
-gtnew('P12345.rdf') while (my protein
parser-gtnext) my id protein-gtid my
mass protein-gtsequence-gtmass print "Mass of
id is mass.\n" print _-gttype, ' ',
_-gtcomment, "\n" foreach (protein-gtannotation
) parser-gtclose
14 15(No Transcript)
16XML Syntax lt?xml version"1.0"
encoding"UTF-8"?gt ltrdfRDF
xmlnsrdfs"http//www.w3.org/2000/01/rdf-schema"
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-sy
ntax-ns" xmlns"urnlsiduniprot.orgontology
" xmlnsowl"http//www.w3.org/2002/07/owl" gt
ltrdfDescription rdfabout"urnlsiduniprot.or
gtaxonomy9606"gt ltrdftype rdfresource"urnl
siduniprot.orgontologyTaxon"/gt
ltmnemonicgtHUMANlt/mnemonicgt ltscientificNamegtHomo
sapienslt/scientificNamegt ltcommonNamegtHumanlt/co
mmonNamegt ltrdfssubClassOf rdfresource"urnls
iduniprot.orgtaxonomy9605"/gt
lt/rdfDescriptiongt lt/rdfRDFgt
17- Triples, Quads and Quints
- What is the source of a triple?
- Compact reification.
18- Web Services
- Overkill for providing programmatic access to
resources. - Often impractical for performance reasons.
19- Life Science Identifiers
- Need special resolver.
- Resolution tied to retrieval.
- Explicit version numbers.
- Not widely used.
20- Embedded References
- uniprot.rdf
- ltrdfDescription rdfabout"_2F9A"gt
- ltrdftype rdfresource"urnlsiduniprot.orgontol
ogyCaution_Annotation"/gt - ltrdfscommentgtIn mouse, 5 genes homologous to
human CD209/DC-SIGN and CD209L/DC-SIGNR have been
identified. Mouse CD209A product was named
DC-SIGN by citation 1 because of its similar
expression pattern and chromosomal location in
juxtaposition to CD23, but despite of the low
sequence similarity.lt/rdfscommentgt - ltcitation rdfresource"_2F8A"/gt
- lt/rdfDescriptiongt
- cyc.rdf
- ltowlClass rdfID"Antigen"gt
- ltrdfscommentgtThe collection of substances that
can stimulate immune response. For example,
bacteria Bacterium, Viruses, proteins
ProteinMolecule can serve as
Antigens.lt/rdfscommentgt - lt/owlClassgt
21 22- People will adopt the technology if it provides
immediate benefits and is simple to use.
23 24- lt?xml version"1.0" encoding"UTF-8"?gt
- ltrdfRDF
- xmlnsrdf"http//www.w3.org/1999/02/22-rdf-synta
x-ns" - xmlnsfoaf"http//xmlns.com/foaf/0.1/"
- gt
- ltfoafProjectgt
- ltfoafnamegtUniProtlt/foafnamegt
- ltfoafhomepage rdfresource"http//uniprot.org/
"/gt - ltfoaffundedBygt
- ltfoafOrganizationgt
- ltfoafnamegtNational Institutes of
Healthlt/foafnamegt - ltfoafhomepage rdfresource"http//www.nih.go
v/"/gt - lt/foafOrganizationgt
- lt/foaffundedBygt
- lt/foafProjectgt
-
- ltfoafOrganizationgt