UniProt - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

UniProt

Description:

UniProt. Eric Jain. Swiss Institute of Bioinformatics, Geneva ... foaf:name European Bioinformatics Institute /foaf:name foaf:nick EBI /foaf:nick ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 25
Provided by: eric73
Category:

less

Transcript and Presenter's Notes

Title: UniProt


1
UniProt
  • Eric Jain
  • Swiss Institute of Bioinformatics, Geneva
  • W3C Workshop on Semantic Web for Life Sciences,
    October 2004

2
  • What is it?

3
(No Transcript)
4
(No Transcript)
5
ID ATPB_CANFA STANDARD PRT 19
AA. AC P99504 DT 15-JUL-1998 (Rel. 36,
Created) DT 15-JUL-1998 (Rel. 36, Last sequence
update) DT 05-JUL-2004 (Rel. 44, Last
annotation update) DE ATP synthase beta chain,
mitochondrial (EC 3.6.3.14) (Fragment). GN
NameATP5B OS Canis familiaris (Dog). OC
Eukaryota Metazoa Chordata Craniata
Vertebrata Euteleostomi OC Mammalia
Eutheria Carnivora Fissipedia Canidae
Canis. OX NCBI_TaxID9615 RN 1 RP
SEQUENCE. RC TISSUEHeart RX
MEDLINE98163340 PubMed9504812 RA Dunn M.J.,
Corbett J.M., Wheeler C.H. RT "HSC-2DPAGE and
the two-dimensional gel electrophoresis database
of RT dog heart proteins." RL
Electrophoresis 182795-2802(1997). CC -!-
FUNCTION Produces ATP from ADP in the presence
of a proton CC gradient across the
membrane. The beta chain is the catalytic CC
subunit. CC -!- CATALYTIC ACTIVITY ATP
H(2)O H()(In) ADP phosphate CC
H()(Out). CC -!- SUBUNIT F-type ATPases have
2 components, CF(1) - the catalytic CC core
- and CF(0) - the membrane proton channel. CF(1)
has five CC subunits alpha(3), beta(3),
gamma(1), delta(1), epsilon(1). CF(0) CC
has three main subunits a, b and c. CC -!-
SUBCELLULAR LOCATION Mitochondrial. CC -!-
SIMILARITY Belongs to the ATPase alpha/beta
chains family. DR HSC-2DPAGE P99504 DOG. DR
InterPro IPR000194 ATPase_a/bcentre. DR
PROSITE PS00152 ATPASE_ALPHA_BETA PARTIAL. KW
ATP synthesis ATP-binding CF(1) Direct
protein sequencing KW Hydrogen ion transport
Hydrolase Mitochondrion. FT UNSURE 8
8 FT UNSURE 17 19 FT NON_TER
19 19 SQ SEQUENCE 19 AA 1871 MW
BB9C163FDC60BB42 CRC64 ATQTSPSPKG
AAAXXXRVV //
6
(No Transcript)
7
  • What have we done so far?

8
  • DIR Parent Directory 19-Jul-2004 1302
    -
  • cellular-components.rdf 11-Oct-2004 1915
    5k
  • databases.rdf 11-Oct-2004 1915
    45k
  • databases.rdf.gz 13-Sep-2004 1134
    6k
  • datasets.rdf 19-Oct-2004 1632
    4k
  • enzymes.rdf.gz 11-Oct-2004 1915
    309k
  • go.rdf.gz 11-Oct-2004 1915
    839k
  • intact.rdf.gz 11-Oct-2004 1915
    636k
  • keywords.rdf.gz 11-Oct-2004 1915
    96k
  • ontology.owl 19-Oct-2004 1827
    77k
  • taxonomy.rdf.gz 11-Oct-2004 1915
    4.0M
  • uniparc.rdf.gz 13-Oct-2004 1054
    762M
  • uniprot.rdf.gz 11-Oct-2004 1939
    768M
  • uniref.rdf.gz 01-Oct-2004 1256
    52.2M

9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
use ExpasyRDF my parser ExpasyRDFParser
-gtnew('P12345.rdf') while (my protein
parser-gtnext) my id protein-gtid my
mass protein-gtsequence-gtmass print "Mass of
id is mass.\n" print _-gttype, ' ',
_-gtcomment, "\n" foreach (protein-gtannotation
) parser-gtclose
14
  • Issues

15
(No Transcript)
16
XML Syntax lt?xml version"1.0"
encoding"UTF-8"?gt ltrdfRDF
xmlnsrdfs"http//www.w3.org/2000/01/rdf-schema"
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-sy
ntax-ns" xmlns"urnlsiduniprot.orgontology
" xmlnsowl"http//www.w3.org/2002/07/owl" gt
ltrdfDescription rdfabout"urnlsiduniprot.or
gtaxonomy9606"gt ltrdftype rdfresource"urnl
siduniprot.orgontologyTaxon"/gt
ltmnemonicgtHUMANlt/mnemonicgt ltscientificNamegtHomo
sapienslt/scientificNamegt ltcommonNamegtHumanlt/co
mmonNamegt ltrdfssubClassOf rdfresource"urnls
iduniprot.orgtaxonomy9605"/gt
lt/rdfDescriptiongt lt/rdfRDFgt
17
  • Triples, Quads and Quints
  • What is the source of a triple?
  • Compact reification.

18
  • Web Services
  • Overkill for providing programmatic access to
    resources.
  • Often impractical for performance reasons.

19
  • Life Science Identifiers
  • Need special resolver.
  • Resolution tied to retrieval.
  • Explicit version numbers.
  • Not widely used.

20
  • Embedded References
  • uniprot.rdf
  • ltrdfDescription rdfabout"_2F9A"gt
  • ltrdftype rdfresource"urnlsiduniprot.orgontol
    ogyCaution_Annotation"/gt
  • ltrdfscommentgtIn mouse, 5 genes homologous to
    human CD209/DC-SIGN and CD209L/DC-SIGNR have been
    identified. Mouse CD209A product was named
    DC-SIGN by citation 1 because of its similar
    expression pattern and chromosomal location in
    juxtaposition to CD23, but despite of the low
    sequence similarity.lt/rdfscommentgt
  • ltcitation rdfresource"_2F8A"/gt
  • lt/rdfDescriptiongt
  • cyc.rdf
  • ltowlClass rdfID"Antigen"gt
  • ltrdfscommentgtThe collection of substances that
    can stimulate immune response. For example,
    bacteria Bacterium, Viruses, proteins
    ProteinMolecule can serve as
    Antigens.lt/rdfscommentgt
  • lt/owlClassgt

21
  • Summary

22
  • People will adopt the technology if it provides
    immediate benefits and is simple to use.

23
  • Credits

24
  • lt?xml version"1.0" encoding"UTF-8"?gt
  • ltrdfRDF
  • xmlnsrdf"http//www.w3.org/1999/02/22-rdf-synta
    x-ns"
  • xmlnsfoaf"http//xmlns.com/foaf/0.1/"
  • gt
  • ltfoafProjectgt
  • ltfoafnamegtUniProtlt/foafnamegt
  • ltfoafhomepage rdfresource"http//uniprot.org/
    "/gt
  • ltfoaffundedBygt
  • ltfoafOrganizationgt
  • ltfoafnamegtNational Institutes of
    Healthlt/foafnamegt
  • ltfoafhomepage rdfresource"http//www.nih.go
    v/"/gt
  • lt/foafOrganizationgt
  • lt/foaffundedBygt
  • lt/foafProjectgt
  • ltfoafOrganizationgt
Write a Comment
User Comments (0)
About PowerShow.com