An Ontological Approach for Describing Phospho-proteins in Rhodococcus - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Description:

PhoD, Phosphodiesterase/alkaline phosphatase D [Inorganic ion transport and metabolism] ... probable alkaline phosphatase. Protein / Product Name: Alternate ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 22
Provided by: denni98
Category:

less

Transcript and Presenter's Notes

Title: An Ontological Approach for Describing Phospho-proteins in Rhodococcus


1
An Ontological Approach for Describing
Phospho-proteins in Rhodococcus
  • Dept. of Computer Science,
  • University of British Columbia.
  • Dennis Wang, Gavin Ha, Jennifer Chen, Nancy Wang
  • CPSC 445. April 5th. 2007

2
What is an ontology?
  • Purpose
  • knowledge representation reasoning
  • Facilitates knowledge sharing and reuse
  • Definition
  • a data model that represents a set of concepts
    within a domain and the relationships between
    those concepts.
  • It is used to reason about the objects within
    that domain.
  • Describe individuals (instances), classes
    (concepts), attributes, relations and axioms
  • Uses
  • AI, information architecture, semantic web,
    software engineer

3
Problems in biology
  • Biology knowledge based
  • use prior knowledge to infer new knowledge
  • data rich
  • Biologist needs extensive prior knowledge to
    analyze data obtained
  • Pace of data production beyond ones ability to
    acquire knowledge
  • Need an automated system to apply domain experts
    knowledge to biological data

4
Solution ontology bioinformatics
  • Joint effort of biologist and computer scientist
  • Build ontologies using domain knowledge
  • Rapid classification of large datasets
  • Allows query to find instances of a class
  • Create controlled vocabularies for shared use
    across different biological and medical domains.
  • In bioinformatics, ontology can make knowledge
    available to community and its applications.

5
Example Gene Ontology (GO)
  • provides structured, controlled vocabularies
    and classifications that cover several domains of
    molecular biology
  • Uses
  • annotation of large data sets
  • the ability to group gene products to some high
    level term
  • Computational (putative) assignments of molecular
    function based on sequence similarity to
    annotated genes or sequences.

6
How are ontologies built?
  • There is no standardized methodology
  • But, efforts to make more comprehensive
    guidelines
  • In general
  • Informal Stage
  • natural language
  • Formal Stage
  • formal knowledge representation language

7
Ontology-building life cycle
  • Inspired by software engineering.
  • User Model (Biologist)
  • 1) Identification of the
  • purpose and scope
  • of the ontology
  • 2) Acquisition of
  • domain knowledge

Identify purpose and scope
Knowledge Acquisition
8
Ontology-building life cycle
  • Conceptualization Model
  • (Bioinformatician/Biologist)
  • 3) Identifying key
  • concepts in the domain.
  • 4) Integration by using
  • and incorporating other
  • existing ontologies

Identify purpose and scope
Knowledge Acquisition
Building
Conceptualization
Integrating existing ontologies
9
Ontology-building life cycle
  • Implementation Model (Bioinformatician)
  • 5) Representing concepts with a formal language
  • 6) Documenting informal
  • and formal definitions
  • 7) Evaluation of the
  • appropriateness of the
  • ontology for its intended
  • application

Identify purpose and scope
Available Development Tools
Knowledge Acquisition
Language Representation
Building
Conceptualization
Integrating existing ontologies
Encoding
Evaluation
10
Describing Phospho-Protein using Phosphabase
Ontology
  • Can we use the phosphabase ontology to describe
    phospho-proteins discovered by the Rhodococcus
    Genome Project?

11
Web Ontology Language (OWL)
  • XML syntax
  • OWL-DL (Description Logic) Certain restrictions
    to guarantee decidability based on description
    logic
  • OWL uses Resource Description Framework (RDF)
  • Subject Predicate Object
  • Basic components in OWL
  • classes
  • Individuals
  • properties

Individual Anne Condon
Individual Jennifer Chen
12
Phosphobase Ontology
Wolstencroft et al, 2006
  • Biological Motivation
  • Driven by protein domain architecture to describe
    signalling protein families
  • Background knowledge required for construction
  • Signal protein domains
  • Presence of protein domains within signal
    proteins
  • OWL Ontology
  • Ontology uses OWL-DL
  • Description-logic can be applied to classify
    proteins using reasoners
  • Many different ways to represent this knowledge
    in OWL

13
Phosphabase.owl
14
OWL DL Reasoners Pellet
  • Input
  • Ontology OWL-DL format
  • axioms about classes into TBox
  • type and property assertions (individuals) into
    ABox
  • Query - RDQL (SPARQL) format
  • Instance data (individuals)
  • Tableau Reasoner
  • Checks satisfiability of an ABox with respect to
    a TBox
  • Test for knowledge base consistency

Parsia and Sirin, ISWC 2004
15
Instance Data
Locus ID RHA1_ro01186 RHA1_ro01186 Acknowledgements for this annotation
Strain Rhodococcus sp. RHA1NBCI Taxonomy Database Replicon ChromosomeRefseq NC_008268
Start 1260414 Stop 1260866
Gene Name    Alternate gene name(s)  
Protein / Product Name protein-tyrosine-phosphatase Alternate product name(s)  
Refseq GI Number 111018199   Category Protein  
Localization Cytoplasmic (Class 3)   Transposon Mutant Available? No transposon mutant available yet
COG predictions Wzb, Protein-tyrosine-phosphatase Signal transduction mechanisms.   Wzb, Protein-tyrosine-phosphatase Signal transduction mechanisms.   Wzb, Protein-tyrosine-phosphatase Signal transduction mechanisms.  
PseudoCAPEC Number 3.1.3.48    COG0394 
Comments
PFAM predictions PF01451 LMWPc, Low molecular weight phosphotyrosine protein phosphatase..  PF01451 LMWPc, Low molecular weight phosphotyrosine protein phosphatase..  PF01451 LMWPc, Low molecular weight phosphotyrosine protein phosphatase.. 
go_function protein tyrosine phosphatase activity goid 0004725  
16
Query Result
17
Instance Data
Locus ID RHA1_ro05453 RHA1_ro05453 Acknowledgements for this annotation
Strain Rhodococcus sp. RHA1NBCI Taxonomy Database Replicon ChromosomeRefseq NC_008268
Start 5845588 Stop 5847288
Gene Name    Alternate gene name(s)  
Protein / Product Name probable protein-tyrosine kinase Alternate product name(s)  
Refseq GI Number 111022419   Category Protein  
Localization Cytoplasmic Membrane (Class 3)   Transposon Mutant Available? No transposon mutant available yet
COG predictions Mrp, ATPases involved in chromosome partitioning Cell division and chromosome partitioning.   Mrp, ATPases involved in chromosome partitioning Cell division and chromosome partitioning.   Mrp, ATPases involved in chromosome partitioning Cell division and chromosome partitioning.  
PseudoCAPEC Number 2.7.10.1    COG0489 
TIGRFAM predictions TIGRFAM Accession TIGR01007TIGRFAM name and function eps_fam - capsular exopolysaccharide family (6.7e-46)TIGRFAM EC Number Role Transport and binding proteins  Sub Role Carbohydrates, organic alcohols, and acidsTIGRFAM to Gene Ontology Mappings TIGRFAM Accession TIGR01007TIGRFAM name and function eps_fam - capsular exopolysaccharide family (6.7e-46)TIGRFAM EC Number Role Transport and binding proteins  Sub Role Carbohydrates, organic alcohols, and acidsTIGRFAM to Gene Ontology Mappings TIGRFAM Accession TIGR01007TIGRFAM name and function eps_fam - capsular exopolysaccharide family (6.7e-46)TIGRFAM EC Number Role Transport and binding proteins  Sub Role Carbohydrates, organic alcohols, and acidsTIGRFAM to Gene Ontology Mappings
Comments
PFAM predictions PF02706 Wzz, Chain length determinant protein. This family includes proteins involved in lipopolysaccharide (lps) biosynthesis. This family comprises the whole length of chain length determinant protein (or wzz protein) that confers a modal distribution of chain length on the O-antigen component of lps. This region is also found as part of bacterial tyrosine kinases..  PF02706 Wzz, Chain length determinant protein. This family includes proteins involved in lipopolysaccharide (lps) biosynthesis. This family comprises the whole length of chain length determinant protein (or wzz protein) that confers a modal distribution of chain length on the O-antigen component of lps. This region is also found as part of bacterial tyrosine kinases..  PF02706 Wzz, Chain length determinant protein. This family includes proteins involved in lipopolysaccharide (lps) biosynthesis. This family comprises the whole length of chain length determinant protein (or wzz protein) that confers a modal distribution of chain length on the O-antigen component of lps. This region is also found as part of bacterial tyrosine kinases.. 
go_component signal recognition particle (sensu Eukaryota) goid 0005786  
18
Query Result
19
Instance Data
Locus ID RHA1_ro05554 RHA1_ro05554 Acknowledgements for this annotation
Strain Rhodococcus sp. RHA1NBCI Taxonomy Database Replicon ChromosomeRefseq NC_008268
Start 5971327 Stop 5972865
Gene Name    Alternate gene name(s)  
Protein / Product Name probable alkaline phosphatase Alternate product name(s)  
Refseq GI Number 111022520   Category Protein  
Localization Unknown (This protein may have multiple localization sites) (Class 3)   Transposon Mutant Available? No transposon mutant available yet
COG predictions PhoD, Phosphodiesterase/alkaline phosphatase D Inorganic ion transport and metabolism. PhoD, Phosphodiesterase/alkaline phosphatase D Inorganic ion transport and metabolism. PhoD, Phosphodiesterase/alkaline phosphatase D Inorganic ion transport and metabolism.
TIGRFAM predictions TIGRFAM to Gene Ontology Mappings COG3540 
Comments
PFAM predictions PF00245 Alk_phosphatase, Alkaline phosphatase.  PF00245 Alk_phosphatase, Alkaline phosphatase.  PF00245 Alk_phosphatase, Alkaline phosphatase. 
No Result
go_component organelle inner membrane goid 0019866  
20
Conclusions
  • Ontologies can be used as a standard model for
    the exchange of biological information
  • Building ontologies can get very complicated
  • Biologists with little description logic training
  • Computer scientist with little knowledge of
    biology
  • Need more bioinformaticians
  • Ontologies can facilitate automated annotation of
    genes / gene products
  • Difficult to Read and Infer from Ontologies
  • Ontologies can get very big (Phosphabase only
    small example)
  • Reasoners are sometimes slow and inaccurate

www.quicklybored.com
21
Acknowledgements
  • Rhodococcus sp. RHA1 data
  • Eltis Lab Dr. Lindsay Eltis, Dept. Microbiology
    Biochemistry
  • Phosphabase Ontologoy
  • Wolstencroft Lab, University of Manchester, UK
  • Bioinformatics paper Wolstencroft et al, 2006
  • Phosphabase Ontology processing
  • Benjamin Good, iCAPTURE Centre, Vancouver
Write a Comment
User Comments (0)
About PowerShow.com