Title: BioPAX A Data Exchange Format for Biological Pathways
1BioPAXA Data Exchange Format for Biological
Pathways
- Michael Cary, Joanne Luciano
- BioPAX Workgroup
- www.biopax.org
- SOFG October 2004
2Introduction
- BioPAX Biological pathway exchange
- An open source community effort to build a data
exchange format for biological pathways - Began at ISMB 02
- First draft (Level 0.5) released Sept 2003
- Level 1 released July 2004
- Focuses on representing metabolic pathways
- Later levels will expand this scope
3The domain Biological pathways
Main categories
Metabolic Pathways
Molecular Interaction Networks
Signalling Pathways
4The Problem
- So many pathway databases, all with their own
data models, formats, and data access methods.
Source Pathway Resource List (http//cbio.mskcc.o
rg/prl/)
5BioPAX Motivation
gt150 DBs and tools
Application
Database
User
Before BioPAX
With BioPAX
Common format will make data more accessible,
promoting data sharing and distributed curation
efforts
6Exchange Formats in the Pathway Data Space
Database Exchange Formats
Simulation Model Exchange Formats
SBML, CellML
PSI-MI
Biochemical Reactions
Protein Interaction Networks
Rate Formulas
Metabolic Pathways Low Detail High
Detail
Regulatory Pathways Low Detail High
Detail
7Exchange Formats in the Pathway Data Space
Database Exchange Formats
Simulation Model Exchange Formats
BioPAX
SBML, CellML
Genetic Interactions
PSI-MI 2
Rate Formulas
Biochemical Reactions
8Exchange Formats in the Pathway Data Space
Database Exchange Formats
Simulation Model Exchange Formats
SBML, CellML
Genetic Interactions
PSI-MI 2
Rate Formulas
BioPAX Level 1
Biochemical Reactions
9BioPAX Ontology
- Conceptual framework based upon existing DB
schemas - aMAZE, BIND, EcoCyc, WIT, KEGG, Reactome, etc.
- Allows wide range of detail, multiple levels of
abstraction - BioPAX ontology in OWL (XML)
- Ontology built using GKB Editor and Protégé
- Semantic mapping still a manual process
- Level 1 represents metabolic pathway data
- Large body of well understood data
- Stable representations (old data)
10BioPAX Ontology Overview
Level 1 v1.0 (Released July 7th, 2004)
11BioPAX Ontology Top Level
- Pathway
- A set of interactions
- E.g. Glycolysis, MAPK, Apoptosis
- Interaction
- A set of entities and some relationship between
them - E.g. Reaction, Molecular Association, Catalysis
- Physical Entity
- A building block of simple interactions
- E.g. Small molecule, Protein, DNA, RNA
12BioPAX Ontology Interactions
13BioPAX Ontology Physical Entities
14How it works
A typical pathway would be decomposed into A
single pathway instance, which would contain
several pathway steps, which would each contain
one or more interactions occurring between
physical entity participants, which each point to
one physical entity.
15Using other ontologies
- Use pointers to existing ontologies to provide
supplemental annotation where appropriate - Cellular location ? GO Component
- Cell type ? Cell.obo
- Organism ? NCBI taxon DB
- Incorporate other standards where appropriate
- Chemical structure ? SMILES, CML, INCHI
16BioPAX Workgroup Organizational Structure
- Small core group advancing the standard
- Increased representation from mailing lists and
subgroups - Cost paid by participants/DOE workshop grant
- Special topics have subgroups
- Core group member outside experts
- Tackle specific challenges
- E.g. States, small molecules, examples
17Current status
- Converting data into Level 1
- EcoCyc and other BioCyc DBs done
- KEGG, Reactome, WIT, aMAZE in progress
- Finishing Level 2
- Adding support for binding interactions
- Draft release coming soon
- Organizing Level 3
- Focusing on signaling pathways
18BioPAX Supporting Groups
- Groups
- Memorial Sloan-Kettering Cancer Center G. Bader,
M. Cary, C. Sander - SRI Bioinformatics Research Group P. Karp, S.
Paley, J. Pick - University of Colorado Health Sciences Center I.
Shah - BioPathways Consortium J. Luciano, E. Neumann,
A. Regev, V. Schachter - Argonne National Laboratory N. Maltsev, E.
Marland - Samuel Lunenfeld Research Institute C. Hogue
- Harvard Medical School E. Brauner, D. Marks, A.
Regev - NIST R. Goldberg
- Stanford T. Klein
- Columbia A. Rzhetsky
- Dana Farber Cancer Institute J. Zucker
- Collaborating Organizations
- Proteomics Standards Initiative (PSI)
- Systems Biology Markup Language (SBML)
- CellML
- Chemical Markup Language (CML)
- Databases
- BioCyc (www.biocyc.org)
- BIND (www.bind.ca)
- WIT (wit.mcs.anl.gov/WIT2)
- PharmGKB (www.pharmgkb.org)
- Grants
- Department of Energy (Workshop)
The BioPAX Community