Title: NPACI ppt template
1http//www.rcsb.org/pdb
Work on Integration
Phil Bourne bourne_at_sdsc.edu
2The Protein Data Bank http//www.rcsb.org
- The next step in biology
- Approx. 12,000 structures
- 50 GB of primary information
- 75,000 queries/day
- 6 mirror sites worldwide
- Complex data conforming
- to a well defined ontology
- Large scale computation
- terabytes of secondary
- information
Bacteriophage phiX174 The 10,000th PDB Structure
3Integration - Short to Long Range Plans
Now - Published URL - return single or multiple
structures http//www.rcsb.org/pdb/lin
king.html
Short - Molecular Information Agent
http//mia.sdsc.edu
Mike Gribskov Justin Calbanero 1 month
Medium - Bioxml
http//www.sdsc.edu/bioxml
Peng Yang David Goodsell 2 months - 1 year
Doug Greer 1-2 years
Long - Corba
4Short Term - Molecular Information
Agent http//beta.rcsb.org
Periodic Updates
MIA
60 Internet resources queried simultaneously from
a PDB id
PDB
Sybase
5External Level
WWW Integrated Query Interface
Visualization Tools
BioSTAR Editor
Storage Resource Broker
POM Objects
Physical Level
Ftp Archive
Relational DB Tables
Data Exchange Level
XML
Common Architecture
Conceptual Schema (STAR/DDL v2.0)
Conceptual Level
6Conceptual Schema - 7 Years of Work
Westbrook at Bourne (2000) Bioinformatics Next
Issue http//www.sdsc.edu/pb/papers/bioinfo99.pdf
7BioXML - ejournal in 2001
Queriable Integrated Intuitive Interface
DTD Molecular Scene
DTD Structure
DTD Function
Bioeditor
PDB
8A Goal - Merge the Literature and the Database
http//www.sdsc.edu/bioxml
9Corba
- Experience to date - efficient and programmable
- PDB team member chair of OMG Working Group
- Response to RFP made today
- Application demonstrating the API 3-6 months
- OMG approval - 1 year
- Publish specification (1-2 years)