Title: Macromolecular Structure Database group
1MSD Search and Visualization tools Jawahar
Swaminathan
2Issues
- The raw database is large and complex
- 27,190 PDB entries
- 120 tables in the warehouse, many very large
- Cross-referenced against UniProt, PubMed...
- Need to expose as much of the data as possible,
without making the interface too complex - We want to cater for three categories of user
- "Novice" user
- Experienced user
- Expert user
3biobar
A toolbar search application for Mozilla/Netscape
or firefox browsers
4Biobar (http//biobar.mozdev.org)
5biobar
- All major bioinformatics databases covered.
- Search genomic, proteomic, structural, literature
and functional databases. - Links to deposition and analysis tools for
sequence and structural data.
6MSDlite
A simple form-based query system to search the
MSD Databases
7MSDlite
8MSDlite
9The Atlas Pages
10The Atlas Ligands
11The Atlas Sequence
12AstexViewer_at_MSD-EBI
- View structures as wireframe, backbone or ribbons
- Built-in sequence viewer
- Calculate and display surfaces
- Various display options
- Ramachandran plots
- Distance matrix
- B-factors
Based on the AstexViewer from Astex Technology
Limited and modified under licence by the MSD
group
13Simple search interface
- Strengths
- simple, easy to use form
- allows multiple search fields to be combined
- relatively fast, despite performing quite complex
SQL queries - Weaknesses
- not exposing the power of a relational database
- user can't specify the relationship between
search fields - ? "name" AND "title" AND "keyword"
- ? "name" OR "title" OR "keyword"
- ? ( "name" OR "title" ) AND NOT "keyword"
- the search form is defined by the authors of the
search system, not the author of a query
14Describing complex searches
- We want to allow the user to entirely control
their query - Since HTML forms are inherently static, we'll use
an applet to provide a dynamic "form" that will
let the user - choose the fields to be searched
- specify the relationships between search fields
- choose the result fields and how results are
presented - perform "complex" sub-queries e.g. SSM, FASTA
15A graphical database search system
- MSDpro uses an applet for constructing queries
and a server to execute them - Avoids the need for the user to understand a
complex database schema or know SQL - The user describes their query entirely
graphically, including logical operations such as
AND, OR and NOT - Applet generates an XML description of the users
query, which is sent to the MSD query server and
converted to SQL automatically
16MSDpro
A flexible graphical search interface for
advanced searching
17(No Transcript)
18(No Transcript)
19Automatic SQL query generation
- The query server is a Java servlet
- accepts a query description as XML
- converts the users query description into a true
SQL query, which is then submitted to the search
database - Searches can include components that are executed
outside of the database, e.g. sequence
similarity, determined using FASTA or structural
similarity, determined using SSM
20Search system is generic
- The search system is designed to be entirely
database-independent - All information about the architecture of the
search database is stored in XML dictionaries - Similarly, the search and result fields which the
applet presents to the user are controlled by a
dictionary - The entire system could move to a completely
different database simply by modifying the
dictionaries
21Java server
22Java server architecture
Methods
DB and external object ontology
User interface
DB
Methods
Interface Ontology
23Web-services
- Some of the new services from MSD are designed as
web-services - web-services are network-based services with
published method signatures - can be accessed via the SOAP protocol from any
language with a SOAP library, via http - The same services used within MSDpro will be
accessible to any SOAP client - The MSD query engine will also be available as a
web-service, allowing users to submit queries
programmatically
24http//www.ebi.ac.uk/msd/
25Query generation
26Query generation
SQL gt select ltAgt from ltBgt where ltCgt ltDgt
A selection B DB objects C Query D table
joins E plugin description
B,C,E
A,B
B,D
Fragments of C
B
C - external