Macromolecular Structure Middleware - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Macromolecular Structure Middleware

Description:

for Structural Bioinformatics. Open. MMS. http://openmms.sdsc.edu ... for Structural Bioinformatics. Open. MMS. http://openmms.sdsc.edu. What Do We Mean by an ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 46
Provided by: Brad1178
Learn more at: https://users.sdsc.edu
Category:

less

Transcript and Presenter's Notes

Title: Macromolecular Structure Middleware


1
Macromolecular Structure Middleware
  • OpenMMS
  • An Ontology Driven Architecture

2
Overview
  • The mmCIF Ontology
  • OpenMMS Toolkit
  • Macromolecular Structure (MMS) Metamodel
  • Parser, XML
  • SQL / Corba Servers and Clients
  • Corba
  • UML and the future...

3
How do we Enable Science?
  • Promote well defined Macromolecular Structure
    (MMS) Specifications
  • Distribution Open Interfaces
  • Now
  • flat files
  • W3 browsing and searching
  • Future
  • XML, SQL, CORBA

4
Why OpenMMS?
  • Allow programmers to more easily create
    efficient, high performance and robust
    applications.
  • A Java-only toolkit with that creates XML, CORBA
    and Relational DB representations of the mmCIF
    Macromolecular Structure Data.
  • Source code is publicly available so users can
    easily modify the metamodel or create an entirely
    new one.

5
What Do We Mean by an Ontology Driven
Architecture?
What do we mean by an Ontology?
A bridge between Our World of Natural
Language and the World of Machines.
6
mmCIF Dictionary and Data Files
  • Based on Ontology for Macromolecular Structure
    defined by the International Union of
    Crystallography
  • Replaces the older 80-Column PDB files
  • mmCIF Dictionary contains over 140 Category and
    1600 Item definitions
  • Open, Extensible
  • Provides a well-defined reference standard for
    data distribution

7
OpenMMS Toolkit Data Flow
8
Metamodel Information Flow
mmCIF Dictionary
mmCIF Ontology Metamodel
Metamodel Framework
Corba IDL, SQL Schema, XML DTD, Java Data
Loaders JDBC Loaders
9
What can OpenMMS do?
  • PDBase program will load any or all PDB files
    into any SQL-92 compatible database (Oracle,
    mySQL, Sybase...)
  • Translate any PDB file into an XML file.
  • Contains Two Corba servers
  • Reference server will cache and serve data read
    from PDB flat files.
  • DB server will cache and serve data read from a
    SQL database (very quickly...)
  • All Source code written in Java and publicly
    available.

10
Some Advantages of Using an Ontology Driven
Architecture
  • Scales to very large Ontologies
  • More reliable and maintainable code
  • Transfer between representations
  • Scientific Correctness of representation
  • Help in maintaining backward compatibility

11
How does one actually represent an
ontology?(OpenMMS Internal Metamodel Overview)
Root
Visitor Abstract Class
Module
Module
Interface
Struct
Visitor Subclass
Struct
Struct
Field
Field
12
mmCIF Parsers
  • General Purpose, Low-level access to data
  • Parsers available in many languages
  • OpenMMS toolkit includes Java Parser
  • Uses Builder Design Pattern
  • An application subclasses Abstract Builder class
    and stores data into its data structures

13
MMS in XML
  • Large Flat Files (open and close tags)
  • Tables can be grouped by rows or columns
  • XML from SQL Query
  • Many requests from Web browsers dont really need
    or want all the data
  • SW available from DB Vendors and ISVs for
    creating XML files from SQL result sets
  • Smaller files load faster

14
Relational DB Expression
  • SQL-92 Compatible
  • Schemas for all the standard DB vendors
  • Fast and Flexible Keyword searches
  • PDBase loader allows structures to be selectively
    loaded
  • Oracle Instance Tested
  • 14,556 Structures
  • 16GB, 88 Million Atom Records

15
A very high-level (and very-rough)
classification of communication
  • Person-to-Person communication
  • email
  • Person-to-Machine communication
  • HTTP/HTML
  • Machine-to-Machine communication
  • CORBA, SQL, .NET, Soap
  • Not Communications -gt Data Formats
  • XML, mmCIF (STAR), many more

16
What is CORBA?
  • Common Object Request Broker Architecture
  • Defines a family of open software interface
    specifications for distributed object computing.
  • http//www.omg.org

17
What is an Object? A Data Structure with an
Attitude
  • Programs Algorithms Data Structure
  • Object Oriented Programming Principle
  • Partition the parts of algorithms with the
    data structures they use

18
Side View of a Distributed Application
Client E.g. a Java Applet
Server
Middle Ware
Middle Ware
E.g. Mainframe Computer Server
IDL
IDL
Network
Internet (TCP/IP)
19
The Hourglass view of the Internet
Applications
  • OO High-Level Interface

HTTP, Corba, .NET
? Reliable Bitsteam
TCP, RTP,...
IP
? Unreliable Datagrams
Copper, Glass Radio Spectrum
(ATM, Ethernet, V.90, SONET...)
20
Where is Corba?
  • Inside every Java Runtime Environment.
  • Commonly used in middle tier and backend (e.g.
    database) connections.
  • Open Source and Commercial Implementations
    Available
  • Usually buried deep inside the software
  • Difficult or impossible to tell when it is being
    used

21
What is Distributed Object Computing?
  • Extends the benefits of object-oriented
    technology across process and machine boundaries
    to encompass entire networks.
  • Attempts to make remote objects appear to
    programmers as if they were local objects in the
    same process. This is called location
    transparency.

22
Advantages of Distributed Object Computing
  • Easier (and faster) for programmers to create
    distributed applications
  • Increases Reliability
  • Increases Maintainability
  • Increases Portability
  • Increases Extensibility

23
The Alphabet Soup
  • OMG Object Management GroupConsortium of 800
    companies founded in 1989.
  • IDL Interface Definition Language

24
Boundaries, Interfaces
  • The key is to focus on boundaries, interfaces,
    how things fit together
  • Not on the internal details of how theyre built
    assume that will be diverse changing

25
Boundaries, Interfaces
  • The Interface to an object can be distributed
    over a network

Shape of boundary is defined in IDL
26
Corba Independence
  • Open Standard for Distributed Object Oriented
    Design
  • Independent of Hardware Platform
  • Independent of Operating System
  • Independent of Programming Language
  • Independent of Object Location

27
Object Request Broker
  • ORBs mediate between objects and things that use
    them (clients)

Object Request Broker
28
Terminology
  • IIOP
  • The Internet Inter-ORB Protocol, defined in the
    Spec as a vendor-independent, wire-level network
    protocol on top of TCP/IP. This allows ORB
    implementations of different vendors to
    interoperate.

29
ORBs Medium for Integration
ORB
ORB
ORB
30
Corba FacilitiesIndustry Standards in Vertical
Markets
  • Manufacturing
  • Finance
  • Life Sciences Research
  • C4I
  • Many others...

31
Using Corba to accessMacromolecular Structure
Data
  • No Parsing of Flat Files
  • Direct Access to Binary Data Structures
  • Strongly Typed Data
  • Granularity of Access
  • Indices and Presence Flags Pre-computed
  • Highest Performance

32
OMG/LSR Macromolecular Structure Adoption Process
  • August 1999 RFP issued
  • March 2000 Initial Submission
  • September 2000 Revised Submission
  • February 2001 Adopted Spec by the OMG
  • 4Q 2001 OpenMMS LSR/MMS1.0 compliant
    implementation source code publicly available
  • February 2002 Approved as a Formal
    OMG Available Specification.

33
Using the CORBA MMS Server
An excerpt from legacy PDB Formatted File
ATOM Record (4hhb.ent) ... ATOM 6 CG1 VAL A
1 7.009 20.127 5.418 6.00 61.79
... ATOM 7 CG2 VAL A 1 5.246
18.533 5.681 6.00 80.12 ... ATOM 8 N
LEU A 2 9.096 18.040 3.857 7.00 26.44
... ATOM 9 CA LEU A 2 10.600
17.889 4.283 6.00 26.32 ... ATOM 10 C
LEU A 2 11.265 19.184 5.297 6.00 32.96
... ATOM 11 O LEU A 2 10.813
20.177 4.647 8.00 31.90 ... ATOM 12 CB
LEU A 2 11.099 18.007 2.815 6.00 29.23
... ATOM 13 CG LEU A 2 11.322
16.956 1.934 6.00 37.71 ... ATOM 14 CD1
LEU A 2 11.468 15.596 2.337 6.00 39.10
... ATOM 15 CD2 LEU A 2 11.423
17.268 .300 6.00 37.47 ... ...
34
LSR/MMS ATOM Record
DsLSRMacromolecularStructure.idl excerpt
struct AtomSite string id
IndexId type_symbol AtomIndex label
IndexId label_entity VectorXYZ
cartn float occupancy float
b_iso_or_equiv
35
Example Code and Resulting Output
Entry e entryFactory.get_entry_from_id(4hhb")
AtomSite a e.get_atom_site_list() for (int i
0 i lt a.length i)
System.out.println(ai.id " "
ai.type_symbol.id " ("
ai.cartn.x ", " ai.cartn.y ", "
ai.cartn.z ")") produces 1 N
(11.065, 7.352, 9.598) 2 C (12.436, 7.764,
9.902) 3 C (12.883, 7.09, 11.208) 4 O (12.088,
7.0, 12.147) 5 C (12.611, 9.264, 10.06) ...
36
What are the alternatives to Corba?
  • TCP/IP Sockets - Byte stream
  • DCOM, COM, OLE, .NET (Microsoft Only)
  • DCOM ? ? Corba Bridges are available from several
    vendors
  • SOAP (Simple Object Access Protocol)
  • XML Based

37
Unified Modeling Language UMLWhat do all those
arrows and boxes Mean?
  • Schematic Language for Defining SW
  • Graphics Representations
  • UML Things, Relations and Diagrams
  • 9 types of Diagrams
  • The most commonly used diagram is the Class
    Diagram

38
UML Class Diagram Example
EntryFactory
get_version() get_entry_id_list() get_entry_modifi
cation_dates() native_formats_supported() get_nat
ive_entry_representation()


ModificationDate
Entry_id EntryId date TimeBaseTimeT
39
UML Class Diagram Basics
? Underlined for Class Instances, Italics
for Abstract Classes
Class_Name
var1 Type var2 Type
? Variables
method1() method2() method3()
  • Methods

Details may be omitted if not important
40
UML Relationships
Dependency
0..1

Association
Generalization (Inheritance)
Aggregation

41
UML Example
EntryFactory
get_version() get_entry_id_list() get_entry_modifi
cation_dates() native_formats_supported() get_nat
ive_entry_representation()


ModificationDate
Entry_id EntryId Date TimeBaseTimeT
42
XMI XML Metadata Interchange
  • UML is a graphical representation need some way
    to exchange UML models between applications
  • XMI is used to store and transmit UML models
  • XML based
  • Defines XML tags for classes, relationships
    between classes etc.

43
OMG MDA
  • Platform Independent Models (PIMs) that define
    the interface are defined in UML
  • The PIMs are translated to Platform Specific
    Models (PSMs) such as Corba, SOAP, .NET or XML
    Schemas
  • The Corba servers and clients may be the same,
    but now the interface is defined in UML and the
    IDL is then generated from the UML

44
MDA Platform Independent toPlatform Dependent
Translation
UML
.NET
Corba
SOAP
XML
45
Thanks and Acknowledgments
  • Phil Bourne
  • John Westbrook
  • David Benton
  • Karl Konnerth
  • Lynn TenEyck
Write a Comment
User Comments (0)
About PowerShow.com