Title: XMDR Prototype Progress Report
1XMDR Prototype Progress Report
- John McCarthy and Kevin D. Keck
- XMDR Project Quarterly Meeting
- 19 October, 2005
- UC Berkeley Faculty Club
2Presentation Outline
- Recap prototype evolution during past year
- Review recent prototype activities
- Revised OWL ontology (esp for relationships)
- Connected terminolgies to data elements
- Loaded some NASA SWEET ontologies
- Worked on inference module (Jena, SPARQL, Kowari)
- Drafted tested recipe to import XMDR prototype
- Next steps major challenges
3Brief chronology of XMDR Prototype Evolution
XMDR Kickoff Meeting
2004 July
Requirements use cases discussed
2004 Oct
Architecture alternatives recommendations
2005 Jan
Initial implementation OWL ontology for 11179
Subversion XMDR data spec combines OWL, RDF and
XML data extracted from EDR web site
2005 March
OWL ontology refinements text search
interface terminologies loaded via Lexgrid Jena
inference
2005 July
OWL ontology refinements for XMDR 11179, ed3
NASA SWEET ontology loaded concepts linked
2005 Oct
4XMDR Prototype Architecture Initial Implemented
Modules
External Interface
RegistryStore
Registry
Java
WritableRegistryStore
Subversion
Authentication Service (defer)
RetrievalIndex
MetadataValidator (defer) schema-driven syntax
checker
Jena, Xerces
LogicBasedIndex
FullTextIndex
Jena, OWI KS Racer,Kowari
Lucene
MappingEngine (defer)
Ontology Editor
11179 OWL Ontology
Protege
Composition (tight ownership)
Generalization
Aggregation (loose ownership)
5OWL, RDF XML Schema used to specify XMDR as UML
is used for 11179 metamodel
11179 Relational Schema
Relational Metadata
UML11179 Metamodel
OWL XMDR Ontology annotations
XMDR XML Schema
Types Cardinalities
TRang
XMDRs Relax NG Schema
Triples binary labeled relationships
RDF Spec
XML Schema Language spec
XML Objects
What things go in own files? Which property
direction stored? Sequential ordering of
properties
6XMDR-P example content has been loadedfrom
diverse sources via lexgrid XSLT
Concept System A
XSLT script
Harold Solbrig (Mayo Clinic)
A Concepts
Original Source A
Lexgrid Source A
A Relationships
7Other XMDR-P example content has been loaded from
EDR and SWEET
SWEET (OWL)
EDR
screen scrape (perl)
java
XMDR files (includes ontologies)
XMDR files (ontologies)
concepts relationships
XMDR ontology
8XMDR Prototype bridges different realms of
metadata standards
Information artifacts data elements, schemas,
UML models, ...
Conceptual models of the real world
Ontology Standards OWL, KIF, CL, XTM, ....
XMDR Prototype MMF 11179 ed. 3 Metadata
Registry Standards
OMG Standards MOF, UML, CWM
Terminology Standards
Like the 11179 Metadata Registry Standard ed. 3,
the XMDR Prototype bridges the realms of
conceptual models and information artifacts.
9Enterprise Vocabulary Services (EVS) concepts
unite NCI MDR
Conceptual Domain Agent
Object Class Chemopreventive Agent
Valid Values Cyclooxygenase Inhibitor Doxercalcife
rol Eflornithine Ursodiol
Data Element Concept Chemopreventive Agent NSC
Number
Value Domain NSC Code
Classification Schemes caDSRTraining
Property NSCNumber
Representation Code
Data Element Chemopreventive Agent Name
Context caCORE
10XMDR Prototype reflects revised focus on Concepts
for 11179, ed3
Note Denise Warzel came to essentially the same
viewpoint for NCIs Metadata Registry
11Proposed 11179 revisions for ontologies,concepts
relationships
New Proposed Objects
Current 11179 Objects
12XMDR-P is upward compatible facilitates
automated migration
XMDR EDR Objects
Java transformation script
Current EDR Objects
13DEMO DISCUSS LOADING LINKING CONCEPTS
- EDR Countries of the World as concepts, etc.
14Initial XML OWL sub-classing helped organize
11179 objects
Note that tree is relatively shallow
15Refined XMDR subclasses improve organization
inference
Note deeper sub-classing. Well discuss details
Thursday
16NASA-JPL Semantic Web for Earth and Environmental
Terminology
- SWEET written in OWL ontology language (W3C)
- Can view with Internet Explorer 5, Netscape 7,
etc. - Can also use OWL-specific tools (e.g., SWOOP,
Protégé) - Terms in other taxonomies can be mapped to SWEET
using - Global Change Master Directory (GCMD)
- CF Standard Names
- http//sweet.jpl.nasa.gov/ontology/
17SWEET units ontology can be used to create new
data elements
This diagram only shows classesnot the concepts
themselves -- because object properties
instances are also concepts
18DEMO DISCUSS Connect terminology/ontology to
EDR items
- Search wetlands go to sweet
19Relationships are implemented as LINKS to other
xml files
ltDataElement ... xmlbase"http//erdos.lbl
.gov/xmdr2/data/DEALL.1.5394.1.xml"gt...
ltsigngtCountry Namelt/signgt ...
lttype rdfresource"RCDIS.1.12116.1.xml"/gtRepresen
tationClass link ltdomain rdfresource"VDALL.1.1
5147.1.xml"/gt ValueDomain link ltmeaning
rdfresource"DCDIS.1.12800.1.xml"/gt
DataElementConcept link ltexample
rdfdatatype"xsdstring"gtUnited
Stateslt/examplegt lt/DataElementgt
Metadata schema includes relationships that
specify which attributes can or must link to
other entity-types
Kevin show HTML rendition
Index contains names of entities links
ltValueDomain ... xmlbase"http//erdos.lbl
.gov/xmdr2/data/ VDALL.1.15147.1.xml"gt
20XMDR RDF graph query facilities compliment text
query capabilities
- SQL-like queries
- e.g., names of ontologies in a registry
- Span items that are only indirectly connected
- e.g., data elements associated with a conceptual
domain - Expand queries to subsumed classes in hierarchy
- e.g., ConceptualDomain includes EnnumeratedConc..
- Transitivity
- e.g., all subclasses subsumed by a higher order
class - e.g., all superclasses (ancestors) of a
particular class - Least common ancestor
- e.g., closest subsuming concept for 2 concepts
- (minimal generalization)
21Reasoners use OWL ontologies to augment RDF
graph queries
RDF Query (rdql/nrdql/SPARQL)
Reasoner Jena or Racer (memory)
result set includes subclasses, inverses, etc.
OWL 11179 Ontology
OWL built-in rules
11179 metadata (xml/rdf files)
22Reasoner software is still somewhat bleeding
edge
- OWL Web Ontology Language
- W3C Recommendation February 2004
- W3C Reasoners investigation (March 2004)
- http//www.w3.org/2003/08/owl-systems/test-results
-out - No pre-OWL reasoners passed all of the tests
- Pre-OWL reasoners took some time to modify
- New tools are just starting to be available
- Jena has had some serious limitations for XMDR
- Scalability cant load large datasets
- Performancequeries take forever to complete
- Weve begun to investigate alternatives (e.g.,
Kowari)
23Kowari may be a promising alternative/supplement
- Open Source
- OWL, Java, RDF, SPARQL
- Design objectives scalability, speed, stability
- Resolvers to operate on external data
- Datatype model operators
- RDF graph transversal, transitive closure,
inverse - Closely associated with MINDSWAP
- Maryland Information and Network Dynamics Lab
Semantic Web Agents Project - Mindswap is also majorly involved with SPARQL
- See extract from Mindswaps XTech2005
presentation
24SPARQL SPARQL Protocol And RDF Query Language
- W3C RDF Data Access Working Group
- Last call working drafts just out for
- SPARQL Query Language for RDF (July)
- SPARQL Query Results XML Format (August)
- SPARQL Protocol for RDF (September)
- Tools are tracking the SPARQL standards
- e.g., Kowari, Jena,
- http//www.w3.org/2001/sw/DataAccess/
25XMDR Prototype Package demonstrate recipe
on XMDR wiki
- Works on unix and Windows-XP
- http//erdos.lbl.gov/mediawiki/
index.php/Installation_on_Windows - Sandor Dornbush has installed prototype
- Student with Joel Sachs at U. MD Baltimore
- drafted recipe for importing XMDR prototype to XP
- John McCarthy now testing recipe XP import
- Next goal is to have download site for prototype
system, data documentation
26Challenges and Future Goals for XMDR Prototype
- Complexity
- Representation of complex content
- Can mere mortals create XMDR files (XML RDF
OWL)? - Scalability performance
- Prototype currently includes only 60,000 objects
- References to externally maintained sources
- data, ontologies, terminologies
- Building out other modules for original
architecture - Mapping
- Validation
- Evaluate alternative technologies
- for different modules
27Challenges and Future Goals (cont)
- Harmonization with ODM and MMF
- Tools
- RDF tools wont create XMDR files (add wrapper
constraints?) - User-friendly interface for RDF queries
- Something like EDR UI with link labels inverse
references - form interface for registration uploading
metadata? - Incorporate Common Logic, Web Services, etc.
- Ontology Lifecycle Management (OLM)
- Link concepts to data
- Generate schemas from axiomitized ontologies
28Other Topics? Extra Slides
29XMDR Prototype example dual purpose rdf/xml file
(extract) for one GEMET term
ltConcept rdfabout"" xmlbase"http//erdos.lbl.g
ov/xmdr2/data/CS-GEMET_2001.0/13198.xml"gt ltsource
rdfresource"../CS-GEMET_2001.0.xml"/gt ltidentifi
er rdfparseType"Resource"gt ltstring
rdfdatatype"http//www.w3.org/2001/XMLSchemastr
ing"gt13198lt/stringgt lt/identifiergt ltterminological
Entry rdfparseType"Resource"gt ltentryContext
rdfresource"CXT-default.xml"/gt ltsection
rdfparseType"Resource"gt ltlanguage
rdfdatatype"http//www.w3.org/2001/XMLSchemastr
ing"gtenlt/languagegt ltdesignation
rdfparseType"Resource"gt ltname
xmllang"en"gtprotein productlt/namegt lt/designation
gt ltdefinition rdfparseType"Resource"gt ltsource
rdfresource"lgConsource"/gt lttext
xmllang"en"gtNo definition needed.lt/textgt lt/defin
itiongt lt/sectiongt
Kevin show new version Note parts that illustrate
RDF and OWL
30OWL XML fragment for SWEET measurement units
lt?xml version"1.0" encoding"UTF-8" ?gt
lt!DOCTYPE rdfRDF (View Source for full
doctype...)gt - ltConcept rdfabout""
xmlns"http//hpcrd.lbl.gov/SDM/XMDR/ont/xmdr_2005
-10-13.owl" xmlnsrdf"http//www.w3.org/1999/0
2/22-rdf-syntax-ns" xmlbase"http//erdos.lbl.
gov/xmdr4/data/sweet-units/meter.xml"gt-
ltdesignation rdfparseType"resource"gt
ltcontext rdfresource"default-cxt.xml" /gt
ltsign xmllang"en"gtmeterlt/signgt
lt/designationgt- ltdesignation rdfparseType"reso
urce"gt ltcontext rdfresource"units-symbol-cxt.
xml" /gt ltsign xmllang"en"gtmlt/signgt
lt/designationgt ltcontainer rdfresource"../sw
eet-units.xml" /gt lt/Conceptgt