Relational database integration with RDFOWL - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Relational database integration with RDFOWL

Description:

Improve the way you create, manage and distribute information. www.innodata-isogen.com ... 7790','http://www.radioshack.com/Miguel','RadioShack','','2109 Green Ave. ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 24
Provided by: snee
Category:

less

Transcript and Presenter's Notes

Title: Relational database integration with RDFOWL


1
Relational database integration with RDF/OWL
  • Bob DuCharme
  • December 7, 2006
  • XML 2006

2
About me
  • Senior Consultant, Innodata Isogen
  • weblog
  • http//www.snee.com/bobdc.blog
  • other writing
  • See http//www.snee.com/bob

3
What is an RDF/OWL ontology?
  • Ontology Computational formalization of a
    subject matter (Bijan Parsia et al)
  • Describe metadata about resource classes and
    their relationships
  • Web Ontology Language a W3C update of DAMLOIL
  • Good fit with Knowledge Representation and other
    AI work
  • Ontologies vs. traditional schemas

4
Ontologies for the sake of ontologies
  • If metadata is data about data, what data is your
    metadata about?
  • Field of Dreams attitude of many ontology
    developers

5
RDF in one slide
  • A data model, not a syntax.
  • Three-part statement called a triple
  • (Subject, Predicate, Object)
  • For example
  • (urnisbn0553213113, http//purl.org/dc/elements/
    1.1/creator, Herman Melville)
  • Great for loosely structured data, but

6
RDBMS integration with RDF/OWL
  • This presentation background demo
  • Paper accompanying presentation

7
Use Cases
  • Two address book databases that use different
    names (e.g. workState, businessState)
  • Find useful queries across the two that are
    easier in SPARQL than in SQL, thanks to RDF/OWL
  • Who works in NY state?
  • List any phone numbers (home, mobile, business,
    etc.) that I have for Alfred Adams.
  • Find all info for Bobby Fischer at 2304 Eighth
    Lane, even if the other database lists him as
    Robert L. Fischer of 2304 8th Ln.

8
Basic Steps
  • Generate data
  • Load into MySQL
  • Let D2RQ (RDBMS/RDF interface server) know about
    those databases
  • Get a dump of representative RDF data
  • Create ontology for that data
  • Issue ontology-aware SPARQL queries against that
    data

9
Generate Data
  • Fill out every field in a Eudora address book
    entry, export to CSV, see whats there
  • Repeat for Outlook
  • Write python script to generate data, e.g.
  • "Miguel","miguel802_at_hotmail.com","Miguel
    Porter","Miguel","Porter","1462 Oak
    St.","Kitchener","TN","US","67117-2620","(364)
    769-1070","(431) 985-7923","(850)
    998-7790","http//www.radioshack.com/Miguel","Radi
    oShack","","2109 Green Ave.","Boston","MP","US","4
    8379-6760","(824) 959-5268","(354)
    384-8517","(992) 963-9772","http//www.radioshack.
    com", "miguel.porter_at_radioshack.com","(748)
    965-6871","","Here is a sample note.\n\nThat was
    two carriage returns."

10
Load into MySQL
  • CREATE DATABASE eudora
  • USE eudora
  • CREATE TABLE entries (
  • nickname VARCHAR(20),
  • email1 VARCHAR(50),
  • fullName VARCHAR(30),
  • firstName VARCHAR(15),
  • lastName VARCHAR(20),
  • address VARCHAR(60),
  • etc.
  • PRIMARY KEY (lastName,firstName)
  • )

11
Tell D2RQ about databases
  • Generate mapping files (command lines split)
  • generate-mapping -o eudoraMapping.ttl -u root -p
    mypw jdbcmysql//localhost/eudora
  • generate-mapping -o outlookMapping.ttl -u root -p
    if27 jdbcmysql//localhost/outlook
  • Combine two mapping files
  • Start server with combined mapping file
  • d2r-server comboMapping.ttl

12
Get some data to use for ontology creation
  • SPARQL Query
  • CONSTRUCT ?s ?p ?o
  • WHERE ?s ?p ?o
  • URL version
  • http//localhost2020/sparql?queryCONSTRUCT7B
    3Fs3Fp3Fo7DWHERE7B3Fs3Fp3Fo7D

13
rdfcat.xsl
  • XSLT 1.0 stylesheet to create a single RDF file
    from a source file like this
  • ltrdfcat xmlnsxi"http//www.w3.org/2001/XInclude"
    gt
  • ltxiinclude hrefmyfile1.rdf"/gt
  • ltxiinclude hrefmyfile2.rdf"/gt
  • ltxiinclude hrefmyfile3.rdf"/gt
  • lt/rdfcatgt

14
List of files to concatenate together (rdfcat.rdf)
  • ltrdfcat xmlnsxi"http//www.w3.org/2001/XInclude"
    gt
  • ltxiinclude href"http//localhost2020/sparql?q
    ueryCONSTRUCT7B3Fs3Fp3Fo7DWHERE7B3
    Fs3Fp3Fo7D"/gt
  • lt!--xiinclude href"properties.owl"/--gt
  • lt/rdfcatgt
  • Short XSLT stylesheet reads listed resources,
    concatenates them together. Now we have RDF of
    sample data.

15
Generate ontology
  • Tell SWOOP to load an ontology then just load a
    regular RDF file!
  • Save it right away, see what you have.
  • Add That Value
  • Define more relationships between properties with
    Swoop
  • Save it
  • Look at the resulting ontology

16
New ontology rules
  • Define equivalent fields in the two databases
  • Declare phone property, name its subproperties
    (home, mobile, cell, work, business, fax)
  • email as inverse function

17
Separate new rules into separate file
  • ltrdfcat xmlnsxi"http//www.w3.org/2001/XInclude"
    gt
  • ltxiinclude href"http//localhost2020/sparql?q
    ueryCONSTRUCT7B3Fs3Fp3Fo7DWHERE7B3
    Fs3Fp3Fo7D"/gt
  • ltxiinclude href"properties.owl"/gt
  • lt/rdfcatgt

18
Issue Queries
  • Who works in NY state?
  • List any phone numbers (home, mobile, business,
    etc.) that I have for Alfred Adams.
  • Find all info for Bobby Fischer at 2304 Eighth
    Lane, even if other database lists him as Robert
    L. Fischer of 2304 8th Ln.
  • Sample running of pellet query (split onto two
    lines)
  • pellet -if file///dat/xml/rdf/databaseint/sampleo
    ut.rdf -ifmt RDF/XML -qf atest1.spq

19
Who works in NY state?
  • PREFIX e lthttp//localhost2020/resource/eudora/gt
  • PREFIX o lthttp//localhost2020/resource/outlook/
    gt
  • SELECT WHERE
  • ?s eentries_workState "NY"
  • --------------------------------------------------
    ------------
  • Query Results (9 answers)
  • s
  • jillJones
  • sarahRichardson
  • victorHernandez
  • elaineSanchez
  • annieButler
  • rodneyJones
  • jesusWells
  • curtisBarnes
  • crystalMartin

20
Alfred Adams phone numbers
  • PREFIX e lthttp//localhost2020/resource/entries/
    gt
  • SELECT ?phoneType ?phone WHERE
  • ?s ?phoneType ?phone.
  • ?s ephone ?phone.
  • ?s eudentries_lastName "Adams".
  • ?s eudentries_firstName "Alfred".
  • --------------------------------------------------
    -----
  • Query Results (13 answers)
  • phoneType phone
  • outlookentries_businessPhone "(768) 629-3639"
  • eudoraentries_workPhone "(768) 629-3639"
  • eudoraentries_workFax "(865) 937-1192"
  • eudoraentries_workMobile "(262) 851-6276"
  • eudoraentries_otherPhone "(840) 290-6143"
  • eudoraentries_mobile "(257) 372-7719"
  • et cetera

21
Bobby Fischer info
  • SELECT WHERE
  • lthttp//localhost2020/resource/entries/Bobby/
    Fishergt ?p ?o
  • --------------------------------------------------
    ------------------------------
  • Query Results (41 answers)
  • p o

  • eudoraentries_mobile "(989)
    402-5141"
  • eudoraentries_workWebAddress
    "http//www.atmosenergy.com"
  • outlookentries_lastName "Fisher"
  • eudoraentries_firstName "Bobby"
  • eudoraentries_state "NE"
  • eudoraentries_zip "29565-9670"
  • outlookentries_businessPhone "(167)
    559-3177"
  • eudoraentries_lastName "Fisher"
  • eudoraentries_workCity "El Paso"
  • eudoraphone "(974)
    270-6457"
  • et cetera...

22
Caveats
  • Querying disk file of full dump
  • Scaleable?

23
Relational database integration with RDF/OWL
  • Bob DuCharme
  • December 7, 2006
  • XML 2006
Write a Comment
User Comments (0)
About PowerShow.com