Title: TOBB UNIVERSITY OF ECONOMICS AND TECHNOLOGY BIL 546 Semantic Web
1TOBB UNIVERSITY OF ECONOMICS AND TECHNOLOGYBIL
546 Semantic Web
- Semantic Web
- and
- Databases
Hüseyin ÇOTUK 08110122
2Related Papers
- M.Krishna, Retaining Semantics in Relational
Databases by Mapping Them to RDF, 2006
IEEE/WIC/ACM Internaional Conference - J.Petrini, T.Risch, SWARD Semantic Web Abridged
Relational Databases, 18th International Workshop
on Database and Expert Systems Applications, 2007 - W.Teswanich, S.Chittayasothorn, A Transformation
from RDF Documents and Schemas to Relational
Databases, 2007 IEEE Pacific Rim Conference on
Communications - J.Xu, W.Li, Using Relational Database to Build
OWL Ontology from XML Data Sources, 2007
International Conference on Computational
Intelligence and Security Workshops
3Related Work
- D2R
- squirrelRDF
- SPASQL
- Oracle applications
- Reading Relational Databases on the Semantic Web
by Tim Burners LEE
4CONTENTS
- Transformation Philosophy
- Overview of papers
- Overview of related work
- Project Subject
- Questions
5Transformation Philosophy
- Real-world applications store data in relational
databases - RDBs are faster and more reliable
- Semantic web aims to make data to be machine
processible - With the growth of semantic web, expressing
relational databases in a form and language that
may be machine processable and such that they
make the semantics as expressed by the database
more explicit
6Retaining Semantics in Relational Databases by
Mapping Them to RDF
- Purpose of using relational DBs
- storing
- managing
- retrieving data
- Translation of data modelling schemes such as
entity relationship daigrams into RDBs is
accompanied by a certain loss of semantics - Paper presents a methodology for expressing
relational databases in the Resource Description
Framework (RDF) language, which has even been
referred to as the language of the Semantic Web
7Retaining Semantics in Relational Databases by
Mapping Them to RDF
- In order to express a description, RDF uses a
number of triples - ltSubjectgt ltPredicategt ltObjectgt
- ltResourcegt ltPropertygt ltProperty Valuegt
- A resource can remain constant even when its
content - the entities, to which it currently
corresponds - changes over time, provided that
the conceptual mapping is not changed in the
process. - What gives RDF its uniqueness in representing
relationships between resources is that RDF is
specific to use on the web it utilizes Uniform
Resource Identifiers (URIs) to represent
resources and properties, and in some cases,
property values as well.
8Retaining Semantics in Relational Databases by
Mapping Them to RDF
- Word ? URIs (vocabulary)
- Define vocabulary ? URIREFs (RDF Schema or OWL
needed) - Much of the related research work attempting to
- express data in a machine readable form, while
utilizing terms defined ontologically, has
concentrated on the use of UML - UML has many disadvantages when it comes to
dealing with the problem at hand (lacking
semantics)
9Retaining Semantics in Relational Databases by
Mapping Them to RDF
- A relational database consists of tables, which
consist of tuples or records. - Each tuple consists of a set of attribute values.
A tuple, therefore, is comprised of the contents
of its attributes - Relationship
- tuple (a row in a table) ? RDF subject
- attribute ? RDF predicate
- attribute value ? RDF object
10Retaining Semantics in Relational Databases by
Mapping Them to RDF
11Semantic Web Abridged Relational Databases
- A system that can process queries to RDF views of
large relational databases - This provides very flexible views of wrapped
databases that can be queried using either RDQL
or SQL - it is critical to optimize not only data access
time but also the time to perform the query
optimization itself - RDBs ? RDF views
- RDF views include schema data,table content data
12Semantic Web Abridged Relational Databases
- Queries can mix meta-data and table access
- SWARD presents RDF triples derived from a
relational database as a single relation of
triples, called the universal property view, UPV. - The UPV is internally defined as a union of a
content view that represents relational table
contents and a schema view that represents the
relational schema. - Content view represent one exported column in
relational database
13Semantic Web Abridged Relational Databases
- Related work
- RDF repository systems often use relational
databases internally. Such a relational database
is fully managed by the repository system and the
schema of the relational database is internal. If
one wants to make RDF queries to an existing
relational database using such a repository, it
requires downloading the database into the
repository. This clearly does not scale - Rather than storing RDF data in dedicated RDF
repositories SWARD wraps an existing relational
database so that it can be used in RDF queries
without downloading database tables to a
repository. Instead the data necessary for
answering a particular query are represented as
RDF triples streamed through SWARD
14Semantic Web Abridged Relational Databases
15Semantic Web Abridged Relational Databases
16Semantic Web Abridged Relational Databases
- The UPV U of a relational database for a given
ontology is defined as the union of two subviews,
one representing the schema of the relational
database, the schema view S, and one representing
its contents, the content view C, i.e. US ? C. U
is generated by ExportRDB and has the definition - U(s,p,v) - S(s,p,v) OR C(s,p,v)
- S(s,p,v) - Classes(s,p,v) OR
- Domains(s,p,v) OR
- Ranges(s,p,v)
17Semantic Web Abridged Relational Databases
18Semantic Web Abridged Relational Databases
- After normalization the SQL generator finally
translates each simplified conjunctive subquery
into an algebra expression. The algebra
expression contains calls to SQL statements sent
via JDBC to the relational database for
cost-based optimization and execution. - The only SQL statement submitted to the back-end
relational database in our example is actually
19A Transformation from RDF Documents and Schemas
to RDBs
- Resource Description Framework (RDF) documents
and schemas (RDFS) are used to describe
information in the semantic web. - Many research works regard the RDF/RDFS documents
as databases and proposed data manipulations for
them. - This paper takes a different approach. In order
to easily manipulate the database, RDF/RDFS
documents are transformed into relational
database format so that relational languages,
data management and business intelligence
facilities which are readily available can be
exploited.
20A Transformation from RDF Documents and Schemas
to RDBs
- RDF Schemas (RDFS) help RDF defined properties
(attributes), kinds, and relationships of
resources in RDF documents - The processing of RDF/RDFS documents as databases
is not efficient due to the lack of database
management system (DBMS) support of RDFS as a
database model - Querying RDF/RDFS documents is based on tree
traversal and simple pattern matching. - From the productivity point of view, SQL requests
made on relational databases are considered
simpler and take less time to formulate than
using the RDF-based language such as SPARQL
21A Transformation from RDF Documents and Schemas
to RDBs
- From the availability point of view, relational
DBMS and supported Business Intelligence Software
Tools which are widely available are mostly based
on relational databases - In order to transform RDF/RDFS documents to
Relational Database (RDB), the documents will be
loaded into the RDF Transformation Engine, which
provides three levels of engines - The loaded documents will be separated into two
parts in data segregation level RDF and its
Schema (RDFS)
22A Transformation from RDF Documents and Schemas
to RDBs
23A Transformation from RDF Documents and Schemas
to RDBs
- NIAM Conceptual Meta Schema of RDFS
24A Transformation from RDF Documents and Schemas
to RDBs
25A Transformation from RDF Documents and Schemas
to RDBs
26A Transformation from RDF Documents and Schemas
to RDBs
27A Transformation from RDF Documents and Schemas
to RDBs
- SPARQL is currently a working draft under
development by W3Cs RDF Data Access working
group (DAWG). - Since SPARQL is not a state-full protocol, using
cursors with a special isolation and locking
level on relational database could not be
applied. - SPARQL does not support the data modification
operations like INSERT, UPDATE, or DELETE in SQL. - The query using SPARQL on RDF document still does
not support aggregate functions like COUNT, MAX,
MIN, or AVG in SQL in this current version
28A Transformation from RDF Documents and Schemas
to RDBs
29A Transformation from RDF Documents and Schemas
to RDBs
30Using Relational Database to Build OWL Ontology
from XML Data Sources
- The semantic web and web service take ontology
into usage to describe the important concepts and
relations among them. But the construction of
ontology from scratch is costly and difficult. - In this paper an approach is proposed to
construct OWL ontology from XML document with the
help of entity-relation model, and this approach
will alleviate the difficulties in ontology
construction.
31Using Relational Database to Build OWL Ontology
from XML Data Sources
- XML itself only provides syntax and little
meanings of XML document content. The tags in XML
documents are only meaningful to human, but
meaningless to machine - The Semantic web takes ontology as the way to
express the semantics of the data - Unfortunately, in the real world, knowledge
doesnt exist in an ontology style. So how to
construct a domain ontology is interesting. - Ontology construction is a very expensive,
time-consuming and laborious issue.
32Using Relational Database to Build OWL Ontology
from XML Data Sources
- In this paper, an approach is proposed to build
OWL ontology from XML document. We first map an
XML document to an entity-relation model, and
then extract the metadata information and
structural restriction of the entity-relation
model to build an OWL ontology - Two basic approaches can be adopted on the topic.
One is top-down approach. In this method, an
ontology is previously defined and then
associated to the local schema or XML document
instance.
33Using Relational Database to Build OWL Ontology
from XML Data Sources
- The other is the bottom-up approach, in which an
ontology is constructed from the conceptual
schema of local data source and all semantics of
the ontology derived from the local data sources. - There are some related work that analyzes the
structure of an XML document to access semantic
of the content in second way. Some of them
focused on a general mapping between XML and RDF
and others mainly aim at mapping from DTD or XML
Schema to OWL without enough considering XML
instance data.
34Using Relational Database to Build OWL Ontology
from XML Data Sources
- The XTR-RTO Mapping
- One XTR (XMLTransform to Relational database)
mapping approach is proposed to map an XML
document to an entity-relation model, and then
one RTO (Relational database Transform to
Ontology ) mapping approach is proposed to map an
entity-relation model to an OWL ontology. - Each SimpleType element and attribute is mapped
to a scalar type, which will be used as a column
of the table, and their values cannot be changed. - Each ComplexType element is mapped to a class,
which will be used as a table. The attributes and
subelements of them will be mapped to the
properties of the class.
35Using Relational Database to Build OWL Ontology
from XML Data Sources
- Each class will be transformed into a table, and
its properties will be transformed into a column
of the table. It is concretely described below - The name of the table is the same as the class,
and also the name of scalar property is used as
the name of a column. Add the primary key to each
table. - RTO Description
- The entity-relation model is the most popular
style for organizing database at present, which
can express the relationship between data
clearly. So we can extract metadata information
from relational database to construct OWL
ontologies.
36Using Relational Database to Build OWL Ontology
from XML Data Sources
- The OWL ontology contains
- vocabularies for describing relational database
systems such as rdbDBName, rdbRelation,
rdbRelationList, rdbTable, rdbAttribute,
rdbPrimaryKeyAttribute, rdbForeignKeyAttribute
and so on. - semantic relationships between vocabularies such
as rdbhasRelation, rdbhasAttribute,
rdbprimaryKey, rdbhasType, rdbisNullable and
so on. - restrictions on the vocabularies and their
semantic relationships such as each relation has
zero or more attributes, each attribute has
exactly one type, etc.
37Using Relational Database to Build OWL Ontology
from XML Data Sources
- The RTO mapping approach is described as below
- Each table is mapped to an instance of type
rdbRelation and then added to type
rdbRelationList. - Each attribute is mapped to an instance of type
rdbAttribute, and an instance of type
rdbhasType is generated simultaneously. If the
attribute is the foreign key, an instance of type
rdbReferenceAttribute and an instance oftype
rdbReferenceRelation are generated to represent
this information.
38Using Relational Database to Build OWL Ontology
from XML Data Sources
- ltBOOKgt
- ltBOOK_IDgtBK-001lt/BOOK_IDgt
- ltTITLEgtAn XML primerlt/TITLEgt
- ltAUTHORgtJohnlt/AUTHORgt
- ltPRINTERgt
- ltPRINTER_IDgtGB2000lt/PRINTER_IDgt
- ltPRINTER_NAMEgtXianDailt/PRINTER_NAMEgt
- ltCITYgtBeijinglt/CITYgt
- lt/PRINTERgt
- lt/BOOKgt
39Using Relational Database to Build OWL Ontology
from XML Data Sources
- According to our XTR mapping algorithm, we will
get the entity-relation model as below - Class BOOK
- BOOK_ID string
- TITLE string
- AUTHOR string
- PRINTER_ID string
-
- Class PRINTER
- PRINTER_ID string
- PRINTER_NAME string
- CITY string
40Using Relational Database to Build OWL Ontology
from XML Data Sources
- According to the XTR mapping, there will be two
tables in this database
41Using Relational Database to Build OWL Ontology
from XML Data Sources
- The OWL description of table BOOK
42Using Relational Database to Build OWL Ontology
from XML Data Sources
- BOOK_ID is the primary key of table BOOK, so it
must be unique, and its cardinality must be 1.
But the number of Author is not necessarily be 1,
maybe a book has several authors. So its
minCardinality is 1, but on paper it can be any
count. - As for foreign key, it has the same meaning as
the attribute referred by it. So we can use
owlsameAs to describe this information. For
example, the cardinality restriction and foreign
key restriction in BOOK.owl are described
separately
43Using Relational Database to Build OWL Ontology
from XML Data Sources
44D2R
- D2R Server is a tool for publishing relational
databases on the Semantic Web - Data on the Semantic Web is modelled and
represented in RDF. - D2R Server uses a customizable D2RQ mapping to
map database content into this format, and allows
the RDF data to be browsed and searched the two
main access paradigms to the Semantic Web.
45D2R
- D2R Server's Linked Data interface makes RDF
descriptions of individual resources available
over the HTTP protocol. An RDF description can be
retrieved simply by accessing the resource's URI
over the Web. Using a Semantic Web browser like
Tabulator the OpenLink RDF Browser, or Disco, you
can follow links from one resource to the next,
surfing the Web of Data - The SPARQL interface enables applications to
search and query the database using the SPARQL
query language over the SPARQL protocol.
46D2R
- Requests from the Web are rewritten into SQL
queries via the mapping. This on-the-fly
translation allows publishing of RDF from large
live databases and eliminates the need for
replicating the data into a dedicated RDF triple
store.
47D2R
- Public D2R servers
- dbpedia.org Structured data extracted from
Wikipedia - Gene Ontology annotations (Chris Mungall)
- Annotated images of gene expression in fruitfly
embryogenesis (Chris Mungall) - DBLP Bibliography Database on the Semantic Web
- Web-based Systems Group _at_ Freie Universität
Berlin Information about staff, projects and
publications - Roller blog server demo (Henry Story)
- D2R Server Live Demo publishing an example
conference database
48squirrelRDF
- SquirrelRDF is a tool which allows non-RDF data
stores (or, perhaps, not explicitly RDF) to be
queried using SPARQL. In its current form this
includes relational databases (via JDBC) and LDAP
servers (via JNDI). It provides an ARQ
QueryEngine (for java access), a command line
tool, and a servlet for SPARQL http access. As a
result the information now looks like RDF, and is
always current. - SquirrelRDF exposes the mapped store in a rather
'raw' form. It makes no attempt, for example, to
reveal implicit relations between objects
(suggested by foreign keys), or normalise
denormalised data. This simplifies Squirrel's
task, focusing it on mapping to RDF and ignoring
the complex task of transforming between
vocabularies or ontologies, which are better left
to pure RDF tools.
49squirrelRDF
- Here are some approaches
- Map using the configuration files, which gives
you a simple property or class mapping. Very
limited. - Use CONSTRUCT, which is a more powerful means to
change the shape of the results. - Use N3 or Jena rules, which is often ideal.
- Transform the incoming query. A poor man's
backward rule engine, but it is quite useful. - You will need
- Java 5
- Jena 2.4
- The relevant JDBC driver (if you want to use the
RDB mapper) - HSQLDB (if you want to run the RDB tests)
50SPASQL
- SPASQL is simply an extension of the SQL
standard, allowing execution of SPARQL queries
within SQL statements, typically by treating them
as subquery or function clauses - Adding native SPARQL support to the database can
deliver the same performance as for well-tailored
SQL queries - Several gateways between RDF and conventional
relational stores have also been developed to
take advantage of federated query capabilities.
Examples of implementations that can rewrite
SPARQL queries to SQL include OpenLink Virtuoso,
D2RQ, and SquirrelRdf.
51SPASQL
- Mapping
- Semantic mapping is (in this context) a mapping
from SPARQL queries expressed in terms of RDF
graphs to relational queries (SQL) expressed in
terms of tables and attributes. The advantages of
mapping include - portability a query about a person with
foafname "Bob" can work on multiple relational
databases - intuitiveness using common terms allows multiple
databases to express their data in the shape of a
single, well-thought-out schema developed and
understood by the community - migration no need to convert relational
databases into RDF for storage in a triple store,
which can add latency, and increase storage
requirements
52SPASQL
- SPASQL is a modified MySQL server which is able
to parse both SQL and SPARQL queries. This allows
one to embed SPARQL queries in any MySQL client.
For instance, a PHP page may include SPARQL
queries where it used SQL before - lt? mysql_connect(localhost,username,password)
_at_mysql_select_db(database) or die( "Unable to
select database") - mysql_query("SPARQL SELECT ?address ?apt WHERE
?o ltOrders.shippingAddressgt ?address . ?address
ltAddresses.aptgt ?apt ") ?gt
53RDF Support in Oracle RDBMS
- Three types of database objects
- Model - RDF graph consisting of a set of triples
- Rulebase - Set of (user-defined) rules
- Rule Index - Entailed RDF graph
54RDF Support in Oracle RDBMS
- Family
- (John brotherOf Mary)
- (John age 16xsdInteger)
- (Mary parentOf Matt)
- (John name John)
- (Mary name Mary)
- Reification
- (John thinks _S1)
- (_S1 rdfsubject Sue)
- (_S1 rdfpredicate livesIn)
- (_S1 rdfobject NYC)
55RDF Support in Oracle RDBMS
- Example RDF Query
- Find salary and hiredate of all the uncles
- SELECT emp.name, emp.salary, emp.hiredate
- FROM emp,
- TABLE(SDO_RDF_MATCH(
- (?x brotherOf ?y)
- (?y parentOf ?z)
- (?x name ?name),
- SDO_RDF_Models(family'),
- )) t
- WHERE emp.namet.name
- Use of SDO_RDF_MATCH allows embedding a graph
query in a SQL query
56RDF Support in Oracle RDBMS
- Find pairs of persons residing at the same
address where the first person rents a truck and
the second person buys a fertilizer - SELECT t3.x name1, t3.y name2
- FROM AddrTable t1, AddrTable t2,
- TABLE(SDO_RDF_MATCH(
- (?x rents ?a) (?a rdftype Truck)
- (?y buys ?b) (?b rdftype Fertilizer),
- SDO_RDF_Models(Activities'),
- )) t3
- WHERE t1.namet3.x and t2.namet3.y and
- t1.addrt2.addr
57RDF Support in Oracle RDBMS
- Each RDF rulebase consists of a set of rules
- Each rule consists of
- antecedent graph-pattern
- filter condition (optional)
- Consequent graph-pattern
- One or more rulebases may be used with relevant
RDF models (graphs) to obtain entailed graphs
58RDF Support in Oracle RDBMS
- Rules in a rulebase family_rb
- Antecedent (?x brotherOf ?y) (?y parentOf
?z) - Filter NULL
- Consequent (?x uncleOf ?z)
- Antecedent (?x age ?a)
- Filter a gt 65
- Consequent (?x ageGroup Senior)
- Antecedent (?x parentOf ?y) (?y parentOf ?z)
- Filter NULL
- Consequent (?x grandParentOf ?z)
59RDF Support in Oracle RDBMS
- A rule index represents an entailed graph
- A rule index is created on an RDF dataset
(consisting of a set of RDF models and a set of
RDF rulebases) - A rule index may be created on a dataset
consisting of - family RDF data, and
- family_rb rulebase (shown earlier)
- The rule index will contain inferred triples
showing uncleOf and ageGroup information
60RDF Support in Oracle RDBMS
- RDF Query w/ Inference Example
- Find salary and hiredate of all the uncles
- SELECT emp.name, emp.salary, emp.hiredate
- FROM emp,
- TABLE(SDO_RDF_MATCH(
- (?x uncleOf ?y) (?x name ?name),
- SDO_RDF_Models(family'),
- SDO_RDF_Rulebases(rdfs, family_rb'),
- )) t
- WHERE emp.namet.name
61RDF Support in Oracle RDBMS
- Find pairs of persons residing at the same
address where the first person rents a truck and
the second person buys a fertilizer - SELECT t3.x name1, t3.y name2
- FROM AddrTable t1, AddrTable t2,
- TABLE(SDO_RDF_MATCH(
- (?x rents ?a) (?a rdftype Truck)
- (?y buys ?b) (?b rdftype Fertilizer),
- SDO_RDF_Models(Activities'),
- SDO_RDF_Rulebases(rdfs),
- )) t3
- WHERE t1.namet3.x and t2.namet3.y and
- t1.addrt2.addr
62RDF Support in Oracle RDBMS
63RDF Support in Oracle RDBMS
- select m from TABLE(SDO_RDF_MATCH(
- '(?m rdftype Male)',
- SDO_RDF_Models('family'), null,
- SDO_RDF_Aliases(
- SDO_RDF_Alias('', 'http//www.example.org/family/'
)), null)) - M
- --------------------------------------------------
------------------------------ - http//www.example.org/family/Jack
- http//www.example.org/family/Tom
64RDF Support in Oracle RDBMS
- select m from TABLE(SDO_RDF_MATCH(
- '(?m rdftype Male)',
- SDO_RDF_Models('family'),
- SDO_RDF_Rulebases(RDFS),
- SDO_RDF_Aliases(
- SDO_RDF_Alias('', 'http//www.example.org/family/'
)),null)) - M
- --------------------------------------------------
------------------------------ - http//www.example.org/family/Jack
- http//www.example.org/family/Tom
- http//www.example.org/family/John
- http//www.example.org/family/Matt
- http//www.example.org/family/Sammy
65RDF Support in Oracle RDBMS
- select x, y from TABLE(SDO_RDF_MATCH(
- '(?x grandParentOf ?y) (?x rdftype Male)',
- SDO_RDF_Models('family'),
- SDO_RDF_Rulebases('RDFS','family_rb'),
- SDO_RDF_Aliases(
- SDO_RDF_Alias('','http//www.example.org/family/')
),null)) - X Y
- -------------------------------------------
------------------------------------------- - http//www.example.org/family/John
http//www.example.org/family/Cindy - http//www.example.org/family/John
http//www.example.org/family/Tom - http//www.example.org/family/John
http//www.example.org/family/Jack - http//www.example.org/family/John
http//www.example.org/family/Cathy
66Project Subject
- Generic RDB to SW transformation
- JDBC metadata
- Jena
- SparQL
67Questions