Title: RAL: an RDF Algebra
1RAL an RDF Algebra
- Flavius Frasincar
- Geert-Jan Houben
- Richard Vdovjak
- Peter Barna
2Contents
- Introduction
- RAL Goals
- RAL Data Model
- RAL Operators
- Conclusion
31. Introduction
- Metadata is machine understandable information
about web resources or other things - Source Tim Berners-Lee, Metadata
Architecture - RDF (Resource Description Framework) is the Web
metadata language for the Web - RDF extends the syntactic interoperability of XML
to semantic interoperability being the foundation
for the Semantic Web
4 Semantic Web Architecture Layer Cake
Source Tim Berners-Lee Director W3C Keynote
speech at XML2000 RDF and the Semantic
Web (Washington DC, 6 Dec. 2000)
5Hera
- Hera research project Web Information Systems
(WIS) and web (hypermedia) generation in WIS - WIS use RDF to represent and query application
data for - Semantic integration of data coming from
heterogeneous sources - Semantic information presentation
- Semantic querying
- Huge quantities of data and metadata need to be
processed in real-time optimization is crucial
6Hera Methodology/Suite
7RDF Representations
- Primitive semantics Subject Predicate Object
- Three alternative notations
- Triple (http//example.com/sb.jpg,
painted_by, Rembrandt) - RDF/XML ltrdfDescription rdfIDhttp//examp
le.com/sb.jpggt -
ltpainted_bygt -
Rembrandt -
lt/painted_bygt -
lt/rdfDescriptiongt - Graph
8RDF Query Languages
- Triple-based
- Triple successor of SiLRI (Horn logic)
- Metalog (Datalog)
- XML-based
- RDF Query
- RQuery (XQuery)
- Graph-based (but not graphical)
- RQL (OQL)
92. RAL Goals
- Support the formal specification of RDF query
languages - Provide a reference framework to compare
different RDF query languages - Consider the result construction phase
- presently neglected by RDF query languages which
focus only on extraction - Enable algebraic query optimization
10RAL
- RAL Data Model specify what information is
accessible (for RAL operators) in an RDF graph - Nodes Resources and Literals
- Edges Properties
- RAL Operators define operators working on
collections of nodes from the RAL Data Model - Extraction Operators
- Loop Operators
- Construction Operators
113. RAL Data Model
- R is the set of resources R U ? B
- U is the set of URI references rdfProperty ?U
- B is the set of blank nodes
- L is the set of literals U, B, L
are disjoint - P is the set of properties P ? R,
rdftype ?P
R
L
U
B
P
12- An RDF model M is a finite set of triples
(statements) - M ? R ? U ? (R ? L)
- The set of properties of an RDF model M
- PM p (s, p, o) ? M ? (p, rdftype,
rdfProperty) ? M - The RDF graph model is similar to a directed
labeled graph (DLG) - It is not a DLG since it allows for multiple
edges between two nodes - It is not a general multigraph because different
edges between two nodes cannot share the same
label
13- The RDF graph model corresponding to an RDF model
M is defined by - GM (N, E, lN , lE), lN N ? R ?L, lE E?
P - using the following construction mechanism
- for each (s, p, o) ? M
- add nodes ns, no to N (different only
if s ? o) - assign lN (ns) s, lN (no) o
- add ep to E as a directed edge between
ns and no - assign lE ( ep ) p
- Observations
- lN (.) is an injective partial function
- lE ( .) is a total function
14Basic Properties
Edges
Nodes
- Two non-blank nodes are equal if they have the
same id - Two blank nodes are equal if they have the same
properties - and the corresponding property values are
equal -
15RDF(S)-Closure
- RDF Model Theory defines the RDF-closure and
RDFS-closure of an RDF Model M by proposing a set
of rules for generating new triples - Extensional data the original model M triples
- Intensional data the new triples generated by
the RDF(S)-closure - RAL operators work on extensionalintensional
data - Variants of the operators can be defined to
neglect the intensional data (similar to the RQL
strict interpretation)
164. RAL Operators
- All operators have the following form
- of(x1, x2, xn expression)
- where an expression is a collection of nodes
and f is a function having as input/output
collection of nodes - Extraction Operators retrieve the needed
information from an RDF graph - Loop Operators control the repetitive
application of certain operators - Construction Operators build new RDF graphs from
the extracted data
17(No Transcript)
184.1 Extraction Operators
- Projection
- ?re_name(e expression)
- computes the values of the properties with a
name given by the regular expression re_name over
strings on the input collection given by e - Example
- ?(Pp)aints(r4)
- returns the resources painted by r4
19(No Transcript)
20- Selection
- ?condition(e expression)
- selects input collection nodes fulfilling the
given condition - Example
- ??tname Chiaroscuro(c)
- where c is the collection of input resources
r1, r2, r3, and r4, returns the resources
representing the painting technique with the
nameChiaroscuro
21(No Transcript)
22- Cartesian Product
- (x expression) ? (y expression)
- for each element in the Cartesian product of the
input collections, a blank node that has all
properties of both originating nodes is added to
the result -
- Example
- ??rdftype Technique(c) ? ??rdftype
Painter(c) - returns a collection of blank nodes, each blank
node having all the properties of the
corresponding pair from the Cartesian product
(the new nodes have both types Technique and
Painter)
23(No Transcript)
24 Join (x expression) ?condition (y
expression) ? ?condition(x ? y) is a
Cartesian product followed by a selection
Example (x ??rdftype
Technique(c))
??exemplified_by(x) ? paints(y)
(y ??rdftype Painter(c))
returns a collection of blank nodes, each blank
node having all the properties of the
corresponding pair from the Cartesian product
that satisfies the given condition
25(No Transcript)
26 Union, Difference, Intersection (x expression)
? (y expression) where ? ??, ?,
? defined as in set theory Example
??rdftype Technique(c) ? ??rdftype
Painter(c), returns the collection of resources
obtained by combining the two collections (these
two collections are obtained using two selections)
27(No Transcript)
284.2 Loop Operators
- Map
- mapf(e expression)
- applies the function f to each element of the
input collection the function results are added
in the output collection -
- Example
- map? rdfssubClassOf(Painting, Painter)
- computes the parent classes using the
property rdfssubClassOf for the collection
consisting of Painting and Painter
29(No Transcript)
30(No Transcript)
31 Kleene Star ?f(e expression) repeats the
function f possibly infinite times starting with
the given input collection at each iteration the
results of the function are added to the next
function input Example ??rdfssubClassOf
(Painting)) computes the transitive closure
of the property rdfssubClassOf starting from
Painting, i.e. Painting and all its superclasses
32(No Transcript)
334.3 Construction Operators
- Create Node
- nodetype, id()
- adds a new node to the graph with the given type
and id (id is missing for blank nodes) and
returns this node if a resource is created, an
rdftype edge is added between the resource and
the node representing rdfsResource - The Create Node operator assigns a unique (in
the resulted RDF graph) internal identifier for
each created node
34 Example nodeResource() and
nodeLiteral,Caravagio() create a
Resource representing a blank node and a Literal
representing the string Caravagio
35 Create Edge edgename, subject(object
expression) adds edges between the subject node
and each of the nodes in the object collection,
and returns the subject node the label of the
edges is given by name which is the id of a
property resource The Create Node and Create
Edge operators abort if the well-formed RDF(S)
graph conditions (e.g. rdftype cannot refer to
a literal, literals cannot have properties etc.)
are not met after construction
36 Example edgename, nodeResource()(node
Literal, Caravagio()) creates an edge
labeled with name between the nodes defined
in the previous example
375. Conclusion
- The RAL algebra is developed from a DB
perspective and proposes a set of operators
similar to their relational algebra counterparts - Extraction Operators Projection, Selection,
Cartesian Product, Join, Union, Difference,
Intersection - Similar to the existing semi-structured query
languages RAL considers powerful repetition
operators - Loop Operators Map, Kleene Star
- As opposed to present RDF query languages RAL
supports result construction - Construction Operators Create Node, Create Edge
38Future Work
- Analyze the power of expression of RAL compared
to RQL, a popular RDF query language at present
time (build a translation scheme from RQL to RAL)
- Formally specify the semantics of other RDF query
languages in terms of RAL - Compare the power of expression of different RDF
query languages using RAL as reference language - Explore equivalence rules for RAL expressions to
be used in query optimization - Develop an RDF query optimization algorithm on RAL