Title: By: Boanerges AlemanMeza
1Paper ReviewRAL An Algebra for Querying RDF
Citation Flavius Frasincar, Geert-Jan Houben,
Richard Vdovjak, Peter Barna. RAL An Algebra for
Querying RDF. World Wide Web Internet and Web
Information Systems, 7(1), pp. 83-109, 2004
- By Boanerges Aleman-Meza
- CSCI-8380 Advanced Topics in Information Systems
- Computer Science, University of Georgia
- February 19, 2004
2RAL An Algebra for Querying RDF
Alternative version of the paper (9 pages)
Flavius Frasincar, Geert-Jan Houben, Richard
Vdovjak, Peter Barna. RAL An Algebra for
Querying RDF. Proceedings of the Third
International Conference on Web Information
Systems Engineering (WISE'02), Singapure,
December 12-14, 2002. pp. 173-181
32. RAL Goals
- Support the formal specification of RDF query
languages - Provide a reference framework to compare
different RDF query languages - Consider the result construction phase
- presently neglected by RDF query languages which
focus only on extraction - Enable algebraic query optimization
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
4RAL
- RAL Data Model specify what information is
accessible (for RAL operators) in an RDF graph - Nodes Resources and Literals
- Edges Properties
- RAL Operators define operators working on
collections of nodes from the RAL Data Model - Extraction Operators
- Loop Operators
- Construction Operators
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
53. RAL Data Model
- R is the set of resources R U ? B
- U is the set of URI references rdfProperty ?U
- B is the set of blank nodes
- L is the set of literals U, B, L
are disjoint - P is the set of properties P ? R,
rdftype ?P
R
L
U
B
P
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
6- An RDF model M is a finite set of triples
(statements) - M ? R ? U ? (R ? L)
- The set of properties of an RDF model M
- PM p (s, p, o) ? M ? (p, rdftype,
rdfProperty) ? M - The RDF graph model is similar to a directed
labeled graph (DLG) - It is not a DLG since it allows for multiple
edges between two nodes - It is not a general multigraph because different
edges between two nodes cannot share the same
label
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
7- The RDF graph model corresponding to an RDF model
M is defined by - GM (N, E, lN , lE), lN N ? R ?L, lE E?
P - using the following construction mechanism
- for each (s, p, o) ? M
- add nodes ns, no to N (different only
if s ? o) - assign lN (ns) s, lN (no) o
- add ep to E as a directed edge between
ns and no - assign lE ( ep ) p
- Observations
- lN (.) is an injective partial function
- lE ( .) is a total function
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
8Basic Properties
Edges
Nodes
- Two non-blank nodes are equal if they have the
same id - Two blank nodes are equal if they have the same
properties - and the corresponding property values are
equal -
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
9RAL data model
Review of RAL An Algebra for RDF
- Where's the contribution in the data model?
- What's the value of defining sets U (uri), B
(blank nodes), L (literals), R (resources), ? - Are two blank nodes really equal if they have the
same properties and the corresponding property
values are equal ?
10RDF(S)-Closure
- RDF Model Theory defines the RDF-closure and
RDFS-closure of an RDF Model M by proposing a set
of rules for generating new triples - Extensional data the original model M triples
- Intensional data the new triples generated by
the RDF(S)-closure - RAL operators work on extensionalintensional
data - Variants of the operators can be defined to
neglect the intensional data (similar to the RQL
strict interpretation)
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
11RDF(S) Closure Example
Review of RAL An Algebra for RDF
rdfssubClassOf
rdfssubClassOf
DOMAIN
RANGE
assistant_professor_at
schema/vocabulary
instances
assistant_professor_at
124. RAL Operators
- All operators have the following form
- of(x1, x2, xn expression)
- where an expression is a collection of nodes
and f is a function having as input/output
collection of nodes - Extraction Operators retrieve the needed
information from an RDF graph - Loop Operators control the repetitive
application of certain operators - Construction Operators build new RDF graphs from
the extracted data
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
134.1 Extraction Operators
- Projection
- ?re_name(e expression)
- computes the values of the properties with a
name given by the regular expression re_name over
strings on the input collection given by e - Example
- ?(Pp)aints(r4)
- returns the resources painted by r4
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
14Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
15- Selection
- ?condition(e expression)
- selects input collection nodes fulfilling the
given condition - Example
- ??tname Chiaroscuro(c)
- where c is the collection of input resources
r1, r2, r3, and r4, returns the resources
representing the painting technique with the
nameChiaroscuro
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
16Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
17 Join (x expression) ?condition (y
expression) ? ?condition(x ? y) is a
Cartesian product followed by a selection
Example (x ??rdftype
Technique(c))
??exemplified_by(x) ? paints(y)
(y ??rdftype Painter(c))
returns a collection of blank nodes, each blank
node having all the properties of the
corresponding pair from the Cartesian product
that satisfies the given condition
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
18Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
19 Union, Difference, Intersection (x expression)
? (y expression) where ? ??, ?,
? defined as in set theory Example
??rdftype Technique(c) ? ??rdftype
Painter(c), returns the collection of resources
obtained by combining the two collections (these
two collections are obtained using two selections)
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
20Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
214.2 Loop Operators
- Map
- mapf(e expression)
- applies the function f to each element of the
input collection the function results are added
in the output collection -
- Example
- map? rdfssubClassOf(Painting, Painter)
- computes the parent classes using the
property rdfssubClassOf for the collection
consisting of Painting and Painter
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
22Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
23 Kleene Star ?f(e expression) repeats the
function f possibly infinite times starting with
the given input collection at each iteration the
results of the function are added to the next
function input Example ??rdfssubClassOf
(Painting)) computes the transitive closure
of the property rdfssubClassOf starting from
Painting, i.e. Painting and all its superclasses
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
24Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
25Equivalents in RQL ?
Review of RAL An Algebra for RDF
- Selection, Projection
- Blank nodes in cartesian product, join
- Union, Difference, Intersection
- Map, Kleene Star
264.3 Construction Operators
- Create Node
- nodetype, id()
- adds a new node to the graph with the given type
and id (id is missing for blank nodes) and
returns this node if a resource is created, an
rdftype edge is added between the resource and
the node representing rdfsResource - The Create Node operator assigns a unique (in
the resulted RDF graph) internal identifier for
each created node
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
27 Example nodeResource() and
nodeLiteral,Caravagio() create a
Resource representing a blank node and a Literal
representing the string Caravagio
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
28 Create Edge edgename, subject(object
expression) adds edges between the subject node
and each of the nodes in the object collection,
and returns the subject node the label of the
edges is given by name which is the id of a
property resource The Create Node and Create
Edge operators abort if the well-formed RDF(S)
graph conditions (e.g. rdftype cannot refer to
a literal, literals cannot have properties etc.)
are not met after construction
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
29 Example edgename, nodeResource()(node
Literal, Caravagio()) creates an edge
labeled with name between the nodes defined
in the previous example
Slide copied from http//wwwis.win.tue.nl/hera/
presentations/ral.ppt
30RAL optimization options
Review of RAL An Algebra for Querying RDF
- Push selections down
- Law 9 - commutativity of ? with ?
- ?cond(e1 ? e2) ?cond(e1) ? e2
? ?rdftypePainter ?rdftypeTechnique
(x, y)
?
31RAL optimization options
- Law 15 - 19 ? Will consider ...?
- "sets" of nodes, with their neighboring nodes
- classes/properties hierarchy
- changing focus from datatypes to relations
32Conclusions (my review)
- Will RAL Algebra be used?
- Compare with RQL queries
- Compare with relational algebra
- W3C endorsed? Mentioned? Linked to?
- Remaining work
- develop a query optimization algorithm
- compare RAL expressive power with RQL, and other
query languages
33Conclusions (my review)
Review of RAL An Algebra for Querying RDF
- (Possible) Misleading treatment/mention of
"domain" and "range" restrictions - RDF data with zero to many schemas, but does
not have to comply with them ! - Why do they give examples using literals ?
34Comments, Questions