On Explicit Provenance Management in RDFS Graphs - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

On Explicit Provenance Management in RDFS Graphs

Description:

Deletes. Under coherence semantics ... Coherence semantics in updates (deletes) ... If DELETE is void ignore it ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 25
Provided by: gior57
Category:

less

Transcript and Presenter's Notes

Title: On Explicit Provenance Management in RDFS Graphs


1

On Explicit Provenance Management in RDF/S Graphs
Panagiotis PediaditisGiorgos FlourisIrini
FundulakiVassilis Christophides pped, fgeo,
fundul, christop_at_ics.forth.gr
Institute of Computer Science Foundation for
Research and Technology Hellas Heraklion, Greece
2
Provenance Management in RDF/S
  • Provenance management problem
  • Mostly addressed in the database context
  • We are dealing with why provenance in RDF/S
    graphs
  • Why provenance identifying the source data that
    had some influence on the existence of the target
    data
  • Three main characteristics (peculiarities of
    RDF/S)
  • Triple-based representation
  • Use quadruples to talk about triples provenance
  • Inference
  • Assign provenance information to implicit data
  • Coherence semantics (in updates)
  • Implicit data is a first-class citizen and should
    be retained during change, along with its
    provenance information

3
Characteristic 1Triple-based Representation
4
RDF Graphs
Define classes Paper rdftype rdfsClass PaperT
APP rdftype rdfsClass Person rdftype
rdfsClass Author rdftype rdfsClass Define
properties writes rdftype rdfProperty writes
rdfsdomain Author writes rdfsrange
Paper Instantiate (and define)
individuals Paper10 rdftype PaperTAPP Giorgos
rdftype Author Giorgos writes Paper10 Define
hierarchies PaperTAPP rdfssubClassOf
Paper Author rdfssubClassOf Person And other
stuff
RDF graph set of RDF triples
Paper
Person
writes
PaperTAPP
Author
Paper10
Giorgos
5
Provenance in RDF Graphs
Paper
Person
PUB Paper rdftype rdfsClass TAPP PaperTAPP
rdftype rdfsClass PUB Person rdftype
rdfsClass PUB Author rdftype
rdfsClass PUB writes rdftype
rdfProperty PUB writes rdfsdomain
Author PUB writes rdfsrange
Paper TAPP Paper10 rdftype PaperTAPP TAPP G
iorgos rdftype Author TAPP Giorgos writes
Paper10 TAPP PaperTAPP rdfssubClassOf
Paper PUB Author rdfssubClassOf Person
writes
PaperTAPP
Author
Paper10
Giorgos
6
Named Graphs and Provenance
  • Create two named graphs and assign an ID (URI) to
    each
  • Publications graph (URI PUB)
  • TAPP graph (URI TAPP)
  • Each named graph corresponds to a different
    source
  • Need some method to associate named graphs with
    triples
  • Triples become quadruples
  • Fourth element is the URI of the named graph
    (origin)

Paper
Person
writes
PaperTAPP
Author
Paper10
Giorgos
7
Quadruples for Provenance
Paper rdftype rdfsClass PUB PaperTAPP
rdftype rdfsClass TAPP Person rdftype
rdfsClass PUB Author rdftype rdfsClass
PUB writes rdftype rdfProperty PUB writes
rdfsdomain Author PUB writes rdfsrange Paper
PUB Paper10 rdftype PaperTAPP TAPP Giorgos
rdftype Author TAPP Giorgos writes Paper10
TAPP PaperTAPP rdfssubClassOf Paper
TAPP Author rdfssubClassOf Person PUB All
quadruples of the form s p o PUB originate from
named graph PUB (Publications graph) All
quadruples of the form s p o TAPP originate
from named graph TAPP (TAPP graph)
Paper
Person
writes
PaperTAPP
Author
Paper10
Giorgos
8
Properties of Named Graphs
  • The named graph URI can be used to refer to the
    named graph
  • Can be used for assignment of metadataTAPP
    hasAuthor JamesCheney G
  • Granularity of provenance
  • A triple is the smallest bit of information
  • The granularity of provenance achieved by named
    graphs is at the triple level
  • Flexible
  • A named graph can contain 0,1, or many triples
  • A triple can belong to 0,1, or many named graphs

Paper
Person
writes
PaperTAPP
Author
Paper10
Giorgos
9
Characteristic 2Inference
10
RDF/S Graphs
  • RDF Schema add-on to RDF
  • RDFS adds inference semantics
  • Transitivity of subclass/subproperty
  • Implicit instantiations
  • Example
  • Giorgos rdftype Author
  • Author rdfssubClassOf Person
  • Inference Giorgos rdftype Person
  • Inferred knowledge is implicit

Paper
Person
writes
PaperTAPP
Author
Paper10
Giorgos
11
Provenance and Inference
  • Quadruples
  • Giorgos rdftype Author PUB
  • Author rdfssubClassOf Person TAPP
  • Giorgos rdftype Person ???
  • Needs
  • Shared ownership
  • A more sophisticated, compound structure
  • Keeping the connection with the components
  • Composition operator (PTPUB?TAPP)
  • Giorgos rdftype Person PT
  • Ok, but see characteristic 3

Paper
Person
writes
PaperTAPP
Author
Paper10
Giorgos
12
Characteristic 3 Coherence Semantics (in
Updates)
13
Foundational Semantics
  • Foundational viewpoint (pyramid)
  • Knowledge consists of the explicitly represented
    knowledge
  • Only explicit knowledge can be changed
  • Implicit knowledge is affected indirectly,
    through the changes in the explicit knowledge (so
    that the resulting pyramid is stable)
  • Explicit knowledge is more important than
    implicit knowledge

Supported Knowledge
Implicit Knowledge
Explicit Knowledge
Basic Knowledge
14
Coherence Semantics
  • Coherence viewpoint (raft)
  • No discrimination between explicit and implicit
    knowledge
  • Both explicit and implicit knowledge can be
    changed
  • Changes should be made coherently in order for
    the resulting knowledge to make sense (so that
    the raft is stable)
  • Explicit and implicit knowledge are of the same
    value


Knowledge(includes both implicit and explicit
knowledge)
15
Deletes
  • Under coherence semantics
  • Inferred knowledge needs to be made explicit
    (when in danger of being lost)
  • Explicit assignment of shared origin to triples
  • Explicit shared origin assignment
  • Cannot use any composition operator
  • Must be a first-class construct (autonomous)
  • Retain the connection with its constituents
  • A need, but also a useful feature

Paper
Person
writes
PaperTAPP
Author
Paper10
Giorgos
16
RDF/S Graphsets
  • Graphsets are like named graphs
  • Have IDs (URIs)
  • Used in quadruples
  • Association of triples with graphsetsGiorgos
    rdftype Person PT
  • Can be referred to (metadata)PT rdftype
    Confidential G
  • Encode origin or shared origin
  • Giorgos rdftype Person PT
  • URI association (via skolem function)
  • PT is the URI of PUB, TAPP
  • PUB is the URI of PUB
  • A named graph is a graphset
  • PUB corresponds to PUB

Paper
Person
writes
PaperTAPP
Author
PT
Paper10
Giorgos
17
Querying With RDF/S Graphsets
  • Standard queries (original RQL)
  • Give me the Persons Giorgos
  • Provenance queries (extended RQL)
  • Give me the Persons per PUB
  • Give me the Persons per TAPP, PUBGiorgos
  • Give me the sources per which Author is a
    subclass of PersonPUB
  • Give me all the individual sourcesTAPP, PUB

Paper
Person
writes
PaperTAPP
Author
Paper10
Giorgos
18
Validity and Redundancy Elimination
  • Two invariants for RDF/S graphs
  • Valid (per some validity rules)
  • Redundant-free (space considerations)
  • The invariants allow optimized execution of
    queries
  • These invariants are imposed during change
  • Improve query speed, but make updates more
    difficult
  • Trade-off between having query overhead or update
    overhead

19
Updating With RDF/S Graphsets
  • Updates supported through an extended version of
    RUL
  • INSERT and DELETE
  • Only for data (class and property instances)
  • Implicit or explicit knowledge
  • Take into account and update graphset
    (provenance) information
  • Main considerations
  • Apply the change (INSERT or DELETE)
  • Respect invariants
  • Non-redundancy (INSERT) and validity (DELETE)
  • Make minimal changes (under coherence viewpoint)
  • No unnecessary loss of information
  • Take into account and preserve graphset
    (provenance) information
  • Applicable upon quadruples

20
Conclusion
  • Objective assign provenance information to RDF/S
    graphs to capture why provenance
  • Triple-based representation
  • Turned triples into quadruples and used named
    graphs to record the origin
  • Inference (per RDFS)
  • Composed named graphs
  • Coherence semantics in updates (deletes)
  • Used graphsets for composed named graphs (cannot
    use an operator)
  • Proposed query and update languages for graphsets
  • Based on RQL, RUL
  • Can be used to query/update provenance
    information
  • Provided syntax and semantics, as well as an
    implementation
  • Demo at http//139.91.183.303026/RULdemo/named_g
    raph_demo/

21
Thank You
22
EXTRA SLIDES
23
RDF/S Graphset Properties
  • Three types of triples in a graphset
  • Explicitly assigned triples
  • Implicitly assigned triples (from the constituent
    named graphs)
  • Implications of the above (per RDFS)

Paper
Person
PT
writes
PaperTAPP
Author
PT
Paper10
Giorgos
24
Inserts and Deletes General Process
  • INSERT
  • Validity respected
  • Must verify non-redundancy
  • Process
  • If INSERT is redundant ignore it
  • Remove all redundant information (after insert)
  • DELETE
  • Must verify validity
  • Non-redundancy respected
  • Issues with inference and the coherence viewpoint
  • Process
  • If DELETE is void ignore it
  • Make explicit all originally redundant
    information that will be lost otherwise
  • Restore validity by removing property instances
    if necessary
Write a Comment
User Comments (0)
About PowerShow.com