Title: SomeWhere in the Semantic Web
1SomeWhere in the Semantic Web
- Marie-Christine Rousset
- Joint work with Philippe Adjiman, Philippe
Chatalic, - François Goasdoué, Laurent Simon
2The Semantic Web today
- Ontology centered
- Methodologies, formal languages, platforms and
standards for building (domain) ontologies - Very few query languages for searching data on
the Web - Expressivity is favoured against efficiency or
even feasibility of machine processing - OWL Full is undecidable
- OWL Lite is ExpTime-complete
- Cannot not scale up to the Web
- semantic Google does not exist
3A more data centric vision of the SW
Focus of this talk
- Semantic Web viewed as a huge semantic and
distributed data management system - SomeWhere
- a peer to peer infrastructure
- based on simple personalized ontologies and
mappings distributed at Web scale
4P2P Data Management Systems
- Logical network of peers (? physical network)
- each peer is characterized by
- its physical address (IP)
- a description of the stored resources
- its neighbors in the network
- the peers to which it can transmit messages
(queries,...) - Some structures of logical networks
- fixed (Chord, Hypercube)
- guided by the semantics
- SON, Edutella, Piazza, DRAGO, Somewhere
5SomeWhere logical networks
- The topology is not fixed
- Guided by mappings
- A peer
- joins by declaring mappings between its ontology
and the ontologies of some peers that it knows - leaves by removing the mappings with its
acquaintances in the network
6SomeWhere in a nutshell
- Simple data model based on a propositional
language of classes - for defining ontologies, mappings, and queries
- a sublanguage of OWL DL (W3C)
- Scales up to one thousand peers
- logical network small world
-
7SomeWhere Data Model
SchemaData
Data
Data
8SomeWhere Data Model
9Semantics
- Standard logical semantics
- one single domain of interpretation
- a distributed set of formulas interpreted in the
same way as if they were not distributed - Distributed semantics of DDL or DFOL
- more loose
- based on distributed interpretations defined
w.r.t a collection of domains of interpretations - Our assumption
- the objects have a unique URI
- objects stored at different peers and having the
same URI are interpreted as being the same
10Data model example
P1
Musique
P2
Rock Pop Classique
Mouv Rock
Français US St_pop Tchaikovsky
St_Français St_US St_Tchaikovsky
11Query answering illustration
P1
Musique
P2
Rock Pop Classique
Français US St_pop Tchaikovsky
St_Français St_US St_Tchaikovsky
St_Mouv
St_Mouv
St_Français
St_Français
St_US
St_US
St_Pop
St_Pop
St_Pop
St_Pop
Music
Rewritings
Pop_Rock Classical
St_Pop_Rock
St_Pop_Rock
Ru It
St_Ru Tchai
St_Tchai
12Query answering in SomeWhere
- Decomposition of queries/recombination of answers
- only atomic queries are transmitted to peers
- a complex query is splitted into atomic queries
- each solicited peer processes a given atomic
query q and incrementally sends back intentional
answers for it - (conjunction of) extensional classes that are
rewritings of q - intentional answers of different atomic queries
resulting from the split of a complex query must
be recombined - intentional answers can combine extensional
classes of different peers - Computation of proper prime implicates in
distributed clausal propositional theories - Ontologies and mappings are encoded as clauses
- Proper prime implicates of the negation of a
conjunctive query Q are the negation of the
conjunctive rewritings of Q
13Query answering algorithm
- Message based algorithm
- same algorithm on each peer
- query, answer, and termination messages
-
- Properties
- soundness
- completeness
- termination (even for cyclic networks)
14http//www.lri.fr/adjiman/somewhere/
Flash demo of the SomeWhere
extension
15(No Transcript)
16Classes extensions
Friends Humor
BenStiller Comedy
17Action Suspense Thriller
Friends Humor
BenStiller Comedy
18Action Suspense Thriller
Friends Humor
BenStiller Comedy
19Rewritings of Thriller evaluation
Local
P3Thriller P1Action P1Suspence P5Drama
P6DramaComedy P2BruceWillis P1Suspense
Integration
20SomeWhere infrastructure
21SomeWhere infrastructure
Zoom on one machine
100 JAVA 1.5 somewhere.jar 250 Ko
22Scalability experiments
- on randomly generated networks
- 1000 peers
- small world topology
- Close to the topology of the web
- peers
- ontologies
- random clauses of length 2
- mappings
- random clauses of length 2 or 3
23Varying topologies
P probability of redirecting an edge
Model of Watts and Strogatz
24Scalability results
- Varying parameters
- Number of mappings between peers
- complexity of mappings
- ratio of clauses of length 3 (0, 20, 100)
- timeout 30 s/query
- Depth of query processing
- Small depth (less than 7) even on the hard cases
- Time to produce a number of answers
- In 90 cases, the first answer is produced within
2 seconds - Easy cases (simple mappings)
- few answers per query (5 on average)
- very fast (less than 0.1s) to compute all the
answers without timeouts - Hard cases (complex and more mappings per edge)
- around 1000 answers per query (but gt 30 queries
not complete timeouts) - quite fast to obtain them (less than 20s)
25Ongoing work (1)
- Extending the data model to RDF(S)
- W3C recommendation for describing web resources
- Classes and (binary) relations between objects
- each object is identified by a URI
- Propositional encoding
- of the schema
- of the (atomic) queries
- Query answering
- Propositional rewriting of each atom based on the
encoding (variables are removed) - Composition of the rewritings by adding
corresponding variables to each rewritten atom
26 RDF data model
- Triple ltresource, property, valuegt
- Relational property(resource, value)
- Graphical
MuseumName
http//www.louvre.fr
"Le Louvre"
Located
CityName
" Paris"
http//www.paris.fr
27RDFS
CulturalPlace
Is-a
Contains
MadeBy
MuseumName
Literal
Work
Artist
Museum
WorkName
Located
ArtistName
Is-a
Is-a
Literal
Literal
CityName
Literal
City
ArcheologyMuseum
ModernMuseum
28SomeRDFS data model
29Query rewriting
- Propositionalisation of RDFS statements
- Query rewriting using SomeWhere
C1dom ? C2dom C1range ?
C2range P1rel ? P2rel Prel ? Cdom
Prel ? Crange
30illustration
Q(X,Y) P2.Work(X)?P2.refersTo(X,Y)
31illustration
Q(X,Y) P2.Work(X)?P2.refersTo(X,Y)
32Ongoing work (2)
- Handling inconsistencies
- how to define them ?
- insatisfiability (no model) gt inconsistency
- not a necessary condition
- how to check consistency?
- at each join of a new peer
- how to deal with inconsistency?
- correct it or reason with it ?
for each A, there exists a model in which A
is non empty S ? ?A
33illustration
path m1 AIPubli is a subclass of Conf.
Article
P1
P2
Theory
Expe
path m0 -gt m2 AIPublic is a subclass of Journal.
Conf and Journal are disjoint, therefore AIPUbli
is necessarily empty
inconsistencies are caused by mappings.
34P2P detecting of inconsistencies
AIPubli v 2005 BDPubli v 2005
Theory v Article Expe v Article
AIPubli v Theory
Conf v Publi Journal v Publi Journal v
Conf
- Propagation of m2 Theory v Journal
AIPubli v Journal.. AIPubli
AIPubli v Conf. Production of a unit
clause Inconsistencym1,m2 is a
NoGood stored at P3
- Propagation of m1 AIPubli v ConfAIPubli
v PubliAIPubli v Journal BDPubli v
Conf BDPubli v Publi BDPubli v Journal .
No production of unit clause No
inconsistency
35Distributed storage of the NoGoods
36P2P well-founded reasoning
- Principle
- avoid the inconsistencies when constructing
answers - Semantics of well-founded answer
- obtained from a consistent subset of formulas
- Algorithm
- for each answer,
- build its set of mapping supports and return
the set of NoGoods encountered during the
reasoning, - discard the mapping supports including a NoGood
- return the answers having a not empty set of
mapping supports
37Perspectives
- Modeling and handling trust in P2P Semantic
overlay networks - based on a logical approach
- P2P discovery and composition of smart devices
- based on a semantic description of the
functionality, inputs and outputs of devices