ISDSI 2009 June 25th, 2009
Combining Semantic and Multimedia Query Routing
Techniques for Unified Data Retrieval in a PDMS
Claudio Gennaro1, Federica Mandreoli2,4, Riccardo
Martoglia2, Matteo Mordacchini1, Wilma Penzo3,4,
and Simona Sassatelli2 1 ISTI PI/CNR, Pisa 2
DII - Università degli Studi di Modena e Reggio
Emilia 3 DEIS - Università degli Studi di
Bologna 4 IEIIT BO/CNR, Bologna
This work is partially supported by the Italian
co-founded Project NeP4B
  • ICTs over the Web have become a strategic asset
  • Internet-based global market place where
    automatic cooperation and competition are allowed
    and enhanced

NeP4B Project (Networked Peers for Business)
  • The aim development of an advanced technological
    infrastructure for SMEs to allow them to search
    for partners, exchange data and negotiate without
    limitations and constraints
  • The architecture inspired by Peer Data
    Management Systems (PDMSs)

  • A peer a single SME or a mediator
  • It keeps its data in an OWL ontology
  • Multimedia objects ? multimedia attributes in the
    ontology (e.g. image)
  • It is queried by exploiting a SPARQL-like query
    language ? similarity predicates (FILTER
    function LIKE )

The Reference Scenario (cont.)
  • Peers are connected by means of semantic mapping
    (with scores)
  • Peers collaborate in solving users queries
  • Queries are formulated on the peers ontology
  • Answers can come from any peer that is connected
    through a semantic path of mappings

How to effectively and efficiently answer a
  • By adopting effective and efficient query routing
  • Flooding is not adequate
  • Overloads the network (traffic computational
  • Overwelms the querying peer (irrelevant results)

Combining Semantic and Multimedia Query Routing
  • We leverage our distinct experiences on semantic
    Mandreoli et al. WIDM 2006, Mandreoli et al.
    WISE 2007 and multimedia Gennaro et al.
    DBISP2P 2007, Gennaro et al. SAC 2008 query
  • We propose to combine our approaches in order to
    design an innovative mechanism for a unified data
  • Two main aspects characterize our scenario
  • the semantic heterogeneity of the peers
  • the execution of multimedia predicates
  • We pursue
  • Effectiveness by selecting the semantically best
    suited subnetworks
  • Efficiency by promoting the networks zones where
    potentially matching objects are more likely to
    be found, while pruning the others

  • Motivation
  • Query Answering Semantics
  • Query Routing
  • Semantic Query Routing
  • Multimedia Query Routing
  • Combined Query Routing
  • Routing Strategies
  • Experimental Evaluation
  • Conclusions and Future Works

Query Answering Semantics
  • Peer pi has an ontology Oi Ci1 , , Cim
  • A semantic mapping a fuzzy relation M(Oi,Oj) ?
    Oi?Oj where each instance (C,C) has a membership
    grade µ(C,C) ? 0,1
  • Query formula
  • f lttriple_patterngt ltfilter_patterngt
  • lttriple_patterngt triple lttriple_patterngt ?
  • ltfilter_patterngt f ltfilter_patterngt ?
  • ltfilter_patterngt ? ltfilter_patterngt
  • f is a relational (,lt,gt, lt, gt) or similarity
    (t) predicate

Local query execution
  • evaluation of f on a local data instance i
    s(f,i) ? 0,1
  • s(f,i) s( f(f1, , fn), i ) sfunf ( s(f1,i),
    , s(fn,i) )
  • Boolean semantics for relational predicates
  • non-Boolean semantics for similarity predicates
  • t-norm ( resp. t-conorm) for scoring conjunctions
    (resp. disjunctions)
  • Local query answers Ans(f,pi) (i,s(f,i))

Query Answering Semantics (cont.)
One-step query reformulation (pi?pj)
  • according to the mapping M(Oi,Oj)
  • s(f,pj) sfunc ( µ(C1 ,C1), , µ(Cn ,Cn) )
  • sfunc is a t-norm

Multi-step query reformulation (p?p1??pm
Pp1pm )
  • f undergoes a chain of reformulations f?f1??fm
  • s(f, Pppm ) sfunr (s(f, p1), s(f1, p2), ,
    s(fm-1, pm))
  • sfunr is a t-norm

Query answering semantics
  • f submitted to p
  • P p1,,pm set of accessed peers
  • Pppi path used to reformulate f over each pi
    in P
  • Ans (f, p U P) Ans(f,p) U Ans(f, Ppp1 ) U
    U Ans(f, Pp1pm )
  • where Ans(f, Pppi ) (Ans(f,pi), s(f, Pppi

  • Motivation
  • Query Answering Semantics
  • Query Routing
  • Semantic Query Routing
  • Multimedia Query Routing
  • Combined Query Routing
  • Routing Strategies
  • Experimental Evaluation
  • Conclusions and Future Works

Semantic Query Routing
  • Whenever pi forwards a query to one of its
    neighbors pj , the query might follow any of the
    semantic paths originating at pj, , i.e. in pjs
  • Main idea introduction of a ranking approach for
    query routing which promotes pi neighbors whose
    subnetworks are the most semantically related to
    the query.

  • Preliminaries
  • pjs the set of peers in the subnetwork rooted
    at pj
  • Ojs set of schemas Ojk pjk in pjs
  • Ppipjs set of paths from pi to any peer in pj
  • A Generalized Semantic Mapping relates each each
    concept C in Oi to the set of concepts Cs in Ojs
    taken from the mappings in Ppipjs according to
    an aggregated score which expresses the semantic
    similarity between C and Cs.

Semantic Query Routing (cont.)
  • Each peer p maintains a matrix named Semantic
    Routing Index (SRI) containing the membership
    grades given by the generalized semantic mappings
    between itself and its neighborhood Nb(p)
  • SRIi,j represents how the j-th concept is
    semantically approximated by the subnetwork
    rooted at the i-th neighbor

SRI-based Query Processing
  • When a peer receives a query formula f, it
    exploits its SRI scores to determine a ranking
    for its neighborhood
  • Rpsem (f)pi sfunc(µ(C1,C1s), , µ(Cn,Cns))

  • Motivation
  • Query Answering Semantics
  • Query Routing
  • Semantic Query Routing
  • Multimedia Query Routing
  • Combined Query Routing
  • Routing Strategies
  • Experimental Evaluation
  • Conclusions and Future Works

Multimedia Query Routing
  • The execution of multimedia predicates is
    inherently costly (CPU and I/O)
  • Main idea introduction of a ranking approach for
    query routing which promotes pi neighbors whose
    subnetworks contain the highest number of
    potentially matching objects.
  • Preliminaries
  • Each peers object is classified w.r.t. its
    distance (dissimilarity) to some reference
    objects (pivots)
  • E.g.
  • object O, pivot P, with d(O, P) 42
  • Bit-vector
  • Each peer maintains a condensed description of
    such a classification of its objects by using
    histograms (Peer Indices)

Multimedia Query Routing (cont.)
  • Each peer p also maintains a Multimedia Routing
    Index (MRI) containing the aggregated description
    of the resources available in its neighbors
  • Any MRI row MRI(p,pis) is built by summing up the
    Peer Indices in the i-th neighbors subnetwork

MRI-based Query Processing
  • Similarity-based Range Queries over metrics
  • For each query Q, a vector representation
    QueryIdx(Q) is built by setting to 1 all the
    intervals covered by the requested range
  • When a peer p receives Q, it computes the
    percentage of potential matching objects in each
    neighbors subnetwork w.r.t the total objects in

  • Motivation
  • Query Answering Semantics
  • Query Routing
  • Semantic Query Routing
  • Multimedia Query Routing
  • Combined Query Routing
  • Routing Strategies
  • Experimental Evaluation
  • Conclusions and Future Works

Combined Query Routing
  • Both the semantic and multimedia scores induce a
    total order
  • They can be combined by means of a meaningful
    aggregate function in order to obtain a global

Rpcomb (f) a Rpsem (f) ? ß Rpmm(f)
  • We inspire to
  • Fagin, Lotem Naor Optimal Aggregation
    Algorithms for Middleware. Journal of Computer
    and System Sciences, 66 47-58, 2003.
  • Optimal aggregation algorithms can only work with
    monotone aggregation functions
  • E.g. min, mean, sum

Combined Query Routing (cont.)
SELECT ?company WHERE ?tile a Tile
. ?tile company ?company . ?tile
price ?price . ?tile origin ?origin .
?tile image ?image . FILTER
( (?price lt 30) (?origin Italy) LIKE
(?image, myimage.jpg, 0.3)) LIMIT 100
  • Final ranking Peer3-Peer2
  • Peer3 roots the most promising subnetwork!

  • Motivation
  • Query Answering Semantics
  • Query Routing
  • Semantic Query Routing
  • Multimedia Query Routing
  • Combined Query Routing
  • Routing Strategies
  • Experimental Evaluation
  • Conclusions and Future Works

Routing Strategies
  • The adopted routing strategy determines the set
    of visited peers and induces an order on it
  • When a peer p receives Q it computes the ranking
    Rpcomb(f) on its neighbors
  • Different routing strategies relying on such
    ranking and having different performance
    priorities are possible
  • Efficiency (Depth First Model) ? the best peer in
    one hop!
  • it exploits the only information provided by
    Rpcomb (f)
  • Effectiveness (Global Model) ? the best known
  • it exploits the information provided by Up is
    visited Rpcomb (f)

  • Motivation
  • Query Answering Semantics
  • Query Routing
  • Semantic Query Routing
  • Multimedia Query Routing
  • Combined Query Routing
  • Routing Strategies
  • Experimental Evaluation
  • Conclusions and Future Works

Experimental Evaluation
Experimental setting
  • Simulation environment for SRI and MRI
  • Peers belong to different semantic categories
  • Ontologies consisting of a small number of
  • Multimedia contents hundreds of images taken
    from the Web, characterized by two MPEG7 standard
    features (scalable color edge histogram)
  • Network topology
  • randomly generated with the BRITE tool
  • in the size of few dozens of nodes
  • Queries on randomly selected peers
  • Routing strategy DF search
  • Aggregation function mean
  • Stopping condition number of retrieved results

  • Queries
  • on randomly selected peers
  • Routing strategy
  • DF search
  • Aggregation function
  • mean
  • Stopping condition
  • number of retrieved results
  • Effectiveness evaluation
  • We measure the quality of the results (combined
  • Efficiency evaluation
  • We measure the number of performed hops

Effectiveness Evaluation
  • We measure the semantic quality of the obtained
    results (satisfaction)

Efficiency Evaluation
  • We measure the number of hops performed by the

  • Motivation
  • Query Answering Semantics
  • Query Routing
  • Semantic Query Routing
  • Multimedia Query Routing
  • Combined Query Routing
  • Routing Strategies
  • Experimental Evaluation
  • Conclusions and Future Works

Conclusions Future Works
  • We presented a novel approach for processing
    queries effectively and efficiently in a
    distributed and heterogeneous environment, like
    the one of the NeP4B Project
  • We proposed an innovative query routing approach
    which exploits the semantic of the concepts in
    the peersontologies and the multimedia contents
    in the peersrepositories
  • We experimentally prove the effectiveness of our
    techniques with a series of exploratory tests
  • (In the future) we will
  • perform new tests on larger and more complex
  • integrate our techniques in a more general
    framework for query routing (including latency,
    costs, etc.)

