Title: SuperPeerBased Routing and Clustering Strategies for RDFBased PeerToPeer Networks
1Super-Peer-Based Routing and Clustering
Strategies forRDF-Based Peer-To-Peer Networks
Alexander Löser Technische Universität Berlin,
Germany
Wolfgang Nejdl, Martin Wolpers, Wolf Siberski,
Christoph Schmitz, Mario Schlosser, Ingo
Brunkhorst Learning Lab Lower Saxony,Hannover/Kar
lsruhe, Germany
2Overview
- Introduction to Edutella
- Schema-based P2P systems
- Super-Peer networks
- The HyperCuP topology
- Indexing
- Routing
- Clustering
- Schema Mapping
- Further Work
3PADLR Personalized Access to Digital Learning
Resources
- Heterogeneous
- Applications
- Repositories
- Platforms
4Edutella Introduction
- Main Goal Achieve interoperability between
heterogeneous metadata-driven (e-learning)
systems - Provides metadata only, not the resources
- resources are fetched via http
- Foundations
- Semantic Web
- Peer-to-Peer
- Federated Databases
- Open source project (http//edutella.jxta.org)
- Uses other OSS JXTA Platform, Jena, JUnit, Ant
- Uses Xerces, Jetty, ICU4J, XIndice, ...
5Query Service
- provides standardized query/retrieval of RDF
metadata stored in distributed RDF repositories - Query Exchange Language
- Based on Datalog (allows expression of rules)
- RDF syntax
- For exchange only
- Adapters to enable QEL query processing on
several backends - File, RDBMS, Rule Database, ...
6Schema-Based Peer-to-Peer Networks
- User-definable schemas
- Structured schemas
- Query language
- No central control
- Node autonomy
- Self organization
(system list not complete)
7Problem and Approach
- Broadcasting all queries to all information
sources obviously doesnt scale - Problem How to distribute queries in a scalable
fashion? - Optimal solution distribute a query only to
peers which have results for it - Approach
- Use Super-Peer network
- Introduce Query Routing Indices
8Super-Peer Networks
- Observation Peers vary significantly in
availability, bandwith, processing power, etc. - Create network backbone from highly available and
powerful peers to distribute load better.
?
9Super-Peer Topology
- Super-peers are arranged as HyperCuP
- Broadcast needs n-1 messages, log2(n) hops
- High connectivity, resilient against node failures
10Routing Indices
- On joining the network, each peer provides
self-description - Based on this information, super-peers maintain
indexes of schemas/schema elements used at each
peer - Super-peer/peer indices
- Super-peer/super-peer indices
- Index Granularity
- Schema
- Property
- Property value range
- Property individual values
11Index Sample
12Query Routing Sample
Find any resource with dcsubjectccssw-eng and
lomcontextundergrad
13Clustering
- If peers are randomly assigned to super-peers, we
often still have to broadcast queries within the
super-peer network - Two approaches
- Static super-peer administrators define
constraints which peers have to fulfill to be
accepted - Dynamic based on query statistics, peers are
continually reassigned to optimize query
distribution - Work in progress
14Schema Mapping
- Peers may use different schemas to annotate their
resources - Use federated database techniques for mapping
- Super-peers acts as Mediators
- Mapping rules have to be specified manually
15Super-peer/Peer implementation
16Further Work
- Quantitative evaluation (by simulation)
- Exploration of clustering approaches
- Integration of other mediation techniques
17The End