Survey of PeertoPeer Database - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Survey of PeertoPeer Database

Description:

Issues of PDBMS. Discussion. Existing PDBMS projects: Piazza, Hyperion, PeerDB ... Key Issues. Scalability. Scale in a less than linear fashion. Availability ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 34
Provided by: ming167
Category:

less

Transcript and Presenter's Notes

Title: Survey of PeertoPeer Database


1
Survey of Peer-to-Peer Database
  • Ming Zhang Yang Li

2
Outline
  • Introduction
  • Review Current Research PDBMS
  • Issues of PDBMS
  • Discussion
  • Existing PDBMS projects Piazza, Hyperion, PeerDB

3
Introduction
  • Peer-to-Peer Database Management Systems (PDBMS)
  • P2P systems becomes popular
  • Most P2P system lacks data management system
  • Database in P2P system is attractive.
  • Mobile users can share the same functionalities
  • The research is still in its beginning of
    evolution.

4
Review and Discussion of PDBMS
  • P2P System vs. Distributed Database System
  • A Model of PDBMS

5
PDBMS vs. Distributed Database System
  • PDBMS nodes can join or leave the network at any
    time.
  • Distributed Database System nodes may join or
    leave the network in a control manner, i.e., add
    when needed

6
PDBMS vs. Distributed Database System
  • PDBMS schema is not global.
  • Distributed Database System nodes are usually
    stable and standard, and have some knowledge of a
    shared schema

7
PDBMS vs. Distributed Database System
  • PDBMS nodes may not contain the complete data.
    Nodes may not be connected
  • Distributed Database System containing a
    complete set of data in each server cluster

8
PDBMS vs. Distributed Database System
  • PDBMS queries must be routed to many nodes in
    order to return an accurate result set
  • Distributed Database System a query can be
    routed to a relatively small set of nodes

9
A Model of PDBMS
  • Requirements
  • Coordinating databases
  • Communicate freely in decentralized environment
  • Autonomous Peers data, schema, choice of peers

10
A Model of PDBMS
  • Local Relational Model (LRM)
  • Assumptions
  • The set of all data consists of local
    (relational) databases
  • With a set of acquaintances (peers)
  • Peers are fully autonomous in choosing their
    acquaintances
  • Join or leave the network at any time

11
A Model of PDBMS
  • Local Relational Model (LRM)
  • Main Goals
  • Allow for inconsistent databases
  • Support semantic interoperability
  • Drawbacks
  • Need a protocol to establish acquaintance
    dynamically
  • Meta data management?
  • Need a query propagation mechanism

12
A Model of PDBMS
Architecture of an LRM node
13
PDBMS Strengths
  • No single point of failure
  • Data high distributed by using local caching
  • If peer fails, query can still continue
  • Minimal administration
  • Distributed DB require extensive administration
  • P2P users create their own database and perform
    their own database manipulation

14
PDBMS Strengths
  • Vast amount of data
  • User access and search a very large amount of
    data with many different types
  • Replicated-data for fast retrieve
  • Data is replicated over many nodes as queries are
    performed
  • The results can be cached locally
  • Fast access local data if the original nodes are
    busy

15
PDBMS Weaknesses
  • Discovery of peers
  • Peer nodes join and leave freely
  • Query routing
  • Queries must be routed to all or large subnets of
    nodes
  • Consumption of network resources
  • Because of intensive routing
  • Mobility of users
  • Nodes are assigned dynamic IP, so mapping table
    must be maintained and read before queries being
    routed.

16
Discussion
  • Key Issues
  • Scalability
  • Scale in a less than linear fashion
  • Availability
  • A has, B caches, C accesses from A and/or B
  • Performance
  • Data Authenticity
  • Distinguish a correct/incorrect query
  • Security
  • Authorize users to access privileged data

17
Case Study
  • 1. The Piazza Peer Data Management Project
  • 2. The Hyperion Project From Data Integration to
    Data Coordination
  • 3. PeerDB A P2P-based System for Distributed
    Data Sharing

18
The Piazza Peer Data Management Project
  • University of Washington
  • The Piazza focuses on the problem of sharing
    semantically heterogeneous data in a distributed
    and scalable way.

19
Piazza Cont.
  • Schema Mediation
  • Querying
  • Constructing Mappings

20
Piazza Cont.
  • Schema Mediation

21
Piazza Cont.
  • Querying
  • Unfolding replaces a subgoal with a set of
    subgoals.1
  • Rewriting replaces a set of subgoals with a
    single subgoal.2
  • Constructing Mappings
  • Schema Matching
  • Precise Mapping

22
The Hyperion Project From Data Integration to
Data Coordination
  • University of Toronto
  • The Hyperion focuses on the specification and
    management of the logical meta-data that enables
    data sharing and coordination between
    independent, autonomous peers.

23
Hyperion Cont.
  • Infrastructure
  • Querying
  • Coordination

24
Hyperion Cont.
  • Infrastructure

25
Hyperion Cont.
  • Querying
  • Hyperion assumes each query is defined with
    respect to the schema of a single peer.
  • A local query is executed using only the data in
    the local peer, while a global query uses data in
    other peers.
  • Use any available mapping expressions and mapping
    tables to translate or rewrite.
  • Coordination
  • Use event-condition-action (ECA) rules to
    coordinate between peers

26
PeerDB A P2P-based System for Distributed Data
Sharing
  • National University of Singapore
  • PeerDB is a prototype P2P distributed object
    management system that incorporates some features
    such as full fledge object management, without a
    shared global schema and mobile agents.

27
PeerDB Cont.
  • Architecture of a PeerDB Node
  • Agent Assisted Query Processing
  • Cache Management

28
PeerDB Cont.
  • Architecture of a PeerDB Node

29
PeerDB Cont.
  • Agent Assisted Query Processing
  • In the first phase, PeerDB apply the relation
    matching strategy to locate potential relations.
  • In phase two, PeerDB sends the queries directly
    to the nodes containing the selected relations.
  • Finally the answers are returned to the query
    node.

30
PeerDB Cont.
  • Cache Management
  • caches answers returned from remote nodes
  • periodically invalidates to keep them up-to-date
  • LRU (least recently used )replacement policy

31
References
  • 1 J. Ullman. Database and Knowledge-Base
    Systems, volume 2. Addison-Wesley, 1989.
  • 2 A. Halevy. Answering queries using views a
    survey. VLDB Journal,10(4), 2001.
  • 3 Wee Siong Ng, Beng Chin Ooi, PeerDB A
    P2P-based System for Distributed Data Sharing,
    http//www.cs.cornell.edu/courses/cs732/2003sp/pap
    ers/Ng2003.pdf

32
References
  • 4 W Anthony Young, Evaluation of Peer-to-Peer
    Database Solutions, http//www.tonyyoung.ca/cs654p
    aper.pdf
  • 5 Albena Roshelova, A Peer-to-Peer Database
    Management System, http//eprints.biblio
    .unitn.it/archive/00000585/01/albena.pdf
  • 6 I.Tatarinov, Z. Ives, J. Madhavan, A. Halevy,
    D. Suciu, N. Dalvi, X. Dong, Y.Kadiyska, G.
    Miklau, and P.Mork The Piazza Peer Data
    Management Project. SIGMOD Record, ACM, September
    2003

33
References
  • 7 Z. Ives, A. Halevy, and D. Weld. Integrating
    Network-Bound XML Data. IEEE Data Engineering
    Bulletin, 24(2), 2001.
  • 8M. Arenas, V. Kantere, A. Kementsietsidis, I.
    Kiringa, R. J. Miller, J. Mylopoulos The
    Hyperion project From Data Integration to Data
    Coordination. University of Toronto, Canada,
    2003.
  • 9W.Ng, B. Ooi, K.Tan, and A. Zhou. PeerDB A
    p2p-based system for distributed data sharing.
    International Conference on Data Engineering
    (ICDE), 2003.
Write a Comment
User Comments (0)
About PowerShow.com