P2P Apps - PowerPoint PPT Presentation

1 / 52

About This Presentation

Title:

P2P Apps

Description:

Replica Diversion. Purpose?: Balance remaining free storage. Store Success? ... File Diversion. Balance Free Storage in NodeId Space. Retry Insert Operation ... – PowerPoint PPT presentation

Number of Views:47

Avg rating:3.0/5.0

Slides: 53

Provided by: rc94

Category:

more less

Transcript and Presenter's Notes

Title: P2P Apps

1
P2P Apps

Chandrasekar Ramachandran and Rahul Malik
Papers
1.Storage management and caching in PAST, a
large-scale, persistent peer-to-peer storage
utility
2. Colyseus A distributed architecture for
interactive multiplayer games
3. OverCite A Distributed, Cooperative CiteSeer

CS525
02/19/2008
2
Storage management and caching in PAST, a
large-scale, persistent peer-to-peer storage
utility

Antony Rowstron Peter
Druschel
Microsoft Research Rice
University

3
Contents

Introduction
Background
An Overview of PAST
Pastry
Operations
Improvements
Storage Management
Caching
Experimental Evaluation
Setup
Results
Conclusions

4
Introduction Focus and Common Themes

Recent Focus
Decentralized Control
Self-Organization
Adaptability/Scalability
P2P Utility Systems
Large-Scale
Common Themes in P2P Systems
Symmetric Communication
Nearly-Identical Capabilities

Source1
5
Background

Characteristic Features of Internet
Geography
Ownership
Administration
Jurisdiction
Need for Strong Persistence and High Availability
Obviates
Physical Transport of Storage Data
Mirroring
Sharing of Storage and Bandwidth

Source2
6
An Overview of PAST

Any host connected to Internet can be a PAST Node
Overlay Network
PAST Node Access Point for User
Operations Exported to Clients
Insert
Lookup
Reclaim
Terms NodeId, FileID
NodeId 128-bit, SHA-1 Hash of Nodes Public Key

Source3
7
Pastry Overview and Routing Table

P2P Routing Substrate
Given Message
Routes to NodeID with FileId Closest to 128 msb
In lt log2bN Steps
Eventual Delivery Guaranteed
Routing Table
log2bN levels with 2b - 1 entries
Each entry
NodeId ? appropriate prefix
Leaf Set and Neighborhood Set

8
Basic PAST Operations

Insert
Store File on K-PAST Nodes with NodeId closest to
128 msb of FileId
Balance Storage Utilization
Uniform Distribution of Set of NodeIds and
FileIDs
Storage Quota ? Debited
Store Receipts
Routing Via Pastry
Lookup
Nodes Respond with Content and Stored File
Certificate
Data Usually Found Near Client. Why?
Proximity Metric
Reclaim
Reclaim Certificate
Reclaim Receipt

9
PAST - Security

Smartcards
Private/Public Key Pair
Certificates
Storage Quotas
Assumptions
Computationally Infeasible to Crack Cryptographic
Functions
Most Nodes Well Behaved
Attacker impervious to Smartcards
Features
Integrity Maintained
Store Receipts
Randomized Pastry Routing Scheme
Routing Information Redundant

Source4
Ingenious?
10
Storage Management - Overview

Aims
High Global Storage Utilization
Graceful Degradation with Max Utilization
Rely on Local Coordination
Why is Storage Not Always uniform?
Statistical Variations
Size Distribution of Files
Different Storage Capacities
How much can a node store?
Difference No more than Order 2 Mag
Compare Advertised Storage Capacity with Leaf Set
Use Cheap Hardware with 60 GB Avg
Node Large?
Split it

11
Storage Management - Replication

Replica Diversion
Purpose? Balance remaining free storage
Store Success?
Forward to k-1 nodes
Store Receipt
Store Fail?
Choose B in Non-Leaf Set
B Stores, APointer to B
Replacement Replicas
Policies
Acceptance of Replicas Locally
Selection of Replica Nodes
Decisions, Decisions
File Diversion
Balance Free Storage in NodeId Space
Retry Insert Operation

12
Storage Management Maintenance

Maintenance
K Copies of Inserted File
Leaf Set
Failures?
Keep-Alive Messages
Adjustments in Leaf-sets
Nodes Please Give Me Replicas of All Files!
Not Possible
Time-Consuming and Inefficient
Solutions
Use Pointers to FileIds
Assumption
Total Amount of Storage in the System Never
Decreases

13
Caching

Goal
Minimize Client Access Latencies
Balance Query Load
Maximize Query Throughput
Creating Additional Replicas
Where do you Cache?
Use unused disk space
Evict Cached Copies when necessary
Insert into Cache If
Size less than Fraction C of Node Storage
Capacity
Greedy-Dual Size Policy

14
Performance Evaluation

Implemented in Java
Configured to Run in Single Java VM
Hardware
Compaq AlphaServer ES40
True64 Unix
6 GB Main Memory
Data
8 Web Proxy Logs from NLANR
4 Million Entries
18.7 GB Content
Institutional File Systems
2 Million files
167 GB

15
Results
SD/FN gt tdiv

Storage
Number of files Stored increases with
Lower tpri
Storage Utilization Drops
Higher Rate of Insertion Failure
Number of Diverted Replicas Small at High
Utilization
Caching
Global Cache Hit Ratio Decreases as Storage
Utilization Increases

when tpri 0.1 and tdiv 0.05.
16
References

Images
bahaiviews.blogspot.com/2006_02_01_archive.html
http//images.jupiterimages.com/common/detail/21/0
5/22480521.jpg
http//www.masternewmedia.org/news/images/p2p_swar
ming.jpg
http//www.theage.com.au/news/national/smart-card-
back-on-the-agenda/2006/03/26/1143330931688.html

17
Discussion

Comparison with CFS and Ivy
How can External Factors such as globally known
information help in Local Coordination?

18
Colyseus A Distributed Architecture for Online
Multiplayer Games
Ashwin Bharambe, Jeffrey Pang, Srini Seshan
ACM/USENIX NSDI 2006
19
Networked games are rapidly evolving
www.mmogchart.com
20
Centralized Scheme
Slow paced games with less interaction between
server and client may scale well

Not true of FPS games (e.g. Quake)
Demand high interactivity
Need a single game world
High outgoing traffic at server
Common shared state between clients

21
Game Model
Think function
Mutable State
Ammo
Monsters
Game Status
Screenshot of Serious Sam
Player
22
Distributed Architecture

Create the replicas
Discovery of objects

Object
Replica
23
Replication

Each object has a primary copy that resides on
exactly one node
Primary executes think function for the object
Replicas are read-only
Replicas are serialized at primary.

24
Object Location
Publication
Subscription
Find objects in range x1,x2, y1,y2, z1,z2
My location is (x,y,z)
Challenge Overcome the delay between Subscription
and reception of Matching publication
25
Distributed Hash Tables (DHT)
0xf0
0xe0
0x00
0xd0
0xc0
0x10
0xb0
0x20
0xa0
0x30
Finger pointer
0x90
0x40
0x80
O(log n) hops
0x50
0x70
0x60
26
Using DHTs for Range Queries

No cryptographic hashing for key ? identifier

Query 6 ? x ? 13 key 6 ? 0xab key 7 ?
0xd3 key 13 ? 0x12
0xf0
0xe0
0x00
0xd0
0x10
0xc0
Query 6 ? x ? 13
0xb0
0x20
0xa0
0x30
0x90
0x40
0x50
0x80
0x60
0x70
27
Using DHTs for Range Queries

Nodes in popular regions can be overloaded
Load imbalance!

28
DHTs with Load Balancing

Load balancing strategy
Re-adjust responsibilities
Range ownerships are skewed!

29
DHTs with Load Balancing
0xf0
0xe0
0xd0
0x00
Popular Region
0xb0
Finger pointers get skewed!
0x30
0xa0
0x90

Each routing hop may not reduce node-space by
half!
? no log(n) hop guarantee

0x80
30
Ideal Link Structure
0xf0
0xe0
0xd0
0x00
Popular Region
0xb0
0x30
0xa0
0x90
0x80
31

Need to establish links based on node-distance

Values
v4
v8
4
8
Nodes

If we had the above information
For finger i
Estimate value v for which 2i th node is
responsible

32
Histogram Maintenance
0xf0

Measure node-density locally
Gossip about it!

0xe0
0xd0
0x00
(Range, density)
(Range, density)
(Range, density)
0xb0
Request sample
0x30
0xa0
0x90
0x80
0x70
33
Load Balancing
Load histogram
Load
0
10
20
25
35
45
60
65
70
75
85

Basic idea leave-join
light nodes leave
Re-join near heavy nodes split the range of
the heavier node

34
Prefetching

On-demand object discovery can cause stalls or
render an incorrect view
So, use game physics for prediction
Predict which areas to move to and subscribe
objects from those areas

35
Proactive Replication

Standard object discovery and replica
instantiation slow for short-lived objects
Uses observation that most objects originate
close to creator
Piggyback object-creation messages to updates of
other objects

36
Soft State Storage