Replication Strategies in Unstructured PeertoPeer Networks - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Replication Strategies in Unstructured PeertoPeer Networks

Description:

Aug 22, 2002. Sigcomm 2002. Replication Strategies in Unstructured Peer-to-Peer Networks ... Aug 22, 2002. Sigcomm 2002. What is the search size of a query ? ... – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 29
Provided by: www2Rese
Category:

less

Transcript and Presenter's Notes

Title: Replication Strategies in Unstructured PeertoPeer Networks


1
Replication Strategies in Unstructured
Peer-to-Peer Networks
  • Edith Cohen
  • ATT Labs-research

Scott Shenker ICIR
2
Peer-to-peer Networks
  • Peers are connected by an overlay network.
  • Users cooperate to share files (e.g., music,
    videos, etc.)

3
(Search in) Basic P2P Architectures
  • Centralized central directory server. (Napster)
  • Supports versatile queries, scope, legal
    troubles
  • Decentralized search is performed by probing
    peers
  • Structured (DHTs) (Freenet, Can, Chord,)
    location is coupled with topology - search is
    routed by the query. Scope, Only exact-match
    queries, tightly controlled overlay.
  • Unstructured (Gnutella, FastTrack) search is
    blind - probed peers are unrelated to query.
    Resilient to transient peers
    versatile queries Harsh scope/scalability
    tradeoff.

4
(replication in) P2P architectures
  • No proactive replication (Gnutella)
  • Hosts store and serve only what they requested
  • A copy can be found only by probing a host with a
    copy
  • Proactive replication of keys ( meta data
    pointer) for search efficiency (FastTrack, DHTs)
  • Proactive replication of copies for search
    and download efficiency, anonymity. (Freenet)

5
Question how to use replication to improve
search efficiency in unstructured networks with a
proactive replication mechanism ?
6
Search and replication model

Unstructured networks with replication of keys or
copies. Peers probed (in the search and
replication process) are unrelated to query/item
- Probe success likelihood can not be better, on
average, than random probes.
  • Search probe hosts, uniformly at random, until
    the query is satisfied (or the search max size is
    exceeded)
  • Replication Each host can store up to r copies
    (or keysmetadatapointer) of items.

Goal minimize average search size (number of
probes till query is satisfied)
7
Search size
  • Query is soluble if there are sufficiently many
    copies of the item.
  • Query is insoluble if item is rare or non
    existent.
  • What is the search size of a query ?
  • Insoluble queries maximum search size
  • Soluble queries number of probes until answer is
    found.
  • We look at the Expected Search Size (ESS) of
    each item. The ESS is inversely proportional to
    the fraction of peers with a copy of the item.

8
Search Example
  • 2 probes

4 probes
9
Expected Search Size (ESS)
  • m items with relative query rates
  • q1 gt q2 gt q3 gt gt qm. Si qi 1
  • Allocation p1, p2, p3,, pm Si pi 1
  • ith item is allocated pi fraction of
    storage. (keys placed in pi r fraction of hosts)
  • Search size for ith item is a Geometric r.v. with
    mean Ai 1/(r pi ).
  • ESS is Si qi Ai (Si qi / pi)/r

10
Uniform and Proportional Replication
  • Two natural strategies
  • Uniform Allocation pi 1/m
  • Simple, resources are divided equally
  • Proportional Allocation pi qi
  • Fair, resources per item proportional to demand
  • Reflects current P2P practices

11
Basic Questions
  • How do Uniform and Proportional allocations
    perform/compare ?
  • Which strategy minimizes the Expected Search Size
    (ESS) ?
  • Is there a simple protocol that achieves optimal
    replication in decentralized unstructured
    networks ?

12
Insoluble queries
  • Search always extends to the maximum allowed
    search size.
  • If we fix the available storage for copies, the
    query rate distribution, and the number if items
    that we wish to be locatable, then
  • The maximum required search size depends on the
    smallest allocation of an item. Thus,
  • Uniform allocation minimizes this maximum and
    thus the cost induced by insoluble queries.

What about the cost of soluble queries? Answer
is more surprising
13
ESS under Uniform and Proportional Allocations
(soluble queries)
  • Lemma The ESS under either Uniform or
    Proportional allocations is m/r
  • Independent of query rates (!!!)
  • Same ESS for Proportional and Uniform (!!!)
  • Proof

Proportional ASS is (Si qi / pi)/r (Si qi /
qi)/r m/r
Uniform ASS is (Si qi / pi)/r (Si m qi)/r
(m/r) Si qi m/r
14
Space of Possible Allocations
  • Definition Allocation p1, p2, p3,, pm is
    in-between Uniform and Proportional if
    for 1lt i ltm, q
    i1/q i lt p i1/p i lt 1
  • Theorem1 All (strictly) in-between strategies
    are (strictly) better than Uniform and
    Proportional

Theorem2 p is worse than Uniform/Proportional if
for all i, p i1/p i gt 1 (more popular gets
less) OR for all i, q i1/q i gt p i1/p i (less
popular gets less than fair share)
15
Space of allocations on 2 items
Uniform
Proportional
p2/p1
q2/q1
16
So, what is the best strategy for soluble queries
?
17
Square-Root Allocation
  • pi is proportional to square-root(qi)
  • Lies In-between Uniform and Proportional
  • Theorem Square-Root allocation minimizes the ESS
    (on soluble queries)
  • Minimize Si qi / pi such that Si pi 1

18
How much can we gain by using SR ?
Zipf-like query rates
19
  • OK
  • SR is best for soluble queries
  • Uniform minimizes cost of insoluble queries

What is the optimal strategy?
20
104 items, Zipf-like w1.5
All Soluble
85 Soluble
All Insoluble
Uniform
SR
21
We now know what we need.
How do we get there?
22
Replication Algorithms
  • Uniform and Proportional are easy -
  • Uniform When item is created, replicate its key
    in a fixed number of hosts.
  • Proportional for each query, replicate the key
    in a fixed number of hosts

Desired properties of algorithm
  • Fully distributed where peers communicate through
    random probes minimal bookkeeping and no more
    communication than what is needed for search.
  • Converge to/obtain SR allocation when query rates
    remain steady.

23
Model for Copy Creation/Deletion
  • Creation after a successful search, C(s) new
    copies are created at random hosts.
  • Deletion is independent of the identity of the
    item copy survival chances are non-decreasing
    with creation time. (i.e., FIFO at each node)

24
Creation/Deletion Process
Corollary
then
  • If

25
SR Replication Algorithms
  • Path replication number of new copies C(s) is
    proportional to the size of the search (Freenet)
  • Converges to SR allocation (reasonable
    conditions)
  • Convergence unstable with delayed creations
  • Sibling memory each copy remembers the number of
    sibling copies,
  • Quickly on target
  • For good estimates need to find several
    copies.
  • Probe memory each peer records number and
    combined search size of probes it sees for each
    item. C(S) is determined by collecting this info
    from number of peers proportional to search size.
  • Immediately on target
  • Extra communication (proportional to that needed
    for search).

26
Alg1 Path Replication
  • Number of new copies produced per query, ltCigt, is
    proportional to search size 1/pi
  • Creation rate is proportional to qi ltCigt
  • Steady state creation rate proportional to
    allocation pi, thus

27
Simulation
Delay 0.25 copy lifetime 10000 hosts
Path replication Sibling number
Hosts with copy
time
28
Summary
  • Random Search/replication Model probes to
    random hosts
  • Proportional allocation current practice
  • Uniform allocation best for insoluble queries
  • Soluble queries
  • Proportional and Uniform allocations are two
    extremes with same average performance
  • Square-Root allocation minimizes Average Search
    Size
  • OPT (all queries) lies between SR and Uniform
  • SR/OPT allocation can be realized by simple
    algorithms.
Write a Comment
User Comments (0)
About PowerShow.com