Title: Evolution of P2P Content Distribution
1Evolution of P2P Content Distribution
2Outline
- History of P2P Content Distribution Architectures
- Techniques to Improve Gnutella
- Brief Overview of DHT
- Techniques to Improve BitTorrent
3History of P2P
- Napster
- Gnutella
- KaZaa
- Distributed Hash Tables
- BitTorrent
4Napster
- Centralized directory
- A central website to hold directory of contents
of all peers - Queries performed at the central directory
- File transfer occurs between peers
- Support arbitrary queries
- Con Single point of failure
5Gnutella
- Decentralized homogenous peers
- No central directory
- Queries performed distributed on peers via
flooding - Support arbitrary queries
- Very resilient against failure
- Problem Doesnt scale
6FastTrack/KaZaa
- Distributed Two-Tier architecture
- Supernodes keep content directory for regular
nodes - Regular nodes do not participate in Query
processing - Queries performed by Supernodes only
- Support arbitrary queries
- Con supernodes stability affect system
performance
7Distributed Hash Tables
- Structured Distributed System
- Structured all nodes participate in a precise
scheme to maintain certain invariants - Provide a directory service
- Directory service
- Routing
- Extra work when nodes join and leave
- Support key-based lookups only
8BitTorrent
- Distribution of very large files
- Tracker connects peers to each other
- Peers exchange file blocks with each other
- Use Tit-for-Tat to discourage free loading
9Improving Gnutella
10Gnutella-Style Systems
- Advantages of Gnutella
- Support more flexible queries
- Typically, precise name search is a small
portion of all queries - Simplicity
- High resilience against node failures
- Problems of Gnutella Scalability
- Flooding ? of messages O(NE)
11Flooding-Based Searches
1
3
2
4
6
5
7
8
. . . . . . . . . . . .
- Duplication increases as TTL increases in
flooding - Worst case a node A is interrupted by N q
degree(A) messages
12Load on Individual Nodes
- Why is a node interrupted
- To process a query
- To route the query to other nodes
- To process duplicated queries sent to it
13Communication Complexity
- Communication complexity determined by
- Network topology
- Distribution of object popularity
- Distribution of replication density of objects
14Network Topologies
- Uniform Random Graph (Random)
- Average and median node degree is 4
- Power-Law Random Graph (PLRG)
- max node degree 1746, median 1, average 4.46
- Gnutella network snapshot (Gnutella)
- Oct 2000 snapshot
- max degree 136, median 2, average 5.5
- Two-dimensional grid (Grid)
15Modeling Methods
- Object popularity distribution pi
- Uniform
- Zipf-like
- Object replication density distribution ri
- Uniform
- Proportional ri ? pi
- Square-Root ri ? ? pi
16Evaluation Metrics
- Overhead average of messages per node per
query - Probability of search success Pr(success)
- Delay of hops till success
17Duplications in Various Network Topologies
18Relationship between TTL and Search Successes
19Problems with Simple TTL-Based Flooding
- Hard to choose TTL
- For objects that are widely present in the
network, small TTLs suffice - For objects that are rare in the network, large
TTLs are necessary - Number of query messages grow exponentially as
TTL grows
20Idea 1 Adaptively Adjust TTL
- Expanding Ring
- Multiple floods start with TTL1 increment TTL
by 2 each time until search succeeds - Success varies by network topology
- For Random, 30- to 70- fold reduction in
message traffic - For Power-law and Gnutella graphs, only
- 3- to 9- fold reduction
21Limitations of Expanding Ring
22Idea 2 Random Walk
- Simple random walk
- takes too long to find anything!
- Multiple-walker random walk
- N agents after each walking T steps visits as
many nodes as 1 agent walking NT steps - When to terminate the search check back with the
query originator once every C steps
23Search Traffic Comparison
24Search Delay Comparison
25Flexible Replication
- In unstructured systems, search success is
essentially about coverage visiting enough nodes
to probabilistically find the object
replication density matters - Limited node storage whats the optimal
replication density distribution? - In Gnutella, only nodes who query an object store
it ri ? pi - What if we have different replication strategies?
26Optimal ri Distribution
- Goal minimize ?( pi/ ri ), where ? ri R
- Calculation
- introduce Lagrange multiplier ?, find ri and ?
that minimize - ?( pi/ ri ) ? (? ri - R)
- ? - pi/ ri2 0 for all i
- ri ? ? pi
27Square-Root Distribution
- General principle to minimize ?( pi/ ri ) under
constraint ? ri R, make ri proportional to
square root of pi - Other application examples
- Bandwidth allocation to minimize expected
download times - Server load balancing to minimize expected
request latency
28Achieving Square-Root Distribution
- Suggestions from some heuristics
- Store an object at a number of nodes that is
proportional to the number of node visited in
order to find the object - Each node uses random replacement
- Two implementations
- Path replication store the object along the path
of a successful walk - Random replication store the object randomly
among nodes visited by the agents
29Evaluation of Replication Methods
- Metrics
- Overall message traffic
- Search delay
- Dynamic simulation
- Assume Zipf-like object query probability
- 5 query/sec Poisson arrival
- Results are during 5000sec-9000sec
30Distribution of ri
31Total Search Message Comparison
- Observation path replication is slightly
inferior to random replication
32Search Delay Comparison
33Summary
- Multi-walker random walk scales much better than
flooding - It wont scale as perfectly as structured
network, but current unstructured network can be
improved significantly - Square-root replication distribution is desirable
and can be achieved via path replication
34KaZaa
- Use Supernodes
- Regular Nodes Supernodes 100 1
- Simple way to scale the system by a factor of 100
35DHTs A Brief Overview(Slides by Bard Karp)
36What Is a DHT?
- Single-node hash table
- key Hash(name)
- put(key, value)
- get(key) - value
- How do I do this across millions of hosts on the
Internet? - Distributed Hash Table
37Distributed Hash Tables
- Chord
- CAN
- Pastry
- Tapastry
- etc. etc.
38The Problem
N2
N1
N3
Internet
Put (Keytitle Valuefile data)
?
Client
Publisher
Get(keytitle)
N6
N4
N5
- Key Placement
- Routing to find key
39Key Placement
- Traditional hashing
- Nodes numbered from 1 to N
- Key is placed at node (hash(key) N)
- Why Traditional Hashing have problems
40Consistent Hashing IDs
- Key identifier SHA-1(key)
- Node identifier SHA-1(IP address)
- SHA-1 distributes both uniformly
- How to map key IDs to node IDs?
41Consistent Hashing Placement
A key is stored at its successor node with next
higher ID
42Basic Lookup
43Finger Table Allows log(N)-time Lookups
½
¼
1/8
1/16
1/32
1/64
1/128
N80
44Finger i Points to Successor of n2i
N120
112
½
¼
1/8
1/16
1/32
1/64
1/128
N80
45Lookups Take O(log(N)) Hops
N5
N10
N110
K19
N20
N99
N32
Lookup(K19)
N80
N60
46Joining Linked List Insert
N25
N36
1. Lookup(36)
K30 K38
N40
47Join (2)
N25
2. N36 sets its own successor pointer
N36
K30 K38
N40
48Join (3)
N25
3. Copy keys 26..36 from N40 to N36
N36
K30
K30 K38
N40
49Join (4)
N25
4. Set N25s successor pointer
N36
K30
K30 K38
N40
Predecessor pointer allows link to new
host Update finger pointers in the
background Correct successors produce correct
lookups
50Chord Lookup Algorithm Properties
- Interface lookup(key) ? IP address
- Efficient O(log N) messages per lookup
- N is the total number of servers
- Scalable O(log N) state per node
- Robust survives massive failures
- Simple to analyze
51Many Many Variations of The Same Theme
- Different ways to choose the fingers
- Ways to make it more robust
- Ways to make it more network efficient
- etc. etc.
52Improving BitTorrent
53BitTorrent File Sharing Network
- Goal replicate K chunks of data among N nodes
- Form neighbor connection graph
- Neighbors exchange data
54BitTorrent Neighbor Selection
Tracker file.torrent
1
Seed
Whole file
4
3
2
5
A
55BitTorrent Piece Replication
Tracker file.torrent
1
Seed
Whole file
2
3
A
56BitTorrent Piece Replication Algorithms
- Tit-for-tat (choking/unchoking)
- Each peer only uploads to 7 other peers at a time
- 6 of these are chosen based on amount of data
received from the neighbor in the last 20 seconds - The last one is chosen randomly, with a 75 bias
toward new comers - (Local) Rarest-first replication
- When peer 3 unchokes peer A, A selects which
piece to download
57Performance of BitTorrent
- Conclusion from modeling studies BitTorrent is
nearly optimal in idealized, homogeneous networks - Demonstrated by simulation studies
- Confirmed by theoretical modeling studies
- Intuition in a random graph,
- Prob(Peer As content is a subset of Peer Bs)
50
58Lessons from BitTorrent
- Often, randomized simple algorithms perform
better than elaborately designed deterministic
algorithms
59Problems of BitTorrent
- ISPs are unhappy
- BitTorrent is notoriously difficult to traffic
engineer - ISPs different links have different monetary
costs - BitTorrent
- Peers are all equal
- Choices made based on measured performance
- No regards for underlying ISP topology or
preferences
60BitTorrent and ISPs Play Together?
- Current state of affairs a clumsy co-existence
- ISPs throttle BitTorrent traffic along
high-cost links - Users suffer
- Can they be partners?
- ISPs inform BitTorrent of its preferences
- BitTorrent schedules traffic in ways that benefit
both Users and ISPs
61Random Neighbor Selection
- Existing studies all assume random neighbor
selection - BitTorrent no longer optimal if nodes in the same
ISP only connect to each other - Random neighbor selection ? high cross-ISP
traffic - Q Can we modify the neighbor selection scheme
without affecting performance?
62Biased Neighbor Selection
- Idea of N neighbors, choose N-k from peers in
the same ISP, and choose k randomly from peers
outside the ISP
ISP
63Implementing Biased Neighbor Selection
- By Tracker
- Need ISP affiliations of peers
- Peer to AS maps
- Public IP address ranges from ISPs
- Special X- HTTP header
- By traffic shaping devices
- Intercept peer ? tracker messages and
manipulate responses - No need to change tracker or client
64Evaluation Methodology
- Event-driven simulator
- Use actual client and tracker codes as much as
possible - Calculate bandwidth contention, assume perfect
fair-share from TCP - Network settings
- 14 ISPs, each with 50 peers, 100Kb/s upload,
1Mb/s download - Seed node, 400Kb/s upload
- Optional university nodes (1Mb/s upload)
- Optional ISP bottleneck to other ISPs
65Limitation of Throttling
66Throttling Cross-ISP Traffic
Redundancy Average of times a data chunk
enters the ISP
67Biased Neighbor Selection Download Times
68Biased Neighbor Selection Cross-ISP Traffic
69Importance of Rarest-First Replication
- Random piece replication performs badly
- Increases download time by 84 - 150
- Increase traffic redundancy from 3 to 14
- Biased neighbors Rarest-First ? More uniform
progress of peers
70Biased Neighbor Selection Single-ISP Deployment
71Presence of External High-Bandwidth Peers
- Biased neighbor selection alone
- Average download time same as regular BitTorrent
- Cross-ISP traffic increases as of university
peers increase - Result of tit-for-tat
- Biased neighbor selection Throttling
- Download time only increases by 12
- Most neighbors do not cross the bottleneck
- Traffic redundancy (i.e. cross-ISP traffic) same
as the scenario without university peers
72Comparison with Alternatives
- Gateway peer only one peer connects to the peers
outside the ISP - Gateway peer must have high bandwidth
- It is the seed for this ISP
- Ends up benefiting peers in other ISPs
- Caching
- Can be combined with biased neighbor selection
- Biased neighbor selection reduces the bandwidth
needed from the cache by an order of magnitude
73Summary
- By choosing neighbors well, BitTorrent can
achieve high peer performance without increasing
ISP cost - Biased neighbor selection choose initial set of
neighbors well - Can be combined with throttling and caching
- ? P2P and ISPs can collaborate!