Title: A Self-repairing Peer-to-Peer Systems Resilient to Dynamic Adversarial Churn
1A Self-repairing Peer-to-Peer Systems Resilient
to Dynamic Adversarial Churn
Fabian Kuhn, Microsoft Research, Silicon
Valley Stefan Schmid, ETH Zurich Roger
Wattenhofer, ETH Zurich
Some slides taken from Stefan Schmids
presentation of his Masters thesis
2Churn
- Unlike servers, peers are transient!
- Machines are under the control of individual
users - e.g., just connecting to download one file
- Membership changes are called churn
join
leave
Successful P2P systems have to cope with
churn (i.e., guarantee correctness, efficiency,
etc.)!
3Churn characteristics
-
- Depends on application (Skype vs. eMule vs. )
- But there may be dozens of membership changes
per second! - Peers may crash without notice!
- How can peers collaborate in spite of churn?
4Churn threatens the advantages of P2P
a lot of churn
What can we guarantee in presence of churn? We
have to actively maintain P2P systems!
5Goal of the paper
Only a small number of P2P systems have been
analyzed under churn!
This paper presents techniques to - Provably
maintain P2P systems with desirable properties
- in spite of ongoing worst-case membership
changes.
Peer degree, network diameter,
Adversary continuously attacks the weakest part
(The system is never fully repaired, but always
fully functional)
6How does Churn affect P2P systems?
- Objects may be lost when the host crashes
- Queries may not make it to the destination
7Think about this
What is the big deal about churn? Does not every
P2P system define Join and Leave
protocols? Well, the system eventually recovers,
but during recovery, services may be affected.
And objects not replicated are lost. Observe the
difference between non-masking and masking fault
tolerance. What we need is some form of masking
tolerance.
8Model for Dynamics
- We assume worst-case perspective Adversary
A(J,L) - induces J joins and L leaves every round
anywhere in - the system. We assume a synchronous model time
divided - into rounds.
- Further refinement Adversary A(J, L, r) implies
- J joins, L leaves every r rounds
- The topology is assumed to be a hypercube that
has O(log n) degree and O(log n) diameter.
9Topology Maintenance
p1
p2
- Challenges in maintaining the hypercube!
- How does peer 1 know that it should replace peer
2? - How does it get there when there are concurrent
joins and leaves? -
10The Proposed Approach
Several peers per node
11General Recipe for Robust Topologies
- Take a graph with desirable properties
- Low diameter, low peer degree, etc.
- Replace vertices by a set of peers
- 3. Maintain it
- a. Permanently run a peer distribution algorithm
- which ensures that all vertices have roughly
the same amount - of peers (token distribution
algorithm). - b. Estimate the total number of peers in
the system and change - dimension of topology accordingly
- (information aggregation algorithm
and scaling algorithm).
12 Dynamic Token Distribution
V 11011
U 11010
a peers
b peers
W 10010
After one step of recovery, both U and V will
contain (ab) /2 peers. Try this once for each
dimension of the hypercube (dimension exchange
method)
13 Theorem
Discrepancy ? is the maximum difference between
the token count of a pair of nodes. The goal is
to reduce the discrepancy ? to 0. The previous
step reduces ? to 0 for fractional tokens, but
for a d-dimensional hypercube, using integer
tokens, ? d in the worst case
In presence of an A(J,K,1) adversary, the
proposed algorithm maintains the invariance of ?
2J 2K d
14Information aggregation
When the total number of peers N exceeds an
upper bound, each node splits into two, and the
dimension of the hypercube has to increase by 1.
Similarly, when the total number of peers N falls
below a lower bound, pairs of nodes in dimension
(d-1) merge into one, and the dimension of the
hypercube has to decrease by 1. Thus, the system
needs a mechanism to keep track of N.
15Simulated hypercube
- Given an adversary A (d1, d1, 6),
- the outdegree of every peer is bounded by ?
(log2N), and - The diameter is bounded by ? (log N)
The adversary inserts and deletes at most (d1)
peers during any time interval of 6 rounds
16Topology
Only the core peers store data items. Despite
churn, at least one node in each core has
to survive
Core
periphery
Example topology for d2. Peers in each core are
connected to one another and to the peers of the
core of the neighboring nodes
Q. What does the periphery node do?
176-round maintenance algorithm
The authors implied six rounds for one dimension
in each phase Round 1. Each node takes snapshot
of active peers within itself. Round 2. Exchange
snapshot Round 3. Preparation for peer
migration Round 4. Core send ids of new peers to
periphery. Reduce dimension if
necessary. Round 5. Dimension growth building
new core (2d3) Round 6. Exchange information
about the new core.
18Further improvement Pancake Graph (1)
- A robust system with degree and diameter O(log n
/ loglog n) the pancake graph (most papers refer
to Papadimitriou Gates contribution here)! - Pancake of dimension d
- d! nodes represented by unique permutation l1,
, ld where l1 ? 1,,d - Two nodes u and v are adjacent iff u is a
prefix-inversion of v
1234
4321
4-dimensional pancake
3214
2134
19The Pancake Graph (2)
- Properties
- Node degree O(log n / log log n)
- Diameter O(log n / log log n)
- where n is the total number of nodes
- A factor log log n better than hypercube!
- But difficult graph (diameter unknown!)
20Contributions
- Using peer distribution and information
aggregation algorithms - on the simulated pancake topology, he proposed
- a DHT-based peer-to-peer system with
- Peer degree and lookup / network diameter in O
(log n / loglog n) - Robustness to ADV(O (log n / log log n), O (log n
/ log log n)) - No data is ever lost!
Asymptotically optimal!
21The Pancake System
22Conclusion
A nice model for understanding the effect of
churn and dealing with it. But it is too
simplistic