Title: Towards Efficient Load Balancing in Structured P2P Systems
1Towards Efficient Load Balancing in Structured
P2P Systems
- Yingwu Zhu, Yiming Hu
- University of Cincinnati
2Outline
- Motivation and Preliminaries
- Load balancing scheme
- Evaluation
3Why Load Balancing?
- Structured P2P systems, e.g., Chord,Pastry
- Object IDs and Node IDs are produced by using a
uniform hash function. - Results in O(log N) load imbalance, in the number
of objects stored at each node. - Skewed distribution of node capacity
- Nodes may carry loads proportional to their
capacities. - Other problems different object sizes,
non-uniform dist. of object IDs.
4Virtual Servers (VS)
- First introduced in Chord/CFS.
- A VS is responsible for a contiguous region of
the ID space. - A node can host multiple VSs.
Node A
Node B
Node C
Chord Ring
5Virtual Sever Reassignment
- Virtual server is the basic unit of load
movement, allowing load to be transferred between
nodes. - L Load, T Target Load.
Node A
T50
Node B
T35
Heavy
Node C
T15
Chord Ring
6Virtual Sever Reassignment
- Virtual server is the basic unit of load
movement, allowing load to be transferred between
nodes. - L Load, T Target Load.
Node A
T50
Node B
T35
Heavy
Node C
T15
Chord Ring
7Virtual Sever Reassignment
- Virtual server is the basic unit of load
movement, allowing load to be transferred between
nodes. - L Load, T Target Load.
Node A
T50
Node B
T35
Node C
T15
Chord Ring
8Advantages of Virtual Servers
- Flexible load is moved in the unit of a virtual
server. - Simple
- VS movement is supported by all structured P2P
systems. - Simulated by a leave operation followed by a join
operation.
9Current Load Balancing Solutions
- Some use the concept of virtual server
- However
- Either ignore the heterogeneity of node
capabilities. - Or transfer loads without considering proximity
relationships between nodes. - Or both.
10Goals
- Goals
- To maintain each nodes load less than its target
load (maximum load a node is willing to take). - High capacity nodes take more loads.
- Load balancing is performed in proximity-aware
manner, to minimize the overhead of load movement
(bandwidth usage) and allow more efficient and
fast load balancing. - Load depends on the particular P2P systems.
- E.g., storage, network bandwidth, and CPU cycles.
11Assumptions
- Nodes in system are cooperative.
- Only one bottlenecked resource, e.g., storage or
network bandwidth. - The load of each virtual server is stable over
the timescale when load balancing is performed.
12Overview of Design
- Step1 Load balancing information (LBI)
aggregation, e.g., load and capacity info. - Step2 Node classification. E.g., heavy nodes,
light nodes, neutral nodes. - Step3 Virtual server assignment (VSA).
- Step4 Virtual server transferring (VST).
- Proximity-aware load balancing
- VSA is proximity-aware.
13LBI Aggregation and Node Classification
- Rely on a fully decentralized, self-repairing,
and fault-tolerant K-nary tree built on top of a
DHT (distributed hash table). - Each K-nary tree node is planted in a DHT node.
- ltL, C, Lmingt represents the load, capacity and
the minimum load of virtual servers, respectively.
lt62, 48, 2gt
14LBI Aggregation and Node Classification
- Relying on a fully decentralized, self-repairing,
and fault-tolerant K-nary tree built on top of a
DHT. - Each K-nary tree node is planted in a DHT node.
- ltL, C, Lmingt represents the load, capacity, and
the minimum load of virtual servers.
lt62, 48, 2gt
Ti (L/C?)Ci
15Virtual Server Assignment
VSA happens earlier between logically closer nodes
16Virtual Server Assignment
- DHT identifier space-based VSA
- VSA happens earlier between logically closer
nodes. - Proximity-ignorant, because logically close nodes
in DHT do NOT mean they are physically close
together.
1 Nodes in same colors are physically close to
each other. 2 H heavy nodes, L light
nodes. 3 Vi virtual servers.
17Proximity-Aware VSA
- Nodes in same colors are physically close to each
other. - H heavy node, L light node, Vi virtual
server. - VSs are assigned between physically close nodes.
18Proximity-Aware VSA
- Use landmark clustering to generate proximity
information, e.g. landmark vectors. - Use space-filling curves (e.g., Hilbert curve)
Landmark vectors ? Hilbert numbers as DHT keys. - Heavy nodes and light nodes each puts/maps their
VSA info. into the underlying DHT with the
resulting DHT keys align physical closeness with
logical closeness. - Each virtual server independently reports the VSA
info. which is mapped into its responsible
region, rather than its nodes own VSA info.
19Proximity-Aware Virtual Server Assignment
VSA happens earlier between physically closer
nodes
Final rendezvous point
Physically close
20Experimental Setup
- A K-nary tree built on top of a DHT (Chord),
e.g., k2, and 8, respectively. - Two node capacity distributions
- Gnutella-like capacity profile, 5-level
capacities. - Zipf-like capacity profile.
- Two load distributions of virtual servers
- Gaussian dist. and Pareto dist.
- Two transit-stub topologies (5,000 nodes)
- ts5k-large and ts5k-small.
21High Capacity Nodes Carry More Loads
Gaussian load distribution Gnutella-like
capacity profile
22High Capacity Nodes Carry More Loads
Pareto load distribution Zipf-like capacity
profile
23Proximity-Aware Load Balancing
More loads are moved within shorter distances by
proximity-aware load balancing.
Gaussian load distribution and Gnutella-like
capacity profile
Pareto load distribution and Zipf-like capacity
profile
CDF of Moved Load Distribution in ts5k-large
24Benefit of Proximity-Aware Scheme
LM(d) denotes the load moved in the distance
of d hops.
- Results
- For ts5k-large B 37-65
- For ts5k-small B 11-20
25Other Results
- Quantify the overhead of K-nary tree
construction - Link stress, node stress.
- The latencies of LBI aggregation and VSA, bound
in O(logN) time. - The effect of pairing threshold in rendezvous
points.
26Conclusions
- Current load balancing approaches using virtual
servers have limitations - Either ignore node capacity heterogeneity.
- Or transfer loads without considering proximity
relationships between nodes. - Or both.
- Our solution
- A fully decentralized, self-repairing, and
fault-tolerant K-nary is built on top of DHTs for
performing load balancing. - Nodes carry loads in proportion to their
capacities. - The first work to address load balancing issue in
a proximity-aware manner, thereby minimizing the
overhead of load movement and allowing more
efficient load balancing.
27Questions?