Title: Bullet: High Bandwidth Data Dissemination Using an Overlay Mesh
1Bullet High Bandwidth Data Dissemination Using
an Overlay Mesh
- by
- Dejan Kostic, Adolfo Rodriguez,Jeannie Albrecht
and Amin Vahdatpresented byJon Turner
2Introduction
- Problem large-scale data dissemination.
- focus on high bandwidth streaming media
- Current solutions
- IP multicast not complete solution and not
deployed - overlay multicast too dependent on quality of
multicast trees - Proposed approach.
- partial distribution using multicast tree
- all data distributed, but each node gets only
fraction from parent - peer-to-peer distribution of remainder
- periodic distribution of information about data
present at random subsets of entire group - nodes request missing data from peers
- system combines number of elements developed
earlier - erasure encoding (redundant tornado codes)
- random subset distribution
- informed content-delivery
- TCP friendly rate control
3Overview of Bullet Operation
- Distribute data over tree
- limited replication
- uses bandwidth feedback
- Nodes retrieve missing data from peers.
- Periodic distribution of content availability
info. - random subset of nodes
- summary of their content
- Limited set of peering relationships.
- each node receives from limited set of senders
- each node limits receivers
- sets evolve over time to improve utility of peers
- Data sent using TCP friendly rate control.
4Data Distribution
- System operates in series of epochs (typ. 5
seconds). - at start of epoch, nodes learn of descendants
children have - child i is assigned a sending factor sfi equal to
its share of descendants (child with 20 of
descendants has sfi.2) - Nodes forward data received from parent to
children. - packets are assigned to children according to
their sf values - additional copies forwarded to children with
spare bandwidth - use limiting factors, lfi to determine which
children have spare bw - lf values are adjusted dynamically in (0,1 based
on bw - Algorithm sketch executed for each input packet
p
- Find child t that has been assigned fewer than
its share of packets. - attempt to send p to t (transport protocol blocks
send if sending rate too high) - if attempt succeeds, assign p to t
- For each child c
- attempt to send to c if no successful attempt yet
or if c has spare bw - if attempt succeeds and this is first successful
attempt, assign p to c - if attempt succeeds, but not first success,
increase lfc - if attempt fails, decrease lfc
5Finding Prospective Data Sources
- At start of each epoch, each node receives
information about data present at a random subset
of peers. - summary ticket describing data present at each
peer from current working set - by comparing peer summary tickets to its own,
node determines similarity of peers data to its
data - select new peers with dissimilar data sets
- limit on number of concurrent senders
- discard senders that have been supplying too few
new packets - creates space in sender list for new sender
- Computing summary ticket
- if node has packets i1,i2,...,in
- let tjminfj(i1), fj(i2),... for 1jk where
fj(ik)(aj ikbj) mod U - summary ticket is (t1, t2,....,tk)
- each tj depends on the entire sequence
- similarity of two tickets is fraction of values
in common
6Recovering Data From Peers
- Node periodically supplies its senders with a
representation of the set of packets it currently
has. - limited to current working set (range of
sequence numbers) - set is represented by a Bloom filter
- each sender is also assigned a fraction of the
working set - packets with sequence numbers equal to i mod s
where s is number of senders - Senders transmit missing packets using available
bw. - packets checked against Bloom filter
- dont send if packets sequence number is in
Bloom filter - because senders selected for dissimilarity, most
packets should pass check - Example of numerical parameters.
- 5 second epoch, 30 second working set, 500
Kb/s50 p/s,10 senders - so asking each sender for at most 150 packets
- simpler and possibly more efficient to use bit
vector
7Distributing Random Subsets
- Objective give each node a random sample of
other nodes in the tree, excluding its
descendants. - Two phases
- collect phase propagate random subsets up the
tree - distribute phase propagate random subsets back
down tree - mix subset received from parent with child subsets
8Other Features
- Formation of overlay tree.
- not a focus of this work
- paper cites variety of previous tree construction
algorithms - argues that use of mesh makes tree quality less
important - most results use random tree
- Data encoding
- not a focus of this work
- suggests use of erasure codes or
multiple-description codes - reported results neglect encoding
- Transport protocol
- unreliable version of TFRC
- adjusts sending rate to match fair-share
bandwidth based on detected loss probability - feedback from transport sender used by Bullet to
adjust rates at which it attempts to send to
children
9Performance Evaluation
- Most results emulated in ModelNet (Duke network
emulation system, similar to Emulab) - uses 50 machines (2 GHz P4s running Linux) to
emulate 1,000 node overlay - 20,000 node network topology generated using INET
- link delays determined by geographic separation
- random subset of network client nodes selected
for overlay - random node in overlay selected as root
- Link bandwidths
- four classes of links
- three scenarios
10Tree Quality for Streaming Data
bottleneck bandwidth tree constructed
heuristically to have no small capacity
links(off-line construction)
point is to show that bottleneck tree is much
better than random tree used by Bullet.
11Baseline Bullet Performance
average better than bottlneck tree for streaming
case
not too much excess traffic
3s point suggests a few percent get less than
half average
most data from mesh
12Cum. Dist. of Received Bandwidth
median about 525 Kb/s
about 100 nodes get about 50 nodes get
13Comparing Bullet to Streaming
when plenty of bandwidth both do well
Bullet does better when bandwidth limited
much better when bandwidth scarce
14Effect of Oblivious Data Distribution
tree nodes attempt to forward all received
packets to all children
average throughput drops by 25
15Bullet vs Epidemic Routing
- Push gossiping
- nodes forward non-duplicate packets to randomly
chosen set of peers in their local view - packets forwarded as they arrive
- no tree required
- Streaming with anti-entropy
- data streamed over multicast tree
- gossip with random peers to retrieve missing data
- anti-entropy (?) used to locate missing data
- apparently, this means periodically select a
random peer and send it list of missing packets,
peer supplies missing packets it has - Experiments done on 5000 node topology.
- no physical losses
- link bandwidths chosen from medium range
16Bullet vs Epidemic Routing
gossiping has high overhead,bullet and streaming
w/AE are comparable
bullet outperforms epidemic routing
17Performance on Lossy Links
non-transit links lose 0 to .3 of
packetstransit links lose 0 to .15 of links
lose 5 to 10
Bullet dramatically better than streaming over
bottleneck tree topology
18Performance with Failing Node
- Child of root with 110 descendants fails at time
250. - Assume no underlying tree recovery.
- Evaluate effect of failure detection and
establishment of new peer relationships
adding new peers eliminates degradation
degraded throughput using existing peers
19Bullet Performance on Planet Lab
- 47 nodes with 10 in Europe, including root.
- Compare to streaming over hand-crafted trees.
European nodes all near tree root
select children of root to have poor bandwidth
from root,recurse
20Related Work
- Peer-to-peer data dissemination
- Snoeren et. al. 2001
- Fast Replica Cherkasova Lee 2003
- Kazaa and Bit Torrent
- Epidemic data propagation
- pbcast Birman et. al. 1999
- lbpcast Eugster et. al. 2001
- Multicast
- Scalable Reliable Multicast Floyd et. al.
1997 - Narada Chu et. al. 2000
- Overcast Janotti et. al. 2000
- Content streaming
- Split Stream Castro et. al. 2003
- CoopNet Padmanabhan et. al. 2003
21Conclusions
- Overlay mesh is superior to streaming over
overlay multicast tree. - Bullet demonstrates this and includes some novel
features. - method of distributing data to subtrees to
equalize availability - scalable method of finding peers who can supply
missing data - Large-scale performance evaluation
- supports claims for Bullets performance
advantages - explores performance under variety of conditions
22Discussion Questions
- What are the contributions?
- whats novel? whats borrowed?
- does it represent an improvement? how much?
- are the authors claims justified?
- for what applications might system be useful?
- Are the authors design choices adequately
justified? - method for distributing data over tree?
- acquiring information about peer data?
- use of summary tickets? use of Bloom filters? use
of TFRC? - Is performance evaluation satisfactory?
- what about other network topologies, link
bandwidths? - what about different group sizes? tree
characteristics? - why no detailed examination of specific design
choices? - why isnt cross traffic emulated?
- what about processing requirements? whats the
average number of cycles executed per packet
received? - is it repeatable by others?