Title: Epidemics
1Epidemics
- by Charles Yang
- Ted Pongthawornkamol
- 9/16/20
2Prelude Multicasting
- Many protocols
- MBONE, 6BONE, XTP, etc.
- Principally designed for scalability
- Fault tolerance really isnt addressed
3Multicasting (cont)
- Scalable Reliable Multicast
- But as graph shows, not that scalable
4So now what? Epidemics!
- Recap from Indys 1st lecture
- Definitions
- Infective node with update it wants to share
- Susceptible node which has not yet received the
update - Removed previously infective node which is no
longer sharing
5Recap (cont)
- Infective node n receives a msg and forwards with
probability p to a susceptible node - Can be shown that spreads quickly with high
probability - Lightweight
- Highly fault-tolerant
6Outline of Presentation
- Epidemic Algorithms for Replicated Database
Maintenance - Bimodal Multicast
- Gossip-Based Ad Hoc Routing
7Epidemic Algorithms for Replicated Database
Maintenance
- Xeroxs Corporite Internet (CIN), Clearinghouse
Servers, about 1986-1987 - Name resolution service
- several hundred ethernets, connected by gateways
and phone lines - DBs were filling up bandwidth for replication
8The Problem
- Inject an update at one server, and have it
propagate to all other servers - how to make it robust and scale well?
- important factors
- convergence time time reqd for update to
propagate to all sites - network traffic traffic reqd to propagate a
single update (want to minimize!)
93 Methods for Spreading Updates
- direct mail (basically multicast or flooding)
- anti-entropy (epidemic)
- rumor mongering/gossiping (epidemic)
10CINs Initial Configuration
- Direct Mail to send updates
- Anti-entropy to bring DBs to sync
- Re-mailing if previous anti-entropy disagreed
- Anti-entropy Run once/day between 12am to 6am
- Eventually, anti-entropy couldnt complete in
allowed time due to traffic - For instance, for a domain stored at 300 sites,
90,000 messages might be introduced 1 night
11Direct Mail
s
12Direct Mail
s
13Direct Mail
s
14Direct Mail Issues
- a lot of b/w - n messages per update
- not quite reliable message can be lost (crashes,
buffer overflows) - s may also not have current knowledge of S (set
of all sites)
15Anti Entropy
- Run in bg to recover from errors
- initially from direct mail, later from rumor
mongering - Executed periodically
- FOR SOME s ? S DO
- ResolveDifferences, s
- ENDLOOP
16Anti-Entropy (after direct mail)
s
17Anti-Entropy (Cycle 1, start)
s
18Anti-Entropy (Cycle 1, end)
s
19Anti-Entropy (Cycle 2, start)
s
20Anti-Entropy (Cycle 2, end)
s
21Anti-Entropy (Cycle 3, start)
s
22Anti-Entropy (Cycle 3, end)
s
23Anti Entropy (cont)
- Assume s is chosen uniformly (talk about spatial
distribs later) - slow and expensive, but reliable
- since usually used as backup, the of
susceptible sites is small - Pull, Push-pull, push
24Pull
- pi is prob that site remains susceptible in ith
cycle - A site remains susceptible after i1st cycle if
- it was susceptible after ith cycle
- and it contacted a susceptible site in i1st
cycle - ? pi1 (pi)2,
- converges rapidly to 0 when pi is small
- In other words very unlikely that susceptible
sites will remain after a while
25Push
- A site remains susceptible after i1st cycle if
- it was susceptible after ith cycle
- and no infectious site contacted it in i1st
cycle - pi1 pi(1-1/n)n(1-pi)
- Approximately pi1 pie-1
- Converges too, but not nearly as quick as pull
- Hence pull, or push-pull is preferred to just
push
26Some Anti-Entropy Optimizations
- Comparing DBs is expensive, but since most DBs
are pretty similar - Could maintain checksum of db
- compare checksums
- If dont match, then start comparing DBs
- Naïve!
27Optimizations (cont)
- Define time window ? (time that updates should be
spread by) - Keep checksums of database AND a recent update
list w/age lt ? - 2 sites first exchange checksums and recent
update list - compute new checksums, and then compare
- ? must be chosen well
- If n grows too much
- expected time for msg spread gt ?
- recent update lists likely to be diff
- Another variation inverted index of db by
timestamp - sites can exchange updates in reverse timestamp
order until the checksums match
28Complex Epidemics / Rumor Mongering / Gossip
- Replace multicasting
- At the expense of slightly larger convergence
time - And a distinct, though very small probability of
failure - Called complex just to distinguish from simple
epidemics like anti-entropy
29Basic (Complex) Epidemic
- Susceptible site receives a hot rumor and becomes
infective - Randomly shares with another susceptible site
- Uniform at Random
- When contacts a site that knows rumor already
- probability 1/k lose interest in sharing the
rumor (and become removed) - After a while, high probability that everyone
knows
30Can model with differential equations (fun!)
31c is determined by i(1-?)??
- For large n, ? goes to zero
- Giving a solution
- i(s) is zero when se-(k1)(1-s)
- Yeah, yeah so what does it mean?
- implicit equation for s
- s decreases exponentially with k (1/k prob site
becomes removed) - k1, 20 will miss
- k2, 6 will miss
- So with each consecutive round, high probability
there will be no susceptibles left
32Can vary complex epidemics
- Concerned with
- Residue when i is zero, whats s? (people who
never heard the rumor) - Traffic
- Delay
- tavg - time for a random node to receive the msg
- tlast - time for the last node who will receive
the msg, to receive it
33Variations (cont)
- Blind vs Feedback
- blind loses interest with 1/k no matter if
contacted node knew msg or not - Counter vs Coin
- With counter, can lost interest after k
unnecessary contacts - Push vs Pull
- Basic used push, but can use pull
- will work if high number of independent updates
- but when db is quiescent, more useless overhead
than push
34? Variations (cont)
- Minimization
- Use a push and pull together, and if both sides
know update, then the site with smaller counter
is incremented (equality, both incremented) - Connection limit
- If theres a lot of updates, need a connection
limit - Pull gets worse but push gets better!
- Hunting
- If one connection rejected, try another
35So instead of mailing anti-entropy
- Use rumor mongering
- And back up with anti-entropy
36Death Certificates
- With anti-entropy, deletion doesnt really work
- absence of entry will be replaced by an old
version - Death Certificates
- carry timestamps
- when compared with older entry, the older entry
is deleted - they take up space
- but if you delete them, risk chance of seeing old
resurrected data - Enter Dormant Death Certificates
37Dormant Death Certificates
- Two thresholds ?1 and ? 2
- Each server retains DC within ?1
- After ?1 , most sites delete DC, while a few keep
it - If old data meets dormant DC, propagate the DC
again - After ?1 ?2 , delete the dormant DC
38Dormant DCs (cont)
- Does not scale indefinitely
- n grows so much, time to propagate DCs exceeds ?1
- More likely to activate dormant DCs, which are
propogated adding to overhead - The ultimate result is catastrophic failure.
39Dormant DCs (cont)
- Dont spread dormant DC
- And if reactivated, can reset timestamp
- But this is wrong (might cancel a legitimate
update) - So use second ts called activation timestamp
which is set if its reactivated
40Spatial Distributions
- networks arent heterogeneous
- some links are slower than others
- can be broken up into different types of zones
- we want to favor locality as we spread updates to
minimize traffic
41Spatial Distributions (cont)
- probability of connecting to a site at distance d
is 1/da, where a is to be determined - intuitively, a indicates the amount of locality
youre going to be connecting at - So increase in a -gt increase in locality
- w/ increased locality, need to compensate in
order to break out of locality - more connections
- more rounds
- Also generalized to more more dimensions 1/d-2D
42Spatial Distribution
- Anti-Entropy
- notice Bushey (trans-Atlantic) traffic
- uniform (75.74) vs a2 (2.38)
- For gossiping
- since rumors eventually become inactive, it needs
to spread a lot in the beginning - hence, pump up k
43Summary for Demers et al
- Direct Mailing
- Rumor Mongering
- Anti-Entropy
- Issues
- Research into effect of and optimizing for
topology - Need to know S
- Scalability with n
- churn
- Bimodal Multicast will address
- What about throughput stability
- What about higher rate of msgs?
44Bimodal multicast
- A technique to apply epidemic concept to achieve
scalable and reliable multicast - Use epidemic in term of anti-entropy
- Randomly choose members in the group
- Synchronize state
45Two classes of multicast
- strong reliability
- atomicity
- delivery ordering
- virtual synchrony
- security
- real-time
- more overhead, unpredictable behavior under some
situations - best-effort reliability
- scalable
- provide no end-to-end delivery
- No strong membership view
- Certain level failure discovery
- SRM,MUSE,RMTP,etc.
46Multicast Examples
- Virtual synchrony
- Strong reliable
- significant degradation even just few node
failures - suitable for small groups, limited to short
bursts of multicasts - SRM
- Best-effort reliable
- Error-prone to stochastic failures
- Meltdown can occur in large network
- None of them addresses stability problem under
failures
47Fault-tolerance problem
- Virtual synchrony perform badly under failures
48Bimodal multicast
- Also called probabilistic broadcast (pbcast)
- fill the gap between two approaches
- scalable
- predictably reliable even under bad conditions
- Complement with existing mechanism, such as
Virtual Synchrony - Atomic
- Provide stability
- Throughput stability
- Multicast stability
49Pbcast protocol
- consists of two concurrent subprotocols
- Optimistic dissemination protocol , such as
IP-multicast - Two-phase anti-entropy protocol to deal with
synchronization problem - first phase detect packet message loss
- second phase corrects losses
50Optimistic dissemination protocol
- each nodes must possess the list of all members
- generate set of spanning trees
- Simple algorithms
- Randomly choose a spanning tree
- every node uses the same spanning tree to forward
the message - A set of spanning trees is needed to calculate
each time nodes join or nodes leave
51Two-Phase Anti-Entropy Protocol
- detect and correct any inconsistencies by
gossiping - At first , nodes randomly choose members to
forward message histories - Also called a digest
- Second, recipient nodes may ask for missing
message from sender nodes - Emphasizing most recent histories than old ones
- Preventing system degradation by faulty nodes
trying to get all messages in history
52Example
- Some processes cannot get the message by
unreliable multicast - Process P misses message M0 , Q misses M1
- P get M0 at first round of anti-antropy, Q get M1
at next round
53Optimizations
- Some Optimizations are used with bimodal
multicast to gain better performances - Soft-Failure Detection
- Round Retransmission Limit
- Cyclic Retransmissions
- Most-Recent-First Retransmission
- Independent Numbering of Rounds
- Random Graphs for Scalability
- Multicast for some retransmission
54Computational Result
55Throughput Stability
Number of susceptible processes versus number of
gossip rounds when the initial multicast fails
(left) and when it reaches 90 of processes
(right note scale). Both runs assume 1000
processes.
56Latency
- Expected number of rounds increase as a function
of log(n) - Variance of latency increases as a function of
sqrt(n) - Scalable
57Performance comparison
- Compare bimodal multicast with two multicast
protocols - Virtual Synchrony
- SRM
- Bimodal multicast beats both protocols under
failures
58Pbcast VS Virtual Synchrony
- Pbcast reacts to failures better
59Bandwidth comparison
60Stability
61Pbcast VS SRM
- Pbcast incurs much less overhead under failures
62Optimizations
63Conclusion
- Bimodal Multicast using anti-entropy to achieve
both reliability and scalability - perform well under failure
- Predictable traffic overhead
- Can be used with strong reliability multicast ,
such as virtual synchrony - Suggestion
- Incur constant overhead (even when network is in
good condition) - Trade-off
- Lack of membership management mechanism
- Nodes join nodes leave
- Cannot handle churn
64Gossip-based Ad Hoc Routing
- The concept of epidemic can be adopted for ad hoc
routing - Proved to be more efficient than traditional
flooding method Hass et al. - Exhibits bimodal behavior (From percolation
theory Grimmett et al. ) - Implemented and tested with AODV routing protocol
65Problem
- In a mobile ad-hoc network which no fixed
infrastructure - A route to other nodes constantly changes
- Each node can use only broadcasting communication
- How to find routes to other nodes in network?
- GPS
- expensive
- Flooding
- More overhead
- Gossip-based
66Gossiping concept
- When an arbitrary node receives a message, with
probability p it forward the message to all of
its neighbors by broadcasting - On the other hand, with probability 1-p it
discard the message
67Flooding VS Gossip
68Percolation Theory
- Gossiping exhibits bimodal behavior
- Given probability to gossip p
- ?S(p) is the probability that gossip does not die
out - If a gossip does not die out, there is ?R(p)
probability that a node will get a message - In most case, ?R(p) 1
- How to find a lowest p
- Also called percolation threshold (pc)
-
What is maximum number of nodes picked and graph
is still connected?
Answer (1-pc)n
69How to choose p ?
- If p is too small ( p -gt 0 )
- Little traffic overhead
- The communication probably dies out and many
nodes will not get the message - If p is too big ( p -gt 1)
- More reliable (almost all nodes get the message)
- More traffic overhead (flooding if p 1)
70GOSSIP1
- In some cases, the message dies out at the source
because of few sources neighbors - To deal with such cases, the message will be sent
by flooding at the beginning , and then continues
to gossip later - GOSSIP1(p,k) forwards first k hops with
probability 1 and then continues forwarding with
probability p
71GOSSIP1 Example
3-hop flooding
72(No Transcript)
73Simulation on 1000x1000 grid
74Dropoff
- Most real graphs are finite
- Some nodes are close to the boundary
- From the result, such nodes have lower
probability to get the message - The reasons are
- Boundary nodes have few neighbors
- Back-propagation is not possible for boundary
nodes
75Optimizations
- Some techniques are adopted to boost the
performance of gossip routing - Two-threshold scheme
- Preventing premature gossip death
- Retries
- Zones
76Two-threshold scheme
- Nodes with few neighbors should gossip with high
probability p - GOSSIP2(p1,k,p2,n)
- GOSSIP1(p1,k) if sender nodes have more than or
equal to n neighbors - GOSSIP1(p2,k) otherwise
- Useful for sparse graph
77GOSSIP2 Example
3-hop flooding
78GOSSIP1 VS GOSSIP2
79Preventing premature gossip death
- If a node received a message and decided not to
forward it - But then after a period, it notices that it
received very few gossips from its neighbors - Probably because broadcast die out
- That node finally decide to broadcast
- GOSSIP3(p,k,m)
- GOSSIP1(p,k)
- In case of 1-p situation, if fewer messages than
m are received from neighbors (a sign of gossip
death), flooding with p1
80GOSSIP3 Example
3-hop flooding
81GOSSIP1 VS GOSSIP3
- With the same performance , GOSSIP3 uses less
overhead
82Retries
- With gossiping protocol, there will always be a
chance that an existing route cannot be found - Suitable with bimodal behavior communication
- Most transmissions success
- every nodes get the message
- No retries
- A few transmissions die out
- Almost none gets the message (almost no overhead)
- Use retries
83Zones
- Each node maintain a list of members within its
k-hop zone - a route to member in its zone can be done without
broadcasting - Suitable for small network
- Solve boundary nodes problems
- Solve intermediate cases (non-bimodal effect)
- Requires each node to maintain a list of members
84Performance analysis
- AODV
- A well-known ad-hoc routing protocol
- Node u requests a route to node v
- Flooding with small radius
- If a route to v is not found, try again with
larger and larger radius - If fails, finally flood throughout the network
- AODVG
- Instead of final flood, gossiping are used
85AODV VS AODVG
86Conclusion
- Epidemic concept can be adopted for ad-hoc
routing - Scalable
- Fault - tolerance
- Less overhead than flooding
- Offer good level of reliability
- Suggestion
- How to find p pc?
- Can we use feedback?
87 Epidemics Summary
- Epidemic algorithms in DB replication
- gossip anti-entropy
- spatial redistribution
- Bimodal Multicast
- Multicast anti-entropy
- Uniformly weigh with random choose
- Bimodal behavior
- Gossip Based Ad-Hoc Routing
- Percolating effect
- Bimodal behavior
- How to find threshold ?
88Additional Papers
- Efficient Epidemic-style Protocols for Reliable
and Scalable Multicast Gupta, Kermarrec
Ganesh - Topologically sensitive
- Reducing overhead in more quiet systems
89References
- I. Gupta. CS598IG, Fall 2004, First Lecture.
- A. Demers, D. Greene, C. Hauser, W. Irish, J.
Larson, S. Shenker, H. Sturgis, D. Swinehart D.
Terry. Epidemic Algorithms for Replicated
Database Maintenance . - K. Birman, M. Hayden, O. Ozkasap, Z. Xiao, M.
Budiu Y. Minsky. Bimodal Multicast. - Z. Haas, J. Halpern L. Li. Gossip-Based Ad Hoc
Routing. - I. Gupta, A. Kermarrec A. Ganesh. Efficient
Epidemic-style Protocols for Reliable and
Scalable Multicast.