Title: Lectures on Randomised Algorithms
1Lectures onRandomised Algorithms
- COMP 523 Advanced Algorithmic Techniques
- Lecturer Dariusz Kowalski
2Overview
- Previous lectures
- NP-hard problems
- Approximation algorithms
- These lectures
- Basic theory
- probability, random variable, expected value
- Randomised algorithms
3Probabilistic theory
- Consider flipping two symmetric coins with sides
1 and 0 - Event situation which depends on random
generator - Event when the sum of results on two flipped
coins is 1 - Random variable a function which attaches a real
value to any event - X sum of results on two flipped coins
- Probability of event proportion of the event to
the set of all events (sometimes weighted) - PrX1 2/4 1/2 , since
- X 1 is the event containing two elementary
events - 0 on the first coin and 1 on the second coin
- 1 on the first coin and 0 on the second coin
4Probabilistic theory cont.
- Consider flipping two symmetric coins with sides
1 and 0 - Expected value (of random variable) the sum of
all possible values of the random variable
weighted by the probabilities of occurring these
values - EX 0 ? 1/4 1 ? 1/2 2 ? 1/4 1
- Independence two events are independent if the
probability of their intersection is equal to the
multiplication of their probabilities - Event 1 1 on the first coin, Event 2 0 on the
second coin - PrEvent1 Event2 1/4 PrEvent1 ?
PrEvent2 1/2 ? 1/2 - Event 3 sum on two coins is 2
- PrEvent1 Event3 1/4 ? PrEvent1 ?
PrEvent3 1/2 ? 1/4
5Randomised algorithms
- Any kind of algorithm using (pseudo) random
generator - Main kinds of algorithms
- Monte Carlo algorithm computes proper solution
with high probability (in practise at least
constant) - Algorithm MC always stops
- Las Vegas algorithm always computes proper
solution - Sometimes algorithm can run very long, but with
very small probability
6Quick sort - algorithmic scheme
- Generic Quick Sort
- Select one element x from the input
- Partition the input into the part containing
elements not greater than x and the part
containing all bigger elements - Sort each part separately
- Concatenate these sorted parts
- Problem how to choose element x to balance the
sizes of these two parts? (to get the similar
recursive equations as for MergeSort)
7Why parts should be balanced?
- Suppose we do not balance, but choose the last
element - T(n) ? T(n-1) T(1) c n
- T(1) ? c
- Solution T(n) ? d n2, for some constant 0 lt d ?
c/2 - Proof by induction.
- For n 1 straightforward
- Suppose T(n-1) ? d (n-1)2 then
- T(n) ? T(n-1) c c n ? d (n-1)2 c (n1)
- ? d (n-1)2 2dn ? d n2
8Randomised approach
- Randomised approach
- Select element x uniformly at random
- Time O(n log n)
- Additional memory O(n)
- Uniform selection each element has the same
probability to be selected
9Randomized approach - analysis
- Let T(n) denote the expected time sum of all
possible values of time weighted by the
probabilities of these values - T(n) ? 1/n (T(n-1)T(1) T(n-2)T(2)
T(0)T(n)) cn - T(0) T(1) 1, T(2) ? c
- Solution T(n) ? d n log n, for some constant d
? 8c - Proof by induction.
- For n 2 straightforward
- Suppose T(m) ? d m log m, for every m lt n then
- (1-1/n)?T(n) ? (2/n)?(T(0) T(n-1)) c n
- ? (2d/n)?(0 log 0 (n-1)log(n-1)) c n
- ? d n log n - d (n/4) c n ? d n log n - d n/2
- T(n) ? n/(n-1)?(d n log n - d n/2) ? d n log n
10Tree structure of random execution
root
1
6
5
7
height 5
3
2
4
1
2
3
4
5
6
7
8
8 leaves
11Minimum Cut in a graph
- Minimum cut in an undirected multi-graph G (there
may be many edges between a pair of nodes) - A partition of nodes with minimum number of
crossing edges - Deterministic approach
- Transform the graph to s-t network, for every
pair of nodes s,t - Replace each undirected edge by two directed
edges in opposite directions of capacity 1 each - Replace all multiple directed edges by one edge
with capacity equal to the multiplicity of this
edge - Run Ford-Fulkerson (or other network-flow
algorithm) to compute max-flow, which is equal to
min-cut
12Minimum Cut in a graph
- Randomised approach
- Select a random edge
- contract their end nodes into one node,
- remove edges between these two nodes
- keep the other adjacent edges to the obtained
supernode - Repeat the above procedure until two supernodes
remain - Count the number of edges between the remaining
supernodes and return the result
13Minimum Cut - Analysis
- Let K be the smallest cut (set of edges) and let
k be its size. - Compute probability that in step j the edge in K
is selected, providing no edge from K has been
selected before, is - Each supernode has at least k adjacent edges
(otherwise a cut between a node with smaller
number of adjacent edges and remaining supernodes
would be smaller than K) - Total number of remaining supernodes in the
beginning of step j is n - j 1 - Total number of edges in the beginning of step j
is thus at least k(n - j 1)/2 (each edge is
counted twice to the degree of a node) - Probability of selecting (and so contracting)
edge in K in step j is at most k/k(n - j 1)/2
2/(n - j 1)
14Minimum Cut - Analysis cont.
- Event Bj in step j of the algorithm an edge not
in K is selected - Conditional probability (of event A under
condition event B) - PrAB PrA?B/PrB
- From the previous slide
- PrBj Bj-1 ?? B1 gt 1 2/(n - j 1)
- The following holds
- PrBj ? Bj-1 ?? B1 PrB1 ? PrB2B1 ?
PrB3B2?B1 - ? ? PrBjBj-1??B1
- Probability of sought event Bn-2 ? Bn-3 ?? B1
(i.e., that in all n-2 steps of the algorithm
edges not in K are selected) is at most - 1-2/n?1-2/(n - 1)??1-2/3
- (n-2)/n?(n-3)/(n-1)?(n-4)/(n-2)??2/4?1/3
- 2/n(n-1)
15Minimum Cut - Analysis cont.
- If we iterate this algorithm independently
n(n-1)/2 times, always recording the minimum
output obtained so far, then the probability of
success (i.e., of finding a min-cut) is at least - 1-(1-2/n(n-1))n(n-1)/2 ? 1-1/e
- To obtain bigger probability we have to iterate
this process more times - The total time is O(n3) concatenations
- Question how to implement concatenation
efficiently?
16Conclusions
- Probabilistic theory
- Events, random variables, expected values
- Basic algorithms
- LV Randomised Quick Sort (randomised recurrence)
- MC Minimum Cut (iterating to get bigger
probability)
17Textbook and Exercises
- READING
- Chapter 13, Sections 13.2, 13.3, 13.5 and 13.12
- EXERCISE
- How many iterations of min-cut randomised
algorithm should we perform to obtain probability
of success at least 1 - 1/n ? - For volunteers
- Suppose that we know the size of min-cut. What is
the expected number of iterations of min-cut
randomised algorithm to find a sample min-cut?
18Overview
- Previous lectures
- Randomised algorithms
- Basic theory probability, random variable,
expected value - Algorithms LV (sorting) and MC (min-cut)
- This lecture
- Basic random processes
19Expected number of successes
- Sequence (possibly infinite) of independent
random trials, each with probability p of success - Expected number of successes in m trials is
- Probability of success in one trial is p, so let
Xj be such that PrXj1 p and PrXj0 1 -
p , for 0ltj?m - E?0ltj?m Xj ?0ltj?m EXj mp
- Memoryless guessing n cards, you guess one, turn
over one card, check if you succeeded, shuffle
cards and repeat how much time do you need to
expect one proper guess? - PrXj1 1/n and PrXj0 1 - 1/n
- E?0ltj?n Xj ?0ltj?n EXj n ? 1/n 1
20Guessing with memory
- n cards, you guess one, turn over one card,
remove the card, shuffle the rest of them and
repeat how many successful guesses can you
expect? - PrXj1 1/(n-j1) and PrXj0 1 - 1/(n-j1)
- E?0ltj?n Xj ?0ltj?n EXj ?0ltj?n 1/(n-j1)
- ?0ltj?n 1/j Hn ln n const.
21Waiting for the first success
- Sequence (possibly infinite) of independent
random trials, each with probability p of success - Expected time for waiting for the first success
is - ?jgt0 j ? (1 - p)j-1 ? p p ? ?j?1 j ? (1 -
p)j-1 - p ? (?j?1 (1 - p)j )
- p ? ((1 - p)/(1-(1-p)))
- p ? 1/p2 1/p
22Collecting coupons
- n types of coupons hidden randomly in a large
number of boxes, each box contains one coupon.
You choose a box and take a coupon from it. How
many boxes can you expect to open in order to
collect all kinds of coupons? - Stage j time between selecting j - 1 different
coupons and jth different coupon - Independent trials Yi for each step i of stage j,
satisfying PrYi0 (j-1)/n and PrYi1
(n-j1)/n - Let Xj be the length of stage j the expected
waiting for first success EXj 1/PrYi 1
n/(n-j1) - Finally E?0ltj?n Xj ?0ltj?n EXj ?0ltj?n
n/(n-j1) - n ?0ltj?n 1/j nHn n ln n n ? const.
23Conclusions
- Probabilistic theory
- Events, random variables, expected values, etc.
- Basic random processes
- Number of successes
- Guessing with or without memory
- Waiting for the first success
- Collecting coupons
24Textbook and Exercises
- READING
- Section 13.3
- EXERCISES
- How many iterations of min-cut randomised
algorithm should we perform to obtain probability
of success at least 1 - 1/n ? - More general question Suppose that algorithm MC
answers correctly with probability 1/2. How to
modify it to answer correctly with probability at
least 1-1/n ? - For volunteers
- Suppose that we know the size of min-cut. What is
the expected number of iterations of min-cut
randomised algorithm to find a precise min-cut?
25Overview
- Previous lectures
- Randomized algorithms
- Basic theory probability, random variable,
expected value - Algorithms LV sorting, MC min-cut
- Basic random processes
- This lecture
- Randomised caching
26Randomised algorithms
- Any kind of algorithm using (pseudo-)random
generator - Main kinds of algorithms
- Monte Carlo algorithm computes the proper
solution with large probability (at least
constant) - Algorithm MC always stops
- We want to have high probability of success
- Las Vegas algorithm computes always the proper
solution - Sometimes algorithm can run very long, but with
very small probability - We want to achieve small expected running time
(or other complexity)
27On-line vs. off-line
- Dynamic data
- Arrive during execution
- Algorithms
- On-line doesnt know the future, makes decision
on-line - Off-line knows the future, makes decision
off-line - Complexity measure Competitive ratio
- The maximum ratio, taken over all data, between
the performance of given on-line algorithm and
the optimum off-line solution for the data
28Analyzing the caching process
- Two kinds of memory
- Fast memory cache of size k
- Slow memory disc of size n
- Examples
- hard disc versus processor cache
- network resources versus local memory
- Problem
- In each step a request for a value arrives
- If the value is in cache then answering does not
cost anything, otherwise it costs one unit (of
access to the slow memory) - Performance measure
- Count the number of accesses to the slow memory
- Compute competitive ratio
29Marking algorithm(s)
- Algorithm proceeds in phases
- Each item in cache is either marked or unmarked
- At the beginning of each phase all items are
unmarked - Upon a request to item s
- If s is in cache then mark s (if already
unmarked) - Else
- If all items in cache are marked then
- finish the current phase and start a new one,
- unmark all items in the cache
- Remove a randomly selected unmarked item from the
cache and put s in its place mark s
30Example of processing by marking
- Stream 1,2,3,4,1,2,3,4
- Cache (for k 3 items)
- Phase 1 1 -gt 1 2 -gt 1,2 3 -gt 1,2,3
- Phase 2 1,2,3 4 -gt 1,3,4 1 -gt 1,3,4
2-gt 1,2,4 - Phase 3 1,2,4 3 -gt 1,2,3 4 -gt 2,3,4
- Number of accesses to slow memory 7
- Optimal algorithm 5
- Notation Marked elements 1
- New marked elements 4
31Analysis
- Let r denote the number of phases of the
algorithm - Item can be
- Marked
- Unmarked
- Fresh - it was not marked during previous phase
- Stale - it was marked during previous phase
- Let
- ? denote the stream of requests,
- cost(?) denote the number of accesses to slow
memory by the algorithm, - opt(?) denote the minimum possible cost on stream
?, - optj(?) be the number of misses in phase j.
- Let cj denote the number of requests in the data
stream to fresh items in phase j
32Analysis fresh items in optimal solution
- () After one phase, only items which have been
requested in that phase can be stored in the
cache - Properties
- optj(?) optj1(?) ? cj1
- Indeed, in phases j and j1 there are at least
cj1 misses in optimal algorithm, since, by
(), fresh items requested in phase j1 were not
requested in phase j and so could not be present
in the cache. - 2opt (?) ? ?0?jltr optj(?) optj1(?) ? ?0?jltr
cj1 - opt(?) ? 0.5 ?0ltj?r cj
33Analysis - stale items
- Let Xj be the number of misses by marking
algorithm in phase j - No misses on marked items - they remain in cache
- cj misses on fresh items in phase j
- At the beginning of phase j all items in cache
are stale - unmarked by request in the previous
phase - ith request to unmarked stale item, say it is for
item s - each of remaining k-i1 stale items is equally
likely to be no longer in cache, at most cj items
were replaced by fresh items, and so s is not in
the cache with probability at most cj/(k-i1) - EXj ? cj ?0lti?k cj/(k-i1) ? cj (1 ?0lti?k
1/(k-i1)) - ? cj (1 Hk)
34Analysis - conclusions
- Let Xj be the number of misses by marking
algorithm in phase j - cost(?) denotes the number of accesses to
external memory by the algorithm - random
variable - opt(?) denotes the minimum possible cost on
stream ? - deterministic value - Ecost(?) ? ?0ltj?r EXj ? (1 Hk)?0ltj?r cj
- ? (2Hk 2) opt(?)
35Conclusions
- Randomised algorithm for caching O(ln k)
competitive - Lower bound k on competitiveness of any
deterministic caching algorithm for every
deterministic algorithm there is a string of
requests such that they are processed at least k
times slower than the optimal processing
36Textbook and Exercises
- READING
- Section 13.8
- EXERCISES (for volunteers)
- Modify Randomized Marking algorithm to obtain
k-competitive deterministic algorithm. - Prove that Randomized Marking algorithm is at
least Hk-competitive. - Prove that each deterministic caching algorithm
is at least k-competitive.
37Overview
- Previous lectures
- Randomized algorithms
- Basic theory probability, random variable,
expected value - Algorithms LV sorting, MC min-cut
- Basic random processes
- Randomised caching
- This lecture
- Multi-access channel protocols
38Ethernet
- dominant LAN technology
- cheap 20 for 1000Mbs!
- first widely used LAN technology
- Simpler, cheaper than token LANs and ATM
- Kept up with speed race 10, 100, 1000 Mbps
Metcalfes Ethernet sketch
39Ethernet Frame Structure
- Sending adapter encapsulates IP datagram (or
other network layer protocol packet) in Ethernet
frame - Preamble
- 7 bytes with pattern 10101010 followed by one
byte with pattern 10101011 - used to synchronize receiver, sender clock rates
40Ethernet Frame Structure (more)
- Addresses 6 bytes
- if adapter receives frame with matching
destination address, or with broadcast address
(eg ARP packet), it passes data in frame to
net-layer protocol - otherwise, adapter discards frame
- Type indicates the higher layer protocol, mostly
IP but others may be supported such as Novell IPX
and AppleTalk) - CRC checked at receiver, if error is detected,
the frame is simply dropped
41Unreliable, connectionless service
- Connectionless No handshaking between sending
and receiving adapter. - Unreliable receiving adapter doesnt send acks
or nacks to sending adapter - stream of datagrams passed to network layer can
have gaps - gaps will be filled if app is using TCP
- otherwise, app will see the gaps
42Random Access Protocols
- When node has packet to send
- transmit at full channel data rate R
- no a priori coordination among nodes
- Multiple-access channel
- One transmitting node at a time -gt successful
access/transmission - Two or more transmitting nodes at a time -gt
collision (no success) - Random access MAC protocol specifies
- how to detect collisions
- how to recover from collisions (e.g., via delayed
retransmissions) - Examples of random access MAC protocols
- ALOHA (slotted, unslotted)
- CSMA (CSMA/CD, CSMA/CA)
43Slotted ALOHA
- Assumptions
- all frames same size
- time is divided into equal size slots, time to
transmit 1 frame - nodes start to transmit frames only at beginning
of slots - nodes are synchronized
- if 2 or more nodes transmit in slot, all nodes
detect collision
- Operation
- when node obtains fresh frame, it transmits in
next slot - no collision, node can send new frame in next
slot - if collision, node retransmits frame in each
subsequent slot with prob. p until success
44Slotted ALOHA
- Pros
- single active node can continuously transmit at
full rate of channel - highly decentralized only slots in nodes need to
be in sync - simple
- Cons
- collisions, wasting slots
- idle slots
- nodes may be able to detect collision in less
than time to transmit packet
45Slotted Aloha analysis
- Suppose that k stations want to transmit in the
same slot. - The probability that one station transmits in the
next slot is kp(1-p)k-1 - If k ? 1/(2p) then kp(1-p)k-1 ?(kp) lt 1/2, and
applying the analysis similar to the coupon
collector problem we get an average number of
slots when all stations transmit successfully is - ?(1/p1/(2p)1/(kp)) ?((1/p)Hk) ?((1/p)
ln k) - If k gt 1/(2p) then kp(1-p)k-1 ?(kp/ekp), hence
the expected time even for the first successful
transmission is ?((1/p) ekp/k) - Conclusion choice of the probability matters!
46CSMA (Carrier Sense Multiple Access)
- CSMA listen before transmit
- If channel sensed idle transmit entire frame
- If channel sensed busy, defer transmission
- Human analogy dont interrupt others!
47CSMA/CD (Collision Detection)
- CSMA/CD carrier sensing, deferral as in CSMA
- collisions detected within short time
- colliding transmissions aborted, reducing channel
wastage - collision detection
- easy in wired LANs measure signal strengths,
compare transmitted, received signals - difficult in wireless LANs receiver shut off
while transmitting - human analogy the polite conversationalist
48Ethernet uses CSMA/CD
- No slots
- adapter doesnt transmit if it senses that some
other adapter is transmitting, that is, carrier
sense - transmitting adapter aborts when it senses that
another adapter is transmitting, that is,
collision detection
- Before attempting a retransmission, adapter waits
a random time, that is, random access
49Ethernet CSMA/CD algorithm
- 1. Adaptor gets datagram and creates frame
- 2. If adapter senses channel idle, it starts to
transmit frame. If it senses channel busy, waits
until channel idle and then transmits - 3. If adapter transmits entire frame without
detecting another transmission, the adapter is
done with frame !
- 4. If adapter detects another transmission while
transmitting, aborts and sends jam signal - 5. After aborting, adapter enters exponential
backoff after the m-th collision, if m lt M,
adapter chooses a K at random from
0,1,2,,2m-1. Adapter waits K512 bit times
and returns to Step 2 -
50Ethernets CSMA/CD (more)
- Jam Signal make sure all other transmitters are
aware of collision 48 bits - Bit time .1 microsec for 10 Mbps Ethernet for
K1023, wait time is about 50 msec -
- Exponential Backoff
- Goal adapt retransmission attempts to estimated
current load - heavy load random wait will be longer
- first collision choose K from 0,1 delay is K
x 512 bit transmission times - after second collision choose K from 0,1,2,3
- after ten collisions, choose K from
0,1,2,3,4,,1023
See/interact with Java applet on AWL Web
site highly recommended !
51Ethernet CSMA/CD modified algorithm
- 1. Adaptor gets datagram from and creates frame
K 0 - 2. If adapter senses channel idle, it starts to
transmit frame. If it senses channel busy, waits
until channel idle and then transmits - 3. If adapter transmits entire frame without
detecting another transmission, the adapter is
done with frame !
- 4. If adapter detects another transmission while
transmitting, aborts and sends jam signal - 5. After aborting, adapter enters modified
exponential backoff after the m-th collision, if
m lt M, adapter - waits (2m-1-K)512 bit times
- chooses a K at random from 0,1,2,,2m-1.
Adapter waits K512 bit times and returns to Step
2 -
52Modified Exponential Backoff analysis
- Suppose some k stations start the protocol at the
same time. - Time complexity for a given packet out of k
packets to be successfully transmitted is O(k)
with probability at least ¼ - Consider value of window such that 0.5 window
k lt window Time required to reach this size of
window is O(k) - The probability that a given packet is
transmitted during the run of the loop for this
value of window is at least - (1-1/window)k-1 gt (1-1/k)k gt ¼
53Modified Exponential Backoff analysis cont.
- Suppose some k stations start the protocol at the
same time. - Time complexity for all given packets to be
successfully transmitted is O(k2) with
probability at least 1/2 - Consider value of window such that 0.5 window
k2 lt window Time required to reach this size of
window is O(k2) - The probability that there is any collision
during the run of the loop for this value x
window is at most - k(k-1)/2 ?1/x2 ? x k(k-1)/2 ? 1/x lt
k(k-1)/2? 1/k2 lt 1/2 - where k is the number of packets that have not
been successfully transmitted before, there are
k(k-1)/2 of pairs of stations that may collide,
each pair with probability 1/x2 , and there are x
times available for collision
54Ethernet Technologies 10Base2
- 10 10Mbps 2 under 200 meters max cable length
- thin coaxial cable in a bus topology
- repeaters used to connect up to multiple segments
- Each segment up to 30 nodes, up to 185 metres
long. - Max of 5 segments.
- repeater repeats bits it hears on one interface
to its other interfaces physical layer device
only! - has become a legacy technology
5510BaseT and 100BaseT
- 10/100 Mbps rate latter called fast ethernet
- T stands for Twisted Pair
- Nodes connect to a hub star topology 100 m
max distance between nodes and hub - Hubs are essentially physical-layer repeaters
- bits coming in one link go out all other links
- no frame buffering
- adapters detect collisions
- provides net management functionality
- eg disconnection of malfunctioning adapters/hosts.
56Gbit Ethernet
- use standard Ethernet frame format
- allows for point-to-point links and shared
broadcast channels - in shared mode, CSMA/CD is used short distances
between nodes to be efficient - uses hubs, called here Buffered Distributors
- Full-Duplex at 1 Gbps for point-to-point links
- 10 Gbps now !
57Textbook and Exercises
- READING
- Section 13.1
- EXERCISES (for volunteers)
- Exercise 3 from Chapter 13