Title: Optimal Space Lower Bounds for all Frequency Moments
1Optimal Space Lower Bounds for all Frequency
Moments
Based on SODA 04 paper
2The Streaming Model AMS96
- Stream of elements a1, , aq each in 1, , m
- Want to compute statistics on stream
- Elements arranged in adversarial order
- Algorithms given one pass over stream
- Goal Minimum space algorithm
3Frequency Moments
- Notation
- q stream size, m universe size
- fi occurrences of item i
k-th moment
- F0 of Distinct elements
- F1 q
- F2 repeat rate
Why are frequency moments important?
4Applications
- Estimating distinct elts. w/ low space
- Estimate selectivity of queries to DB w/o
expensive sort - Routers gather distinct destinations w/limited
memory. - Estimating F2 estimates size of self-joins
,
5The Best Determininistic Algorithm
- Trivial algorithm for Fk
- Store/update fi for each item i, sum fik at end
- Space O(mlog q) m items i, log q bits to count
fi
- Negative Results AMS96
- Compute Fk exactly gt ?(m) space
- Any deterministic alg. outputs x with Fk x
lt ? must use ?(m) space
What about randomized algorithms?
6Randomized Approx Algs for Fk
- Randomized alg. ?-approximates Fk if outputs x
s.t. PrFk x lt ? Fk gt 2/3 - Can ?-approximate F0 BJKST02, F2 AMS96, Fk
CK04, k gt 2 in space - (big-Oh notation suppresses polylog(1/?, m, q)
factors)
- Ideas
- Hashing O(1)-wise independence
- Sampling
7Example F0 BJKST02
- Idea For random function hm -gt 0,1 and
distinct elts b1, b2, , bF0, expect mini h(bi) ¼
1/F0
- Algorithm
- Choose 2-wise indep. hash function h m -gt m3
- Maintain t ?(1/?2) distinct smallest values
h(bi) - Let v be t-th smallest value
- Output tm3/v as estimate for F0
- Success prob up to 1-? gt take median O(log
1/?) copies - Space O((log 1/?)/?2)
8Example F2 AMS99
- Algorithm
- Choose 4-wise indep. hash function hm -gt
-1,1 - Maintain Z ?i in m fi h(i)
- Output Y Z2 as estimate for F2
Correctness
Chebyshevs inequality gt O(1/?2) space
9Previous Lower Bounds
- AMS96 8 k, ?approximating Fk gt ?(log m)
space - Bar-Yossef ?-approximating F0 gt ?(1/?) space
- IW03 ?-approximating F0 gt space if
- Questions
- Does the bound hold for k ? 0?
- Does it hold for F0 for smaller ??
10Our First Result
- Optimal Lower Bound 8 k ? 1, any ? ?(m-.5),
?-approximate Fk gt ?(?-2) bits of space. - F1 q trivial in log q space
- Fk trivial in O(m log q) space, so need ?
?(m-.5) - Technique Reduction from 2-party protocol for
computing Hamming distance ?(x,y) - Use tools from communication complexity
11Lower Bound Idea
Alice
Bob
y 2 0,1m
x 2 0,1m
Stream s(y)
Stream s(x)
S
Internal state of A
(1 ?) Fk algorithm A
(1 ?) Fk algorithm A
- Compute (1 ?) Fk(s(x) s(y)) w.p. gt 2/3
- Idea If can decide f(x,y) w.p. gt 2/3, space
used - by A at least randomized 1-way comm.
Complexity of f
12Randomized 1-way comm. complexity
- Boolean function f X Y ! 0,1
- Alice has x 2 X, Bob y 2 Y. Bob wants f(x,y)
- Only 1 message m sent must be from Alice to Bob
- Communication cost maxx,y Ecoins m
- ? -error randomized 1-way communication
complexity R?(f), is cost of optimal protocol
computing f with probability 1-?
Ok, but how do we lower bound R?(f)?
13Shatter Coefficients KNR
- F f X ! 0,1 function family, f 2 F
length-X bitstring - For S µ X, shatter coefficient SC(fS) of S
- f Sf 2 F distinct bitstrings when
F restricted to S - SC(F, p) maxS µ X, S p SC(fS). If SC(fS)
2S, S shattered - Treat f X Y ! 0,1 as function family fX
- fX fx(y) Y ! 0,1 x 2 X , where fx(y)
f(x,y) - Theorem BJKS For every f X Y ! 0,1, every
integer p, R1/3(f) ?(log(SC(fX, p)))
14Warmup ?(1/?) Lower Bound Bar-Yossef
- Alice input x 2R 0,1m, wt(x) m/2
- Bob input y 2R 0,1m, wt(y) ?m
- s(x), s(y) any streams w/char. vectors x, y
- PROMISE
- (1) wt(x Æ y) 0 OR (2) wt(x Æ y)
?m - f(x,y) 0
f(x,y) 1 - F0(s(x) s(y)) m/2 ?m F0(s(x)
s(y)) m/2 - R1/3(f) ?(1/?) Bar-Yossef (uses shatter
coeffs) - (1?)m/2 lt (1 - ?)(m/2 ?m) for ? ?(?)
- Hence, can decide f ! F0 alg. uses ?(1/?) space
- Too easy! Can replace F0 alg. with a Sampler!
15Our Reduction Hamming Distance Decision Problem
(HDDP)
Set t ?(1/?2)
Alice
Bob
x 2 0,1t
y 2 0,1t
Promise Problem ?
?(x,y) t/2 ?(t1/2) ?(x,y)
gt t/2 f(x,y) 0 OR
f(x,y) 1
- Lower bound R1/3(f) via SC(fX, t), but need a
lemma
16Main Lemma
S µ0,1n
T
y
S-T
- 9 S µ 0,1n with S n s.t. exist 2?(n)
good sets T µ S s.t. - 9 y 2 0,1n s.t
- 8 t 2 T, ?(y, t) n/2 cn1/2 for some c gt 0
- 8 t 2 S T, ?(y,t) gt n/2
17Lemma Resolves HDDP Complexity
- Theorem R1/3(f) ?(t) ?(?-2).
- Proof
- Alice gets yT for random good set T applying
main lemma with n t. - Bob gets random s 2 S
- Let f yT T S ! 0,1.
- Main Lemma gtSC(f) 2?(t)
- BJKS gt R1/3(f) ?(t) ?(?-2)
- Corollary ?(1/?2) space for randomized 2-party
protocol to approximate ?(x,y) between inputs - First known lower bound in terms of ?!
18Back to Frequency Moments
Use ?-approximator for Fk to solve HDDP
y 2 0,1t
s 2 S µ 0,1t
i-th universe element included exactly once in
stream ay iff yi 1 (as same)
ay
as
Fk Alg
Fk Alg
State
19Solving HDDP with Fk
- Alice/Bob compute ?-approx to Fk(ay as)
- Fk(ay as) 2k wt(y Æ s) 1k ?(y,s)
- For k ? 1,
- Alice also transmits wt(y) in log m space.
Conclusion ?-approximating Fk(ay as) decides
HDDP, so space for Fk is ?(t) ?(?-2)
20Back to the Main Lemma
- Recall show 9 S µ 0,1n with S n s.t.
2?(n) good sets T µ S s.t - 9 y 2 0,1n s.t
- 1. 8 t 2 T, ?(y, t) n/2 cn1/2 for some c gt
0 - 2. 8 t 2 S T, ?(y,t) gt n/2
- Probabilistic Method
- Choose n random elts in 0,1n for S
- Show arbitrary T µ S of size n/2 is good with
probability gt 2-zn for constant z lt 1. - Expected good T is 2?(n)
- So exists S with 2?(n) good T
21Proving the Main Lemma
- T t1, , tn/2 µ S arbitrary
- Let y be majority codeword of T
- What is probability p that both
- 1. 8 t 2 T, ?(y, t) n/2 cn1/2 for some c gt
0 - 2. 8 t 2 S T, ?(y,t) gt n/2
- Put x Pr8 t 2 T, ?(y,t) n/2 cn1/2
- Put y Pr8 t 2 S-T, ?(y,t) gt n/2 2-n/2
- Independence gt p xy x2-n/2
22The Matrix Problem
- Wlog, assume y 1n (recall y is majority word)
- Want lower bound Pr8 t 2 T, ?(y,t) n/2
cn1/2 - Equivalent to matrix problem
t1 -gt t2 -gt tn/2 -gt
101001000101111001 100101011100011110 001110111101
010101 101010111011100011
For random n/2 x n binary matrix M, each column
majority 1, what is probablity each row n/2
cn1/2 1s?
23A First Attempt
- Set family A µ 20,1n monotone increasing if
- S1 2 A, S1 µ S2 gt
S2 2 A - For uniform distribution on S µ 0,1n, and A, B
monotone increasing families, Kleitman - PrA Ã… B PrA PrB
- First try
- Let R be event M n/2 cn1/2 1s in each row, C
event M majority 1 in each column - Pr8 t 2 T, ?(y,t) n/2 cn1/2 PrR C
PrR Ã… C/PrC - M characteristic vector of subset of .5n2 gt
R,C monotone increasing - gt PrR Ã… C/PrC PrRPrC/PrC PrR lt
2-n/2 - But we need gt 2-zn/2 for constant z lt 1, so this
fails
24A Second Attempt
- Second Try
- R1 M n/2 cn1/2 1s in first m rows
- R2 M n/2 cn1/2 1s in remaining n/2-m rows
- C M majority 1 in each column
- Pr8 t 2 T, ?(y,t) n/2 cn1/2 PrR1 Ã… R2
C -
PrR1 Ã… R2 Ã… C/PrC - R1, R2, C monotone increasing
- gt PrR1 Ã… R2 Ã… C/PrC PrR1 Ã… CPrR2/PrC
-
PrR1 C PrR2 - Want this at least 2-zn/2 for z lt 1
- Pr? Xi gt n/2 cn1/2 gt ½ - c (2/pi)1/2
Stirling - Independence gt PrR2 gt (½ - c(2/pi)1/2)n/2 - m
-
- Remains to show PrR1 C
large.
25Computing PrR1 C
- PrR1 C PrM n/2 cn1/2 1s in 1st m rows
C - Show PrR1 C gt 2-zm for certain constant z lt
1 - Ingredients
- Expect to get n/2 ?(n1/2) 1s in each of 1st m
rows C - Use negative correlation of entries in a given
row gt - show n/2 ?(n1/2) 1s in a given row w/good
probability for small enough c - A simple worst-case conditioning argument on
these 1st m rows shows they all have n/2
cn1/2 1s
26Completing the Proof
- Recall what is probability p xy, where
- 1. x Pr 8 t 2 T, ?(y, t) n/2 cn1/2
- y Pr 8 t 2 S T, ?(y,t) gt n/2 2-n/2
- R1 M n/2 cn1/2 1s in first m rows
- R2 M n/2 cn1/2 1s in remaining n/2-m rows
- C M majority 1 in each column
- x PrR1 C PrR2 2-zm (½ - c(2/pi)1/2)n/2
m - Analysis shows z small so this
2-zn/2, z lt 1 - Hence p xy 2-(z1)n/2
- Hence expected good sets 2n-O(log n)p 2?(n)
- So exists S with 2?(n) good T
27Bipartite Graphs
- Matrix Problem ? Bipartite Graph Counting
Problem -
- How many bipartite graphs exist on n/2 by n
vertices s.t. each left vertex has degree gt n/2
cn1/2 and each right vertex degree gt n/2?
28Our Result on of Bipartite Graphs
- Bipartite graph count
- Argument shows at least 2n2/2 zn/2 n such
bipartite graphs for constant z lt 1. - Main lemma shows bipartite graphs on n n
vertices w/each vertex degree gt n/2 is gt
2n2-zn-n - Can replace gt with lt
- Previous knowncount 2n2-2n
- MW personal comm.
- Follows easily from Kleitman inequality
29Summary
- Results
- Optimal Fk Lower Bound 8 k ? 1 and any
? ?(m-1/2), any ?-approximator for Fk must use
?(?-2) bits of space. - Communication Lower Bound of ?(?-2) for one-way
communication complexity of (?, ?)-approximating
?(x, y) - Bipartite Graph Count bipartite graphs on
n n vertices w/each vertex degree gt n/2 at
least 2n2-zn-n for constant z lt 1.