Title: Computational Applications of Noise Sensitivity
1Computational Applications of Noise Sensitivity
2Includes joint work withElchanan MosselRocco
ServedioAdam KlivansNader BshoutyOded
RegevBenny Sudakov
3Intro to Noise Sensitivity
4Election schemes
- suppose there is an election between two parties,
called 0 and 1 - assume unrealistically that n voters cast votes
independently and unif. randomly - an election scheme is a boolean function f
0,1n ? 0,1 mapping votes to winner - what if there are errors in recording of votes?
suppose each vote is misrecorded independently
with prob. e.
5(No Transcript)
6Election schemes
- suppose there is an election between two parties,
called 0 and 1 - assume unrealistically that n voters cast votes
independently and unif. randomly - an election scheme is a boolean function f
0,1n ? 0,1 mapping votes to winner - what if there are errors in recording of votes?
suppose each vote is misrecorded independently
with prob. e. - what is the prob. this affects elec.s outcome?
7Definition
- Let f 0,1n ? 0,1 be any boolean function.
- Let 0 e ½, the noise rate.
- Let x be a uniformly randomly chosen string in
0,1n, and let y be an e-noisy version of x. - Then the noise sensitivity of f at e is
- NSe(f) Pr f(x) ? f(y).
x,y
8Examples
- Suppose f is the constant function f(x) 1.
- Then NSe(f) 0.
- Suppose f is the dictator function f(x) x1.
- Then NSe(f) e.
- In general, for fixed f, NSe(f) is a function of
e.
9Examples parity
- The parity (xor) function on n bits 1 iff there
are an odd number of 1s in the input. - In calculating Prf(x) ? f(y), it doesnt matter
what x is, just how many flips there are. - NSe(PARITYn) Prodd number of heads in n
e-biased coin flips - ½ ½(1 2e)n.
10NSe(PARITY10) ½ ½(1 2e)10
11Basic facts about NS
- NSe(f) is an increasing, (log-)concave function
of e which is 0 at 0 and 2p(1-p) at ½ (where
pPrf 1). - this follows from a formula for NSe(f) in terms
of Fourier coefficients - NSe(f) 2f(Ø) 2 S (1-2e)S f (S)2.
S µ n
12PARITY, MAJORITY, dictator, and AND on 5 bits
13PARITY, MAJORITY, dictator, and AND on 15 bits
14PARITY, MAJORITY, dictator, and AND on 45 bits
15History of Noise Sensitivity(in computer science)
16History of Noise Sensitivity
- Kahn-Kalai-Linial 88
- The Influence of Variables on Boolean Functions
17Kahn-Kalai-Linial 88
- implicitly studied noise sensitivity
- motivation study of random walks on the
hypercube where the initial distribution is
uniform over a subset - the question, What is the prob. that a random
walk of length en, starting uniformly in f-1(1),
ends up outside f-1(1)? is essentially asking
about NSe(f) - famous for using Fourier analysis and
Bonami-Beckner inequality in TCS
18History of Noise Sensitivity
- HÃ¥stad 97
- Some Optimal Inapproximability Results
19HÃ¥stad 97
- breakthrough hardness of approximation results
- decoding the Long Code given access to the
truth-table of a function, want to test that it
is significantly determined by a junta (very
small number of variables) - roughly, does a noise sensitivity test picks x
and y as in n.s., tests f(x)f(y)
20History of Noise Sensitivity
- Benjamini-Kalai-Schramm 98
- Noise Sensitivity of Boolean Functions and
Applications to Percolation
Benjamini-Kalai-Schramm 98 Noise Sensitivity of
Boolean Functions and Applications to
Percolation
21Benjamini-Kalai-Schramm 98
- intensive study of noise sensitivity of boolean
functions - introduced asymptotic notions of noise
sensitivity/stability, related them to Fourier
coefficients - studied noise sensitivity of percolation
functions, threshold functions - made conjectures connecting noise sensitivity to
circuit complexity - and more
22This thesis
- New noise sensitivity results and applications
- tight noise sensitivity estimates for boolean
halfspaces, monotone functions - hardness amplification thms. (for NP)
- learning algorithms for halfspaces, DNF (from
random walks), juntas - new coin-flipping problem, and use of reverse
Bonami-Beckner inequality
23Hardness Amplification
24Hardness on average
- def We say f 0,1n ? 0,1 is (1-e)-hard for
circuits of size s if there is no circuit of size
s which computes f correctly on more than (1-e)2n
inputs. - def A complexity class is (1-e)-hard for
polynomial circuits if there is a function family
(fn) in the class such that for suff. large n, fn
is (1-e)-hard for circuits of size poly(n).
25Hardness of EXP, NP
- Of course we cant show NP is even (1-2-n)-hard
for poly ckts, since this is NPµP/poly. - But lets assume EXP, NP µ P/poly. Then just how
hard are these for poly circuits? - For EXP, extremely strong results known
BFNW93,Imp95,IW97,KvM99,STV99 if EXP is
(1-2-n)-hard for poly circuits, then it is (½
1/poly(n))-hard for poly circuits. - What about NP?
26Yaos XOR Lemma
- Some of the hardness amplification results for
EXP use Yaos XOR Lemma - Thm If f is (1-e)-hard for poly circuits,
thenPARITYk f is (½½(1-2e)k)-hard for poly
circuits. - Here, if f is a boolean fcn on n inputs and g is
a boolean fcn on k inputs, g f is the function
on kn inputs given by g(f(x1), , f(xk)). - No coincidence that the hardness bound for
PARITYk f is 1-NSe(PARITYk).
27A general direct product thm.
- Yao doesnt help for NP if you have a hard
function fn in NP, PARITYk fn probably isnt in
NP. - We generalize Yao and determine the hardness of g
fn for any g in terms of the noise
sensitivity of g - Thm If f (balanced) is (1-e)-hard for poly
circuits, then g fn is roughly (1-NSe(g))-hard
for poly circuits.
28Why noise sensitivity?
- Suppose f is balanced and (1-e)-hard for poly
circuits. x1, , xk are chosen uniformly at
random, and you, a poly circuit, have to guess
g(f(x1), , f(xk)). - Natural strategy is to try to compute each yi
f(xi) and then guess g(y1,,yk). - But f is (1-e)-hard for you! So Prf(xi)?yi
e. - Success prob.
- Prg(f(x1)f(xk))g(y1yk) 1-NSe(g).
29Hardness of NP
- If (fn) is a (hard) function family in NP, and
(gk) is a monotone function family, then (gk
fn) is in NP. - We give constructions and prove tight bounds for
the problem of finding monotone g such that
NSe(g) is very large (close to ½) for e very
small. - Thm If NP is (1-1/poly(n))-hard for poly ckts,
then NP is (½ 1/vn)-hard for poly ckts.
30Learning algorithms
31Learning theory
- Learning theory (Valiant84) deals with the
following scenario - someone holds an n-bit boolean function f
- you know f belongs to some class of fcns (eg,
parities of subsets, poly size DNF) - you are given a bunch of uniformly random labeled
examples, (x, f(x)) - you must efficiently come up with a hypothesis
function h that predicts f well
32Learning noise-stable functions
- We introduce a new idea for showing function
classes are learnable - Noise-stable classes are efficiently learnable
- Thm Suppose C is a class of boolean fcns on n
bits, and for all f ? C, NSe(f) ß(e). Then
there is an alg. for learning C to within
accuracy e in time - nO(1)/ß (e).
-1
33Example halfspaces
- E.g., using Peres98, every boolean function f
which is the intersection of two halfspaces has
NSe(f) O(ve). - Cor The class of intersections of two
halfspaces can be learned in time nO(1/e²). - No previously known subexponential alg.
- We also analyze the noise sensitivity of some
more complicated classes based on halfspaces and
get learning algs. for them.
34Why noise stability?
- Suppose a function is fairly noise stable. In
some sense this means if you know f(x), you have
a good guess for f(y) for ys which are somewhat
close to x in Hamming distance. - Idea Draw a net of examples (x1, f(x1)),
(xM, f(xM)). To hypothesize about y, compute a
weighted average of known labels, based on dist.
to y hypothesis - sgn w(?(y,x1))f(x1) w(?(y,xM))f(xM) .
35Learning from random walks
- Holy grail of learning Learn poly size DNF
formulas in polynomial time. - Consider natural weakening of learning examples
not iid, come from random walk. - We show DNF poly-time learnable in this model.
Indeed, also in a harder model NS-model
examples are (x,f(x),y,f(y)) - Proof estimate NS on subsets of input bits ?
find large Fourier coefficients.
36Learning juntas
- The essential blocking issue for learning poly
size DNF formulas is that they can be O(log
n)-juntas. - Previously, no known algorithm for learning
k-juntas in time better than the trivial nk. - We give the first improvement algorithm runs in
time n.704k. - Can the strong relationship between juntas and
noise sensitivity improve this?
37Coin flipping
38The T1-2e operator
- T1-2e operates on the space of functions 0,1n
? R - T1-2e(f) (x) E f(y) ( Prf(y) 1).
- Notable fact about T1-2e the Bonami-Beckner
Bon68 hypercontractive inequality T?(f)2
f1?²
y noisee(x)
39Bonami, Beckner
40The T1-2e operator
- It follows easily that
- NSe(f) ½ - ½ Tv1-2e(f)2.
- Thus studying noise sensitivity is equivalent to
studying the (2-)norm of the T1-2e operator. - We consider studying higher norms of the T1-2e
operator. The problem can be phrased
combinatorially, in terms of a natural coin
flipping problem.
41Cosmic coin flipping
- n random votes cast in an election
- we use a balanced election scheme, f
- k different auditors get copies of the votes
however, each gets an e-noisy copy - what is the probability all k auditors agree on
the winner of the election? - Equivalently, k distributed parties want to flip
a shared random coin given noisy access to a
cosmic random string.
42Relevance of the problem
- Application of this scenario Everlasting
security of DingRabin01 a cryptographic
protocol assuming that many distributed parties
have access to a satellite broadcasting stream of
random bits. - Also a natural error-correction problem without
encoding, can parties attain some shared entropy?
43Success as function of k
- Most interesting asymptotic case e a small
constant, n unbounded, k ? 8. What is the maximum
success probability? - Surprisingly, goes to 0 only polynomially
- Thm The best success probability of k players
is Õ(1/k4e), with the majority function being
essentially optimal.
44Reverse Bonami-Beckner
- To prove that no protocol can do better than
k-O(1), we need to use a reverse Bonami-Beckner
inequality Bor82 for f 0, t 0, - T?(f)1-t/? f1-t?
- Concentration of measure interpretation Let A
be a reasonably large subset of the cube. Then
almost all x have Pry ? A somewhat large.
45Conclusions
46Open directions
- estimate the noise sensitivity of various classes
of functions general intersections of threshold
functions, percolation functions, - new hardness of approx. results using NS-junta
connection DS02,Kho02,DF03? - find a substantially better algorithm for
learning juntas - explore applications of reverse Bonami-Beckner
coding theory, e.g.?