Title: How%20NP%20got%20a%20new%20definition:
1How NP got a new definition
- Probabilistically Checkable Proofs (PCPs)
Approximation Properties of NP-hard problems
SANJEEV ARORA PRINCETON UNIVERSITY
2Talk Overview
- Recap of NP-completeness and its philosophical
importance. - Definition of approximation.
- How to prove approximation is NP-complete (new
definition of NP PCP Theorem) - Survey of approximation algorithms.
3A central theme in modern TCS Computational
Complexity
How much time (i.e., of basic operations) are
needed to solve an instance of the problem?
Example Traveling Salesperson Problem on n
cities
n 49
Number of all possible salesman tours n! (gt
of atoms in the universe for n 49)
One key distinction Polynomial time (n3, n7
etc.) versus Exponential time (2n, n!,
etc.)
4Is there an inherent difference between being
creative / brilliant and being able to
appreciate creativity / brilliance?
- Writing the Moonlight Sonata
- Proving Fermats Last Theorem
- Coming up with a low-cost salesman tour
- Appreciating/verifying any of the above
When formulated as computational effort, just
the P vs NP Question.
5P vs. NP
NPC
YES answer has certificate of O(nc) size,
verifiable in O(nc) time.
NP
P
Solvable in O(nc) time.
NP-complete Every NP problem is reducible to it
in O(nc) time. (Hardest)
e.g., 3SAT Decide satisfiability of a boolean
formula like
6Practical Importance of P vs NP 1000s of
optimization problems are NP-complete/NP-hard.
(Traveling Salesman,CLIQUE, COLORING,
Scheduling, etc.)
Pragmatic Researcher
Why the fuss? I am perfectly content with
approximatelyoptimal solutions. (e.g., cost
within 10 of optimum)
Good news Possible for quite a few problems.
Bad News NP-hard for many problems.
7Approximation Algorithms
MAX-3SAT Given 3-CNF formula ?, find assignment
maximizing the number of satisfied clauses.
An ?-approximation algorithm is one that for
every formula, produces in polynomial time an
assignment that satisfies at least OPT/?
clauses. (? 1).
Good News KZ97 An 8/7-approximation
algorithm exists.
Bad News Hastad97 If P ? NP then for every ?
gt 0, an(8/7 -?)-approximation algorithm does not
exist.
8Observation (1960s thru 1990)
NP-hard problems differ with respect to
approximability
Johnson74 Provide explanation?
Classification?
Last 15 years Avalanche of Good and Bad news
9Next few slides How to rule out existenceof
good approximation algorithms(New definition of
NP via PCP Theoremsand why it was needed)
10Recall Reduction
- If you give me a place to stand, I will move
the earth. Archimedes ( 250BC)
a 1.01-approximation for MAX-3SAT
If you give me a polynomial-time algorithm for
3SAT, I will give you a polynomial-time
algorithm for every NP problem. --- Cook,
Levin (1971)
A., Safra A., Lund, Motwani, Sudan, Szegedy
1992
Every instance of an NP problem can be disguised
as an instance of 3SAT.
MAX-3SAT
11Desired
Way to disguise instances of any NP problem as
instances of MAX-3SAT s.t.
Yes instances turn into satisfiable
formulaeNo instances turn into formulae in
which lt 0.99fraction of clauses can be
simultaneously satisfied
Gap
12 Cook-Levin reduction doesnt produce
instanceswhere approximation is hard.
Main point Expressthese as boolean formula
But, there always exists a transcript that
satisfies almost alllocal constraints! (No Gap)
13New definition of NP.
14Recall Usual definition of NP
M
x is a YES input
? there is a ? s.t. M accepts (x, ?) x is a
NO input ? M rejects (x,
?) for every ?
15NP PCP (log n, 1) AS92ALMSS92 inspired
by BFL90, BFLS91FGLSS91
M
Reads Fixed number of bits(chosen in randomized
fashion)
Uses O(log n) random bits
(Only 3 bits ! (Hastad 97))
Many otherPCP Theoremsknown now.
x is a YES input
? there is a ? s.t. M accepts (x, ?) x is a
NO input ?for every ?,
M rejects (x, ?)
Pr 1
Pr gt 1/2
16Disguising an NP problem as MAX-3SAT
INPUT x
?
M
O(lg n) random bits
Note 2O(lg n) nO(1). ) M nO(1) constraints,
each on O(1) bits x is YES instance ) All are
satisfiable x is NO instance ) ½ fraction
satisfiable
gap
17Of related interest.
Do you need to read a math proof completely to
check it?
Recall Math can be axiomatized (e.g., Peano
Arithmetic) Proof Formal sequence of
derivations from axioms
18 Verification of math proofs
PCP Theorem
(spot-checking)
n bits
Theorem
Proof
M
O(1) bits
M runs in poly(n) time
- Theorem correct ? there is a proof that M accepts
w. prob. 1 - Theorem incorrect ? M rejects every claimed proof
w. prob 1/2 -
19Known Inapproximability ResultsThe tree of
reductions AL 96
MAX-3SAT
PY 88
FGLSS 91, BS 89
PY 88 OTHERS
MAX-3SAT(3)
Metric TSP Vertex Cover MAX-CUT
STEINER...
Class I 1?
CLIQUE
LY 93, ABSS 93
LY 93
LABEL COVER
COLORING
LY 93
NEAREST VECTOR MIN-UNSATISFY
QUADRATIC -PROGRAMMING LONGEST PATH ...
INDEPENDENT SET BICLIQUE COVER
FRACTIONAL COLORING MAX-PLANAR SUBGRAPH
MAX-SET PACKING MAX-SATISFY
Class IV n?
Class III 2(lg n)1-?
SET COVER
Class II O(lg n)
HITTING SET DOMINATING SET HYPERGRAPH -
TRAVERSAL ...
20Proof of PCP Theorems( Some ideas )
21Need for robust representation
?
O(lg n) random bits
3 bits
Randomly corrupt 1 of ?
Correct proof still accepted with 0.97-?
probability!
Original proof of PCP Thm used polynomial
representations, Local testing algorithms for
polynomials, etc. (30-40 pages)
22New Proof (Dinur06) 15-20 pages
Repeated applications of two operations on the
clauses
Globalize Create new constraints using walks
in the adjacency graph of the old constraints.
Domain reduction Change constraints so variables
take values in a smaller domain (e.g., 0,1)
(uses ideas from old proof)
23Unique game conjecture and why it is useful
Problem Given system of equations modulo p (p is
prime). 7x2 2x4 6 5x1 3x5 2 ?
? 7x5 x2 21
2 variables per equation
- UGC (Khot03) Computationally intractable to
distinguish between the cases - 0.99 fraction of equations are simultaneously
satisfiable - no more than 0.001 fraction of equations are
simultaneously satisfiable.
Implies hardness of approximating vertex cover,
max-cut, etc. (K04), (KR05)(KKMO05)
24Applications of PCP Techniques Tour dHorizon
- Locally checkable / decodable codes.
- List decoding of error-correcting codes.
- Private Info Retrieval
- Zero Knowledge arguments / CS proofs
- Amplification of hardness / derandomization
- Constructions of Extractors.
- Property testing
Sudan 96, Guruswami-Sudan
Katz, Trevisan 2000
Kilian 94 Micali
Lipton 88 A., Sudan 97
Sudan, Trevisan, Vadhan
Safra, Ta-shma, Zuckermann Shaltiel, Umans
Goldreich, Goldwasser, Ron 97
25Approximation algorithms Some major ideas
How can you prove that the solution you found
hascost at most 1.5 times (say) the optimum cost?
- Relax, solve, and round Represent problem
using a linear or semidefinite program, solve
to get fractional solution, and round to get an
integer solution. (e.g., MAX-CUT, MAX-3SAT,
SPARSEST CUT)
- Primal-dual Grow a solution edge by edge
prove its near optimality using LP duality.
(Usually gives fasteralgorithms.) e.g., NETWORK
DESIGN, SET COVER
- Show existence of easy to find near-optimal
solutionse.g., Euclidean TSP and Steiner Tree
26Next few slides The semidefinite programming
approach
What is semidefinite programming?
Ans. Generalization of linear programming
variables arevectors instead of fractions.
Nonlinear optimization. Groetschel, Lovasz,
Schrijver 81 first used in approximation
algorithms by Goemans-Williamson94
27Main Idea
G (V,E)
Round
Ex 1.13 ratio for MAX-CUT, MAX-2SAT GW
93 O(lg n) ratio for min-multicut, sparsest
cut. LLR 94, AR 94 n1/4-coloring of
3-colorable graphs. KMS 94 (lg n)O(1) ratio
for min-bandwidth and related problems F 98,
BKRV 98 8/7 ratio for MAX-3SAT KZ 97 plog
n-approximation for graph partitioning problems
(ARV04)
How do you analyze these vector programs?
Ans. Geometric arguments sometimes very
complicated
28Ratio 1.13.. for MAX-CUT
GW 93
Semidefinite Relaxation DP 91, GW 93
29Randomized Rounding
GW 93
Rn
v2
v1
v6
Form a cut by partitioning v1,v2,...,vn around a
random hyperplane.
v3
v5
SDPOPT
Old math rides to the rescue...
30sparsest cut edge expansion
Input A graph G(V,E).
For a cut (S,S) let E(S,S) denote the
edges crossing the cut.
The sparsity of S is the value
The SPARSEST CUT problem is to find the cut which
minimizes ?(S).
SDPs used to give plog n -approximation involves
proving a nontrivial fact about high-dimensional
geometry ARV04
31ARV structure theorem
Arora, Rao, and Vazirani showed how the SDP could
be rounded to obtain an
approximation to Sparsest Cut (2004)
After we have such A and B, it is easy to extend
them to a good sparse cut in G.
ARV structure theorem If the points xu 2 Rn are
well-spread, e.g. ?u,v (xu-xv)2 0.1 and xu2
10 for u 2 V and d(u,v) (xu-xv)2 is a metric,
then
A
There exist two large, well-separated sets
A, B µ x1, x2, , xn with A,B 0.1 n and
B
32Unexpected progress inother disciplines
ARV structure theorem led to new understanding
ofthe interrelationship between l1 and l2
norms (resolved open question in math)
l1 distances among n points can be realized as
l2 distances among some other set of n points,
andthe distortion incurred is only plog n
A., Lee, Naor05, building upon Chawla Gupta
Raecke05
33Theory of Approximability?
- Desired Ingredients
- Definition of approximation-preserving reduction.
- Use reductions and algorithms to show
Partial Progress Max-SNP Problems similar to
MAX-3SAT. PY 88 RMAX(2) Problems similar to
CLIQUE. PR 90 F?2(1) Problems similar to SET
COVER. KT 91 MAX-ONES CSP, MIN-CSP,etc.
(KST97, KSM96)
34Further Directions
- Investigate alternatives to approximation
- Average case analysis
- Slightly subexponential algorithms (e.g. 2o(n)
algorithm for CLIQUE??) - Resolve the approximability of graph partitioning
problems. (BISECTION, SPARSEST CUT, - plog n vs loglog n??) and Graph Coloring
- 3. Complete the classification of problems
w.r.t. approximability. - 4. Simplify proofs of PCP Thms even further.
- 5. Resolve unique gamesconjecture.
- 6. Fast solutions to SDPs? Limitations of SDPs?
35Attributions
FGLSS 91 ALMSS 92
Arora, Safra 92
Lund, Fortnow, Karloff, Nisan 90 Shamir
90 Babai, Fortnow 90 Babai, Fortnow, Levin,
Szegedy 91
Håstad 96, 97
36Constraint Satisfaction Problems
Schaefer 78
Let F a finite family of boolean
constraints. An instance of CSP(F)
x1
x2
xn
. . . . . . . . . . . .
variables
functions from F
g1
g2
gm
. . . . . . . . . . . .
Ex
Dichotomy Thm
37MAX-CSP
Creignou 96 Khanna, Sudan, Williamson 97
MAX-SNP-hard (1?) ratio is NP-hard
P
Iff F is 0-valid, 1-valid, or 2-monotone
(Supercedes MAXSNP work)
MAX-ONES-CSP
KSW 97
Ex
Feasibility is undecidable
P
Feasibilty NP-hard
n?
1?
MIN-ONES-CSP
KST 97
Ex
Feasibilty NP-hard
MIN-HORN-DELETION-complete
P
n?
1?
NEAREST-CODEWORD-complete
38Geometric Embeddings of Graphs
39Example Low Degree Test
F GF(q)
Is f a degree d polynomial ?
Does f agree with a degree d polynomial in 90
of the points?
f F m ! F
Easy f is a degree d polynomial iff its
restriction on every line is a univariate degree
d polynomial. Line 1 dimensional affine
subspace q points.
Theorem Iff on 90 of lines, f has agreement
90 with a univariate degree d
polynomial. Weaker results Babai, Fortnow,
Lund 90 Rubinfeld Sudan 92 Feige,
Goldwasser, Lovász, Szegedy 91 Stronger
results A. Sudan 96 Raz, Safra 96
40Example Low Degree Test
F GF(q)
Is f a degree d polynomial ?
Does f agree with a degree d polynomial in 90
of the points?
f F m ! F
Theorem Iff on 90 of lines, f has agreement
90 with a univariate degree d
polynomial. Weaker results Babai, Fortnow,
Lund 90 Rubinfeld Sudan 92 Feige,
Goldwasser, Lovász, Szegedy 91 Stronger
results A. Sudan 96 Raz, Safra 96
41The results described in this paper indicate a
possible classification of optimization problems
as to the behavior of their approximation
algorithms. Such a classification must remain
tentative, at least until the existence of
polynomial-time algorithms for finding optimal
solutions has been proved or disproved. Are there
indeed O(log n) coloring algorithms? Are there
any clique finding algorithms better than O(ne)
for all egt0? Where do other optimization problems
fit into the scheme of things? What is it that
makes algorithms for different problems behave
the same way? Is there some stronger kind of
reducibility than the simple polynomial
reducibility that will explain these results, or
are they due to some structural similarity
between te problems as we define them? And what
other types of behavior and ways of analyzing and
measuring it are possible?
David Johnson, 1974
42NP-hard Optimization Problems
MAX-3SAT Given 3-CNF formula ?, find assignment
maximizing the number of satisfied clauses.
MAX-LIN(3) Given a linear system over GF(2) of
the form
find its largest feasible subsystem.
43Approximation Algorithms
Defn An ?-approximation for MAX-LIN(3) is a
polynomial-time algorithm that computes, for each
system, a feasible subsystem of size
. (? 1)
Easy Fact 2-approximation exists.
44Common Approx. Ratios
45Early History
- Grahams algorithm for multiprocessor
scheduling approx. ratio 2 - 1971,72 NP-completeness
- Sahni and Gonzalez Approximating TSP is
NP-hard - 1975 FPTAS for Knapsack IK
- 1976 Christofides heuristic for metric TSP
- 1977 Karps probabilistic analysis of Euclidean
TSP - 1980 PTAS for Bin Packing FL KK
- 1980-82 PTASs for planar graph problems LT, B
46Subsequent Developments
- 1988 MAX-SNP MAX-3SAT is complete problem PY
- 1990 IPPSPACE, MIPNEXPTIME
- 1991 First results on PCPs BFLS, FGLSS
- 1992 NPPCP(log n,1) AS,ALMSS
- 1992-95 Better algorithms for scheduling, MAX-
CUT GW, MAX-3SAT,... - 1995-98 Tight Lowerbounds (H97) (1 ?)-
approximation for Euclidean TSP, Steiner Tree... - 1999-now Many new algorithms and hardness
results. - 2005 New simpler proof of NPPCP(log n,1) (Dinur)
47SOME NP-COMPLETE PROBLEMS
3SAT Given a 3-CNF formula, like decide if it
has a satisfying assignment.
THEOREMS Given
decide if T has a proof of length n in
Axiomatic Mathematics
Philosophical meaning of P vs NP Is there an
inherent difference between being creative /
brilliant and being
able to appreciate creativity / brilliance?
48Feasible computations those that run in
polynomial (i.e.,O(nc)) time (central tenet
of theoretical computer science) e.g.,
time is infeasible
49NPPCP(log n, 1)
A., Safra 92 A., Lund, Motwani, Sudan,
Szegedy 92
Håstads 3-bit PCP Theorem (1997)
M
O(lg n) random bits
Accept / Reject
gt 1 - ? gt ½ ?
x is a YES input ? there is ? s.t. M
accepts x is a NO input ? for every ?
M rejects
Pr Pr
50(2-?)-approx. to MAX-LIN(3)) PNP
INPUT x
?
M
O(lg n) random bits
Note 2O(lg n) nO(1). ) M nO(1) linear
constraints x is YES instance ) gt 1-? fraction
satisfiable x is NO instance ) ½? fraction
satisfiable
51Polynomial Encoding
Idea 1
LFKN 90 BFL 90
Sequence of bits / numbers
2 4 5 7
Represent as m-variate degree d polynomial
2x1x2 4x1(1-x2) 5x2(1-x1) 7(1-x1)(1-x2)
Evaluate at all points in GF(q)m
Note 2 different polynomials differ in (1-d/q)
fraction of points.
522nd Idea Many properties of polynomials are
locally checkable.
Program Checking Blum Kannan 88 Program
Testing / Correcting Blum, Luby, Rubinfeld
90 MIP NEXPTIME Babai, Fortnow, Lund 90
1st PCP Theorem
Dinur 05s proof uses random walks on expander
graphs instead of polynomials.
53Håstads 3-bit Theorem (and fourier method)
NP PCP(lg n, 1)
V1
T2
T1
1 bit
c bits
YES instances ) 9 T1T2 PrAccept 1 NO
instances ) 8 T1T2 PrAccept lt 1-?
PrAccept 1 vs. PrAccept lt 2-k/10
V0
V2
(A few pages of Fourier Analysis)
9 S1 S2 which V1 accepts w/ Prob 2-k/10 ) x is
a YES instance.
Suppose PrAccept gt ½ ?
54Sparsest Cut / Edge Expansion
G (V, E)
c- balanced separator
Both NP-hard
55c-balanced separator
Semidefinite relaxation for
vi vj2/4 1
vi vj2 0
1
S
-1
Find unit vectors
cut semimetric
in ltn
Assign 1, -1 to v1, v2, , vn to
minimize ?(i, j) 2 E vi
vj2/4 Subject to ?i lt j vi vj2/4
c(1-c)n2
Triangle inequality
vi vj2 vj vk2 vi vk2 8 i, j, k