Title: CoNP problems on random inputs
1Co-NP problems on random inputs
- Paul Beame
- University of Washington
2Basic idea
- NP is characterized by a simple property - having
short certificates of membership - Show that co-NP doesnt have this property
- would separate P from NP so probably quite hard
- Lots of nice, useful baby steps towards answering
this question
3Certifying language membership
- Certificate of satisfiability
- Satisfying truth assignment
- Always short, SAT NP
- Certificate of unsatisfiability
- ?????
- transcript of failed search for satisfying truth
assignment - Frege-Hilbert proofs, resolution
- Can they always be short? If so then NPco-NP.
4Proof systems
- A proof system for L is a polynomial time
algorithm A s.t. for all inputs x - x is in L iff there exists a certificate
P s.t. A accepts input (P,x) - Complexity of a proof system
- How big P has to be in terms of x
- NP L L has polynomial-size proofs
5Propositional proof systems
- A propositional proof system is a polynomial time
algorithm A s.t. for all formulas F - F is unsatisfiable iff
- there exists a certificate P s.t. A
accepts input (P,F)
6Sample propositional proof systems
- Truth tables
- Axiom/Inference systems, e.g.
- modus ponens A, (A -gt B) B
- excluded middle (A v A)
- Tableaux/Model Elimination systems
- search through sub-formulas of input formula that
might be true simultaneously - e.g. if (A -gt B) is true A must be true and B
must be false
7Frege Systems
- Finite of axioms/inference rules
- Proof of unsatisfiability of F - sequence F1, ,
Fr of formulas s.t. - F1 F
- each Fj is an axiom or follows from previous ones
via an inference rule - Fr L trivial falsehood
- All of equivalent complexity up to poly
8Resolution
- Frege-like system using CNF clauses only
- Start with original input clauses of CNF F
- Resolution rule
- (A v x), (B v x) (A v B)
- Goal derive empty clause L
- Most-popular systems for practical theorem-proving
9Davis-Putnam (DLL) Procedure
- Both
- a proof system
- a collection of algorithms for finding proofs
- As a proof system
- a special case of resolution where the pattern
of inferences forms a tree. - The most widely used family of complete
algorithms for satisfiability
10Simple Davis-Putnam Algorithm
- Refute(F)
- While (F contains a clause of size 1)
- set variable to make that clause true
- simplify all clauses using this assignment
- If F has no clauses then
- output F is satisfiable and HALT
- If F does not contain an empty clause then
- Choose smallest-numbered unset variable x
- Run Refute( )
- Run Refute( )
splitting rule
11Hilberts Nullstellensatz
- System of polynomials Q1(x1,,xn)0,,Qm(x1,,xn)
0 over field K has no solution in any
extension field of K
iff
there exist polynomials P1(x1,,xn),,Pm(x1,,x
n) in Kx1,,xn s.t.
12Nullstellensatz proof system
- Clause (x1 v x2 v x3)
becomes equation (1-x1)x2(1-x3)0 - Add equations xi2-xi 0 for each variable
- Proof polynomials P1,, Pmn proving
unsatisfiability
13Polynomial Calculus
- Similar to Nullstellensatz except
- Begin with Q1,,Qmn as before
- Given polynomials R and S can infer
- a R b S for any a, b in K
- xi R
- Derive constant polynomial 1
- Degree maximum degree of polynomial appearing
in the proof - Can find proof of degree d in time nO(d) using
Groebner basis-like algorithm
14Cutting Planes
- Introduced to relate integer and linear
programming - Clause (x1 v x2 v x3)
becomes inequality x11-x2x3 1 - Add xi 0 and 1-xi 0
- Derive 0 1 using rules for adding inequalities
and Division Rule - acxbcy d implies axby d/c
15Some Proof System Relationships
ZFC
P/poly-Frege
Frege
Cutting Planes
AC0-Frege
Polynomial Calculus
Resolution
Nullstellensatz
Davis-Putnam
Truth Tables
16Random k-CNF formulas
- Make m independent choices of one of the
clauses of length k - D m/n is the clause-density of the formula
- Distribution
17Threshold behavior of random k-SAT
18Contrast with ...
- Theorem CS For every constant D, random k-CNF
formulas almost certainly require resolution
proofs of size 2W(n) - What is the dependence on D ?
19Width of resolution proofs
- If P is a resolution proof width(P)
length of longest clause in P - Theorem BW Every Davis-Putnam (DLL) proof of
size S can be converted to one of width log2S - Theorem BW Every resolution proof of size S
can be converted to one of width
20Sub-critical Expansion
- F - a set of clauses
- s(F) - minimum size subset of F that is
unsatisfiable - d F - boundary of F - set of variables appearing
in exactly one clause of F - e(F) - sub-critical expansion of F
max min d G G
F, s/2 lt G s s s(F)
21 Width and expansion
- Lemma CS If P is a resolution proof of F then
width(P) e(F).
s(F)
s/2 to s
G
contains d G
22 Consequences
- Corollaries
- Any Davis-Putnam (DLL) proof of F requires size
at least 2e(F) - Any resolution proof of F requires size at
least
23s(F) and e(F) for random formulas
- If F is a random formula from then
- s(F) is W (n/D1/(k-2)) almost certainly
- e(F) is W (n/D2/(k-2)e) almost certainly
- Proved for Hypergraph expansion
24Hypergraph Expansion
- F - hypergraph
- d F - boundary of F - set of degree 1 vertices
of F - sH(F) - minimum size subset of F that does not
have a System of Distinct Representatives - eH(F) - sub-critical expansion of F -
max min d G G
F, s/2 lt G s s sH(F)
25System of Distinct Representatives
variables/nodes
clauses/edges
sH(F) s(F) so eH(F) e(F)
26Density and SDRs
- The density of a hypergraph is (edges)/(vertices
) - Halls Theorem A hypergraph F has a system of
distinct representatives iff every subgraph has
density at most 1.
27Density and Boundary
- A k-uniform hypergraph of density bounded below
2/k, say 2/k-e , has average degree bounded below
2 - constant fraction of nodes are in the boundary
28Density of random formulas
- Fix set S of vertices/variables of size r
- Probability p that a single edge/clause lands in
S is at most (r/n)k - Probability that S contains at least q edges is
at most
29s(F) for random formulas
- Apply for qr1 for all r up to s using union
bound - for s O(n/D1/(k-2))
30e(F) for random formulas
- Apply for q2r/k for all r between s/2 and s
using union bound - for s Q(n/D2/(k-2))
31Hypergraph Expansion and Polynomial Calculus
- Theorem BI The degree of any polynomial
calculus or Nullstellensatz proof of
unsatisfiability of F is at least eH(F)/2 if the
characteristic is not 2. - Groebner basis algorithm bound is only
nO(eH(F))
32k-CNF and parity equations
- Clause (x1 v x2 v x3)
is implied by x1(x21)x3 1 (mod 2)
i.e. x1x2x3 0 (mod 2) - Derive contradiction 0 1 (mod 2) by adding
collections of equations - of variables in longest line is at least eH(F)
33Parity equations and polynomial calculus
- Given equations of form
- x1x2x3 0 (mod 2)
- Polynomial equation yi2-10 for each variable
- yi 2xi-1
- Polynomial equation y1 y2 y3-10
- would be y1 y2 y310 if RHS were 1
- Imply the old Nullstellensatz equations if
char(K) is not 2
34Lower bounds
- For random k-CNF chosen from almost certainly
for any egt0 - Any Davis-Putnam proof requires size
- Any resolution proof requires size
- Any polynomial calculus proof requires degree
35Upper Bound
- Theorem BKPS For F chosen from and D
above the threshold, the simple Davis-Putnam
(DLL) algorithm almost certainly finds a
refutation of size - and this is a tight bound...
36Idea of proof
- 2-clause digraph
- (x v y)
- Contradictory cycle contains both x and x
- After setting O(n/D1/(k-2)) variables,
gt 1/2 the variables are almost certainly in
contradictory cycles of the 2-clause digraph - a few splitting steps will pick one almost
certainly - setting clauses of size 1 will finish things off
37Implications
- Random k-CNF formulas are provably hard for the
most common proof search procedures. - This hardness extends well beyond the phase
transition. - Even at clause ratio Dn1/3, current algorithms
on random 3-CNF formulas have asymptotically the
same running time as the best factoring
algorithms.
38Random graph k-colourability
- Random graph G(n,p) where each edge occurs
independently with probability p - Sharp threshold for whether or not graph is
k-colourable, e.g. p 4.6/n for k3 - What about proofs that the graph is not
k-colourable?
39Lower Bound
- Theorem BCM 99 Non-k-colourability requires
exponentially large resolution proofs - Basic proof idea
- same outline as before
- notion of boundary of a sub-graph
- set of vertices of degree lt k
- s(G) smallest non-k-colourable sub-graph
40Challenges
- Better bound for e(F) for random F
- Can it be Q(s(F)) ?
- If so, the simple Davis-Putnam algorithm has
asymptotically best possible exponent of any DP
algorithm. - Extend lower bounds to other proof systems
- must be based on something other than expansion
since certain formulas with high expansion have
small Cutting Planes proofs.
41 Challenges
- Conjecture Random k-CNF formulas are hard for
Frege proofs - Extend to other random co-NP problems
- Independent Set?
- Best algorithms only get within factor of 2 of
the largest independent set in a random graph
42Sources
- Cook, Reckhow 79
- Chvatal, Szemeredi 89
- Mitchell, Selman, Levesque 93
- Beame, Pitassi 97
- Beame, Karp, Pitassi, Saks 98
- Beame, Pitassi 98
- Ben-Sasson, Wigderson 99
- Ben-Sasson, Impagliazzo 99
- Beame, Culberson, Mitchell 99
43Circuit Complexity
- P/poly - polysize circuits
- NC1 - polysize formulas
- CNF - polysize CNF formulas
- AC0 - constant-depth polysize circuits
using and/or/not - AC0m - also 0 mod m tests
- TC0 - threshold instead
44C-Frege Proofs
- Given circuit complexity class C can define
C-Frege proofs to be Frege-like proofs that
manipulate circuits in C rather than formulas - Frege NC1-Frege
- Resolution CNF-Frege
- Extended-Frege P/poly-Frege
- AC0-Frege
- AC0m-Frege
- TC0-Frege