Title: Complexity Intractable Problems
1ComplexityIntractable Problems
2Complexity
- A decidable problem is
- computationally solvable in principle, but not
necessarily in practice - Problem is resource consumption
- Time
- Space
3Tractibility
- Some problems are undecidable no computer can
solve them - E.g., Turings Halting Problem
- Other problems are decidable, but intractable
as they grow large, we are unable to solve them
in reasonable time - What constitutes reasonable time?
4Example
- Consider A 0n1n n 0
- Clearly this language is decidable.
- Question
- How much time does a single-tape TM need to
decide it?
5Example
- M1 On input w where w is a string,
- Scan across tape and reject if 0 is found to the
right of a 1 - Repeat the following if both 0s and 1s appear
on tape - scan across tape, crossing off a single 0 and a
single 1 - If 0s still remain after all the 1s have been
crossed out, or vice-versa, reject - Otherwise, if the tape is empty, accept
6Question
- So how much time does M1 need?
- Number of steps may depend on several parameters
- Example
- if input is a graph, could depend on number of
- Nodes
- Edges
- Maximum degree
- All, some, or none of the above!
7Our Gordian knot solution
- Definition
- Complexity is measured as function of input
string length - worst case longest running time on input of
given length - average case average running time on given
length - Actually, here we consider worst case.
8Definition
- Let M be a deterministic TM that halts on all
inputs - The running time of M is a function
- T N ? N
- where T(n) is the maximum running time of M on
input of length n - I.e., when M is given input of length n, M halts
after at most T(n) moves, whether or not it
accepts - Terminology
- M runs in time T(n)
- M is an T(n)-time TM
9Running Time
- The exact running time of most algorithms is
quite complex - Better to estimate it
- Informally, we want to focus on important
parts only - Example
- 6n3 2n2 20n 45 has four terms
- 6n3 more import
- n3 most important
10Big O
- Given functions f and g, we say that f is O(g)
provided g eventually beats out f - Definition f(n) is O(g(n)) provided there exists
a positive integer n0 and a number c such that
f(n) c g(n) for all n ³ n0
11Example (log n)2 is O(n)
f(n) (log n)2 g(n) n
(log n)2 n for all n 16, so (log n)2 is O(n)
12Asymptotic Notation
- Consider functions
- f, g N ? R
- We say that f(n) O(g(n)) if there exist
positive integers c and n0 such that f(n) c
g(n) for n n0
13Confused?
- Basic idea ignore constant factor differences
- 617n3 277x2 720x 7x O(n3)
- 2 O(1)
- sin(x) O(1)
14Reality Check
- Consider
- f1(n) 5n3 2n 22n 6
- We claim that
- f1(n) O(n3)
- Let c 6 and n0 10. Then
- 5n3 2n 22n 6 6n3
- for every n 10
15Reality Check (Part Two)
- Recall
- f1(n) 5n3 2n 22n 6
- we have seen that
-
- f1(n) O(n3)
-
- but f1(n) is not O(n2), because no value for c
or n0 works!
16Logarithms
- The big-O interacts with logarithms in a
particular way. High-school identity - logb n log2 n
- log2 b
- changing b changes only constant factor
- When we say f(n) O(log n), the base is
unimportant
17Important Notation
- Sometimes we will see
- f(n) O(n2) O(n)
- Each occurrence of O symbol is distinct constant.
- But O(n2) term dominates O(n) term, equivalent to
f(n) O(n2)
18Important Notions
- A bound of nc, where c gt 0, is called polynomial.
- A bound of 2(n?), where ? gt 0, is called
exponential.
19Back to Business
- M1 On input w where w is a string,
- Scan across tape and reject if 0 is found to the
right of a 1 - Repeat the following if both 0s and 1s appear
on tape - scan across tape, crossing off a single 0 and a
single 1 - If 0s still remain after all the 1s have been
crossed out, or vice-versa, reject - Otherwise, if the tape is empty, accept
20Analysis
- Consider stages separately
- Scan across tape and reject if 0 is found to the
right of a 1. - Scanning requires n steps.
- Repositioning head requires n steps.
- Total is 2n O(n) steps.
21Analysis
- 2. Repeat the following if both 0s and 1s
appear on tape - scan across tape, crossing off a single 0 and a
single 1 - Each scan requires O(n) steps
- Because each scan crosses off two symbols, at
most n/2 scans can occur. - Total is (n/2)O(n) O(n2) steps.
22Analysis
- 3. If 0s still remain after all 1s have been
crossed out, or vice-versa, reject. Otherwise, if
the tape is empty, accept. - Single scan requires O(n) steps.
- Total is O(n) steps.
23Analysis
- Total cost for stages
- O(n)
- O(n2)
- O(n)
- which is O(n2)
24Definition
- Let
- T N ? N
- be a function.
- Definition
- TIME(t(n)) LL is a language decided
- by an O(t(n))-time TM
25Question
- We have seen that
- A 0n1n n 0
- A ? TIME(n2).
- Can we do better?
26Home Improvement
- M2 On input string w,
- 1. Scan across tape and reject if 0 is found to
the right of a 1. - 2. Repeat the following if both 0s and 1s appear
on tape - 2.1 Scan across tape, checking whether total
number - of 0s and 1s is even or odd. If odd,
reject. - 2.2 Scan across tape, crossing off every other 0
- (starting with the first), and every other
1 (starting - with the first).
- 3. If no 0s or 1s remain, accept, otherwise
reject.
27Analysis
- Check that M2 halts
- On each scan in step 2.2,
- The total number of 0s is cut in half, and
- remainder discarded.
- Same for 1s.
- Example start with 13 0s
- first pass 6
- then 3
- then 1
- then 0
28Analysis
- Check that M2 is correct
- Consider parity of 0s and 1s in 2.1
- Example start with 13 0s and 13 1s
- odd, odd (13)
- even, even (6)
- odd, odd (3)
- odd, odd (1)
- Result is 1011, reverse of binary representation
of 13 - Each pass checks one binary digit
29Running Time
M2 On input string w, 1. Scan across tape and
reject if 0 is found to the right of a 1.
2. Repeat the following if both 0s and 1s appear
on tape 2.1 Scan across tape, checking whether
total number of 0s and 1s is even or odd.
If odd, reject. 2.2 Scan across tape, crossing
off every other 0 (starting with the
first), and every other 1 (starting with
the first). 3. If no 0s or 1s remain, accept,
otherwise reject.
- Each stage takes O(n) time.
- Stages 1 and 3 once
- Stage 2.2 eliminates half of 0s and 1s 1 log2n
times - Total for 2 is (1 log2n)O(n) O(n log n)
- Grand total O(n) O(n log n) O(n log n)
30A Two Tape TM
- M3 on input string w
- 1. Scan across tape and reject if 0 is found to
the right of a 1. - 2. Scan across 0s to first 1, copying 0s to tape
2. - 3. Scan across 1s on tape 1 until the end. For
each 1, cross off a 0. If no 0s left, reject. - 4. If any 0s left, reject, otherwise accept.
- Question What is the running time?
31Complexity
- Deciding 0n1n
- single-tape M1 O(n2).
- single-tape M2 O(n log n) (fastest possible!)
- two-tape M3 O(n).
- Complexity and computability differ in one
important way. - Computability all reasonable models equivalent
(Church-Turing) - Complexity choice of model affects time
complexity. - Question How does model affect complexity?
32Models and Complexity
- Let t(n) be a function where t(n) n.
- Any t(n)-time multitape TM has an equivalent
O(t2(n))-time single tape TM.
33Two-tape operation
- On input w w1 wn , S
- puts on its tape
- w1 w2 wn
- scans its tape from first to k 1-st to
read - symbols under virtual heads.
- rescans to write new symbols and move heads
- if S tries to move virtual head onto , then M
takes - tape fault and re-arranges tape.
34Complexity
- For each step of M, S performs
- two scans
- up to k rightward shifts
- total of O(t(n)) time
- total
- initial tape arrangement O(n) time
- simulates each of O(t(n)) steps using O(t(n))
steps - for a total of O(t(n)) O(t(n)) O(t2(n)) steps
- grand total O(n) O (t2(n)) steps
- Assumption that t(n) gt n is reasonable, because
M could not even read the entire input in less
time
35Non-Deterministic TMs
- The running time of a non-deterministic TM N is a
function - f N ? N
- where f(n) is the maximum number of steps that N
uses - on any branch of the computation
- on any input of size n
36Deterministic
Non-deterministic
f(n)
f(n)
37Models and Complexity
- Let t(n) be a function where t(n) n. Any
t(n)-time non-deterministic TM has an equivalent
2O(t(n))-time deterministic single-tape TM. - Note contrast with multi-tape result!
38Simulation
- Let N be a non-deterministic TM running in t(m)
time. Simulation basic idea - D tries all possible branches
- If D finds any accepting state, it accepts.
- If all branches reject, D rejects.
- If all branches reject or loop, D loops.
39Simulation
- N s computation is a tree.
- root is starting configuration
- each node has finite fanout b
- each branch has length t(n)
- total number of leaves at most bt(n)
- total number of nodes O(bt(n) )
- time to travel root to node is O(t(n))
- Time to visit each node O(t(n) bt(n) )
O(2O(t(n)))
40Important Distinction
- Polynomial distinction between deterministic
single- and multi-tape TMs, vs. - Exponential distinction between deterministic and
non-deterministic TMs
41Polynomial Good, Exponential Bad
- Complexity differences
- Polynomial small
- Exponential large
42Polynomial Good, Exponential Bad
- Claim All reasonable models of computation
are polynomially equivalent. Any one can simulate
another with only polynomial increase in running
time. - Question is problem solvable in
- linear time? model-specific
- polynomial time? model-independent
-
- We are interested in computation, not models per
se!
43Polynomial boundedness
- A function f(n) is polynomially bounded provided
that f(n) is O(nk) for some positive integer k - Examples
- n3 - 2n2 100, n3/2 log n, and (log n)2 are
polynomially bounded - 2n and nlog n are not
44Its about time
- Time is our most important commodity
- Our gold standard for an algorithm is
- Such an algorithm is called a polynomial-time
algorithm
Its running time is a polynomially
bounded function of the input size
45Justification
- Closure polynomials are closed under sum,
difference, product, and composition - Practicality problems for which polynomial-time
algorithms exist do indeed solve quickly in
practice - Robustness if a problem can be solved in
polynomial time using one reasonable model of
computation, then it can be solved in polynomial
time using any other such model
46The Class P
- Some problems are provably decidable in
polynomial time on an ordinary computer - Polynomial time O(nk) for some constant k
- Vs. exponential time O(2cn) for some constant c
gt 0 - We say such problems belong to the set P
- A language L is in class P if there is some
polynomial T(n) such that L L(M) for some
deterministic TM M of time complexity T(n)
47The Class P
- Invariant for all models of computation
polynomially equivalent to deterministic
single-tape TM - not affected by particulars of model . . .
- go ahead, have another tape, theyre pretty small
. . .
48The Class P
- Roughly corresponds to realistically solvable
(tractable) problems. - Going from exponential to polynomial algorithm
usually requires major insight - Whereas if you find an inefficient polynomial
algorithm, you can often find an efficient one
49The class NP
- Some problems are provably decidable in
polynomial time on a nondeterministic computer - We say such problems belong to the set NP
- Can think of a nondeterministic computer as a
parallel machine that can freely spawn an
infinite number of processes - A language L is in class NP if there is some
polynomial T(n) such that L L(M) for some
nondeterministic TM M of time complexity T(n),
and when M is given an input of length n, there
are no sequences of more than T(n) moves of M
50P and NP
- P set of problems that can be solved in
polynomial time - NP set of problems for which a solution can be
verified in polynomial time - P ? NP
- The big question Does P NP?
51Does P NP?
- Does NP contain some problems not in P?
- Intuitively, answer is yes
- A nondeterministic TM running in polynomial time
has the ability of guess an exponential number of
possible solutions to a problems and check each
one in polynomial time - However
- No one knows for sure
52Two important problems
- MINIMUM SPANNING TREE (MST)
- Given a complete graph on n vertices and an
integer edge length de for each edge e, find a
spanning tree of minimum total length. - TRAVELING SALESMAN (TSP)
- Given a complete graph on n vertices and an
integer edge length de for each edge e, find a
spanning cycle of minimum total length.
Class P
Class NP
53Minimum spanning tree
network nodes possible network links node-node
distances
a
3
2
5
Goal choose links to connect the network
as cheaply as possible
d
2
7
c
b
4
54Brute force 16 trees
10
12
16
8
10
12
14
11
12
14
14
9
9
9
13
11
55Traveling salesman
cities roads city-city distances
a
3
2
5
d
Goal find a shortest possible tour that
visits all four cities
2
7
c
b
4
56Brute force 3 tours
14
16
16
WINNER
57Some numbers
58How long?
- For a 100-vertex instance, we need to check 10196
trees - A computer that could check 10100 trees per
second would take 1088 years to finish
There are about 1080 atoms in the observable
universe
59MST is easy
- There are several well-known procedures for
solving the MST problem - Kruskals algorithm
- Prims algorithm
- Solving the problem means that they always
return an optimal solution - Better yet, they are guaranteed to run quickly
60TSP appears to be hard
- It appears that we can have one but not both of
the following - a procedure that always runs quickly
- a procedure that always returns an optimal
solution - We dont know of any solution procedure that runs
measurably faster than brute force
61Verifiers
- Definition A verifier for a language A is an
algorithm V, where - A w V accepts ltw,cgt for some string c
- A verifier uses additional infomration,
represented by c, to determine that w is a member
of A - Called a certificate, or proof, of membership in
A - Certificate has polynomial length
- A language A is polynomially verifiable if it has
a polynomial time verifier. - Definition NP is the class of languages that
have polynomial time verifiers.
62The Class NP
- A language is in NP iff it is decided by some
nondeterministic polynomial time Turing machine. - Show this by converting a polynomial time
verifier to an equivalent polynomial time NTM and
vice versa - NTM simulates the verifier by guessing the
certificate - The verifier simulates the NTM by using the
accepting branch as the certificate -
63Proof
- For the forward direction of this theorem, let A
? NP and show that A is decided by a polynomial
time NTM N . Let V be the polynomial time
verifier for A that exists by the definition of
NP. Assume that V is a TM that runs in time nk
and construct N as follows. - On input w of length n
- Nondeterministically select string c of length at
most nk - Run V on input ltw,cgt
- If V accepts, accept otherwise, reject
- To prove the other direction of the theorem,
assume that A is decided by a polynomial time NTM
N and construct a polynomial time verifier V as
follows. - On input ltw,cgt, where w and c are strings
- Simulate N on input w, treating each symbol of c
as a description of the nondeterministic choice
to make at each step - If this branch of Ns computation accepts,
accept otherwise, reject
64NTIME
- Definition NTIME(t(n)) L L is a language
described by a O(t(n)) time nondeterministic
Turing machine.
65Example of an NP Problem
- CLIQUE
- A clique in an undirected graph is a subgraph,
qwherein every two nodes are connected by an
edge. - A k-clique if a clique that contains k nodes
A graph with a 5-clique
66CLIQUE
- Clique problem determine whether a graph
contains a clique pf a specified size - Let
- CLIQUE ltG,kgt G is an undirected graph with a
k-clique - Theorem CLIQUE is in NP
67Proof
- Proof idea The clique is the certificate
- Proof
- The following is a verifier V for CLIQUE
- V On input ltltG,kgt,cgt
- Test whether c is a set of k nodes in G
- Test whether G contains all edges connecting
nodes in c - If both pass, accept otherwise, reject
68P vs. NP
- P the class of languages for which membership
can be decided quickly. - NP the class of languages for which membership
can be verified quickly.
69The P vs. NP conjecture
- P Í NP because the solution algorithm can serve
as the certificate
The P vs. NP conjecture P ¹ NP
NP
PNP
P
70NP-Complete Problems
- The NP-Complete problems are problems whose
individual complexity is related to that of the
entire class - If a polynomial time algorithm exists for any of
these problems, all problems in NP would be
polynomial-time solvable - No polynomial-time algorithm has been discovered
for an NP-Complete problem
71Why is NP-Completeness Important?
- Theoretical implications
- A researcher trying to show P NP may focus on
an NP-complete problem - need only find a polynomial-time algorithm for an
NP-complete problem - Practical implications
- NP-completeness may prevent wasting time
searching for a non-existent polynomial-time
algorithm for a particular problem - Proving a problem is NP-complete is strong
evidence of its non-polynomiality, since we
believe P is unequal to NP
72Polynomial-time Reducability
- Similar to reducing to prove undecidability
- If problem A reduces to problem B, a solution to
B can be used to solve A - Here, we take efficiency of the computation into
account - Is there an efficient function that can map
instances of A to instances of B? - Definition A language B is NP-Complete if it
satisfies two conditions - B is in NP and
- every a in NP is polynomial time reducible to B
-
- Theorem If B is NP-Complete and B ? P, then P
NP - Theorem If B is NP-Complete and C reduces to B,
then C is NP-Complete
73Proving a Problem is NP-Complete
- Once we have one NP-Complete problem, we can
obtain others by polynomial time reduction to it - Establishing the first NP-Complete problem is
difficult - In 1970, Stephen Cook did this
- In 1972, Karp proved NP-completeness results for
21 problems
74An NP-Complete Problem
- Satisfiability
- Boolean variables
- values TRUE (1) and FALSE (0)
- Boolean operators
- AND
- OR
- NOT (or )
- Boolean formula
75SAT
- A Boolean formula is satisfiable if some
assignment of 0s and 1s to the variables makes
the formula evaluate to 1 - is satisfiable
because the assignment x0, y1, and z0 makes it
evaluate to 1 - The satisfiability problem tests whether a
Boolean formula is satisfiable. Let - SAT lt?gt ? is a satisfiable Boolean formula
76Cook-Levin Theorem
- Links the complexity of SAT to that of all
problems in NP - Cook-Levin Theorem
- SAT ? P iff P NP
77Proof Idea
- Construct a polynomial time reduction for each
language A in NP to SAT - The reduction for A takes a string w and produces
a Boolean formula that simulates the NP machine
for A on w - If the machine accepts, ? has a satisfying
assignment that corresponds to the accepting
computation - If not, no assignment satisfies ?
- Therefore w is in A if ? is satisfiable
78SAT
- Proof involves many details no time to go
through it all - You have seen one NP-Complete problem CLIQUE