Title: NPcomplete and NPhard problems
1NP-complete and NP-hard problems
2Decision problems vs. optimization problems
- The problems we are trying to solve are basically
of two - kinds. In decision problems we are trying to
decide - whether a statement is true or false. In
optimization - problems we are trying to find the solution with
the best - possible score according to some scoring scheme.
- Optimization problems can be either maximization
- problems, where we are trying to maximize a
certain - score, or minimization problems, where we are
trying to - minimize a cost function.
3Example 1 Hamiltonian cycles
- Given a directed graph, we want to decide whether
or not - there is a Hamiltonian cycle in this graph. This
is a decision - problem.
4Example 2 TSP - The Traveling Salesman Problem
- Given a complete graph and an assignment of
weights to - the edges, find a Hamiltonian cycle of minimum
weight. - This is the optimization version of the problem.
In the - decision version, we are given a weighted
complete graph - and a real number c, and we want to know whether
or not - there exists a Hamiltonian cycle whose combined
weight of - edges does not exceed c.
-
5Example 3 The Minimum Spanning Tree Problem
- Given a connected graph and an assignment of
weights to - the edges, find a spanning tree of minimum
weight. This is - the optimization version of the problem. In the
decision - version, we are given a weighted connected graph
and a - real number c, and we want to know whether or not
there - exists a spanning tree whose combined weight of
edges - does not exceed c.
-
6Important Observation
- Each optimization problem has a corresponding
decision problem.
7Homework 12
- The vertex cover problem is defined over an
undirected - graph G (V,E) and asks for a set of vertices W
such - that for each edge e in E at least one of its
endpoints - belongs to W and W is as small as possible.
Write out - the decision version of this problem.
- Worth 1 point.
8Inputs
- Each of the problems discussed above has its
characteristic - input. For example, for the optimization version
of the TSP, - the input consists of a weighted complete graph
for the - decision version of the TSP, the input consists
of a - weighted complete graph and a real number. The
input - data of a problem are often called the instance
of the - problem. Each instance has a characteristic size
which is - the amount of computer memory needed to describe
the - instance. If the instance is a graph of n
vertices, then the - size of this instance would typically be about
n(n-1)/2.
9The class P
- A decision problem D is solvable in polynomial
time or in - the class P, if there exists an algorithm A such
that - A takes instances of D as inputs.
- A always outputs the correct answer Yes or
No. - There exists a polynomial p such that the
execution of A on inputs of size n always
terminates in p(n) or fewer steps.
10The class P
- EXAMPLE The Minimum Spanning Tree Problem is in
the - class P.
- The class P is often considered as synonymous
with the - class of computationally feasible problems,
although in - practice this is somewhat unrealistic.
11Witnesses for decision problems
- Note that if the answer to a decision problem is
yes, then - there is usually some witness to this. For
example, in the - Hamiltonian cycle problem, any permutation v1,
v2, ,vn - of the vertices of the input graph is a potential
witness. - This potential witness is a true witness if v1
is adjacent to - v2 , and vn is adjacent to v1.
12Witnesses for decision problems
- In the TSP, a potential witness would be a
Hamiltonian - cycle. This potential witness is a true witness
if its cost is c - or less.
13The class NP
- A decision problem is nondeterministically
polynomial-time - solvable or in the class NP if there exists an
algorithm A - such that
- A takes as inputs potential witnesses for yes
answers to problem D. - A correctly distinguishes true witnesses from
false witnesses. - There exists a polynomial p such that for each
potential witnesses of each instance of size n of
D, the execution of the algorithm A takes at most
p(n) steps.
14The class NP
- Note that if a problem is in the class NP, then
we are able - to verify yes-answers in polynomial time,
provided that - we are lucky enough to guess true witnesses.
15The PNP Problem
- Are the classes P and NP identical? This is an
open - problem it may well be the biggest open problem
of - mathematics at the beginning of the 21st century.
It is not - hard to show that every problem in P is also in
NP, but it is - unclear whether every problem in NP is also in P.
16The PNP Problem
- The best we can say is that thousands of computer
- scientists have been unsuccessful for decades to
design - polynomial-time algorithms for some problems in
the class - NP. This constitutes overwhelming empirical
evidence that - the classes P and NP are indeed distinct, but no
formal - mathematical proof of this fact is known.
17Polynomial-time reducibility
- Let E and D be two decision problems. We say that
D is - polynomial-time reducible to E if there exists an
algorithm - A such that
- A takes instances of D as inputs and always
outputs the correct answer Yes or No for each
instance of D. - A uses as a subroutine a hypothetical algorithm B
for solving E. - There exists a polynomial p such that for every
instance of D of size n the algorithm A
terminates in at most p(n) steps if each call of
the subroutine B is counted as only m steps,
where m is the size of the actual input of B.
18An example of polynomial-timereducibility
- Theorem The Hamiltonian cycle problem is
polynomial- - time reducible to the decision version of TSP.
- Proof Given an instance G with vertices v1 ,
, vn of the - Hamiltonian cycle problem, let H be the weighted
- complete graph on v1 , , vn such that the
weight of an - edge vi , vj in H is 1 if vi , vj is an
edge in G, and is 2 - otherwise. Now the correct answer for the
instance G of - the Hamiltonian cycle problem can be obtained by
running - an algorithm on the instance (H,n1) of the TSP.
19NP-complete problems
- A decision problem E is NP-complete if every
problem in - the class NP is polynomial-time reducible to E.
The - Hamiltonian cycle problem, the decision versions
of the - TSP and the graph coloring problem, as well as
literally - hundreds of other problems are known to be
NP-complete. - It is not hard to show that if a problem D is
polynomial- - time reducible to a problem E and E is in the
class P, then - D is also in the class P. It follows that if
there exists a - polynomial-time algorithm for the solution of any
of the - NP-complete problems, then there exist
polynomial-time - algorithms for all of them, and P NP.
20Homework 13
- Given that the Hamiltonian cycle problem is
NP-complete, - show that the Hamiltonian path problem is also
- NP-complete.
-
- Worth 2 points.
21NP-hard problems
- Optimization problems whose decision versions are
NP- - complete are called NP-hard.
- Theorem If there exists a polynomial-time
algorithm for - finding the optimum in any NP-hard problem, then
P NP.
22NP-hard problems
- Proof Let E be an NP-hard optimization (let us
say - minimization) problem, and let A be a
polynomial-time - algorithm for solving it. Now an instance J of
the - corresponding decision problem D is of the form
(I,c), - where I is an instance of E, and c is a number.
Then the - answer to D for instance J can be obtained by
running A - on I and checking whether the cost of the optimal
solution - exceeds c. Thus there exists a polynomial-time
algorithm - for D, and NP-completeness of the latter implies
P NP.
23Consequences for bioinformatics
- In view of the overwhelming empirical evidence
against - the equality P NP it seems that no NP-hard
optimization - problem is solvable by an algorithm that is
guaranteed to - run in polynomial time and
- always produce a solution with optimal
score/cost. - Unfortunately, many, perhaps most, of the
important - optimization problems in bioinformatics are
NP-hard. To - make matters worse, the instances of interest in
- bioinformatics are typically of large size.
- What can we do about these problems?
24Towards alternative performance measures
- So far we have been talking about algorithms that
- run in polynomial time on all instances
- always find the solution with the best
score/cost. - As we have seen, such algorithms may be too much
to ask - for. We will now briefly discuss how one can
meaningfully - relax the above requirements.
25Worst case vs. average performance
- So far, we have been insisting that there exists
a - polynomial p such the running time of an
algorithm is - bounded by p(n) for all instances of size n.
However, for - many practical purposes, it may be sufficient to
have an - algorithm whose average running time for
instances of - size n is bounded by a polynomial. Such an
algorithm may - still be unacceptably slow for some particularly
bad - instances, but such bad instances will
necessarily be very - rare and may be of little practical relevance.
26Approximation algorithms
- While optimal solutions to optimization problems
are - clearly best, reasonably good solutions are
also of value. - Let us say that an algorithm for a minimization
problem D - has a performance guarantee of 1 e if for each
instance - I of the problem it finds a solution whose cost
is at most - (1 e) times the cost of the optimal solution
for instance - I. While D may be NP-hard, it may still be
possible to find, - for some e gt 0, polynomial-time algorithms for D
with - performance guarantee 1 e. Such algorithms are
called - approximation algorithms. For maximization
problems, the - notion of an approximation algorithm is defined
similarly.
27Polynomial-time approximation schemes
- We say that a minimization problem D has a
polynomial- - time approximation scheme (PTAS) if for every e gt
0 there - exists a polynomial-time algorithm for D with
performance - guarantee 1 e. While D may be NP-hard, it may
still be - have a PTAS.
28Heuristic algorithms
- Quite often, bioinformaticians rely on heuristic
algorithms - for solving NP-hard optimization problems. These
are - algorithms that appear to run reasonably fast on
the - average instance, appear to find, most of the
time, - solutions within (1 e) of optimum for
reasonably small e. - However, it is not always easy or possible to
- mathematically analyze the performance of a
heuristic - algorithm.