Title: Exhaustive Search
1Exhaustive Search
Some problems involve searching a vast number of
potential solutions to find an answer, and do not
seem to be amenable to a solution by efficient
algorithms. What is an efficient algorithm? We
are used to thinking in terms of O(N), O(N log
N), and O(N3/2) for example. We are used to
O(N3), for example, being labelled as
inefficient. For the kind of problem we are
about to discuss, O(N50) would be delightful (at
least from a theoretical standpoint)!
2Exhaustive Search
Suppose we have an algorithm that runs
O(2N). Lets adopt the naive assumption that it
would take 2N CPU cycles to solve this problem
(an overly-optimistic and simplistic
assumption). Further, assume that we have a
super-fast 3GHz machine capable of 3E9 cycles
per second. Thus, for a problem of size N50,
this would take 250 cycles 1.259E15
cycles Our CPU would then require 1.259E15/3E9
419667 sec. Thats 116 hours of computing.
3Exhaustive Search
So our problem of N50 takes 116 hours of
computing. But now consider the problem N51.
This will take 208 hours. So weve approximately
doubled the computing effort but have solved the
problem for only one more element in the problem
set (50gt51). Now to compute the problem N59,
this would take six years using our super-fast
CPU! This is definitely not to our advantage.
Even if we came up with a computer a million
times the speed of the current one, it would take
13 million years to finish the computation for
N100.
4Exhaustive Search Example The Traveling
Salesman Problem
The most famous problem of this type is the
Traveling Salesman Problem given N cities, find
the shortest route connecting them all, with no
city visited twice. It is unthinkable to solve
an arbitrary instance of this problem for N1000
for example. The problem is difficult because
there does not seem to be a way to avoid having
to check the length of a very large number of
possible tours. Checking each possible tour is
exhaustive search.
5Exhaustive Search in Graphs
We can model the Traveling Salesman Problem with
graphs given a weighted (and possibly directed)
graph, we want to find the shortest simple cycle
that connects all the nodes.
An instance of the Traveling Salesman Problem
6Exhaustive Search in Graphs
This brings up a another problem that would seem
to be easier given an undirected graph, is there
any way to connect all the nodes with a simple
cycle? That is, starting at some node, can we
visit all the other nodes and return to the
original node, visiting every node in the graph
exactly once? This is known as the Hamilton Cycle
Problem. As it turns out, the Hamilton Cycle
Problem is technically equivalent to the
Traveling Salesman Problem.
7Exhaustive Search in Graphs
DFS visits the nodes in the graph below in this
order A B C E F D G H I K J L M assuming an
adjacency matrix or sorted adjacency-list
representation. This is not a simple cycle, so to
dins a Hamilton cycle we have to try another way
to visit the nodes. As it turns out, we can try
all possibilities systematically with a simple
modification of the DFS visit function.
8Exhaustive Search in Graphs(adjacency matrix
version)
void visit(int val, int k, int id) int
t cout ltlt "visit visiting node " ltlt k ltlt
endl valk id t adjk for
(t0 tltN t) if (akt) if
(valt0) visit(val, t, id)
id-- valk 0
The only marked nodes are those for which visit
hasnt completed, and they correspond to a simple
path of length id in the graph, from the initial
node to the one currently being visited. To visit
a node, we simply visit all unmarked adjacent
nodes (marked ones would not correspond to a
simple path). The recursive procedure checks all
simple paths in the graph that start at the
initial node.
Self-cleanup code on return.
9Exhaustive Search in Graphs
Each node in the tree corresponds to a call to
visit.
10Exhaustive Search in Graphs
Each path is traversed twice (i.e. in both
directions) For this example, there is only one
Hamilton cycle. There are other paths of length V
that do not make up a Hamilton cycle.
11Exhaustive Search in Graphs
Note carefully that in the regular DFS, nodes are
marked as visited and remain marked after they
are visited, whereas in exhaustive search nodes
are visited many times. The unmarking of the
nodes makes exhaustive search different from DFS
in an essential way. Our DFS modification, so
far, explores all possible paths. But we need to
detect a Hamilton cycle if we come across one!
Recall that id is the length of the path being
tried, and valk is the position of node k on
that path. Thus we can make the visit function
test the existence of a Hamilton cycle by testing
whether there is an edge from k to 1 when valk
V We can use the same technique described
above to solve the Traveling Salesman Problem, by
keeping track of the length of the current path
in the val array, then keeping track of the
minimum of the lengths of the Hamilton cycles
found.
12Backtracking
The time taken to do exhaustive search is
proportional to the number of calls to visit,
which is the number of nodes in the exhaustive
search tree. For large graphs, this will be very
large. (cf. if this is a complete graph, with
every node connected to every other node, then
there are V! simple cycles, one corresponding to
each arrangement of the nodes). We need to add
tests to visit to discover that recursive calls
should not be made for certain nodes. This
corresponds to pruning the exhaustive search
tree cutting certain branches and deleting
everything connected to them.
13Backtracking
One important technique is the removal of
symmertries. In our exhaustive search, each cycle
is traversed in both directions. We can try to
ensure that we find each cycle just once, by
insisting that three particular nodes appear in
particular order. For example, if we insist on
the order C after A but before B, then we do
not call visit for B unless C is already on the
path.
This is not always a possible technique. Suppose
were trying to find a path (not necessarily a
cycle) connecting all vertices. Now this scheme
can not be used since we dont know in advance
whether a path will lead to a cycle or not.
14Backtracking
Each time we cut off the search at a node, we
avoid searching the entire subtree below that
node. For very large trees, this is very
substantial savings. Those savings are so
significant that it is worthwhile to do as much
as possible within visit to avoid making
recursive calls.
15Backtracking
Example some paths might divide the graph in
such a way that the unmarked nodes arent
connected, so no cycle can be found. Take for
instance the path starting with ABCE. Applying
this rule to the pruned tree with symmetry
heuristics applied leads to the reduced tree
above. (This costs us one DFS to discover the
heuristic, but saves us many otherwise wasted
calls to visit.)
16Backtracking
When searching for the best path (for instance
Traveling Salesman Problem), another important
pruning technique is available Suppose a path
of cost x through the graph has been found. Then
its useless to continue along any path for which
the cost so far is greater than x. This can be
implemented by making no recursive calls to visit
if the cost of the current partial path is
greater than the cost of the best path found so
far. We clearly wont miss the minimum-cost path
by adopting such a policy.
17Backtracking
The pruning is more effective if a low-cost path
is found early in the search. One way to make
this likely is to visit the nodes adjacent to the
current node in order of increasing cost. But we
can do better often, we can compute a bound on
the cost of all full paths that begin with a
general partial path add the cost of the minimum
spanning tree of the unmarked nodes. This
general technique of calculating bounds on
partial solutions in order to limit the number of
full solutions needing to be examined is
sometimes called branch-and-bound. It applies
whenever costs are attached to paths.
18Backtracking
The general procedure just described, of solving
a problem by systematically generating all
possible solutions is called backtracking. Whenev
er partial solutions to a problem can be
successfully augmented in many ways to produce a
complete solution, a recursive implementation may
be appropriate. The process can be described by
an exhaustive search whose tree nodes correspond
to a partial solution. Going down the tree
corresponds to progress towards more complete
solutions. Going up the tree corresponds to
backtracking to some previously generated
partial solution, from which point it might be
worthwhile to proceed forward again.
19Backtracking
Backtracking and branch-and-bound are widely
applicable as general problem-solving techniques.
For example, they form the basis for many
programs that play chess or checkers. In this
case, a partial solution is some legal
positioning of all the pieces on the board, and
the descendant of a node in the exhaustive search
tree is a position that can be the result of some
legal move. Ideally, we want an exhaustive
search. Realistically, we need to prune (in quite
a sophisticated way).
20A word on performance
However sophisticated the criteria, it is
generally true that the running time for
backtracking algorithms remains exponential. If
each node in the search tree has a children, on
the average, and the length of the solution path
is N, then we expect the number of nodes in the
tree to be proportional to aN. Different
backtracking rules correspond to reducing the
value of a, the number of choices to try at each
node. This is worthwhile because this increases
the size of the problem that can be solved. For
example, an algorithm that runs O(1.1N) can solve
a problem perhaps eight times as large as one
which runs in time O(2N). On the other hand,
neither will do well with very large problems.
21Subtle differences between easy and hard
problems
Sometimes the line between easy and hard
problems is fine. Recall our solution to the
problem find the shortest path from vertex x to
vertex y. But if we ask what is the longest
path (without cycles) from x to y? we have a
real problem on our hands. Sometimes the fine
line is even more striking when you consider
similar problems that ask for only yes/no
questions Easy is there a path from x to y
with weight lt M? Hard is there a path from x
to y with weight gt M? BFS gives a solution for
the first question in linear time. All known
solutions to the second could take exponential
time1.
Without loss of generality, exponential time here
means 2N. The 2 can be replaced with anything gt 1
and the argument still holds.
22Deterministic NondeterministicPolynomial-Time
Algorithms
Lets develop a simple formal model to
distinguish between those efficient algorithms
that weve been studying, and the brute-force
exponential algorithms that weve discussed
recently. In this model, the efficiency of an
algorithm is a function of the number of bits
used to encode the input, using a reasonable
encoding scheme. For example you would expect a
number M to be represented with logM bits not M
bits. In any case, we are merely interested in
identifying algorithms guaranteed to run in time
proportional to some polynomial in the number of
bits of input. P the set of all problems that
can be solved by deterministic algorithms in
polynomial time. Deterministic? - at any point
in time, the next step in a program can be
predicted.
23Deterministic NondeterministicPolynomial-Time
Algorithms
Lets now empower our computer with the power of
nondeterminism when an algorithm is faced with a
choice of several options, it has the power to
guess the right one. NP the set of all
problems that can be solved by nondeterministic
algorithms in polynomial time. Obviously, and
problem in P is also in NP. But it seems that
there should be many more problems in NP.
24Deterministic NondeterministicPolynomial-Time
Algorithms
Lets now empower our computer with the power of
nondeterminism when an algorithm is faced with a
choice of several options, it has the power to
guess the right one. NP the set of all
problems that can be solved by nondeterministic
algorithms in polynomial time. Obviously, any
problem in P is also in NP. But it seems that
there should be many more problems in NP.
25Deterministic NondeterministicPolynomial-Time
Algorithms
Why bother considering an imaginary tool that
makes difficult problems seem trivial? No one
has been able to find a single problem that can
be proven to be in NP but not in P (or even prove
that one exists). i.e., we do not know whether
NPP or not. This is quite frustrating, since if
we determine that a particular problem is in NP
and not in P, we could abandon the search for an
efficient solution to it. In the absence of such
proof, there is the possibility that some
efficient algorithms have gone undiscovered. In
fact, given what we know so far, there could be
an efficient algorithm for every problem in NP gt
many efficient algorithms have gone undiscovered.
26Deterministic NondeterministicPolynomial-Time
Algorithms
Virtually no-one believes that PNP, but this is
an outstanding research question.
27Satisfiablility
Another example of a problem in NP is the
satisfiability problem. Given a logical formula
of the form (x1 x3 x5) (x1 !x2 x4)
(!x3 x4 x5) (x2 !x3 x5) where xi
represents a boolean variable, represents or,
represents and, and ! represents not. The
satisfiability problem is to determine whether
there is an assignment of truth values to the
variables that makes the formula true
(satisfies it). -- hold that thought, were
going to need satisfiability in a little bit.
28NP-Completeness
Lets look at some problems known to belong to NP
but might or might not belong to P. That is, they
are easy to solve on a non-deterministic machine,
but no-one has been able to find an efficient
algorithm for them on a conventional
machine. These problems have an added property
if any of them can be solved in polynomial time
on a conventional machine, then so can all of
them (i.e. PNP). Such problems are said to be
NP-Complete. The primary tool used to prove that
problems are NP-Complete employs the idea of
polynomial reducibility transform any instance
of the known NP-Complete problem to an instance
of the new problem, solve the problem using the
given algorithm, then transform the solution back
to a solution of the NP-Complete
problem. Polynmially reducible means that the
transformation can be done in polynomial time.
29NP-Completeness
For example, to prove that a problem in NP is
NP-Complete, we need only show that some known
NP-Complete problem is polynomially reducible to
it that is, that a polynomial time algorithm for
the new problem can be used to solve the
NP-Complete proble, and then can, in turn, be
used to solve all problems in NP. Example Trave
ling Salesman Problem Given a set of cities and
distances between all pairs, find a tour of all
cities of distance less than M. Hamilton Cycle
given a graph, find a simple cycle that includes
all the vertices. The Hamilton Cycle problem
reduces to the Traveling Salesman Problem
(assuming we know the Hamilton Cycle problem is
NP-Complete)
30NP-Completeness
The Hamilton Cycle problem reduces to the
Traveling Salesman Problem (assuming we know the
Hamilton Cycle problem is NP-Complete) that was
relatively easy to illustrate. Reducibility can
sometimes be very hard indeed to prove! There
are literally thousands of problems that have
been shown to be NP-Complete.
31NP-Completeness
Reduction uses one NP-Complete problem to imply
another. But how was the first problem found? S.
A. Cook gave a direct proof in 1971 that
satisfiability is NP-Complete, i.e. that if there
is a polynomial time algorithm for
satisfiability, then all problems in NP can be
solved in polynomial time. The proof is
extremely complicated and involves laying down
the specification of a Turing Machine with the
added property of nondeterminism. The solution
of the satisfiability problem is essentially a
simulation of the machine running the given
program on the given input to produce a solution
to an instance of the given problem. Fortunately,
this proof had to be done only once. We can use
reducibility to show that other problems are
NP-Complete.
32Some NP-Complete Problems
Partition given a set of integers, can they be
divided into two sets whose sum is
equal? Integer Linear Programming given a
linear problem, is there a solution in
integers? Multiprocessor Scheduling given a
deadline and a set of tasks of varying lengths to
be performed on two identical processors, can the
tasks be arranged so that the deadline is
met? Vertex Cover given a graph and an N, is
there a set of fewer than N vertices which
touches all the edges? Molecular Energy given a
molecule with a set of rotatable bonds, what is
the minimum energy of the molecule if it is
allowed to freely rotate its bonds?