Heuristic Search

About This Presentation

Title:

Heuristic Search

Description:

Heuristic Search Foundations of Artificial Intelligence Announcement Mirror Site Now Available http://facweb.cs.depaul.edu/mobasher/classes/CS480/ The old site on ... – PowerPoint PPT presentation

Number of Views:287

Avg rating:3.0/5.0

Slides: 49

Provided by: depa93

Learn more at: http://facweb.cs.depaul.edu

Category:

more less

Transcript and Presenter's Notes

Title: Heuristic Search

1
Heuristic Search

Foundations of Artificial Intelligence

2
Announcement

Mirror Site Now Available
http//facweb.cs.depaul.edu/mobasher/classes/CS480
/
The old site on maya.cs.depaul.edu will soon be
de-commissioned.

3
Topics

Heuristic Search
What is a heuristic
Best-First Search and Hill-Climbing
A Search

4
Heuristic Search

Problem with uniform cost search
We are only considering the cost so far, not the
expected cost of getting to the goal node
But, we dont know before hand the cost of
getting to the goal from a previous state
Solution
Need to estimate for each state the cost of
getting from there to a goal state
Use heuristic information to guess which nodes
to expand next
a heuristic is in the form of an evaluation
function based on domain-specific information
related to the problem.
gives us a way to evaluate a node locally based
on an estimate of the cost to get from the node
to a goal node (the idea is to find the least
cost path to a goal node).

5
Evaluation Functions

h(n) is the heuristic functiong(n) cost of the
best path found so far between the initial node
and n
f(n) h(n) ? greedy best-first search
f(n) g(n) h(n) ? A search

6
Best-First Search

Basic Idea always expand the node that minimizes
(or maximizes) the evaluation function f(n)
Greedy strategy f(n) h(n), where h(n)
estimates the cost of getting from the node n to
the goal
if we keep nodes in memory (on the queue) for
backtracking, then this is called (Greedy)
Best-First search if no queue and we stop as
soon as f(n) is worse for the children than the
parent, then this is called Hill-Climbing.
What happens if always try at each step to move
closer to the goal node?

The BFS algorithm in this case will find the
longer solution path, since it will begin by
moving forward and then be committed to this
choice. What about hill-climbing?
7
Hill-Climbing
It is simply a loop that continually moves in the
direction of the best value. No search tree is
maintained. One important refinement is that
when there is more than one best successor to
choose from, the algorithm can select among them
at random.
This simple policy has three well-known
drawbacks 1. Local Maxima a local maximum
as opposed to global maximum. 2. Plateaus An
area of the search space where evaluation
function is flat, thus requiring random
walk. 3. Ridge Where there are steep
slopes and the search direction is not
towards the top but towards the side.
8
Best-First Search

The evaluation function f maps each search node n
to positive real number f(n)
Traditionally, the smaller f(n), the more
promising n
Best-first search sorts the search queue at each
step in increasing order of f
random order is assumed among nodes with equal
values of f

9
Best-First Search

The evaluation function f maps each search node n
to positive real number f(n)
Traditionally, the smaller f(n), the more
promising n
Best-first search sorts the search queue at each
step in increasing order of f
random order is assumed among nodes with equal
values of f

Best only refers to the value of f, not to the
quality of the actual path. Best-first search
does not generate optimal paths in general
10
Best-First Search Example (Romania)

Suppose we dont know the actual distances
beforehand, but can figure out the straight line
distances from a map

11
Best-First Search Example (Romania)

Suppose we dont know the actual distances
beforehand, but can figure out the straight line
distances from a map
Heuristic evaluation function
h(n) straight-line distance between n
and Bucharest
h(n) is a heuristic because it is an estimate of
the actual cost of getting from n to the goal
Note that h(goal) 0 always

12
Greedy Best-First Search
h(n) 366
Arad
lt Arad lt
13
Greedy Best-First Search
h(n) 366
Arad
Zerind
Sibiu
h(n) 253
h(n) 329
Timisoara
h(n) 374
lt Sibiu, Timisoara, Zerind lt
14
Greedy Best-First Search
h(n) 366
Arad
Zerind
Sibiu
h(n) 253
h(n) 329
Timisoara
h(n) 374
193
178
366
380
Fagaras
Oradea
Arad
Rimnicu
lt Fagaras, Rimnicu, Timisoara, Zerind, Oradea
lt
15
Greedy Best-First Search
h(n) 366
Arad
Zerind
Sibiu
h(n) 253
h(n) 329
Timisoara
h(n) 374
193
178
380
Fagaras
Oradea
Rimnicu
253
h(n) 0
Sibiu
Bucharest
lt Bucharest, Rimnicu, Timisoara, Zerind, Oradea
lt
16
Greedy Best-First Search
h(n) 366
Arad
Zerind
Sibiu
h(n) 253
h(n) 329
Timisoara
h(n) 374
193
178
380
Fagaras
Oradea
Rimnicu
Actual cost of the solution Arad gt Sibiu gt
Fagaras gt Bucharest is 140 99 211
450 But, consider the path Arad gt Sibiu gt
Rimnicu gt Pitesti gt Bucharest with the cost 418
So we got a suboptimal solution
h(n) 0
253
Sibiu
Bucharest
17
Heuristics for 8-Puzzle Problem

In total, there are a possible of 9! or 362,880
possible states.
However, with a good heuristic function, it is
possible to reduce this state to less than 50.
Some possible heuristics for 8-Puzzle
h1(n) no. of misplaced tiles
may have many plateaus (indistinguishable states)
doesnt captures the number of moves to get to
the right place
h2(n) sum of Manhattan distances (i.e., no. of
squares from desired location of each tile)
doesnt capture the importance of sequencing
tiles (putting them in the right order)

18
Heuristics for 8-Puzzle Problem
1
2
3
5
4
6
1
8
8
4
6
3
7
2
7
5
s start state
g goal state

h1(s) 7
h2(s) 4 2 2 3 3 0 2 2 18

19
Part of the search tree generated by Best-First
search using h2 sum of Manhattan distances.
20
Heuristics Search in 8-Puzzle
Part of the search tree generated by Best-First
search using h2 sum of Manhattan
distances. What will happen with hill-climbing?
Goal Node
Initial Node
21
Properties of Best-First (Greedy) Search

Complete?
No - can get stuck in loops
e.g., Iasi gt Neamt gt Iasi gt Neamt gt
It is complete in finite space with
repeated-state checking
Time Complexity
In worst case O(bm)
but good heuristic can give dramatic improvement
Space Complexity
In worst case O(bm)
keeps all nodes in memory
Optimal No

22
A Search(most popular algorithm in AI)

Basic Idea avoid expanding paths that are
already expensive
Evaluation function f(n) g(n) h(n)
g(n) cost so far to reach n
h(n) estimated cost to goal from n
f(n) estimated total cost of path through n to
reach the goal
Admissible heuristics
i.e., h(n) h(n), for all n, where h(n) is
the true cost from n
Ex straight-line distance never overestimates
the actual road distance
Ex h1 and h2 is 8-puzzle never overestimate the
actual no. of moves
A search is optimal (finds lowest cost solution)
if h(n) is admissible
however, the number of nodes expanded depends on
how good the heuristic is
best case h(n) h(n) for all n ? A will find
the best solution with no search
if h(n) gt h(n) for some n, then A
might still work, but might not find any solution
at all

23
A Search
f(n) 366
Arad
140
75
118
h(n) 253 f(n) 393
h(n) 374 f(n) 449
h(n) 329 f(n) 447
Zerind
Sibiu
Timisoara
80
140
99
151
Fagaras
Oradea
Arad
f(n) 413
Rimnicu
f(n) 417
f(n) 646
f(n) 661
146
97
80
Pitesti
Craiova
Sibiu
f(n) 526
f(n) 415
f(n) 553
24
A Search
...
h(n) 253 f(n) 393
Sibiu
140
151
99
80
Fagaras
Oradea
Arad
f(n) 413
Rimnicu
f(n) 417
f(n) 661
f(n) 646
146
97
80
Pitesti
Sibiu
f(n) 415
Craiova
f(n) 553
f(n) 526
138
101
97
Craiova
Bucharest
Rimnicu
f(n) 418
f(n) 607
f(n) 615
25
A Search
...
h(n) 253 f(n) 393
Sibiu
140
99
80
151
Fagaras
Oradea
f(n) 417
Arad
f(n) 413
Rimnicu
f(n) 526
f(n) 646
99
211
146
97
80
Pitesti
Sibiu
f(n) 415
Bucharest
Craiova
Sibiu
f(n) 553
f(n) 450
f(n) 526
f(n) 591
138
101
97
Craiova
Bucharest
Rimnicu
f(n) 418
f(n) 607
f(n) 615
26
A Search
...
h(n) 253 f(n) 393
Sibiu
140
99
80
151
Fagaras
Oradea
f(n) 417
Arad
f(n) 413
Rimnicu
f(n) 526
f(n) 646
99
211
146
97
80
Pitesti
Sibiu
f(n) 415
Bucharest
Craiova
Sibiu
f(n) 553
f(n) 450
f(n) 526
f(n) 591
138
101
97
Craiova
Bucharest
Rimnicu
f(n) 607
f(n) 615
f(n) 418
27
A search for an instance of 8-puzzle with h1
(sum of misplaced tiles). g(n) assumes each move
has a cost of 1. Here we assume repeated state
checking.
f(n) g(n) h(n)
28
A search for an instance of 8-puzzle with h1
(sum of misplaced tiles). g(n) assumes each move
has a cost of 1. Here we assume repeated state
checking.
Order of expansion
f(n) g(n) h(n)
29
A search for an instance of 8-puzzle with h1
(sum of misplaced tiles). g(n) assumes each move
has a cost of 1. Here we assume repeated state
checking.
f(n) g(n) h(n)
30
A search for an instance of 8-puzzle with h1
(sum of misplaced tiles). g(n) assumes each move
has a cost of 1. Here we assume repeated state
checking.
f(n) g(n) h(n)
31
A search for an instance of 8-puzzle with h1
(sum of misplaced tiles). g(n) assumes each move
has a cost of 1. Here we assume repeated state
checking.
f(n) g(n) h(n)
32
A search for an instance of 8-puzzle with h1
(sum of misplaced tiles). g(n) assumes each move
has a cost of 1. Here we assume repeated state
checking.
f(n) g(n) h(n)
33
A search for an instance of 8-puzzle with h1
(sum of misplaced tiles). g(n) assumes each move
has a cost of 1. Here we assume repeated state
checking.
f(n) g(n) h(n)
34
A search for an instance of 8-puzzle with h1
(sum of misplaced tiles). g(n) assumes each move
has a cost of 1. Here we assume repeated state
checking.
f(n) g(n) h(n)
Note at level 2 there are two nodes listed with
f(n) 5. Depending on which node is we put in
front of the queue, the algorithm will either
expand 6 or 7 nodes. Here we have assumed the
worse case, and thus the tree shows that 6 nodes
were expanded
7
35
A and Repeated States

The heuristic h is clearly admissible

36
A and Repeated States
?
If we discard this new node, then the
search algorithm expands the goal node next
and returns a non-optimal solution
37
A and Repeated States
290
Instead, if we do not discard nodes revisiting
states, the search terminates with an optimal
solution
38
A and Repeated States

It is not harmful to discard a node revisiting a
state if the new path to this state has higher
cost than the previous one
A remains optimal, but the size of the search
tree can still be exponential in the worst case
Fortunately, for a large family of admissible
heuristics consistent heuristics there is a
much easier way of dealing with revisited states

39
A and Consistency (Monotonicity)

A heuristic h(n) is consistent if for every node
n and every successor of n, succ(n)
h(n) h(succ(n)) cost( n ? succ(n) )
i.e., decrease in heuristic value due to an
action is never more than the cost of the action
All consistent heuristics are admissible
For an admissible heuristic
the values of f(n) along any path are
non-decreasing (monotonincity)
A expands nodes according to non-decreasing
order of f(n)

40
A and Consistency (Monotonicity)
goal
STATE(N)

h1(N) number of misplaced tiles
h2(N) sum of the (Manhattan) distances
of every tile to its goal position
are both consistent

41
A and Repeated States
Theorem If h is consistent, then whenever A
expands a node, it has already found an optimal
path to this nodes state

Dealing with Repeated States
If a newly generated state was previously
expanded, then discard the new state
If multiple (unexpanded) instances of a state end
up on the queue, we only keep the instance that
has the smallest f value.

42
A and Informedness

Finding good heuristics for a problem
relax restrictions on operators in general, the
cost of an exact solution to a relaxed problem is
a good heuristic for the original problem
e.g., sum of Manhattan distances in 8-puzzle
gives the exact solution to the relaxed version
of the problem where a tile can move in any
direction, even onto occupied squares
use statistical information from training
examples to predict the correct heuristic value
for the nodes (this may result in an inadmissible
heuristic function)

43
A and Informedness (Cont.)

Multiple admissible heuristics?
given admissible heuristics h1 and h2, if h1(n) gt
h2(n), for all n, then h2 dominates h1
the dominating admissible heuristic usually
expands fewer nodes (it is more informed)
if there are several admissible heuristics none
of which dominates any others, we can take the
composite heuristic h(n) max ( h1(n), h2(n),
, hk(n) )
A Efficiency and Informedness
A expands every node n for which f(n) lt f
i.e., every node with h(n) lt f - g(n) will be
expanded
so, if h1(n) ³ h2(n), for all n, every node
expanded by h1 must also be expanded by h2

Moral of the story heuristic functions with
higher values work better so long as they are
admissible (in other words, we want our heuristic
to get as close as possible to h).
44
Efficiency of A
Comparison of search costs and effective
branching factors for the Iterative Deepening
search and A algorithms with h1 and h2 for
8-puzzle. d is the average depth of the search
tree. Data are averaged over 100 instances of the
problem, for various solution lengths.
45
IDA Algorithm

A potential problem with A is memory
since it reduces to breadth-first search (when h
0), it will potentially use memory that is
exponential to the depth of the optimal goal node
iterative deepening can again help, but now we
prune the nodes for which the nearest goal node
can be shown to lie below the cutoff depth
note that individual iterations perform a
depth-first search heuristic function is used to
prune nodes, but not to determine the order of
node expansion

46
IDA Algorithm

Sketch of IDA Algorithm
1. Set c 1 this is the current cutoff value.
2. Set L to be the list of initial nodes.
Let n be the first node on L.
4. If L is empty, increment c and return to step
2.
5. If n is a goal node, stop and return the path
from initial node to n.
6. Otherwise, remove n from L. Add to front of L
every child node n of n
for which f(n) c. Return to step 3.

47
When to Use Search Techniques?

The search space is small, and
No other technique is available, or
Developing a more efficient technique is not
worth the effort
The search space is large, and
No other available technique is available, and
There exist good heuristics

48
Exercise

Consider the problem of solving a cross-word
puzzle
initial state is an empty board with some
possible blocked cells
a goal state is board configuration filled in
with legal English words
How can this problem be viewed as a search
problem? What are the operators? How can we
measure path costs? Etc.
Assuming we have dictionary of 100,000 words,
what would be a good (uninformed) search strategy
to use? Why?
What might be some good heuristics to use for
this problem?
How effective might hill-climbing strategies work
in solving this problem? How can we handle the
local minima problem? Propose a solution and
discuss its effectiveness.