NMS PI meeting, September 2729, 2000 - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

NMS PI meeting, September 2729, 2000

Description:

Implies no backtracking during solution extraction search! ... that ignores delete effects (R stores propositions in most recent level) ... – PowerPoint PPT presentation

Number of Views:19

Avg rating:3.0/5.0

Slides: 25

Provided by: edwin4

Category:

more less

Transcript and Presenter's Notes

Title: NMS PI meeting, September 2729, 2000

1
Heuristic Search PlanningProgression and
Regression
Alan Fern

A heuristic for STRIPS problems
Forward search (HSP, HSP2.0)
Regression
Regression search (HSPr)

Based in part on slides by Daniel Weld and Dana
Nau

2
Planning as heuristic search

Use standard search techniques, e.g. A,
best-first, hill-climbing etc.
Attempt to extract heuristic state evaluator
automatically from the Strips encoding of the
domain
Here, heuristic is based on relaxed problem by
assuming action preconditions are independent and
no delete effects

3
Review Heuristic Search

A search is a best-first search using node
evaluation
f(s) g(s) h(s)
where
g(s) accumulated cost/number of actions
h(s) estimate of future cost
h(s) is admissible if it does not overestimate
the cost to goal
For admissible h(s) A returns optimal solutions

4
Heuristic from Relaxed Problem

Relaxed problem ignores delete lists on actions
The length of optimal solution for the relaxed
problem is admissible heuristic for original
problem. Why?
BUT still finding optimal solution is NP-hard
So we will approximate it
One way is to explicitly search for a relaxed
plan
Finding a relaxed plan can be done in polynomial
time
Take relaxed-plan length to be the heuristic
value
FF (for FastForward) is one such well-known
planner

5
FF Planner finding relaxed plans

Consider running Graphplan while ignoring the
delete lists
No mutexes
Implies no backtracking during solution
extraction search!
So we can find a relaxed solutions efficiently
After running the no-delete-list Graphplan take
the number of actions in layered plan to be
heuristic
Different choices in solution extracton can lead
to differentheuristic values
The planner FastForward (FF) uses this heuristic
in forward state-space best-first search
Actually uses several improvements over this
Took first place in the AIPS-2000 planning
competition

6
Example Finding Relaxed Plans
Relaxed plan graph(no mutexes)
The value returned depends on particularchoices
made in the backward extraction
7
HSP Indirect Relaxed Plan Length

HSP preceded FF and was one of the first
successful state-space heuristic search
HSP does not compute a relaxed plan explicitly
Uses recursive equations to compute bounds on the
relaxed plan length
?(s,p) minimum distance from state s to a state
containing proposition p
?(s,g) minimum distance from state s to a state
containing every proposition p in goal set g
Since these are NP-hard to compute we will
instead compute ?0(s,p) and ?0(s,g)
estimates of ?(s,p) and ?(s,g)

8
Heuristic Functions for Planning

?0(s,p) and ?0(s,g)
estimates of ?(s,p) and ?(s,g)
and p ? s,
h(s) ?0(s,g), where g is the goal

9
Admissibility

Is h admissible?
No. It assumes subgoals are independent, but
they may not be.
I.e. it assumes that the cost of achieving a set
of subgoals is the sum of costs of achieving them
independently
In reality, achieving one subgoal can help
achieve another subgoal
Consider an alternative heuristic hmax(s)
?0(s,g) where we redefine ?0(s,g) to be
?0(s,g) MAX p ? g ?0(s,p)
Is hmax admissible?
Yes.
Why would we use h instead of hmax then?
h is more informative, which tends to lead us to
the goal more quickly
But the solutions found may not be optimal

10
Computing the Heuristic

Given current state s, can compute ?0(s,p), for
every proposition p in polynomial time
1) Set ?0(s,p)0 if p ? s, otherwise ?0(s,p)?
2) R s the reachable set of
propositions
3) repeat until no change to ?0(s,p)
for each action a such that
PRE(a)?? R do
for each p ? ADD(a) do
add p to R
?0(s,p) min ?0(s,p),
1?p ? PRE(a) ?0(s,p)
From this, compute h(s) ?0(s,g) ?p ? g
?0(s,p)
Can be viewed as a plan graph expansion that
ignores delete effects (R stores propositions in
most recent level)

11
HSP algorithm overview

Hill-climbing search based on h(s)
randomly breaks ties
restarts if no progress is made for a given
number of steps
Some ad hoc choices for the planning competition
Hill-climbing search is not complete and is not
guaranteed optimal

12
HSP2 overview

Based on weighted A (WA) search
f(n) g(n) W h(n)
If W 1, its A (with admissible h).
If W gt 1, its a little greedy generally finds
solutions faster, but not optimal (within factor
of W of optimal).
In HSP2, W 5

13
Experiments

Does ok compared with IPP (Graphplan derivative)
and Blackbox.

14
Regression search

Motivation for HSPr
HSP and HSP2 spend up to 80 of their time
computing the evaluation function.
Search backwards from goal. This will allow reuse
of the heuristic computation

goal (partial state)
initial state
. . . .
Problem Many possible goal states are equally
acceptable since the goal is
only a partial specification.
From which one should we search?
15
Regression

Let G be a goal (set of facts)
The regression of a goal G through an action A,
REG(G,A) yields
weakest precondition G (least constraining G)
Such that if G is true before A is executed
then G is guaranteed to be true afterwards

A
G
precond
GREG(G,A)
effect
Represents a set of world states
Represents a set of world states
16
Regressing STRIPS Actions

An action A is relevant for G, if
G ? ADD(A) ? ?
G ? DEL(A) ?
The result of regressing g through a is
REG(G,A) (G ADD(A)) ? PRE(A)

A
G
precond
GREG(G,A)
effect
17
Regression Example
A
G
G
precond
effect
clear(C) ontable(C) handempty on(A,B)
holding(C) on(A,B)
pickup(C) PRE clear(C), ontable(C),
handempty ADD holding(C) DEL
clear(C), handempty, ontable(C)
18
HSPr search space

Search nodes are sets of atoms (correspond to
sets of states in original space)
initial search node n0 is the goal G
Goal nodes are those that are true in the initial
state s0
Heuristic value for search node g is
h(g) ?0(s0,g) ?p ? g ?0(s0,p)
Note that we can compute ?0(s0,p) before search
begins and reuse the values during search (avoids
significant computation)!

19
Mutexes in HSPr

Problem many of the regressed goal states are
impossible prune them with mutexes
E.g in blocksworld (on(c,d), on(a,d), ..) is
probably unreachable.
How can we detect and prune this set during
regression?
Compute a set of mutex propositions and only
consider regression results that do not include
mutexed pairs.

20
Mutexes in HSPr

First definition
A set M of pairs R p, q is a mutex set if
(1) R is not true in s0, and
(2) for every action A that adds p, a deletes q
(and vice versa replacing the roles of p
and q)
Sound, but too weak. Will not recognize many
mutexed propositions

21
Mutexes in HSPr, take 2

Better definition
A set M of pairs R p, q is a mutex set if
(1) R is not true in s0
(2) for every action A that adds p,
either A deletes q,
or A does not add q, and for some
precondition r of A,
r, q is in M. (and vice versa
replacing the role of p and q)
Recursive definition allows for some interaction
of the operators

22
Computing mutex sets

Start with some set of potential mutex pairs
Delete any that dont satisfy (1) and (2) above
Keep going until you dont delete any more
Initial set? could be all pairs (usually too
expensive)
The paper gives one suggestion for a smaller
initial set.

23
HSPr algorithm

Compute mutex set M
Computer heuristic value for each proposition
WA search using h(g) as heuristic and pruning
states that violate M
W 5 as before

24
Experiments comparing HSP2 and HSPr

Sometimes HSPr does better, sometimes HSP2 does
better. Why?
Two reasons (per B G)
HSPr saves significant time per search node
But, regressions still yields spurious states
Also, since HSP2 recomputes the estimate in each
state, it actually has more information

Write a Comment

User Comments (0)