Title: NMS PI meeting, September 2729, 2000
1Heuristic Search PlanningProgression and
Regression
Alan Fern
- A heuristic for STRIPS problems
- Forward search (HSP, HSP2.0)
- Regression
- Regression search (HSPr)
- Based in part on slides by Daniel Weld and Dana
Nau
2Planning as heuristic search
- Use standard search techniques, e.g. A,
best-first, hill-climbing etc. - Attempt to extract heuristic state evaluator
automatically from the Strips encoding of the
domain - Here, heuristic is based on relaxed problem by
assuming action preconditions are independent and
no delete effects
3Review Heuristic Search
- A search is a best-first search using node
evaluation - f(s) g(s) h(s)
- where
- g(s) accumulated cost/number of actions
- h(s) estimate of future cost
- h(s) is admissible if it does not overestimate
the cost to goal - For admissible h(s) A returns optimal solutions
4Heuristic from Relaxed Problem
- Relaxed problem ignores delete lists on actions
- The length of optimal solution for the relaxed
problem is admissible heuristic for original
problem. Why? - BUT still finding optimal solution is NP-hard
- So we will approximate it
- One way is to explicitly search for a relaxed
plan - Finding a relaxed plan can be done in polynomial
time - Take relaxed-plan length to be the heuristic
value - FF (for FastForward) is one such well-known
planner -
5FF Planner finding relaxed plans
- Consider running Graphplan while ignoring the
delete lists - No mutexes
- Implies no backtracking during solution
extraction search! - So we can find a relaxed solutions efficiently
- After running the no-delete-list Graphplan take
the number of actions in layered plan to be
heuristic - Different choices in solution extracton can lead
to differentheuristic values - The planner FastForward (FF) uses this heuristic
in forward state-space best-first search - Actually uses several improvements over this
- Took first place in the AIPS-2000 planning
competition
6Example Finding Relaxed Plans
Relaxed plan graph(no mutexes)
The value returned depends on particularchoices
made in the backward extraction
7HSP Indirect Relaxed Plan Length
- HSP preceded FF and was one of the first
successful state-space heuristic search - HSP does not compute a relaxed plan explicitly
- Uses recursive equations to compute bounds on the
relaxed plan length - ?(s,p) minimum distance from state s to a state
containing proposition p - ?(s,g) minimum distance from state s to a state
containing every proposition p in goal set g - Since these are NP-hard to compute we will
instead compute ?0(s,p) and ?0(s,g) - estimates of ?(s,p) and ?(s,g)
-
8Heuristic Functions for Planning
- ?0(s,p) and ?0(s,g)
- estimates of ?(s,p) and ?(s,g)
- and p ? s,
- h(s) ?0(s,g), where g is the goal
9Admissibility
- Is h admissible?
- No. It assumes subgoals are independent, but
they may not be. - I.e. it assumes that the cost of achieving a set
of subgoals is the sum of costs of achieving them
independently - In reality, achieving one subgoal can help
achieve another subgoal - Consider an alternative heuristic hmax(s)
?0(s,g) where we redefine ?0(s,g) to be - ?0(s,g) MAX p ? g ?0(s,p)
- Is hmax admissible?
- Yes.
- Why would we use h instead of hmax then?
- h is more informative, which tends to lead us to
the goal more quickly - But the solutions found may not be optimal
10Computing the Heuristic
- Given current state s, can compute ?0(s,p), for
every proposition p in polynomial time - 1) Set ?0(s,p)0 if p ? s, otherwise ?0(s,p)?
- 2) R s the reachable set of
propositions - 3) repeat until no change to ?0(s,p)
- for each action a such that
PRE(a)?? R do - for each p ? ADD(a) do
- add p to R
- ?0(s,p) min ?0(s,p),
1?p ? PRE(a) ?0(s,p) - From this, compute h(s) ?0(s,g) ?p ? g
?0(s,p) - Can be viewed as a plan graph expansion that
ignores delete effects (R stores propositions in
most recent level)
11HSP algorithm overview
- Hill-climbing search based on h(s)
- randomly breaks ties
- restarts if no progress is made for a given
number of steps - Some ad hoc choices for the planning competition
- Hill-climbing search is not complete and is not
guaranteed optimal
12HSP2 overview
- Based on weighted A (WA) search
- f(n) g(n) W h(n)
- If W 1, its A (with admissible h).
- If W gt 1, its a little greedy generally finds
solutions faster, but not optimal (within factor
of W of optimal). - In HSP2, W 5
13Experiments
- Does ok compared with IPP (Graphplan derivative)
and Blackbox.
14Regression search
- Motivation for HSPr
- HSP and HSP2 spend up to 80 of their time
computing the evaluation function. - Search backwards from goal. This will allow reuse
of the heuristic computation
goal (partial state)
initial state
. . . .
Problem Many possible goal states are equally
acceptable since the goal is
only a partial specification.
From which one should we search?
15Regression
- Let G be a goal (set of facts)
- The regression of a goal G through an action A,
REG(G,A) yields - weakest precondition G (least constraining G)
- Such that if G is true before A is executed
then G is guaranteed to be true afterwards
A
G
precond
GREG(G,A)
effect
Represents a set of world states
Represents a set of world states
16Regressing STRIPS Actions
- An action A is relevant for G, if
- G ? ADD(A) ? ?
- G ? DEL(A) ?
- The result of regressing g through a is
- REG(G,A) (G ADD(A)) ? PRE(A)
A
G
precond
GREG(G,A)
effect
17Regression Example
A
G
G
precond
effect
clear(C) ontable(C) handempty on(A,B)
holding(C) on(A,B)
pickup(C) PRE clear(C), ontable(C),
handempty ADD holding(C) DEL
clear(C), handempty, ontable(C)
18HSPr search space
- Search nodes are sets of atoms (correspond to
sets of states in original space) - initial search node n0 is the goal G
- Goal nodes are those that are true in the initial
state s0 - Heuristic value for search node g is
h(g) ?0(s0,g) ?p ? g ?0(s0,p) - Note that we can compute ?0(s0,p) before search
begins and reuse the values during search (avoids
significant computation)!
19Mutexes in HSPr
- Problem many of the regressed goal states are
impossible prune them with mutexes - E.g in blocksworld (on(c,d), on(a,d), ..) is
probably unreachable. - How can we detect and prune this set during
regression? - Compute a set of mutex propositions and only
consider regression results that do not include
mutexed pairs.
20Mutexes in HSPr
- First definition
- A set M of pairs R p, q is a mutex set if
- (1) R is not true in s0, and
- (2) for every action A that adds p, a deletes q
(and vice versa replacing the roles of p
and q) - Sound, but too weak. Will not recognize many
mutexed propositions
21Mutexes in HSPr, take 2
- Better definition
- A set M of pairs R p, q is a mutex set if
- (1) R is not true in s0
- (2) for every action A that adds p,
- either A deletes q,
- or A does not add q, and for some
precondition r of A, - r, q is in M. (and vice versa
replacing the role of p and q) - Recursive definition allows for some interaction
of the operators
22Computing mutex sets
- Start with some set of potential mutex pairs
- Delete any that dont satisfy (1) and (2) above
- Keep going until you dont delete any more
- Initial set? could be all pairs (usually too
expensive) - The paper gives one suggestion for a smaller
initial set.
23HSPr algorithm
- Compute mutex set M
- Computer heuristic value for each proposition
- WA search using h(g) as heuristic and pruning
states that violate M - W 5 as before
24Experiments comparing HSP2 and HSPr
- Sometimes HSPr does better, sometimes HSP2 does
better. Why? - Two reasons (per B G)
- HSPr saves significant time per search node
- But, regressions still yields spurious states
- Also, since HSP2 recomputes the estimate in each
state, it actually has more information