Title: Finding Admissible Bounds for Over-subscribed Planning Problems
1Finding Admissible Bounds for Over-subscribed
Planning Problems
Menkes van den Briel
Subbarao Kambhampati
Arizona State University
2Is this plan good?
3How good is a given plan
How to drive a planner to find a good plan
Related
e.g., when we have many soft goals
Especially important when quality may vary widely
Admissible heuristics
Helps per-node use
Helps one-shot use
Need a heuristic schema that admits degrees of
relaxation
4(No Transcript)
5Challenges
1. Build a strong admissible heuristic 2.
Provide a way to add relaxation for varied use
An integer programming (IP) based heuristic
Use the linear programming (LP) relaxation
6PSPUDPartial Satisfaction Planning with Utility
Dependency
Actions have cost
Goal sets have utility
loc1
loc2
(at t loc2) (in p1 t)
(at t loc1) (in p1 t)
(at t loc1) (at p1 loc2)
(at t loc2) (at p1 loc2)
S3
S0
S1
S2
(move t loc2)
(unload p1 loc2)
(move t loc1)
cost 20
cost 20
cost 5
sum cost 20
sum cost 25
sum cost 45
util(S1) 0
util(S3) 10106080
util(S0) 10
util(S2) 10
net benefit(S0) 10-010
net benefit(S1) 0-20-20
net benefit(S2) 10-25-15
net benefit(S3) 80-4535
utility((at t loc1) (at p1 t)) 60
utility((at t loc1)) 10
utility((at p1 loc2)) 10
7Building a Heuristic
A network flow model on variable transitions
Capture relevant transitions with multi-valued
fluents
add initial states
add prevail constraints
add goal states
add cost on actions
add utility on goals
loc1
loc2
package
truck
util 10
util 10
cost 5
util 60
cost 20
cost 5
cost 20
cost 5
cost 5
8Building a Heuristic
Constraints of this model
1. If an action executes, then all of its effects
and prevail conditions must also.
2. If a fact is deleted, then it must be added to
re-achieve a value.
3. If a prevail condition is required, then it
must be achieved.
4. A goal utility dependency is achieved if its
goals are achieved.
package
truck
util 10
util 10
cost 5
util 60
cost 20
cost 5
cost 20
cost 5
cost 5
9Formulation
Variables
action(a) ? Z The number of times a ? A is executed
effect(a,v,e) ? Z The number of times a transition e in state variable v is caused by action a
prevail(a,v,f) ? Z The number of times a prevail condition f in state variable v is required by action a
endvalue(v,f) ? 0,1 Equal to 1 if value f is the end value in a state variable v
goaldep(k) Equal to 1 if a goal dependency is achieved
Parameters
cost(a) the cost of executing action a ? A
utility(v,f) the utility of achieving value f in state variable v
utility(k) the utility of achieving achieving goal dependency Gk
1. If an action executes, then all of its effects
and prevail conditions must also.
action(a) Seffects of a in v effect(a,v,e)
Sprevails of a in v prevail(a,v,f)
2. If a fact is deleted, then it must be added to
re-achieve a value.
1if f ? s0v Seffects that add f
effect(a,v,e) Seffects that delete f
effect(a,v,e) endvalue(v,f)
3. If a prevail condition is required, then it
must be achieved.
1if f ? s0v Seffects that add f
effect(a,v,e) prevail(a,v,f) / M
4. A goal utility dependency is achieved if its
goals are achieved.
goaldep(k) Sf in dependency k endvalue(v,f)
Gk 1 goaldep(k) endvalue(v,f) ? f in
dependency k
10Formulation
Variables
action(a) ? Z The number of times a ? A is executed
effect(a,v,e) ? Z The number of times a transition e in state variable v is caused by action a
prevail(a,v,f) ? Z The number of times a prevail condition f in state variable v is required by action a
endvalue(v,f) ? 0,1 Equal to 1 if value f is the end value in a state variable v
goaldep(k) Equal to 1 if a goal dependency is achieved
Parameters
cost(a) the cost of executing action a ? A
utility(v,f) the utility of achieving value f in state variable v
utility(k) the utility of achieving achieving goal dependency Gk
Objective Function
Sv?V,f?Dv utility(v,f) endvalue(v,f) Sk?K
utility(k) goaldep(k) Sa?A cost(a) action(a)
Maximize Net Benefit
11Experimental Setup
Three modified IPC 3 domains zenotravel,
satellite, rovers
(maximize net benefit)
One IPC 5 domain Rovers, simple preferences
(minimize (goal achievement violations action
cost))
Compared with
, a cost propagation-based heuristic
heuristic value at initial state versus optimal
plan
Found using a branch and bound search
LP gt IP gt OPTIMAL
maximizing
LP lt IP lt OPTIMAL
minimizing
12Results
13Results
14Results
IP
LP
15Summary
- IP gives bound on quality of plan
- Doubly relaxed (LP) to provide heuristic for
search (Search I Session Monday at 410 pm)
16Future Work
- Improve encoding (to give better LP values)
- Use fluent merging