4/8: Cost Propagation - PowerPoint PPT Presentation

About This Presentation

Title:

4/8: Cost Propagation

Description:

... ICAPS 2003 paper. READ it before coming. Homework on SAPA coming from Vietnam ... if etrA strA' in Ppc; rAA'= if strA etrA' in Ppc; rAA'= other wise. ... – PowerPoint PPT presentation

Number of Views:14

Avg rating:3.0/5.0

Slides: 25

Provided by: min63

Learn more at: https://rakaposhi.eas.asu.edu

Category:

more less

Transcript and Presenter's Notes

Title: 4/8: Cost Propagation

1
4/8 Cost Propagation
Partialization
Next Class LPGICAPS 2003 paper.
READ it before coming.
Homework on SAPA coming from Vietnam

Todays lesson
Beware of solicitous suggestions from juvenile
cosmetologists
Exhibit A Abe Lincoln
Exhibit B Rao

2
Multi-objective search

Multi-dimensional nature of plan quality in
metric temporal planning
Temporal quality (e.g. makespan, slack)
Plan cost (e.g. cumulative action cost, resource
consumption)
Necessitates multi-objective optimization
Modeling objective functions
Tracking different quality metrics and heuristic
estimation
? Challenge There may be inter-dependent
relations between different quality metric

3
Example

Option 1 Tempe ?Phoenix (Bus) ? Los Angeles
(Airplane)
Less time 3 hours More expensive 200
Option 2 Tempe ?Los Angeles (Car)
More time 12 hours Less expensive 50
Given a deadline constraint (6 hours) ? Only
option 1 is viable
Given a money constraint (100) ? Only option 2
is viable

4
Solution Quality in the presence of multiple
objectives

When we have multiple objectives, it is not clear
how to define global optimum
E.g. How does ltcost5,Makespan7gt plan compare
to ltcost4,Makespan9gt?
Problem We dont know what the users utility
metric is as a function of cost and makespan.

5
Solution 1 Pareto Sets

Present pareto sets/curves to the user
A pareto set is a set of non-dominated solutions
A solution S1 is dominated by another S2, if S1
is worse than S2 in at least one objective and
equal in all or worse in all other objectives.
E.g. ltC4,M9gt dominated by ltC5M9gt
A travel agent shouldnt bother asking whether I
would like a flight that starts at 6pm and
reaches at 9pm, and cost 100 or another ones
which also leaves at 6 and reaches at 9, but
costs 200.
A pareto set is exhaustive if it contains all
non-dominated solutions
Presenting the pareto set allows the users to
state their preferences implicitly by choosing
what they like rather than by stating them
explicitly.
Problem Exhaustive Pareto sets can be large
(non-finite in many cases).
In practice, travel agents give you
non-exhaustive pareto sets, just so you have the
illusion of choice ?
Optimizing with pareto sets changes the nature of
the problemyou are looking for multiple rather
than a single solution.

6
Solution 2 Aggregate Utility Metrics

Combine the various objectives into a single
utility measure
Eg w1costw2make-span
Could model grad students preferences with
w1infinity, w20
Log(cost) 5(Make-span)25
Could model Bill Gates preferences.
How do we assess the form of the utility measure
(linear? Nonlinear?)
and how will we get the weights?
Utility elicitation process
Learning problem Ask tons of questions to the
users and learn their utility function to fit
their preferences
Can be cast as a sort of learning task (e.g.
learn a neual net that is consistent with the
examples)
Of course, if you want to learn a true nonlinear
preference function, you will need many many more
examples, and the training takes much longer.
With aggregate utility metrics, the multi-obj
optimization is, in theory, reduces to a single
objective optimization problem
However if you are trying to good heuristics to
direct the search, then since estimators are
likely to be available for naturally occurring
factors of the solution quality, rather than
random combinations there-of, we still have to
follow a two step process
Find estimators for each of the factors
Combine the estimates using the utility measure
THIS IS WHAT WE WILL DO IN THE NEXT FEW SLIDES

7
Our approach

Using the Temporal Planning Graph (Smith Weld)
structure to track the time-sensitive cost
function
Estimation of the earliest time (makespan) to
achieve all goals.
Estimation of the lowest cost to achieve goals
Estimation of the cost to achieve goals given the
specific makespan value.
Using this information to calculate the heuristic
value for the objective function involving both
time and cost
New issue How to propagate cost over planning
graphs?

8
The (Relaxed) Temporal PG
9
Time-sensitive Cost Function
cost
?
300
220
100
0
time
1.5
2
10
Drive-car(Tempe,LA)
Airplane(P,LA)
Heli(T,P)
Shuttle(Tempe,Phx) Cost 20 Time 1.0
hour Helicopter(Tempe,Phx) Cost 100 Time 0.5
hour Car(Tempe,LA) Cost 100 Time 10
hour Airplane(Phx,LA) Cost 200 Time 1.0 hour
Shuttle(T,P)
t 10
t 0
t 0.5
t 1
t 1.5

Standard (Temporal) planning graph (TPG) shows
the time-related estimates e.g. earliest time to
achieve fact, or to execute action
TPG does not show the cost estimates to achieve
facts or execute actions

10
Estimating the Cost Function
?
Shuttle(Tempe,Phx) Cost 20 Time 1.0
hour Helicopter(Tempe,Phx) Cost 100 Time 0.5
hour Car(Tempe,LA) Cost 100 Time 10
hour Airplane(Phx,LA) Cost 200 Time 1.0 hour
300
220
100
20
time
0
1.5
2
10
1
Cost(At(LA))
Cost(At(Phx)) Cost(Flight(Phx,LA))
11
Cost Propagation

Issues
At a given time point, each fact is supported by
multiple actions
Each action has more than one precondition
Propagation rules
Cost(f,t) min Cost(A,t) f ?Effect(A)
Cost(A,t) Aggregate(Cost(f,t) f ?Pre(A))
Sum-propagation ? Cost(f,t)
The plans for individual preconds may be
interacting
Max-propagation Max Cost(f,t)
Combination 0.5 ? Cost(f,t) 0.5 Max Cost(f,t)

Cant use something like set-level idea here
because That will entail tracking the costs of
subsets of literals
Probably other better ideas could be tried
12
Termination Criteria
cost

Deadline Termination Terminate at time point t
if
? goal G Dealine(G) ? t
? goal G (Dealine(G) lt t) ? (Cost(G,t) ?
Fix-point Termination Terminate at time point t
where we can not improve the cost of any
proposition.
K-lookahead approximation At t where Cost(g,t) lt
?, repeat the process of applying (set) of
actions that can improve the cost functions k
times.

?
300
220
100
0
time
1.5
2
10
Earliest time point
Cheapest cost
Drive-car(Tempe,LA)
Plane(P,LA)
H(T,P)
Shuttle(T,P)
t 0
0.5
1.5
1
t 10
13
Heuristic estimation using the cost functions
The cost functions have information to track both
temporal and cost metric of the plan, and their
inter-dependent relations !!!

If the objective function is to minimize time h
t0
If the objective function is to minimize cost h
CostAggregate(G, t?)
If the objective function is the function of both
time and cost
O f(time,cost) then
h min f(t,Cost(G,t)) s.t. t0 ? t ? t?
Eg f(time,cost) 100.makespan Cost then
h 100x2 220 at t0 ? t 2 ? t?

cost
?
300
220
100
0
t01.5
2
t? 10
time
Cost(At(LA))
Earliest achieve time t0 1.5 Lowest cost time
t? 10
14
Heuristic estimation by extracting the relaxed
plan

Relaxed plan satisfies all the goals ignoring the
negative interaction
Take into account positive interaction
Base set of actions for possible adjustment
according to neglected (relaxed) information
(e.g. negative interaction, resource usage etc.)
? Need to find a good relaxed plan (among
multiple ones) according to the objective function

15
Heuristic estimation by extracting the relaxed
plan
cost

Initially supported facts SF Init state
Initial goals G Init goals \ SF
Traverse backward searching for actions
supporting all the goals. When A is added to the
relaxed plan RP, then
SF SF ? Effects(A)
G (G ? Precond(A)) \ Effects
If the objective function is f(time,cost), then A
is selected such that
f(t(RPA),C(RPA)) f(t(Gnew),C(Gnew))
is minimal (Gnew (G ? Precond(A)) \ Effects)
When A is added, using mutex to set orders
between A and actions in RP so that less number
of causal constraints are violated

?
300
220
100
0
t01.5
2
t? 10
time
Tempe
L.A
Phoenix
f(t,c) 100.makespan Cost
16
Heuristic estimation by extracting the relaxed
plan
cost

General Alg. Traverse backward searching for
actions supporting all the goals. When A is added
to the relaxed plan RP, then
Supported Fact SF ? Effects(A)
Goals SF \ (G ? Precond(A))
Temporal Planning with Cost If the objective
function is f(time,cost), then A is selected such
that
f(t(RPA),C(RPA)) f(t(Gnew),C(Gnew))
is minimal (Gnew (G ? Precond(A)) \ Effects)
Finally, using mutex to set orders between A and
actions in RP so that less number of causal
constraints are violated

?
300
220
100
0
t01.5
2
t? 10
time
Tempe
L.A
Phoenix
f(t,c) 100.makespan Cost
17
Adjusting the Heuristic Values
Ignored resource related information can be used
to improve the heuristic values (such like ve
and ve interactions in classical planning)
Adjusted Cost C C ?R ?
(Con(R) (Init(R)Pro(R)))/?R? C(AR)
? Cannot be applied to admissible heuristics
18
4/10
19
Partialization Example
A position-constrained plan with makespan 22
A1(10) gives g1 but deletes p A3(8) gives g2 but
requires p at start A2(4) gives p at end We
want g1,g2
A1
A2
A3
p
Order Constrained plan
The best makespan dispatch of the
order-constrained plan
A2
g2
A3
G
A2
A3
14e
A1
A1
g1
There could be multiple O.C. plans because of
multiple possible causal sources. Optimization
will involve Going through them all.
et(A1) lt et(A2) or st(A1) gt st(A3) et(A2)
lt st(A3) .
20
Problem Definitions

Position constrained (p.c) plan The execution
time of each action is fixed to a specific time
point
Can be generated more efficiently by state-space
planners
Order constrained (o.c) plan Only the relative
orderings between actions are specified
More flexible solutions, causal relations between
actions
Partialization Constructing a o.c plan from a
p.c plan

t1
t2
t3
Q
R
Q
R
R
R
G
G
?R
?R
Q
Q
Q
G
Q
G
p.c plan
o.c plan
21
Validity Requirements for a partialization

An o.c plan Poc is a valid partialization of a
valid p.c plan Ppc, if
Poc contains the same actions as Ppc
Poc is executable
Poc satisfies all the top level goals
(Optional) Ppc is a legal dispatch (execution) of
Poc
(Optional) Contains no redundant ordering
relations

redundant
X
P
P
Q
Q
22
Greedy Approximations

Solving the optimization problem for makespan and
number of orderings is NP-hard (Backstrom,1998)
Greedy approaches have been considered in
classical planning (e.g. Kambhampati Kedar,
1993, Veloso et. al.,1990)
Find a causal explanation of correctness for the
p.c plan
Introduce just the orderings needed for the
explanation to hold

23
Partialization A simple example
Pickup(A)
Stack(A,B)
Pickup(C)
Stack(C,D)
On(A,B)
Stack(A,B)
Holding(C)
Pickup(A)
Stack(C,D)
On(C,D)
Hand-empty
Pickup(C)
Holding(B)
Hand-empty
24
Modeling greedy approaches as value ordering
strategies
Key insight We can capture many of the greedy
approaches as specific value ordering strategies
on the CSOP encoding

Variation of Kambhampati Kedar,1993 greedy
algorithm for temporal planning as value
ordering
Supporting variables SpA A such that
etpA lt stpA in the p.c plan Ppc
? B s.t. etpA lt et?pB lt stpA
? C s.t. etpC lt etpA and satisfy two above
conditions
Ordering and interference variables
?pAB lt if et?pB lt stpA ?pAB gt if st?pB gt
stpA
?rAA lt if etrA lt strA in Ppc ?rAA gt if strA
gt etrA in Ppc ?rAA ? other wise.

25
CSOP Variables and values

Continuous variables
Temporal stA D(stA) 0, ?, D(stinit) 0,
D(stGoals) Dl(G).
Resource level VrA
Discrete variables
Resource ordering ?rAA Dom(?rAA) lt,gt or
Dom(?rAA) lt,gt,?
Causal effect SpA Dom(SpA) B1, B2,Bn, p?
E(Bi)
Mutex ?pAA Dom(?pAA) lt,gt p ?
E(A),?p?E(A) U P(A)

A2
A3
Exp Dom(SQA2) Aibit, A1 Dom(SRA3) A2,
Dom(SGAg) A3 ?RA1A2, ?RA1A3
Q
R
R
G
?R
Q
A1
G
Q
26
Constraints

Causal-link protection
SpA B ? ?A, ?p?E(A) (?pAB lt) ? (?pAA gt)
Ordering and temporal variables
SpA B ? etpB lt stpA
?pAB lt ? et?pA lt stpA ?pAB gt ? et?pA gt
stpA
?rAA lt ? etrA lt strA ?rAA gt ? strA gt etrA
Optional temporal constraints
Goal deadline stAg ? tg
Time constraints on individual actions L ? stA
? U
Resource precondition constraints
For each precondition VrA ? K, ? gt,lt,?,?,
set up one constraint involving all ?rAA such
as
Exp Initr ?AltAUrA ?A?A,Ult0 UrA gt K if ? gt

27
Modeling Different Objective Functions

Temporal quality
Minimum Makespan Minimize MaxA (stA durA)
Maximize summation of slacks
Maximize ?(stgAg - etgA) SgAg A
Maximize average flexibility
Maximize Avg(Dom(stA))
Fewest orderings
Minimize (stA lt stA)

28
Empirical evaluation

Objective
Demonstrate that metric temporal planner armed
with our approach is able to produce plans that
satisfy a variety of cost/makespan tradeoff.
Testing problems
Randomly generated logistics problems from TP4
(HasslumGeffner)

Load/unload(package,location) Cost 1 Duration
1 Drive-inter-city(location1,location2) Cost
4.0 Duration 12.0 Flight(airport1,airport2)
Cost 15.0 Duration 3.0 Drive-intra-city(loc
ation1,location2,city) Cost 2.0 Duration
2.0
29
LPG DiscussionLook at notes of Week 12 (as they
are more uptodate)

Write a Comment

User Comments (0)