Title: Chapter 17 Planning Based on Model Checking
1Chapter 17Planning Based on Model Checking
Lecture slides for Automated Planning Theory and
Practice
- Dana S. Nau
- University of Maryland
- 1123 PM June 16, 2015
2Motivation
c
a
b
Intendedoutcome
- Actions with multiple possibleoutcomes
- Action failures
- e.g., gripper drops its load
- Exogenous events
- e.g., road closed
- Nondeterministic systems are like Markov Decision
Processes (MDPs), but without probabilities
attached to the outcomes - Useful if accurate probabilities arent
available, or if probability calculations would
introduce inaccuracies
c
grasp(c)
a
b
c
a
b
Unintendedoutcome
3Nondeterministic Systems
- Nondeterministic system a triple ? (S, A, ?)
- S finite set of states
- A finite set of actions
- ? S ? A ? 2s
- Like in the previous chapter, the book doesnt
commit to any particular representation - It only deals with the underlying semantics
- Draw the state-transition graph explicitly
- Like in the previous chapter, a policy is a
function from states into actions - p S ? A
- Notation Sp s (s,a) ? p
- In some algorithms, well temporarily have
nondeterministic policies - Ambiguous multiple actions for some states
- p S ? 2A, or equivalently, p ? S ? A
- Well always make these policies deterministic
before the algorithm terminates
4Example
- Robot r1 startsat location l1
- Objective is toget r1 to location l4
- p1 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)) - p2 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s5,
move(r1,l3,l4)) - p3 (s1, move(r1,l1,l4))
5Example
- Robot r1 startsat location l1
- Objective is toget r1 to location l4
- p1 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)) - p2 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s5,
move(r1,l3,l4)) - p3 (s1, move(r1,l1,l4))
6Example
- Robot r1 startsat location l1
- Objective is toget r1 to location l4
- p1 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)) - p2 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s5,
move(r1,l3,l4)) - p3 (s1, move(r1,l1,l4))
7Execution Structures
s5
s2
s3
- Execution structurefor a policy p
- The graph of all ofps execution paths
- Notation ?p (Q,T)
- Q ? S
- T ? S ? S
- p1 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)) - p2 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s5,
move(r1,l3,l4)) - p3 (s1, move(r1,l1,l4))
s1
s4
8Execution Structures
s5
s2
s3
- Execution structurefor a policy p
- The graph of all ofps execution paths
- Notation ?p (Q,T)
- Q ? S
- T ? S ? S
- p1 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)) - p2 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s5,
move(r1,l3,l4)) - p3 (s1, move(r1,l1,l4))
s1
s4
9Execution Structures
s5
s2
s3
- Execution structurefor a policy p
- The graph of all ofps execution paths
- Notation ?p (Q,T)
- Q ? S
- T ? S ? S
- p1 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)) - p2 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s5,
move(r1,l3,l4)) - p3 (s1, move(r1,l1,l4))
s1
s4
10Execution Structures
s5
s2
s3
- Execution structurefor a policy p
- The graph of all ofps execution paths
- Notation ?p (Q,T)
- Q ? S
- T ? S ? S
- p1 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)) - p2 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s5,
move(r1,l3,l4)) - p3 (s1, move(r1,l1,l4))
s1
s4
11Execution Structures
- Execution structurefor a policy p
- The graph of all ofps execution paths
- Notation ?p (Q,T)
- Q ? S
- T ? S ? S
- p1 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)) - p2 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s5,
move(r1,l3,l4)) - p3 (s1, move(r1,l1,l4))
s1
s4
12Execution Structures
- Execution structurefor a policy p
- The graph of all ofps execution paths
- Notation ?p (Q,T)
- Q ? S
- T ? S ? S
- p1 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)) - p2 (s1, move(r1,l1,l2)), (s2,
move(r1,l2,l3)), (s3, move(r1,l3,l4)), (s5,
move(r1,l3,l4)) - p3 (s1, move(r1,l1,l4))
s1
s4
13Types of Solutions
- Weak solution at least one execution path
reaches a goal - Strong solution every execution path reaches a
goal - Strong-cyclic solution every fair execution path
reaches a goal - Dont stay in a cycle forever if theres a
state-transition out of it
s0
s3
Goal
a2
a0
s2
s1
Goal
a3
a1
a3
s0
s3
a2
a0
Goal
s2
s1
a1
14Finding Strong Solutions
- Backward breadth-first search
- StrongPreImg(S) (s,a) ?(s,a) ? ?,
?(s,a) ? S - all state-action pairs for whichall of the
successors are in S - PruneStates(p,S) (s,a) ? p s ? S
- S is the set of states wevealready solved
- keep only the state-actionpairs for other states
- MkDet(p')
- p' is a policy that may be nondeterministic
- remove some state-action pairs ifnecessary, to
get a deterministic policy
15Example
2
- p failure
- p' ?
- Sp' ?
- Sg ? Sp' s4
Start
s4
Goal
16Example
s5
2
- p failure
- p' ?
- Sp' ?
- Sg ? Sp' s4
- p'' ? PreImage (s3,move(r1,l3,l4)),
(s5,move(r1,l5,l4))
s3
Start
s4
Goal
17Example
s5
2
- p failure
- p' ?
- Sp' ?
- Sg ? Sp' s4
- p'' ? PreImage (s3,move(r1,l3,l4)),
(s5,move(r1,l5,l4)) - p ? p' ?
- p' ? p' U p'' (s3,move(r1,l3,l4)),
(s5,move(r1,l5,l4))
s3
Start
s4
Goal
18Example
s5
- p ?
- p' (s3,move(r1,l3,l4)),
(s5,move(r1,l5,l4)) - Sp' s3,s5
- Sg ? Sp' s3,s4,s5
2
s3
Start
s4
Goal
19Example
s5
- p ?
- p' (s3,move(r1,l3,l4)),
(s5,move(r1,l5,l4)) - Sp' s3,s5
- Sg ? Sp' s3,s4,s5
- PreImage ? (s2,move(r1,l2,l3)),
(s3,move(r1,l3,l4)), (s5,move(r1,l5,l4)),
(s3,move(r1,l4,l3)), (s5,move(r1,l4,l5)) - p'' ? (s2,move(r1,l2,l3))
- p ? p' (s3,move(r1,l3,l4)),
(s5,move(r1,l5,l4)) - p' ? (s2,move(r1,l2,l3),
(s3,move(r1,l3,l4)), (s5,move(r1,l5,l4))
2
s2
s3
Start
s4
Goal
20Example
s5
- p (s3,move(r1,l3,l4)), (s5,move(r1,l5,l4))
- p' (s2,move(r1,l2,l3)),
(s3,move(r1,l3,l4)), (s5,move(r1,l5,l4)) - Sp' s2,s3,s5
- Sg ? Sp' s2,s3,s4,s5
2
s2
s3
Start
s4
Goal
21Example
s5
- p (s3,move(r1,l3,l4)), (s5,move(r1,l5,l4))
- p' (s2,move(r1,l2,l3)),
(s3,move(r1,l3,l4)), (s5,move(r1,l5,l4)) - Sp' s2,s3,s5
- Sg ? Sp' s2,s3,s4,s5
- p'' ? (s1,move(r1,l1,l2))
- p ? p' (s2,move(r1,l2,l3)),
(s3,move(r1,l3,l4)), (s5,move(r1,l5,l4)) - p' ? (s1,move(r1,l1,l2)),
(s2,move(r1,l2,l3)),
(s3,move(r1,l3,l4)),
(s5,move(r1,l5,l4))
2
s2
s3
Start
s1
s4
Goal
22Example
s5
2
- p (s2,move(r1,l2,l3)), (s3,move(r1,l3,l4)),
(s5,move(r1,l5,l4)) - p' (s1,move(r1,l1,l2)),
(s2,move(r1,l2,l3)), (s3,move(r1,l3,l4)),
(s5,move(r1,l5,l4)) - Sp' s1,s2,s3,s5
- Sg ? Sp' s1,s2,s3,s4,s5
s2
s3
Start
s1
s4
Goal
23Example
s5
2
- p (s2,move(r1,l2,l3)), (s3,move(r1,l3,l4)),
(s5,move(r1,l5,l4)) - p' (s1,move(r1,l1,l2)),
(s2,move(r1,l2,l3)), (s3,move(r1,l3,l4)),
(s5,move(r1,l5,l4)) - Sp' s1,s2,s3,s5
- Sg ? Sp' s1,s2,s3,s4,s5
- S0 ? Sg ? Sp'
- MkDet(p') p'
s2
s3
Start
s1
s4
Goal
24Finding Weak Solutions
- Weak-Plan is just like Strong-Plan except for
this - WeakPreImg(S) (s,a) ?(s,a) i S ? ?
- at least one successor is in S
Weak
Weak
25Example
2
- p failure
- p' ?
- Sp' ?
- Sg ? Sp' s4
Start
s4
Goal
Weak
Weak
26Example
s5
2
- p failure
- p' ?
- Sp' ?
- Sg ? Sp' s4
- p'' PreImage (s1,move(r1,l1,l4)),
(s3,move(r1,l3,l4)), (s5,move(r1,l5,l4)) - p ? p' ?
- p' ? p' U p'' (s1,move(r1,l1,l4)),
(s3,move(r1,l3,l4)), (s5,move(r1,l5,l4))
s3
Start
s1
s4
Goal
Weak
Weak
27Example
s5
2
- p ?
- p' (s1,move(r1,l1,l4)),
(s3,move(r1,l3,l4)), (s5,move(r1,l5,l4)) - Sp' s1,s3,s5
- Sg ? Sp' s1,s3,s4,s5
- S0 ? Sg ? Sp'
- MkDet(p') p'
s3
Start
s1
s4
Goal
Weak
Weak
28Finding Strong-Cyclic Solutions
- Begin with a universal policy p' that contains
all state-action pairs - Repeatedly, eliminate state-action pairs that
take us to bad states - PruneOutgoing removes state-action pairs that go
to states not in Sg?Sp - PruneOutgoing(p,S) p (s,a) ? p ?(s,a) ?
S?Sp - PruneUnconnected removes states from which it is
impossible to get to Sg - Start with p' ?, compute fixpoint of p' ? p n
WeakPreImg(Sg?Sp)
29FindingStrong-Cyclic Solutions
s5
2
s2
s3
- Once the policy stops changing,
- If its not a solution, returnfailure
- RemoveNonProgressremoves state-actionpairs
that dont gotoward the goal - implement asbackward searchfrom the goal
- MkDet makes suretheres only one actionfor each
state
Start
s1
s4
s6
Goal
at(r1,l6)
30Example 1
s5
2
- p ? ?
- p' ? (s,a) a is applicable to s
s2
s3
Start
s1
s4
s6
Goal
at(r1,l6)
31Example 1
s5
2
- p ? ?
- p' ? (s,a) a is applicable to s
- p ? (s,a) a is applicable to s
- PruneOutgoing(p',Sg) p'
- PruneUnconnected(p',Sg) p'
- RemoveNonProgress(p') ?
s2
s3
Start
s1
s4
s6
Goal
at(r1,l6)
32Example 1
s5
2
- p ? ?
- p' ? (s,a) a is applicable to s
- p ? (s,a) a is applicable to s
- PruneOutgoing(p',Sg) p'
- PruneUnconnected(p',Sg) p'
- RemoveNonProgress(p') as shown
s2
s3
Start
s1
s4
s6
Goal
at(r1,l6)
33Example 1
s5
2
- p ? ?
- p' ? (s,a) a is applicable to s
- p ? (s,a) a is applicable to s
- PruneOutgoing(p',Sg) p'
- PruneUnconnected(p',Sg) p'
- RemoveNonProgress(p') as shown
- MkDet() either(s1,move(r1,l1,l4),
(s2,move(r1,l2,l3)), (s3,move(r1,l3,l4),
(s4,move(r1,l4,l6), (s5,move(r1,l5,l4) - or (s1,move(r1,l1,l2), (s2,move(r1,l2,l3)),
(s3,move(r1,l3,l4), (s4,move(r1,l4,l6),
(s5,move(r1,l5,l4)
s2
s3
Start
s1
s4
s6
Goal
at(r1,l6)
34Example 2 no applicable actions at s5
s5
2
- p ? ?
- p' ? (s,a) a is applicable to s
s2
s3
Start
s1
s4
s6
Goal
at(r1,l6)
35Example 2 no applicable actions at s5
s5
2
- p ? ?
- p' ? (s,a) a is applicable to s
- p ? (s,a) a is applicable to s
- PruneOutgoing(p',Sg)
s2
s3
Start
s1
s4
s6
Goal
at(r1,l6)
36Example 2 no applicable actions at s5
s5
2
- p ? ?
- p' ? (s,a) a is applicable to s
- p ? (s,a) a is applicable to s
- PruneOutgoing(p',Sg) p'
s2
s3
Start
s1
s4
s6
Goal
at(r1,l6)
37Example 2 no applicable actions at s5
2
- p ? ?
- p' ? (s,a) a is applicable to s
- p ? (s,a) a is applicable to s
- PruneOutgoing(p',Sg) p'
- PruneUnconnected(p',Sg) as shown
s2
s3
Start
s1
s4
s6
Goal
at(r1,l6)
38Example 2 no applicable actions at s5
2
- p ? ?
- p' ? (s,a) a is applicable to s
- p ? (s,a) a is applicable to s
- PruneOutgoing(p',Sg) p'
- PruneUnconnected(p',Sg) as shown
- p' ? as shown
s2
s3
Start
s1
s4
s6
Goal
at(r1,l6)
39Example 2 no applicable actions at s5
2
- p' ? as hown
- p ? p'
- PruneOutgoing(p',Sg) p'
- PruneUnconnected(p',Sg) p'
- so p p'
- RemoveNonProgress(p')
s2
s3
Start
s1
s4
s6
Goal
at(r1,l6)
40Example 2 no applicable actions at s5
2
- p' ? shown
- p ? p'
- PruneOutgoing(p',Sg) p'
- PruneUnconnected(p'',Sg) p'
- so p p'
- RemoveNonProgress(p') as shown
- MkDet(shown) no change
s2
s3
Start
s1
s4
s6
Goal
at(r1,l6)
41Planning for Extended Goals
- Here, extended means temporally extended
- Constraints that apply to some sequence of states
- Examples
- want to move to l3,and then to l5
- want to keep goingback and forthbetween l3 and
l5
42Planning for Extended Goals
- Context the internal state of the controller
- Plan (C, c0, act, ctxt)
- C a set of execution contexts
- c0 is the initial context
- act S ? C ? A
- ctxt S ? C ? S ? C
- Sections 17.3 extends the ideas in Sections 17.1
and 17.2 to deal with extended goals - Well skip the details