Title: ModelBased Reflection and SelfAdaptation
1Model-Based Reflection andSelf-Adaptation
Ashok K. Goel Artificial Intelligence
Laboratory College of Computing Georgia Institute
of Technology Atlanta, GA 30332-0280 goel_at_cc.gatec
h.edu http//www.cc.gatech.edu/ai/faculty/goel/
Indian International Conference on Artificial
IntelligenceHyderabad, India December 2003
2Model-Based Reflection and Self-Adaptation
Agent
- Self-Adaptation
- Incremental adaptation of self inresponse to
demands of environment - Reflection
- Reasoning about self
- Use of knowledge of self
- Model
- Self-knowledge
- Telelogical and compositional knowledge
New Goal
Evolved Agent
3Model-Based Self-Adaptation
- Given
- A model of an agent that can accomplish some
tasks - A specification of a new task
- A set of input values
- Makes
- A revised model of the agent which is capable of
performing the new task
4Model-Based AdaptationTrivial Example
Agent
New Task
Revised Agent
?
Install Light Bulb
Install Light Bulb
Remove Light Bulb
Traditional Installation
Traditional Installation
Insert Bulb
Rotate Bulb
Insert Bulb
Rotate Bulb
Remove Light Bulb
Newly Created Method
Retract Bulb
Rotate Bulb
5Constraints
- Automation
- transfer all the way down to the level of code
- Uniformity
- meta-level and base-level in the same language
- Generality
- variety of tasks, domains
- should allow execution with failures leading to
repair - also allow evolution without prior execution
6Issues
- Purely model-based evolution
- Model-based evolution with traces
- Additional mechanisms
- Knowledge requirements
- Level of decomposition
- Computational cost
7Issues (1)
- Purely model-based evolution
- For what classes of problems is evolution based
only on models applicable? - Model-based evolution with traces
- What additional classes of problems may be solved
by the addition of traces of specific reasoning
episodes? - Additional mechanisms
- For what classes of problems does model-based
evolution provide incomplete solutions? - What sorts of additional mechanisms can complete
these problems?
8Issues (2)
- Knowledge Requirements
- Are the knowledge requirements of this technique
reasonable? - Level of decomposition
- How should the correct level of decomposition for
a model be determined? - Computational cost
- For what problems is the computational cost of
this approach competitive with other approaches?
9Varieties of Adaptation
- Types of redesign processes
- credit assignment
- repair
- analysis of a single episode of processing
- analysis of potential paths of processing
- Types of redesign knowledge
- Components, connections
- Control of processing
- Function of processing elements
- Domain knowledge used in processing
10Why is it hard?
- Purely model-based reasoning
- Many elements and connections
- Local change can have non-local effects
- Many paths through these elements and connections
- Model-based reasoning with traces
- Addresses third issue and may address the second,
but still leaves you with the first
11Projects
- Failure-driven agent evolution
- Autognostic (1992-96), SIRRINE (1997-2001)
- Software architecture
- MORALE (1998-2001)
- Robotics
- Reflecs (1993-96)
- Model-based evolution in response to both
failures and novel tasks - REM (1998-present)
12Task-Method-Knowledge (TMK)
- TMK models provide the agent with knowledge of
its own design. - TMK encodes
- Tasks (including Subtasks and Primitive Tasks)
- Requirements and results
- Methods
- Composition and control
- Knowledge
- Domain concepts and relations
13Evolutionary Reasoning Shell
- REM (Reflective Evolutionary Mind)
- Uses TMKL (TMK Language)
- A new, powerful formalism of TMK
- REM uses reasoning processes encoded in TMKL. It
executes these processes and it adapts them as
needed.
14Tasks in TMKL
- All tasks can have input output parameter lists
and given makes conditions. - A non-primitive task must have one or more
methods which accomplishes it. - A primitive task must include one or more of the
following source code, a logical assertion, a
specified output value. - Unimplemented tasks have neither of these.
15TMKL Task
- (define-task communicate-with-www-server
- input (input-url)
- output (server-reply)
- makes
- (and
- (document-at-location (value server-reply)
- (value
input-url)) - (document-at-location (value server-reply)
-
local-host)) - by-mmethod (communicate-with-server-method))
16Methods in TMKL
- Methods have provided and additional result
conditions which specify incidental requirements
and results. - In addition, a method specifies a start
transition for its processing control. - Each transition specifies requirements for using
it and a new state that it goes to. - Each state has a task and a set of outgoing
transitions.
17Different Kinds of Similarity
- Some problems have very different solutions which
can be produced by very similar reasoning
mechanisms. - Because solutions are very different, traditional
CBR over solutions is not appropriate. - Because reasoning is very similar, maybe we can
do CBR over reasoning. - If the reasoning is case-based, this process is
Meta-Case-Based Reasoning
18Simple TMKL Method
- (define-mmethod external-display
- provided (not (internal-display-tag (value
server-tag))) - series (select-display-command
- compile-display-command
- execute-display-command))
19Complex TMKL Method
- (define-mmethod make-plan-node-children-mmethod
- series (select-child-plan-node
- make-subplan-hierarchy
- add-plan-mappings
- set-plan-node-children))
- (tell (transitiongtlinks make-plan-node-children-m
method-t3 - equivalent-plan-nodes
- child-equivalent-plan-nod
es) - (transitiongtnext make-plan-node-children-mm
ethod-t5 - make-plan-node-children-mm
ethod-s1) - (create make-plan-node-children-terminate
transition) - (reasoning-stategttransition
make-plan-node-children-mmethod-s1 -
make-plan-node-children-terminate) - (about make-plan-node-children-terminate
- (transitiongtprovided
- '(terminal-addam-value (value
child-plan-node)))))
20Knowledge in TMKL
- Foundation LOOM
- Concepts, instances, relations
- Concepts and relations are instances and can have
facts about them. - Knowledge representation in TMKL involves LOOM
some TMKL specific reflective concepts and
relations.
21Some TMKL Knowledge Modeling
- (defconcept location)
- (defconcept computer
- is-primitive location)
- (defconcept url
- is-primitive location
- roles (text))
- (defrelation text
- range string
- characteristics single-valued)
- (defrelation document-at-location
- domain reply
- range location)
- (tell (external-state-relation document-at-locatio
n))
22Sample Meta-Knowledge in TMKL
- generic relations
- same-as
- instance-of
- isa
- inverse-of
- relation characteristics
- single-valued/multiple-valued
- symmetric, commutative
- many more
- relations over relations
- external/internal
- state/definitional
- concepts of relations
- binary-relation
- unary-relation
- isa
- inverse-of
- concepts relating to concepts
- thing
- meta-concept
- concept
23Flexibility
- Case-based reasoning is fundamentally flexible
- A CBR system can address a new problem whenever
it has a sufficiently similar case. - Traditional CBR systems cannot solve new kinds of
problems. - even when the reasoning for those new kinds of
problems is very similar to known reasoning
24Traditional Example Hierarchical Case-Based
Disassembly Planning
Remove Board-2-1
Remove Board-2-2
Unscrew Screw-2-2
Remove Board-3-3
Remove Board-3-1
Remove Board-3-2
Unscrew Screw-3-2
25Alternate Example Using the Disassembly Planner
for Assembly
Remove Board-2-1
Remove Board-2-2
Unscrew Screw-2-2
Place Board-2-1
Place Board-2-2
Screw Screw-2-2
26Different Kinds of Similarity
- Some problems have very different solutions which
can be produced by very similar reasoning
mechanisms. - Because solutions are very different, traditional
CBR over solutions is not appropriate. - Because reasoning is very similar, maybe we can
do CBR over reasoning. - If the reasoning is case-based, this process is
Meta-Case-Based Reasoning
27REM Reasoning Process
...
Implemented Task
Execution
...
A Method
...
ADAPTED Implemented Task
Trace
...
Set of Input Values
ADAPTED Method
Set of Output Values
Unimplemented Task
Adaptation
Set of Input Values
28Adaptation Process
Generative Planning
...
Task
ADAPTED Implemented Task
Situated Learning
...
Set of Input Values
ADAPTED Method
Proactive Model Transfer
...
...
Existing Method
Similar Implemented Task
Failure-Driven Model Transfer
...
Trace
A Method
29Execution Process
...
Implemented Task
Select Method
...
A Method
Trace
Select Next Task Within Method
Set of Input Values
Set of Output Values
Execute Primitive Task
30Selection Q-Learning
- Popular, simple form of reinforcement learning.
- In each state, each possible decision is assigned
an estimate of its potential value (Q). - For each decision, preference is given to higher
Q values. - Each decision is reinforced, i.e., its Q value
is altered based on the results of the actions. - These results include actual success or failure
and the Q values of next available decisions.
31Physical Device Disassembly
- ADDAM Legacy software agent for hierarchical
case-based disassembly planning and (simulated)
execution - Interactive Agent connects to a user specifying
goals and to a complex physical environment - Dynamic New designs and demands
- Knowledge Intensive Designs, plans, etc.
32Disassembly ? Assembly
- A user with access to the ADDAM disassembly agent
wishes to have this agent instead do assembly. - ADDAM has no assembly method thus must adapt
first. - Since assembly is similar to disassembly, REM
selects Proactive Model Transfer.
33Pieces of ADDAM which are key to the Disassembly
? Assembly Problem
Disassemble
Plan Then Execute Disassembly
Adapt Disassembly Plan
Execute Plan
Hierarchical Plan Execution
Topology Based Plan Adaptation
Make Plan Hierarchy
Map Dependencies
Select Next Action
Execute Action
Select Dependency
Assert Dependency
Make Equivalent Plan Nodes Method
Make Equivalent Plan Node
Add Equivalent Plan Node
34Process for Addressing the Assemble Task by REM
using ADDAM
- First the agent tries to find a method for the
Assemble task. It doesnt have one. - Next it tries to find a similar task which does
have a method. It finds Disassemble. - The index is the input and output information
provided in the task. - Similarity is determined by a combination of
general rules plus domain-specific rules and
assertions. - Next it searches for a relation which links the
effects of the two task. It finds Inverse-of. - Finally, it uses this relation to modify
components of the existing process to address the
new process. - Some of these changes may be uncertain
Q-learning resolves this uncertainty.
35New Adapted Assembly Task
Assemble
COPIED Plan Then Execute Disassembly
COPIED Adapt Disassembly Plan
COPIED Execute Plan
COPIED Hierarchical Plan Execution
COPIED Topology Based Plan Adaptation
COPIED Make Plan Hierarchy
COPIED Map Dependencies
Select Next Action
INSERTED Inversion Task 2
Execute Action
COPIED Select Dependency
INVERTED Assert Dependency
COPIED Make Equivalent Plan Nodes Method
COPIED Add Equivalent Plan Node
INSERTED Inversion Task 1
COPIED Make Equivalent Plan Node
36Changes to ADDAM
- After the task which produces plan nodes an
optional task which imposes the inverse-of
relation on the type of the node. - e.g., Unscrew ? Screw
- The (simple) task which asserts ordering
dependencies is changed to assert the inverse-of
ordering dependencies. - After the task which extracts plan nodes from a
plan an optional task which imposes the
inverse-of relation on the type of the node. - e.g., Screw ? Unscrew
37ADDAM Example Layered Roof
38Alternative ApproachSituated Learning
- Another way to address this problem would be to
simply try out actions (in the simulator) and see
what happens. - Since REM selects among alternatives using
Q-learning, trying arbitrary actions in REM is
pure Q-learning. - Contrast with the Meta-Case-Based Reasoning
example which only makes two decisions by
Q-learning in the entire process.
39Alternative Approach Generative Planning
- Since the assembly problem is a planning problem,
one can address it by traditional generative
planning. - It is well known that Case-Based Reasoning can
provide enormous speed-up versus Generative
Planning - Meta-Case-Based Reasoning can enable the speed
advantages of CBR to be transferred to new
problems that the existing cases dont directly
address (such as assembly in the disassembly
planner).
40Modified Roof Assembly No Conflicting Goals
41Roof Assembly
42Solve Problem
Evolve then Execute
Execute then Evolve
Execute
Elicit Feedback
Evolve
Planning Evolution
Proactive Model Evolution
Situator Evolution
Failure Driven Model Evolution
Evolve using Situator
Evolve Known Task
Retrieve Known Task
Select Failure
Analyze Failures
Make Repair
Evolve using Generative Planning
43Task Solve Problem
Input problem state, main task Output result
state, result trace Given - Makes the given
condition for the main task holds By Method
Execute then Evolve, Evolve then Execute
44Method Execute then Evolve
Provided the main task is implemented Additional
Result -
- Control
- REPEAT
- Execute
- WHILE (reinforcement learning is occurring AND
task has not succeeded) - Elicit Feedback
- UNLESS (task has succeeded AND there is no
feedback) - Evolve
- UNLESS no repair has been made
- Solve Problem
45Method Evolve then Execute
Provided - Additional Result -
- Control
- Evolve
- Solve Problem
46Execute
- Execution of a non-primitive task involves
selection of a method. - Selection of methods decision making
- Execution of a method involves following a
state-transition diagram encoded in that method,
invoking subtasks. - Selection of transitions decision making
- Primitive tasks are simply run.
- Execution involves building a trace and checking
that tasks behave as specified.
47Execute
- IF check given condition for main task THEN
- IF main task is primitive THEN
- do main task
- ELSE
- chosen-method DECIDE on one applicable
method for the main task - current-transition start transition of
chosen-method - WHILE current transition is not terminal
- current-state next state of
current-transition - Execute current-states subtask
- current-state DECIDE on one available
transition from the next state - check makes condition for main task
- build trace element for main task
48Decision Making Q-Learning
- In each state, each possible decision is assigned
an estimate of its potential value (Q). - For each decision, preference is given to higher
Q values. - Each decision is reinforced, i.e., its Q value
is altered based on the results of the actions. - These results include actual success or failure
and the Q values of next available decisions.
49Q-Learning in REM
- Decisions are made for method selection and for
selecting new transitions within a method. - A decision state is a point in the reasoning
(i.e., task, method) plus a set of all decisions
which have been made in the past. - Initial Q values are set to 0.
- Decides on option with highest Q value or
randomly selects option with probabilities
weighted by Q value (configurable). - A decision receives positive reinforcement when
it leads immediately (without any other
decisions) to the success of the overall task.
50Evolution Methods
- Planning Evolution
- Invokes Graphplan to combine primitive tasks into
a single method - Situator Evolution
- Creates a separate method for each possible
action - During execution, the decision-making
(Q-learning) module selects actions - Proactive Model Evolution
- Adapts a similar known task to address a new task
- Failure-Driven Model Evolution
- Identifies and fixes problems with existing task
/ method - Requires a trace
51Evolve Using Generative Planning
- Invokes Graphplan
- Operators Those primitive tasks known to the
agent which can be translated into Graphplans
operator language - Facts Known assertions which involve relations
referred to by the operators - Goal Makes condition of main task
- Translates plan into more general method by
turning specific objects into parameters - Stores method for later reuse
52Evolve Using Generative Planning
- actions all primitive tasks which can be
translated into Graphplans operator language - FOR action IN actions DO
- relevant-relations relevant-relations
relations which are referenced in the
preconditions or postconditions of action - facts all known assertions regarding the
relevant-relations in the current knowledge state
or the given state of main-task - goal makes condition of main-task
- plan run Graphplan on actions, facts, goal
- replace specific values in plan with parameters
- method for main-task new method whos
subtasks are the steps of plan
53Evolve Using Situator
- Creates one method for each possible combination
of inputs for each known action. - e.g., place board-1 on board-2, place board-1 on
board-3, etc. - Selection of applicable methods is left
unspecified. - When the task is then executed, the choice of
methods is handled by the normal REM decision
making process (i.e., Q-Learning).
54Evolve Using Situator
- create a new method which loops until the makes
of the main-task is condition is met and invokes
a new task, new-task - FOR action IN all primitive tasks involving
action DO - input-value-combinations all possible
bindings of inputs to values - FOR input-value-combination IN
input-value-combinations DO - new-method new method for new-task
- new-subtask task which sets the values of
the parameters - to match the input-value-combination
- set the subtasks of new-method to be
new-subtask, action - set the provided condition of new-method to
the given - condition of main-task
55Method Failure Driven Model Evolution
Provided there is a trace of execution for the
main task Additional Result -
- Control
- REPEAT
- Analyze Failures
- Select Failure
- Make Repair
- WHILE (there is no failure which has been
repaired AND - task has not succeeded)
56Select Failure
- Types of failure
- directly-contradicts-feedback
- may-contradict-feedback
- missing-inputs
- given-fails
- makes-fails
- no-applicable-mmethod
- no-applicable-transition
57Select Failure
- Heuristics for selection
- Failures of directly-contradicts-feedback type
are prioritized ahead of may-contradict-feedback
which are prioritized ahead of other feedback
types. - Among feedback contradiction failures, ones which
affect primitive tasks are preferred.
58Make Repair
- Repair strategies for failure-driven model
evolution - Knowledge assimilation
- Knowledge re-organization
- Task/method parameter modification
- Task generalization/specialization
- Fixed-value method development
- Task insertion
59Method Proactive Model Evolution
Provided there is some implemented task similar
to the main task and proactive model
evolution has not been attempted Additional
Result -
- Control
- Retrieve Known Task
- Evolve Known Task
60Disassembly ? Assembly
- A user with access to ADDAM disassembly agent
wishes to have this agent instead do assembly. - Since ADDAM has no assembly method, REM selects
- Since assembly is similar to disassembly, REM
selects
Evolve then Execute
Proactive Model Evolution
61Disassemble
Plan Then Execute Disassembly
Adapt Disassembly Plan
Execute Plan
Hierarchical Plan Execution
Topology Based Plan Adaptation
Match Topologies
Sequentialize Plan
Execute Sequential Plan
Make Plan Hierarchy
Map Dependencies
Make Plan Hierarchy Method
Sequential Plan Execution
Serialized Dependency Mapping
Encapsulate Target Plan
Select Base Plan Top
Make Subplan Hierarchy
Select Next Action
Execute Action
Make Subplan Hierarchy Method
List Target Dependencies
Select Dependency
Assert Dependency
Make Plan Node Mappings
Find Equivalent Topology Nodes
Select Equivalent Plan Node
Make Equivalent Plan Nodes
Find Base Plan Node Children
Make Plan Node Children
Make Equivalent Plan Nodes Method
Make Plan Node Children Method
Select Equivalent Topology Node
Make Equivalent Plan Node
Add Equivalent Plan Node
Make Subplan Hierarchy
Add Plan Mappings
Set Plan Node Children
Select Child Plan Node
62Pieces of ADDAM which are key to the Disassembly
? Assembly Problem
Disassemble
Plan Then Execute Disassembly
Adapt Disassembly Plan
Execute Plan
Hierarchical Plan Execution
Topology Based Plan Adaptation
Make Plan Hierarchy
Map Dependencies
Select Next Action
Execute Action
Select Dependency
Assert Dependency
Make Equivalent Plan Nodes Method
Make Equivalent Plan Node
Add Equivalent Plan Node
63Adaptation Strategy Inversion
- Copy methods for known task to main task
- invertible-relations all relations for which
inverse-of holds with some other relation - invertible-concepts all concepts for which
inverse-of holds with some other concept - relevant-relations invertible-relations all
relations over invertible-concepts - relevant-manipulable-relations
relevant-relations which are internal state
relations - candidate-tasks all tasks which affect
relevant-manipulable-relations - FOR candidate-task IN candidate-tasks DO
- IF candidate-task directly asserts a
relevant-manipulable-relations THEN - invert the assertion for that candidate task
- ELSE IF candidate-task produces an invertible
output THEN - insert an inversion task after candidate-task
64New Adapted Task in theDisassembly ? Assembly
Problem
Assemble
COPIED Plan Then Execute Disassembly
COPIED Adapt Disassembly Plan
COPIED Execute Plan
COPIED Hierarchical Plan Execution
COPIED Topology Based Plan Adaptation
COPIED Make Plan Hierarchy
COPIED Map Dependencies
Select Next Action
INSERTED Inversion Task 2
Execute Action
COPIED Select Dependency
INVERTED Assert Dependency
COPIED Make Equivalent Plan Nodes Method
COPIED Add Equivalent Plan Node
INSERTED Inversion Task 1
COPIED Make Equivalent Plan Node
65Task Assert Dependency
- Before
- define-task Assert-Dependency
- input target-before-node, target-after-node
- asserts (node-precedes (value
target-before-node) - (value target-after-node))
- After
- define-task Inverted-Assert-Dependency
- input target-before-node, target-after-node
- asserts (node-follows (value
target-before-node) - (value target-after-node)))
66Task Make Equivalent Plan Node
- define-task make-equivalent-plan-node
- input base-plan-node, parent-plan-node,
equivalent-topology-node - output equivalent-plan-node
- makes (and
- (plan-node-parent (value
equivalent-plan-node) -
(value parent-plan-node)) - (plan-node-object (value
equivalent-plan-node) -
(value equivalent-topology-node)) - (implies (plan-action (value
base-plan-node)) - (type-of-action
(value equivalent-plan-node) -
(type-of-action (value base-plan-node))))) - by procedure ...
67Task Inverted Reversal Task 1
- define-task Inverted Reversal Task 1
- input equivalent-plan-node
- asserts (type-of-action
- (value equivalent-plan-node)
- (inverse-of
- (type-of-action
- (value
equivalent-plan-node))))
68Illustrative Shipping Domain
- Structured shipping domain
- Loading crates on to a truck, driving them to a
destination, delivering documentation - Loading subproblem is isomorphic to
Tower-of-Hanoi - Solution cost grows rapidly with the number of
crates - Driving and delivering documentation 1 action
each
Destination
Pallet
Warehouse
69REM Functional Architecture
70Shipping Model
Deliver with Documentation
Move Stack
Select Object
Drive
Deliver Object
Recipient
Documentation
Agent
Select Move
Move Crate
Paper
Manifest
Truck
Object
Warehouse
In Truck
Destination
Crate
Place
71Shipping Model
Deliver with Documentation
Move Stack
Select Object
Drive
Deliver Object
Select Move
Move Crate
72Model Execution
- If the given task is non-primitive
- Chose a method whose applicability conditions are
met - While the state-transition machine encoded in the
method is not at an end state - Execute the subtask for the current state
- Chose a transition from the current state whose
applicability conditions are met - Else execute the primitive task
- Choices are made through weighted random
selection reinforcement learning influences
these weights. - In most models, usually only one choice at each
decision point - Throughout execution, REM records a trace of the
tasks and methods performed and the knowledge
used.
73Ablated Shipping Model
Deliver with Documentation
Move Stack
Select Object
Drive
Deliver Object
Select Move
Move Crate
74Adaptation UsingGenerative Planning
- Requires operators and a set of facts (initial
state) - Invokes external planner (i.e., Graphplan)
- Operators Those primitive tasks known to the
agent which can be translated into planners
operator language - Facts Known assertions which involve relations
referred to by the operators - Goal Makes condition of main task
- Translates plan into more general method by
turning instances into parameters propagating
bindings - Stores method for later reuse
75Ablated model generative planning
Deliver with Documentation
Move Stack
Select Object
Drive
Deliver Object
Select Move
Move Crate
New method (from planning)
76Generative planning only
Deliver with Documentation
Deliver Object
Drive
Move Crate
New method (from planning)
77Shipping Example Results
78Contrast Small Problems(overheads dominate)
- 1 crate
- Complete and ablated both require no adaptation
(very fast) - None requires planning (slower)
- 2 3 crates
- Complete still requires no adaptation (very fast)
- None requires planning (slower)
- Ablated requires model-based credit assignment
and planning (slowest)
79Contrast Larger Problems(knowledge power)
- Complete model much faster than planning with 4
or more crates. - The ablated model took 3.5 hours less than no
model. - The plan produced with no model only had 2 extra
steps! (Drive, Deliver Object) - All of the Move Crate actions are the same in
both conditions.
80Constraints (1)
- Automation
- REMs TMKL model of itself is implemented within
REM. - Transfer occurs all the way down to the level of
code. - Uniformity
- TMKL enables uniform representation of tasks,
methods, and knowledge at both base-level and
meta-level.
81Constraints (2)
- Generality
- SIRRINE Failure-driven evolution
- Kritik Representation of design processing
- Employee Time Card Architecture Interoperation
with Architectural Description Languages. - Meeting Scheduler Adapting domain knowledge /
problem constraints. - REM Evolution in response to failures and for
novel tasks. - ADDAM roof, book, stool, camera, computer
(approximately 15 components) - Web browser
82Applicability of PurelyModel-Based Evolution
- Knowledge about the concepts and relations in the
domain - Knowledge about how the tasks and methods affect
these concepts and relations - Differences between the old task and the new map
onto knowledge of the concepts and relations in
the domain.
83Applicability of Model-Based Evolution with Traces
- May need less knowledge about the domain itself
since the evolution is grounded in a specific
incident. - e.g., feedback about PDF for an example instead
of advance knowledge of all document types. - Still requires knowledge about how the tasks and
methods interact with the domain.
84Additional Mechanisms
- Model-based adaptation may leave some design
decisions unsolved. - By treating computations as actions, these
decisions may be solved by traditional decision
making mechanisms, such as reinforcement
learning. - Models may be unavailable or irrelevant for some
tasks or subtasks - Generative planning can combine primitive
actions. - May be too slow for some problems
85Knowledge Requirements
- MORALE
- Involved techniques for architectural extraction
- TMK effective for architectural descriptions of
software - Evidence that TMK is an attainable knowledge
condition for software agents - In contrast purely-traced based reasoning methods
require feedback in the form of traces. - Unreasonable requirement
- Unlikely that a user will know every step that an
agent should have executed.
86Level of Decomposition
- Level of decomposition may be dictated by the
nature of the agent. - Some tasks simply cannot be decomposed
- In other situations, level of decomposition may
be guided by the nature of adaptation to be done. - Can be brittle if unpredicted demands arise.
- REM enables autonomous decomposition of
primitives which addresses this problem.
87Computational Costs
- Meta-reasoning incurs some costs.
- For very easy problems, this overhead may not be
justified. - For other problems, the benefits enormously
outweigh these costs.
88 Results Automates transfer of executable
agent Interoperation of reasoning and
meta-reasoning Use of approach for a
variety of tasks and domains
Dependent on domain knowledge and task
definitions Dependent on task
definitions but less dependent on domain
knowledge
Criteria and Issues Automation Uniformity
Generality Applicability with
Models Applicability with Models and Traces
Products and Projects Autognostic, Reflecs, SIRR
INE, REM TMKL Autognostic, Reflects,
SIRRINE, MORALE, REM, MORALE Autognostic, Ref
lecs, SIRRINE, REM
89 Results Reinforcement learning for unresolved
constraints and generative planning for reasoning
without relevant models Architectural
extraction provides a foundation for inferring
tasks and methods of software agents Enhanced
by automated method generation Drastically
outperforms competitors on some hard problems
Criteria and Issues Additional
Mechanisms Knowledge Requirements Level of
Decomposition Computational Cost
Products and Projects REM MORALE Autognost
ic, SIRRINE, REM REM
90Conclusions
- Reflection based on teleological/compositional
self-models enables self-adaptation!
91All of above has been joint work
- J. William Murdock
- Ph.D., Georgia Tech, 2001
- Now with IBM T.J. Watson Research Center
- Primary developer of
- REM, SIRRINE, MORALE, TMKL
- (and also these slides)
- Eleni Stroulia
- Ph.D., Georgia Tech, 1994
- Now on CS faculty at University of Edmonton
- Primary developer of
- Autognostic, Reflecs, TMK
92Partial History of TMK
Key Influences
Functional Representation (Sembugamoorthy
Chandrasekaran 1986)
Generic Tasks (Chandrasekaran 1986)
OSU
SBF Models (Goel, Bhatta, Stroulia 1997)
TMK Projects
Autognostic Failure-driven learning (Stroulia
Goel 1995)
...
Interactive Kritik Self-Explanation (Goel
Murdock 1996)
GT
SIRRINE Retrospective Adaptation (Murdock Goel
2001)
REM Proactive Adaptation (Murdock 2001)
Reflecs Robotics (Goel et al 1998)
Game Playing
MORALE Software Design
93Meta-CBR vs.Case-Based Adaptation
- Meta-CBR reasons about and adapts an entire
reasoning process such as a CBR process. - Case-based adaptation restricts adaptation to one
portion of a case-based process adaptation. - Being more focused is a substantial advantage for
case-based adaptation. - However, for problems which require adaptation of
different sorts of reasoning processes, it is
useful to have models of these processes, as in
meta-CBR.
94Meta-CBR vs.Derivational Analogy
- REMs meta-CBR adapts models of tasks and
methods. - Derivational analogy generally assumes some sort
of universal process (e.g., generative planning)
and only needs to represent and reason about key
decision points. - Advantage of derivational analogy Models not
needed traces alone enable reuse. - Advantage of meta-CBR Applicable to problems for
which a universal process is not appropriate
(e.g., 6 board roof example takes days using
planning Q-learning). - Meta-CBR demands more knowledge but makes
effective use of that additional knowledge.
95Comparing TMK with HTNs
- Similarities
- Both are hierarchical decompositions of tasks and
methods - Both include primitive actions at the lowest
level - Differences
- TMK includes functions of tasks
- Not needed for planning
- Useful for creating/modifying methods
- Also useful for explaining reasoning, etc.
- HTNs allow more ambiguity (for backtracking)
- TMK uses a very expressive state representation
- TMK interleaves reasoning and acting
- Many subtle technical differences
- Mostly because HTNs are optimized for planning,
while TMK is also intended to support
modification, etc.
96Q-Learning Example
Install Lightbulb
Traditional Installation
Rotate Bulb
Insert Bulb
Insert Roughly
Insert Gently
Rotate Quickly
Rotate Slowly
...
...
...
...
- Two decision points Method for Insert Bulb and
Method for Rotate Bulb - Three possible decision states (Insert Bulb),
(Rotate BulbInsert Roughly), (Rotate BulbInsert
Gently)
97Q-Learning Example Beginning
- Q(Insert Bulb, Insert Gently) 0
- Q(Insert Bulb, Insert Roughly) 0
- Q(Rotate Bulb Insert Roughly, Rotate Quickly)
0 - Q(Rotate Bulb Insert Roughly, Rotate Slowly)
0 - Q(Rotate Bulb Insert Gently, Rotate Quickly)
0 - Q(Rotate Bulb Insert Gently, Rotate Slowly) 0
- First decision In Insert Bulb, choose an action
(arbitrary, since values are 0). Say Insert
Gently. - When you get to the next decision, update the
previous decision
98Q-Learning Example First Update
- Q(s,a) Q(s,a) ?(r ? max (Q(s',a')) -
Q(s,a)) - In REM, ?.1 and ?.93
- Q(Insert Bulb, Insert Gently) 0.1(0.930-0)
- Q(Insert Bulb, Insert Gently) 0 no change
- Next decision In Rotate Bulb I , choose an
action (arbitrary, since values are 0). Say
Rotate Slowly.