Title: Computational Discovery of Communicable Knowledge
1Cumulative Learning of Relational and
Hierarchical Skills from Problem Solving
Pat Langley Institute for the Study of Learning
and Expertise Palo Alto, CA http//www.isle.org
This research was funded by Grant
HR0011-04-1-0008 from the DARPA Information
Processing Technology Office, which may not agree
with the points made in this talk.
2Research Objectives
- We are designing and implementing new learning
methods that - operate over relational, hierarchical knowledge
structures - support reasoning, reactive control, and problem
solving - are embedded within a broader architectural
framework - utilize existing knowledge to increase learning
rates - acquire this knowledge in an incremental,
cumulative manner - are applicable to a variety of challenging
domains - We hope to develop learning mechanisms that
support horizontal and vertical transfer both
within and across domains.
3The ICARUS Architecture
Perceptual Buffer
Short-Term Conceptual Memory
Long-Term Conceptual Memory
Categorization and Inference
Perception
Environment
Skill Retrieval
Long-Term Skill Memory
Goal/Skill Stack
Skill Execution
Means-Ends Analysis
Motor Buffer
without learning
4Organization of Long-Term Memory
ICARUS organizes both concepts and skills in a
hierarchical manner.
concepts
Each concept is defined in terms of other
concepts and/or percepts. Each skill is defined
in terms of other skills, concepts, and percepts.
skills
5Concepts from In-City Driving Domain
(in-segment (?self ?sg) percepts ((self ?self
segment ?sg) (segment ?sg)))(aligned-with-lane
(?self ?lane) percepts ((self ?self)
(lane-line ?lane angle ?angle))
positives ((in-lane ?self ?lane)) tests ((gt
?angle ?0.05) (lt ?angle 0.05)) )(on-street
(?self ?packet) percepts ((self ?self) (packet
?packet street ?street) (segment ?sg street
?street)) positives ((not-delivered ?packet)
(current-segment ?self ?sg))) (increasing-directi
on (?self) percepts ((self ?self))
positives ((increasing ?b1 ?b2))
negatives ((decreasing ?b3 ?b4)) )
6Organization of Long-Term Memory
ICARUS interleaves its long-term memories for
concepts and skills.
For example, the skill highlighted here refers
directly to the highlighted concepts.
7Skills from In-City Driving Domain
(turn-around-on-street (?self ?packet)
percepts ((self ?self segment ?segment direction
?dir) (building ?landmark)) start
((on-street-wrong-direction ?packet))
effects ((on-street-right-direction ?packet))
ordered ((get-in-U-turn-lane ?self)
(prepare-for-U-turn ?self) (steer-for-U-turn
?self ?landmark)) ) (get-aligned-in-segment
(?self ?sg) percepts ((lane-line ?lane angle
?angle)) requires ((in-lane ?self ?lane))
effects ((aligned-with-lane ?self ?lane))
actions ((?steer (?times ?angle 2)))
) (steer-for-right-turn (?self ?int ?endsg)
percepts ((self ?self speed ?speed)
(intersection ?int cross ?cross) (segment
?endsg street ?cross angle ?angle))
start ((ready-for-right-turn ?self ?int))
effects ((in-segment ?self ?endsg))
actions ((?times steer 2)) )
8Basic ICARUS Processes
ICARUS matches patterns to recognize concepts and
select skills.
concepts
Concepts are matched bottom up, starting from
percepts. Skill paths are matched top down,
starting from intentions.
skills
9A Trace of Means-Ends Problem Solving
An impasse causes ICARUS to invoke a means-ends
problem solver.
The resulting traces provide the material for
learning new relational skills and concepts in
terms of simpler components.
10Learning Skills from Means-Ends Traces
ICARUS learns skills for ordering subgoals from
concept chaining.
11Learning Skills from Means-Ends Traces
ICARUS learns skills for ordering subskills from
skill chaining.
12Learning Skills from Means-Ends Traces
Each level of skill learning builds upon results
from prior levels.
13Learning Skills from Means-Ends Traces
This leads ICARUS to extend its skill hierarchy
in a cumulative way.
14Learning Skills from Means-Ends Traces
This in turn supports transfer both within and
across problems.
15Transfer Results in FreeCell
FreeCell is a complex solitaire game in which all
cards are visible.
We let ICARUS practice on versions with a small
set of cards, then examined its transfer to
problems with more cards.
16Transfer Results in FreeCell
Experiments revealed substantial transfer to the
harder problems.
This held both for the percentage of problems
solved and for the effort required on successful
attempts.
17Directions for Future Research
Our initial results suggest ICARUS can transfer
knowledge learned on simple problems to complex
ones from the same domain. In future work, we
intend to examine the additional issues of
- vertical transfer to domains that utilize others
as components - horizontal transfer to domains to share knowledge
elements - horizontal transfer to tasks that require
representation mapping.
The final problem is a key challenge in
developing robust methods for reusing learned
knowledge. We hope to evaluate our ideas on both
action-oriented domains like strategy games and
inferential tasks like physics problems.
18The General Game-Playing Testbed
Genesereth and Love (2005) have developed a
framework that
- supports a wide variety of wide variety of
N-person games - describes each game setting in a standard logical
formalism - specifies the rules of each game in a related
formalism - manages matches between players and records
activities - provides sample games for debugging candidate
systems.
They have designed this framework to encourage
research on general approaches to intelligent
behavior. However, it also provides an excellent
testbed for evaluating the ability of learning
systems to transfer within and across
domains. See http//games.stanford.edu for more
details and examples.