Learning through Interactive Behavior Specifications - PowerPoint PPT Presentation

About This Presentation
Title:

Learning through Interactive Behavior Specifications

Description:

Reduce the cost of agent development. Reduce the expertise required to ... We may want ... knowledge is represented with a hierarchy of durative goals. i3. r1 ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 29
Provided by: tolga2
Learn more at: http://lac.gmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Learning through Interactive Behavior Specifications


1
Learning through Interactive Behavior
Specifications
  • Tolga Konik
  • CSLI, Stanford University
  • Douglas Pearson
  • Three Penny Software
  • John Laird
  • University of Michigan

2
Goal
  • Automatically generate cognitive agents
  • Reduce the cost of agent development
  • Reduce the expertise required to develop agents.

3
Domains
  • Autonomous Cognitive agents
  • Dynamic Virtual Worlds
  • Real time decisions based on knowledge and sensed
    data
  • Soar agent architecture

4
Learning by Observation
  • Approach
  • Observe expert behavior
  • Learn to replicate it
  • Why?
  • We may want human-like agents
  • In complex domains, imitating humans maybe easier
    than learning from scratch

5
Bottleneck in pure Learning by Observation
  • PROBLEM
  • You cannot observe the internal reasoning of the
    expert
  • SOLUTION
  • Ask the expert for additional information
  • Goal annotations
  • Use additional knowledge sources
  • Task domain knowledge

6
Learning by Observation
Environment
Goal annotations
Actions
Percepts
Additional Task Knowledge
Learner
7
Learning by Observation
Environment
8
Learning by ObservationCritic Mode
Environment
critic
Learner
9
One Body, Two Minds
?
?
Environment
  • How and when to switch control
  • How the expert and the agent program communicate

10
Diagrammatic Behavior Specification
Learner
11
Redux
  • Visual rule editing
  • Diagrammatic Behavior Specification

12
Goal Hierarchy
Get-item(Item)
Get-item-different-room(Item)
  • Task-Performance knowledge is represented with a
    hierarchy of durative goals.

13
Goal Hierarchy
Get-item(i3)
Itemi3
Get-item-in-room(Item)
Goto-next-room
Get-item-different-room(Item)
Get-item-in-room(i3)
Go-to(Door)
14
Goal Hierarchy
Get-item(i3)
Itemi3
Get-item-different-room(Item)
Get-item-different-room(i3)
Get-item-in-room(Item)
Go-to(Door)
Go-to(d1)
Doord1
15
Goal Hierarchy
i3
Get-item(i3)
Get-item-in-room(Item)
Get-item-different-room(i3)
Doord1
16
Goal Hierarchy
i3
Get-item(i3)
Get-item-in-room(Item)
Get-item-different-room(i3)
Doord3
17
Behavior Specification
  • Expert draws initial abstract situation
  • Create senario by selecting actions

18
Goal Specification
  • Goals are explicitly selected
  • The agent contributes based on the current
    situation, current goal and its knowledge

19
Switching Roles
  • Expert generates behavior if the agent doesnt
    know how to pursue the current goal
  • Agent may propose goals, subgoals and actions
  • If the agent is correct, the expert observes and
    validates
  • Otherwise rejects, corrects, or takes over
  • Key to the interaction is shared goals shared
    assumption about the current situation

20
Goal Hierarchy
  • Learning by Observation perspective
  • Unobservable mental reasoning of the expert
  • Learning Perspective
  • Bias hypothesis space
  • learn agent problem reduced to learn goal
    selection and termination
  • MI Perspective
  • information exchange between the expert and the
    agent

21
Relevant Knowledge Specification
Prepare food
  • Expert can mark important objects in a decision

22
Rich Behavior Trace
  • Expert specified undesired actions and goals
  • Expert rejected actions and goals of the
    approximately learned agent program

Watch TV
23
Rich Behavior Trace
  • Hypothetical Actions and Goals
  • Situation history a tree structure of possible
    behaviors

24
Relational Learning by Observation
  • Input
  • Relational Situations
  • Goal and action selections and rejections
  • Additional annotations (i.e. important objects)
  • Background knowledge
  • Output
  • Rule based agent program
  • Learn goal/action selection/termination
  • generalizing over multiple examples
  • Inductive Logic Programming to combine rich
    knowledge structures

25
Relational Learning by Observation
26
Relational Learning by Observation
Find the common structures in the decision
examples
27
Relational Learning by Observation

Learn relations between what the agent wants,
perceives and knows.

Select a door in the current room, which leads
to a room that contains the item the agent wants
to get
28
Comparing Redux to LBOAdvantages of Redux
  • No real time constraints on behavior
  • i.e. no waiting for a 2 hour long goal
  • can be used to describe unlikely, but critical
    situations
  • i.e. Lets assume that there is a nuclear
    melt-down.
  • Richer annotation opportunities
  • Increase learning speed and quality
  • Faster focus where knowledge is lacked most
  • Immediate expert feedback on how rules behave

29
Comparing Redux to LBODisadvantages of Redux
  • Cant learn low level behavior.
  • Contains domain specific components
  • Although most of Redux is domain independent
  • Generating behavior may be slower.
  • Additional annotations improve learning but
    require extra expert effort

30
Relational Behavior Trace
Behavior Trace The Set of
Situations in execution history
  • A Situation
  • a symbolic snapshot of the observed environment
    at a time

31
Annotated Behavior Traces



  • Behavior is annotated with actions and goals
    goto-room(r1), etc.

32
Summary
  • Diagrammatic behavior specification approach
  • To extract rich behavior knowledge
  • Interactive behavior specification
  • Communication medium between the agents (explicit
    goals and assumed situation)
  • Relational learning by observation approach to
    combine multiple complex knowledge sources

33
Future Work
  • Improve mixed initiative interaction of the
    interface
  • Explore domain independent diagrammatic interface
    features
  • Allow the expert to enter context sensitive
    knowledge

34
Mixed initiative perspective
  • Interactive behavior specification
  • Diagrammatic representation of behavior
  • communication medium between the agents
  • Explicit goals and desired behavior
  • Facilitates interaction between the agents
Write a Comment
User Comments (0)
About PowerShow.com