Title: Human Directability of Agents
1Human Directability of Agents
- Karen Myers, David Morley
- myers, morley_at_ai.sri.com
- AI Center
- SRI International
2True Confessions
I am not a Machine Learning Person
- Why am I here?
- Directing Agents learning by being told
- Critical need for learning technology to develop
real-world agent applications
3Agents Everywhere!
SoftBots
4Current Practice
Interaction Spectrum
Teleoperation Human makes all decisions Ex
internet agents, UCAVs
Fully Autonomous Agent makes all decisions Ex
mobile robots
- ? Acts according to human preferences
- ? Little knowledge modeling needed
- X Human bears cognitive load
Little human influence X Must encode all
expertise X Low human cognitive load ?
- Objective mixed-initiative directability of
agents by a human supervisor - Delegation without loss of control
5Supervised Autonomy
- Scope of applicability
- Agents capable of fully autonomous operation
- Want agents to be mostly autonomous
- Human influence would improve performance
- Humans want to customize agent operations
- Approach
- Dynamic guidance for management of agents
- Strategy Preference
- Adjustable Autonomy
6Disaster Relief Intel Management
TRAC
MAPLESIM
7BDI Agent Model (a la PRS)
User
Plan Library
Tasks
Executor
Intentions
Beliefs
World
8Strategy Preference
- Strategy how to make decisions
- Assumption agents have library of parameterized
plans - Approach guidance defines policies on plan
selection, parameter instantiation
Example Only use helicopters for survey tasks in
sectors more than 200 miles from base.
9Adjustable Autonomy
- Autonomy degree to which agent makes its own
decisions - Assumption agents capable of full autonomy
- Approach guidance restricts space of agent
decisions
Permission Requirements gating conditions on
actions Obtain permission before abandoning
survey tasks with Prioritygt3 Consultation
Requirements deferred choice Consult me when
selecting locations for evacuation sites.
10Guidance Foundations
- Language for expressing guidance
- Belief-Desire-Intention (BDI) Model of Agency
- FOL
- Domain Metatheory
- Formal Semantics
- Guidance-compatible execution
- Enforcement Methods
- Operationalization within BDI interpreter loop
11Domain Metatheory
- Base-level Agent Theory
- Individuals
- Relations modeling the world, internal agent
state - Tasks
- Plans
- Domain Metatheory
- Captures high-level, distinguishing attributes of
plans, tasks - Features, Roles
12Example Domain Metatheory
- Feature - distinguishing attribute of a plan/task
- Plans for Task MOVE(Obj1 Place1 Place2)
- Move-by-Land-Opr LAND
- Move-by-Sea-Opr SEA
- Move-by-Air-Opr AIR
- Role - capacity in which a variable is used
- Origin Place.1, Destination Place.2
- Key Idea abstraction over individual plans, tasks
13Guidance Components
- Use domain metatheory to define abstract classes
of plans, goals, and agent state - Activity specification
- Desire specification
- Agent context
14Activity Specification
- Abstract characterization of a class of
activities - Defined in terms of
- Features required/prohibited
- Constraints on role values
Example Abandon a survey task Features
Abandon Roles Current-Task Role Constraints (
(TASK-TYPE Current-Task) SURVEY)
15Desire Specification
- Abstract characterization of a class of desires
- Defined/used similarly to Activity Specification
16Agent Context
- Describes an operational state of agent
Example Performing a communication plan for a
Survey task within 10 miles of the Base
Beliefs (lt (Distance (Current-Position) Base)
10) Desires Features Survey Intentions
Features Communication
17Permission Requirement
- Definition ltagent-context, activity-specificationgt
- Semantics when in the context, permission is
required to adopt plans that match the activity
specification
Ex Seek permission to abandon survey tasks with
priority gt 5 Agent Context Intentions Feature
SURVEY-TASK Activity-Spec Features
ABANDON Roles Current-Task Role Constraints (gt
(Task-Priority Current-Task) 5)
18Consultation Requirement
- Definition ltagent-context, rolegt
- Semantics when in the context, consult the
supervisor when there are options for the
designated role
Ex When responding to medical emergencies,
consult when selecting MedEvac facilities. Agent
Context Intention Features
Medical-Emergency, Response Role MedEvac-Facility
19Strategy Preference
- Definition ltagent-context, activity-specificationgt
- Semantics when in the context, plans matching
activity specification should be preferred
Ex Respond to rescue emergencies involving more
than 10 people when the severity exceeds the
current task priority. Agent Context Features
Emergency, Response Roles Current-Task,
Severity, Number Role Constraints (AND (gt
Number 10)
(gt Severity (TASK-PRIORITY Current-Task))) Activi
ty Specification Features ADOPT Roles
New-Task Constraints ( (TASK-PRIORITY New-Task)
ESEVERITY.1)
20Guidance Interface Tools
21Guidance Enforcement
- Simple Semantics guidance as filters on
applicable plans - Enforcement
- Simple extension to BDI executor
- Modify plan selection step to incorporate
- Filtering of plans with respect to guidance
constraints - User consultation
Filter-based Semantics
P1
P3
Good
P5
P2
P4
Bad
22Guidance Conflicts (1)
- A. Plan Selection guidance yields contradictory
suggestions - Execute Plan P / Dont execute Plan P
- Solution
- Rank applicable plans according to guidance
satisfaction - Select higher-ranked plan(s) when there is a
conflict
Filter-based Semantics
Prioritized Semantics
Ranking
P1
P3
Good
P1
P5
P3
P5
P2
P4
Bad
P4
P2
23Guidance Conflicts (2)
- B. Situated Conflict prior activities block
guidance application - Guidance would recommend a response to an
emergency but required resources are unavailable
- Solution
- Expand the set of candidate plans proactively
- Resolution Plans Delay current task to obtain
required resource
Filter-based Semantics
Prioritized Expansion Semantics
Ranking
P6
P1
P3
Good
P1
P5
P3
P7
P5
P2
P4
Bad
P4
P8
P2
24Related Work
- Deontic logics
- Obligation, permission, authority modalities
- Mostly formal rather than practical
- Policy-based systems management
- Incorporating deontic concepts for runtime
definition of behaviors - Sets authority parameters for components
- Adjustable Autonomy
- Electric-Elves MDP based approach for
consultation
25Summary
- Technical Contributions
- Language, semantics, enforcement techniques for
agent guidance - Form of learning by being told --- limited to
control rather than core knowledge - Benefits
- Combines capabilities of humans and agents
- Adapts to dynamic user preferences
- Reduced knowledge modeling effort
- Status
- TRAC implementation on top of PRS
reimplementation in SPARK
26CALO Cognitive Assistant the Learns and Organizes
- Develop an intelligent personal assistant for a
high-level knowledge worker - Large project encompassing 20 different research
organizations in the US led by SRI - Integrated Learning as a key theme
27CALO Task Manager
Timeline
Introspect
Interact
Task Manager
Plan
Act
Notice
Anticipate
t
t
Now
- Capabilities
- Perform tasks on behalf of the user (reactively,
proactively) - Manage user commitments (time, workload)
- Keep the user informed
- Coordinate interactions with other CALOs
28The Need for Integrated Learning
- Capabilities
- User customization
- Extending/modifying procedural knowledge
- Performance improvement
- Setting
- Learning unobtrusively
- Learning from small number of cases (for some
things) - Mixed-initiative setting
29Learning in the Task Manager (Current)
- Learning by Being Told
- Human Guidance for Agents (Myers, Morley)
- Interactive Acquisition/Modification of
Procedures (Blythe) - Preference Learning for Email Management
(Gervasio) - folder and priority prediction
- Preference Learning for Calendar Management
(Gervasio) - Schedule evaluation functions
- Reinforcement Learning for Reminder Customization
(Pollack) - Query Relaxation via online data mining (Muslea)
- mine small subset of solution space for rules
that relate domain attributes use the rules to
relax query constraints
30Learning Procedural Knowledge
- Programming by demonstration
- Calendar Manager how to arrange meetings of
different types - Observe sequence of actions from meeting
initiation to actual meeting - Failure-driven learning procedure adaptation
(automated, mixed-initiative) - Adapt/extend predefined core of procedures to
handle a broader set of tasks, improve robustness
- User Agent explore high-dimensional traces of
failed tasks