Human Directability of Agents - PowerPoint PPT Presentation

About This Presentation

Title:

Human Directability of Agents

Description:

Objective: mixed-initiative directability of agents by a human supervisor ... Human makes all decisions. Ex: internet agents, UCAVs. Acts according to human ... – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 31

Provided by: davidm103

Learn more at: http://www.isle.org

Category:

more less

Transcript and Presenter's Notes

Title: Human Directability of Agents

1
Human Directability of Agents

Karen Myers, David Morley
myers, morley_at_ai.sri.com
AI Center
SRI International

2
True Confessions
I am not a Machine Learning Person

Why am I here?
Directing Agents learning by being told
Critical need for learning technology to develop
real-world agent applications

3
Agents Everywhere!
SoftBots
4
Current Practice
Interaction Spectrum
Teleoperation Human makes all decisions Ex
internet agents, UCAVs
Fully Autonomous Agent makes all decisions Ex
mobile robots

? Acts according to human preferences
? Little knowledge modeling needed
X Human bears cognitive load

Little human influence X Must encode all
expertise X Low human cognitive load ?

Objective mixed-initiative directability of
agents by a human supervisor
Delegation without loss of control

5
Supervised Autonomy

Scope of applicability
Agents capable of fully autonomous operation
Want agents to be mostly autonomous
Human influence would improve performance
Humans want to customize agent operations
Approach
Dynamic guidance for management of agents
Strategy Preference
Adjustable Autonomy

6
Disaster Relief Intel Management
TRAC
MAPLESIM
7
BDI Agent Model (a la PRS)
User
Plan Library
Tasks
Executor
Intentions
Beliefs
World
8
Strategy Preference

Strategy how to make decisions
Assumption agents have library of parameterized
plans
Approach guidance defines policies on plan
selection, parameter instantiation

Example Only use helicopters for survey tasks in
sectors more than 200 miles from base.
9
Adjustable Autonomy

Autonomy degree to which agent makes its own
decisions
Assumption agents capable of full autonomy
Approach guidance restricts space of agent
decisions

Permission Requirements gating conditions on
actions Obtain permission before abandoning
survey tasks with Prioritygt3 Consultation
Requirements deferred choice Consult me when
selecting locations for evacuation sites.
10
Guidance Foundations

Language for expressing guidance
Belief-Desire-Intention (BDI) Model of Agency
FOL
Domain Metatheory
Formal Semantics
Guidance-compatible execution
Enforcement Methods
Operationalization within BDI interpreter loop

11
Domain Metatheory

Base-level Agent Theory
Individuals
Relations modeling the world, internal agent
state
Tasks
Plans
Domain Metatheory
Captures high-level, distinguishing attributes of
plans, tasks
Features, Roles

12
Example Domain Metatheory

Feature - distinguishing attribute of a plan/task
Plans for Task MOVE(Obj1 Place1 Place2)
Move-by-Land-Opr LAND
Move-by-Sea-Opr SEA
Move-by-Air-Opr AIR
Role - capacity in which a variable is used
Origin Place.1, Destination Place.2
Key Idea abstraction over individual plans, tasks

13
Guidance Components

Use domain metatheory to define abstract classes
of plans, goals, and agent state
Activity specification
Desire specification
Agent context

14
Activity Specification

Abstract characterization of a class of
activities
Defined in terms of
Features required/prohibited
Constraints on role values

Example Abandon a survey task Features
Abandon Roles Current-Task Role Constraints (
(TASK-TYPE Current-Task) SURVEY)
15
Desire Specification

Abstract characterization of a class of desires
Defined/used similarly to Activity Specification

16
Agent Context

Describes an operational state of agent

Example Performing a communication plan for a
Survey task within 10 miles of the Base
Beliefs (lt (Distance (Current-Position) Base)
10) Desires Features Survey Intentions
Features Communication
17
Permission Requirement

Definition ltagent-context, activity-specificationgt
Semantics when in the context, permission is
required to adopt plans that match the activity
specification

Ex Seek permission to abandon survey tasks with
priority gt 5 Agent Context Intentions Feature
SURVEY-TASK Activity-Spec Features
ABANDON Roles Current-Task Role Constraints (gt
(Task-Priority Current-Task) 5)
18
Consultation Requirement

Definition ltagent-context, rolegt
Semantics when in the context, consult the
supervisor when there are options for the
designated role

Ex When responding to medical emergencies,
consult when selecting MedEvac facilities. Agent
Context Intention Features
Medical-Emergency, Response Role MedEvac-Facility
19
Strategy Preference

Definition ltagent-context, activity-specificationgt
Semantics when in the context, plans matching
activity specification should be preferred

Ex Respond to rescue emergencies involving more
than 10 people when the severity exceeds the
current task priority. Agent Context Features
Emergency, Response Roles Current-Task,
Severity, Number Role Constraints (AND (gt
Number 10)
(gt Severity (TASK-PRIORITY Current-Task))) Activi
ty Specification Features ADOPT Roles
New-Task Constraints ( (TASK-PRIORITY New-Task)
ESEVERITY.1)
20
Guidance Interface Tools
21
Guidance Enforcement

Simple Semantics guidance as filters on
applicable plans
Enforcement
Simple extension to BDI executor
Modify plan selection step to incorporate
Filtering of plans with respect to guidance
constraints
User consultation

Filter-based Semantics
P1
P3
Good
P5
P2
P4
Bad
22
Guidance Conflicts (1)

A. Plan Selection guidance yields contradictory
suggestions
Execute Plan P / Dont execute Plan P

Solution
Rank applicable plans according to guidance
satisfaction
Select higher-ranked plan(s) when there is a
conflict

Filter-based Semantics
Prioritized Semantics
Ranking
P1
P3
Good
P1
P5
P3
P5
P2
P4
Bad
P4
P2
23
Guidance Conflicts (2)

B. Situated Conflict prior activities block
guidance application
Guidance would recommend a response to an
emergency but required resources are unavailable

Solution
Expand the set of candidate plans proactively
Resolution Plans Delay current task to obtain
required resource

Filter-based Semantics
Prioritized Expansion Semantics
Ranking
P6
P1
P3
Good
P1
P5
P3
P7
P5
P2
P4
Bad
P4
P8
P2
24
Related Work

Deontic logics
Obligation, permission, authority modalities
Mostly formal rather than practical
Policy-based systems management
Incorporating deontic concepts for runtime
definition of behaviors
Sets authority parameters for components
Adjustable Autonomy
Electric-Elves MDP based approach for
consultation

25
Summary

Technical Contributions
Language, semantics, enforcement techniques for
agent guidance
Form of learning by being told --- limited to
control rather than core knowledge
Benefits
Combines capabilities of humans and agents
Adapts to dynamic user preferences
Reduced knowledge modeling effort
Status
TRAC implementation on top of PRS
reimplementation in SPARK

26
CALO Cognitive Assistant the Learns and Organizes

Develop an intelligent personal assistant for a
high-level knowledge worker
Large project encompassing 20 different research
organizations in the US led by SRI
Integrated Learning as a key theme

27
CALO Task Manager
Timeline
Introspect
Interact
Task Manager
Plan
Act
Notice
Anticipate
t
t
Now

Capabilities
Perform tasks on behalf of the user (reactively,
proactively)
Manage user commitments (time, workload)
Keep the user informed
Coordinate interactions with other CALOs

28
The Need for Integrated Learning

Capabilities
User customization
Extending/modifying procedural knowledge
Performance improvement
Setting
Learning unobtrusively
Learning from small number of cases (for some
things)
Mixed-initiative setting

29
Learning in the Task Manager (Current)

Learning by Being Told
Human Guidance for Agents (Myers, Morley)
Interactive Acquisition/Modification of
Procedures (Blythe)
Preference Learning for Email Management
(Gervasio)
folder and priority prediction
Preference Learning for Calendar Management
(Gervasio)
Schedule evaluation functions
Reinforcement Learning for Reminder Customization
(Pollack)
Query Relaxation via online data mining (Muslea)
mine small subset of solution space for rules
that relate domain attributes use the rules to
relax query constraints

30
Learning Procedural Knowledge

Programming by demonstration
Calendar Manager how to arrange meetings of
different types
Observe sequence of actions from meeting
initiation to actual meeting
Failure-driven learning procedure adaptation
(automated, mixed-initiative)
Adapt/extend predefined core of procedures to
handle a broader set of tasks, improve robustness
User Agent explore high-dimensional traces of
failed tasks