Title: Adaptive Agents
1Adaptive Agents
- Diana Gordon
- Navy Center for Applied Research in AI
- Naval Research Laboratory
- May 9, 2000
2Outline
- Agents that are adaptive, predictable and timely.
- Adaptive supervisory control of multi-agent
systems. - Epidemiological model of computer virus spread.
- Potential transitions.
3APT Agents Agents That Are Adaptive,
Predictable, and Timely
Diana Gordon Naval Research Laboratory
4Agents Embedded in an Environment
automaton strategy
multi-agent automaton strategy
(simulated) environment
5 Re-Verification of APT Agents Strategies
O F F L I N E
strategy
Verification
O N L I N E
strategy
Adaptation
NEW SITUATION
Rapid Re-verification
revised strategy
6Adaptation
7(Co-)Evolving Agent Strategies
select parents from population
population of strategies
parents
perturbation (learning) operators applied
return offspring to population
offspring
fitness
evaluation
environment
8Learning (Perturbation) Operators
- Not addressed at this time
- Abstraction/concretion of the Boolean algebra for
transition conditions (see ICML98 paper) - Add/delete state
- Edge operators are currently being addressed
- Add/delete edge
- Generalize/specialize transition condition
- Move transition condition
- Preserves determinism and
- completeness (relevant for agents)
F-deliver I-receive
F-collect I-deliver
9Predictability
10The Need for Formal Verification
- Testing alone has the following drawbacks
- The process of evolution with testing is slow.
- It doesnt provide strict behavioral guarantees.
- Properties that are more critical require formal
verification. - Model checking is applied after each learning
occurrence. - (Current approach Verify global Invariance
and Response properties with a model checker that
does a depth-first search.) - If model checking outputs failure, either a new
learning operator is selected or the strategy
repaired. - To model check global multi-agent properties, the
product automaton (multi-agent strategy) is
formed.
11Timeliness
12 Objective Rapid Re-verification
- Re-verification from scratch.
- Time-inefficient. If m actions for each of n
agents, time complexity is O(m ). - Restrict learning using a priori results.
- Safe machine learning operator (SMLO)
- S P gt o(S) P
- Safety guarantee with no run-time re-verification
cost! - Incremental re-verification.
- Useful when general a priori results are negative
or difficult to obtain. - Time efficiency gained by localizing.
n
-(
-)
-)
13A Priori Results
- New results for Invariance and Response
properties. - For operators with negative a priori results, we
need incremental.
14Three Types ofIncremental Algorithms
15Streamline Formation of Product Automaton Knowing
That Learning Occurred
Re-form product transitions affected by learning
product state
product state
Agent 1 state3
Agent 3 state1
Agent 2 state5
Re-use product transitions not affected by
learning
product state
Agent 1 state3
Agent 3 state7
Agent 2 state5
16Incremental Re-verification That Capitalizes on
Knowing That Learning Occurred
Learning applied here
Only need to re-verify downstream from the
learning site
Advantage Has been demonstrated to be more time
efficient than total
re-verification on some practical
problems. Limitation Worst-case time complexity
is same as total re-verif.
17Incremental Re-verification Algorithms Specific
for Generalization and Particular Property
Classes (Invariance, Response)
Generalization applied here
Works because generalization has a local
effect on accessibility
Only need to re-verify locally
Advantage Typically theyre extremely
time-efficient. Limitations Overly cautious --
may find false errors.
18Evaluation of Incremental Re-verification
Algorithms
- Theoretical worst-case time complexity analysis,
e.g., - The second type of incremental algorithm saves no
time over - total re-verification from scratch in the
worst case. - The incremental algorithms for generalization are
unaffected by the number of automaton states. - Empirical cpu time comparisons in a natural
setting - All incremental algorithms were faster than total
re-verification, though improvement for second
type not statistically significant. - The incremental algorithm for generalization and
Response properties showed a 1/2-billion-fold
speedup (on average) over total re-verification
on 274,000-state product automata.
19Applications
- Coordinating planetary rovers.
- Competition for resources simulation.
20Adaptive Supervisory Control of Multi-Agent
Systems
- Diana Gordon, Naval Research Laboratory
- Kiriakos Kiriakidis, U.S. Naval Academy
21The Challenge
- Repairing errors detected by verification in a
multi-agent system is a highly challenging credit
assignment problem, e.g., - What is the source of the inter-agent conflict?
- What is the best way to resolve this conflict?
- Our solution
- Recast the multi-agent paradigm as discrete
supervisory control (see Ramadge Wonham).
22Our Solution Adaptive Supervisory Control
learning
(re-)verify
AGENT 1 STRATEGY
DESIRED MULTI-AGENT BEHAVIOR
AGENT 2 STRATEGY
SUPERVISOR
AGENT 5 STRATEGY
repair if (re-)verification fails
23Last MURI Board Meeting Evolving Better
Strategies for Resource Competition
- Naval Research Laboratory
Diana Gordon and William Spears
Insup Lee and Oleg Sokolsky
University of Pennsylvania
24Last MURI Board Meeting Future Directions
- Integrate virus simulation with MaCS
- Co-evolving virus and anti-virus
- More realistic communication topologies
- Epidemiological model of virus/anti-virus spread
through a network
25Current Progress on Virus Modeling
- Kephart White (1991 1993)
- Diff eqs describing the time evolution of the
fraction of infected nodes in a network. - di/dt ??i(1- i) - ? i
- Advantage Closed-form solutions.
- Limitation They had difficulty solving the diff
eqs for complex topologies and realistic
assumptions. - Spears Gordon
- Extending a (computational) Markov model approach
used to analyze evolutionary algorithms to apply
to modeling virus - and anti-virus spread through a network.
- Advantage and limitation is opposite Kephart
White.
26Questions Well Be Able to Answer with the Markov
Model
- What is the
- expected number of infected/inoculated nodes at
some future time? - probability of virus extinction at a certain
time? - expected waiting time until extinction?
27Potential Transitions
Global monitoring UAV
Using Artificial Physics, MAVs form a hexagonal
lattice sensing grid
- Ongoing discussions with Jill Dahlburg and Rick
Foch in NRL TEW about transitioning the
Artificial Physics with global monitoring
(MaC) to real MAVs. Other NRL vehicles also
being considered. Need to extend AP
from 2D to 3D - Kiriakos Kiriakidis has a student
who wants to transition our work to mini-robots.