The different architectural approaches to constructing robots'

About This Presentation

Title:

The different architectural approaches to constructing robots'

Description:

Effective for problems completely specified at design time. ... Real-time obstacle avoidance. Boundary following. Landmark detection. Map construction ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 32

Provided by: phar75

Category:

more less

Transcript and Presenter's Notes

Title: The different architectural approaches to constructing robots'

1
Introduction

The different architectural approaches to
constructing robots.
The constraints of behavior-based approaches to
control.
Three architectures implementing this approach
used for navigation and path finding, group
behaviors, and learning of behavior selection.

2
Architecture

What is Architecture?
-provides a set of principles for organizing
control systems
-imposes constraints on the way control problems
can be solved.

3
Four Basic Types

Deliberate / Planner Based
Purely Reactive
Hybrid
Behavior Based

4
Deliberate Approach

Traditional, Top-down planner-based /
deliberative strategies.
Rely on centralized world for verifying sensory
information and generating actions in the world.
Information in the world is used by the planner
to produce the most appropriate actions for the
agent.
Changes in the environment and uncertainty in
sensing requires frequent re-planning
The high cost of planning does not allow for very
complex systems.
Scales poorly with complexity of real world
problems.
Impossible to react to real-time sudden world
changes.

5
Purely Reactive

An approach to achieve real-time performance in
autonomous agents.
Bottom up approach
Agents control strategy is embedded into a
collection of preprogrammed action pairs.
Maintain no internal models and perform no
searches.
Simple functional mapping between stimuli and
appropriate responses.
Mappings rely on a direct realtionship between
sensing and action and fast feedback from
environment.
Effective for problems completely specified at
design time.
Cannot store information dynamically and this
strategy is therefore inflexible at run time

6
Purely Reactive (Continued)

Amount of computation performed at run-time
demonstrates the division between reactive and
deliberate strategies.
Reactive run-time strategies are derived by a
planner, by computing all possible plans offline.
Entire control system can be precompiled as a
decision graph into a collection of reactive
rules.
Scale poorly with complexity of environment and
control system.

7
Hybrid

Compromise between purely reactive and deliberate
approaches.
Usually has a reactive system for low level
control and a planner for higher-level decision
making.
Separated into two or more communicating but
otherwise independent parts.
Low level reactive process immediate safety of
the agent.
Higher level uses planner to select action
sequences.
Examples of Hybrid Systems
Reactive planning in Reactive Action Packages
(RAPs)
Procedural Reasoning Systems
Internalized Plans
Contingency Plans

8
Behavior-Based

Extension of reactive architectures
Falls between reactive and planner-based
extremes.
Behavior based have some of the properties of
reactive systems and contain reactive components,
but the computation is not limited to simple
functional mapping.
Can store various forms of state and implement
various forms of representation
There is much freedom of interpretation as to
what a behavior-based system actually is, which
has therefore promoted much research in the field.

9
Behavior-Based (Continued)

General Definition of Behavior Based Architecture
Does not employ centralized representations
operated by a reasoning engine.
Relies on forms of distributed representations
and performs distributed computations on them.
Behaviors are typically more time-extended than
actions of reactive systems.
Reactive Systems produce coherent externally
measurable output behaviors from the interaction
of their rules in a particular environment.
Behavioral Based systems often internally specify
such behaviors. The emergent properties of this
system result from interaction of behaviors and
the world and are therefore typically higher
level.

10
Behavior-Based (Continued)

In most systems, the upper level design is a
built-in, fixed control hierarchy imposing a
priority ordering on the behaviors.
Constraints on Behavior Based Systems
Behaviors must be relatively simple
Incrementally added to the system.
Execution not be serialized
Must be more time-extended than simple atomic
actions of the particular agent.
Must interact with other behaviors through the
world rather than internally through the system.

11
Tradeoffs between Architectures

Purely reactive systems are usually assumed to be
less powerful than behavior-based and
planner-based systems.
With a well defined task that has a well known
environment and a sufficiently equipped robot,
purely reactionary solutions can be used for
rather complex problems.
Purely reactive systems achieve exceptional run
time efficiency because of small computational
overhead.
Their limited representational power results in a
lack of run-time flexibility.

12
Tradeoffs (continued)

The reactive vs. deliberative tradeoff is best
exemplified in the tradeoff between the amount of
built-in information and the amount of run time
computation.
It may be easier to hardwire the action rules
It may be easier to maintain an internal world
model.

13
Navigation and Path Finding

It is not possible to construct and update
internal representations of the world at run time
with reactive systems.
The same was thought to be true for
behavior-based systems until the arrival of Toto,
a robot equipped with a ring of sonars and a
compass.
Totos goal was to demonstrate both higher level
reasoning and real time reaction in a non-hybrid
system.
Its capabilities are typical of a deliberate
system, but are implemented with a behavior based
system.

14
Toto

Capabilities
Real-time obstacle avoidance
Boundary following
Landmark detection
Map construction
Path finding

15
Toto

Control system implements both reactive rules and
behaviors.
Navigation is accomplished through reaction.
Input is received from sonar, the compass, and
motor current sensors
Output is sent directly to the motors.
Landmark detection is accomplished through the
implementation of behaviors.
Each is a perceptual filter that monitors the
external world (sonar and compass) and movement
(motor current sensors).

16
Toto

The behaviors do not have a direct affect on the
motion of the robot, but rather change the
activation levels related to a particular
landmark.
For example continuous updates are received from
lateral sensory readings, which allow the robot
to update its confidence level if it is moving
straight down a corridor.

17
Toto

The Map behaviors are initially a collection of
empty behavior shells.
As the properties of specific landmarks are
discovered, they are assigned to these shells.
Map behaviors are connected to each other,
creating communication and topological links.
The landmark detector behaviors send their
outputs to each of the existing map behaviors.
The map behavior that most closely matches the
broadcasted landmark becomes active, which
localizes the robot in the existing map.
If there is not a match to the received landmark,
a new one is added to the map by placing the
properties of the landmark in an empty shell.
The best paths for the robot is found through the
topological connections in the map, which are
combined with the physical attributes of the
landmarks.

18
Toto

When the shell representing the current position
of the robot is found, a motion command is sent
to the wheels of the robot, moving Toto in the
direction of the next landmark on the shortest
path towards the goal.
This activation continues until Toto has arrived
at its final destination.
If the shortest path to the goal is blocked, Toto
will try another path and remove the path that is
blocked.

19
Toto

A demonstration of traditional planning task
implemented with a behavior-based system using a
representation that is procedural and
distributed.
Toto is a departure from both the planner based
and hybrid approaches that are usually used for
similar navigation problems.
It uses reactive rules and behaviors throughout
the system, from the low-level navigation and
control to the depiction of the map

20
Multi Agent Control

Extending the architecture planning from single
agent to multi agent domains requires the
expansion of the global state space, which must
now include the state of all agents.
Global State Space size of the state of each
agent raised to the number of agents
G s a
This exponential factor makes online planning all
but impossible for larger group sizes.
The Bandwidth needed for communication grows with
the number of agents.
Uncertainty in recognizing the entire environment
increases with a growth in the number of agents.
These problems illustrate that a planner based
approach is inappropriate for problems that
involve many agents in a dynamic environment.

21
Multi Agent Control

Using behavior based architecture for multi agent
control results in completely distributed systems
with no centralized controller.
The systems are identical at the local and global
levels.
The local distribution of control does not
require global communication, scales well with
the number of agents, and is more robust to
sensor errors.

22
The Nerd Herd

An approach to structuring local reactive rules
and behaviors into a set to be used as a basis
for programming a collection of robots in a
coherent, scalable fashion.
Defines a behavior as a control law that
satisfies a set of constraints to achieve and
maintain a particular goal.
The Nerd Herd uses a set of these behaviors,
known as basis behaviors, which can be combined
to create many higher-level behaviors.
The process of choosing behaviors is influenced
from the bottom-up by the dynamics of the agent
and the environment, and from the top down by the
goals of the robot.
This combination allows for an efficient basis
set.

23
The Nerd Herd

20 ISX mobile robots with IRs, contact sensors,
grippers, position sensors, and radio
communication.
The behavior set includes safe wandering,
following, aggregation, dispersion, and homing.
Basis behaviors are intended as building blocks
for generating higher level behaviors for
performing various tasks.
The architecture allows for two types of
combination operators
Summation () - flocking as a result of summing
the outputs of safe-wandering, aggregation and
dispersion.
Switching (X)- only one behavior has complete
control. Foraging is a result of activating
safe-wandering when the robot needs a puck,
dispersion when it is crowded with other robots,
homing when it has the puck, and following when
it is near a robot with the same state.
Experiments with basis behaviors demonstrate an
approach toward a principled, cheap development
of basis modules for behavior-based systems.

24
Learning Behavior Selection

In addition to being appropriate units for
control, basis behavior can serve as an efficient
method for allowing learning of behavior
selection.
Robots can learn what behaviors to activate in
order to forage together in a group.
The foraging controller is learned from the
information, or reinforcement, that is received
through interaction with other robots and the
environmnet.

25
Learning Behavior Selection

The traditional formula for reinforcement
learning uses states, actions, and reinforcement.
The robot is in one of a finite number of
possible states.
With time, the agent learns to correlate states
and actions in order to maximize reinforcement.
The Don Group uses basis behaviors instead of
actions as the basic representational units for
reinforced learning.
Using basis behaviors allows for replacing the
complete state space with a much smaller set of
conditions.
Since there are fewer conditions than states, the
agents learning space is greatly diminished,
which increases the speed of the reinforcement
learning.

26
The Don Group

Four IS Robotics R2 mobile robots
Use the same sensors that were contained on the
Nerd Herd robots.
This group will attempt to choose the best
behavior for each condition.

27
The Don Group

Uses shape reinforcement in two forms
Feedback after the completion of a time-extended
behavior.
Helps the robot correlate conditions and
behaviors thus learning when to execute any given
behavior.
Feedback during the execution of a time-extended
behavior.
Helps the robot explore the space more
effectively by allowing it to know when to
continue and when to end its particular behavior.
Behaviors are triggered and terminated by events,
either external or internal.
Event-driven behavior termination is more natural
than the use of time periods.
As the situation changes dynamically with the
movement of the entire group, it is not realistic
to use arbitrary behavior termination.

28
The Don Group

Similar to the case of hand-foraging, only one of
the basis behaviors is active at a time, and the
robots use the reinforcement to learn the
switching circuit.
The reinforcement assists in forming the robots
behavior toward the desired foraging without
having to pre-plan the solution
Reinforcement was generated through the robots
own reward and punishment systems, which are
implemented in the form of behaviors.
Multi-model feedback behaviors are perceptual
filters that monitor the environment, detect
particular events, and deliver appropriate
reinforcement.

29
The Don Group

The success of the Don Group using the
multi-modal feedback and progress estimators was
compared to two alternatives. These consisted of
just the use of multi-modal feedback and no
progress estimators and the use of Q-learning.
The multi-modal feedback with progress estimators
was successful in 95 of the trials, while the
other approaches attained a success rate of only
60 and 30 respectively.
The complex domain required shaped reinforcement
in order to both enable learning and make it
efficient.
The experiment demonstrates basis behaviors as an
effective substrate for automated synthesis of
new higher-level behaviors.

30
Conclusion

Behavior-based systems are able to demonstrate
that distributed approaches to autonomous agent
control are feasible, efficient, and robust.
In the three examples, centralized behavior
coordination was shown to be unnecessary.
The real-time capabilities of well-designed
behavior-based systems should allow for quick
solutions of all goals.
The design of behaviors determines the
effectiveness of the control systems and is
therefore the most challenging aspect of the
behavior-based approach.

31
Resources