Learning Momentum: Integration and Experimentation - PowerPoint PPT Presentation

About This Presentation

Title:

Learning Momentum: Integration and Experimentation

Description:

The number of consecutive steps the wander vector points in the same direction ... May require a high wander gain to carry the robot through closely spaced obstacles ... – PowerPoint PPT presentation

Number of Views:18

Avg rating:3.0/5.0

Slides: 29

Provided by: bria114

Learn more at: https://sites.cc.gatech.edu

Category:

more less

Transcript and Presenter's Notes

Title: Learning Momentum: Integration and Experimentation

1
Learning Momentum Integration and Experimentation

Brian Lee and Ronald C. Arkin
Mobile Robot Laboratory
Georgia Tech
Atlanta, GA

2
Motivation

Its hard to manually derive controller
parameters.
The parameter space increases exponentially with
the number of parameters.
You dont always have a priori knowledge of the
environment.
Without prior knowledge, a user cant confidently
derive appropriate parameter values, so it
becomes necessary for the robot to adapt on its
own to what it finds.
Obstacle densities and layout in the environment
may be heterogeneous.
Parameters that work well for one type of
environment may not work well with another type.

3
Adaptation and Learning Methods DARPA MARS

Investigate robot shaping at five distinct levels
in a hybrid robot software architecture
Implement algorithms within MissionLab mission
specification system
Conduct experiments to evaluate performance of
each technique
Combine techniques where possible
Integrate on a platform more suitable for
realistic missions and continue development

4
Overview of techniques

CBR Wizardry
Guide the operator
Probabilistic Planning
Manage complexity for the operator
RL for Behavioral Assemblage Selection
Learn what works for the robot
CBR for Behavior Transitions
Adapt to situations the robot can recognize
Learning Momentum
Vary robot parameters in real time

THE LEARNINGCONTINUUM Deliberative
(premission) . . . Behavioral switching . . . Reac
tive (online adaptation)
. . .
5
Basic Concepts of LM

Provides adaptability to behavior-based systems
A crude form of reinforcement learning.
If the robot is doing well, keep doing what its
doing, otherwise try something different.
Behavior parameters are changed in response to
progress and obstacles.
The system is still fully reactive.
Although the robot changes its behavior, there is
no deliberation.

6
Currently Used Behaviors

Move to Goal
Always returns a vector pointing toward the goal
position.
Avoid Obstacles
Returns a sum of weighted vectors pointing away
from obstacles.
Wander
Returns vectors pointing in random directions.

7
Adjustable Parameters

Move to goal vector gain
Avoid obstacle vector gain
Avoid obstacle sphere of influence
Radius around the robot inside of which obstacles
are perceived
Wander vector gain
Wander persistence
The number of consecutive steps the wander vector
points in the same direction

8
Four Predefined Situations

no movement
M lt T movement
progress toward the goal
M gt T movement
P gt T progress
no progress with obstacles
M gt T movement
P lt T progress
O count gt T obstacles
no progress without obstacles
M gt T movement
P lt T progress
O count lt T obstacles

M average movement
M goal average
movement to the goal
P M goal / M
O count obstacles
encountered
T movement movement
threshold
T progress progress
threshold
T obstacles obstacles
threshold

9
Parameter adjustments
Sample adjustment parameters for ballooning.
10
Two Possible Strategies

Ballooning - Sphere of influence is increased
when obstacles impede progress. The robot moves
around large objects.
Squeezing - Sphere of influence is decreased when
obstacles impede progress. The robot moves
between closely spaced objects.

11
Integration
Base System
Sensors
Controller
Position and Goal Information
Move To Goal(Gm)
Obstacle Information
?
Avoid Obstacles(Go,S)
Output direction
Wander(Gw,P)

Gm goal gain

Go obstacle gain

S obstacle sphere of
influence

Gw wander gain

P wander persistence

12
Integration
Integrated System
Sensors
Controller
Position and Goal Information
Move To Goal(Gm)
Obstacle Information
?
Avoid Obstacles(Go,S)
Output direction
Wander(Gw,P)

Gm goal gain

Go obstacle gain

S obstacle sphere of
influence

New Gm, Go, S, Gw, and P parameters.

Gw wander gain

LM Module

P wander persistence

13
Experiments in Simulation

150m x 150m area
robot moves from (10m, 10m) to (140m, 90m)
Obstacle densities of 15 and 20 were used.
Obstacle radii varied between 0.38m and 1.43m.

14
Ballooning
15
Observations on Ballooning

Covers a lot of area
Not as easily trapped in box canyon situations
May settle in locally clear areas
May require a high wander gain to carry the robot
through closely spaced obstacles

16
Squeezing
17
Observations on Squeezing

Results in a straighter path
Moves easily through closely spaced obstacles
May get trapped in small box canyon situations
for large amounts of time

18
Simulations of the Real World
End Place
Start Place
24m x 10m
Simulated setup of the real world environment.
19
Completion Rates For Simulation
Uniform Obstacle Size (1m radii)
Varying Obstacle Sizes (0.38m - 1.43m radii)
20
Average Steps to Completion
Uniform Obstacle Size (1m radii)
Varying Obstacle Sizes (0.38m - 1.43m radii)
21
Results FromSimulated Real Environment
Complete
Steps to Completion

As before, there is an increase in completion
rates with an accompanying increase in steps to
completion.

22
Simulation Results

Completion rates can be drastically improved.
Completion rate improvements come at a cost of
time.
Ballooning and squeezing strategies are geared
toward different situations.

23
Physical Robot Experiments

Nomad 150 robot
Sonar ring for obstacle avoidance
Traverses the length of a 24m x 10m room while
negotiating obstacles

24
Outdoor Run (adaptive)
25
Outdoor Run (non-adaptive)
26
Physical Experiment Results

Non-learning robots became stuck.
Learning robots successfully negotiated the
obstacles.
Squeezing was faster than ballooning in this case.

Average steps to goal.
27
Conclusions

Improved success has a price of time.
Performance of one strategy is very poor in
situations better suited for another strategy.
The ballooning strategy is generally faster.
Ballooning robots can move through closely spaced
objects faster than squeezing robots can move out
of box canyon situations.

28
Conclusions (contd)

If some general knowledge of the terrain is know
a priori, an appropriate strategy can be chosen.
If terrain is totally unknown, ballooning is
probably the better choice.
A way to dynamically switch strategies should
improve performance.

Write a Comment

User Comments (0)