Title: Learning Momentum: Integration and Experimentation
1Learning Momentum Integration and Experimentation
- Brian Lee and Ronald C. Arkin
- Mobile Robot Laboratory
- Georgia Tech
- Atlanta, GA
2Motivation
- Its hard to manually derive controller
parameters. - The parameter space increases exponentially with
the number of parameters. - You dont always have a priori knowledge of the
environment. - Without prior knowledge, a user cant confidently
derive appropriate parameter values, so it
becomes necessary for the robot to adapt on its
own to what it finds. - Obstacle densities and layout in the environment
may be heterogeneous. - Parameters that work well for one type of
environment may not work well with another type.
3Adaptation and Learning Methods DARPA MARS
- Investigate robot shaping at five distinct levels
in a hybrid robot software architecture - Implement algorithms within MissionLab mission
specification system - Conduct experiments to evaluate performance of
each technique - Combine techniques where possible
- Integrate on a platform more suitable for
realistic missions and continue development
4Overview of techniques
- CBR Wizardry
- Guide the operator
- Probabilistic Planning
- Manage complexity for the operator
- RL for Behavioral Assemblage Selection
- Learn what works for the robot
- CBR for Behavior Transitions
- Adapt to situations the robot can recognize
- Learning Momentum
- Vary robot parameters in real time
THE LEARNINGCONTINUUM Deliberative
(premission) . . . Behavioral switching . . . Reac
tive (online adaptation)
. . .
5Basic Concepts of LM
- Provides adaptability to behavior-based systems
- A crude form of reinforcement learning.
- If the robot is doing well, keep doing what its
doing, otherwise try something different. - Behavior parameters are changed in response to
progress and obstacles. - The system is still fully reactive.
- Although the robot changes its behavior, there is
no deliberation.
6Currently Used Behaviors
- Move to Goal
- Always returns a vector pointing toward the goal
position. - Avoid Obstacles
- Returns a sum of weighted vectors pointing away
from obstacles. - Wander
- Returns vectors pointing in random directions.
7Adjustable Parameters
- Move to goal vector gain
- Avoid obstacle vector gain
- Avoid obstacle sphere of influence
- Radius around the robot inside of which obstacles
are perceived - Wander vector gain
- Wander persistence
- The number of consecutive steps the wander vector
points in the same direction
8Four Predefined Situations
- no movement
- M lt T movement
- progress toward the goal
- M gt T movement
- P gt T progress
- no progress with obstacles
- M gt T movement
- P lt T progress
- O count gt T obstacles
- no progress without obstacles
- M gt T movement
- P lt T progress
- O count lt T obstacles
- M average movement
- M goal average
- movement to the goal
- P M goal / M
- O count obstacles
- encountered
- T movement movement
- threshold
- T progress progress
- threshold
- T obstacles obstacles
- threshold
9Parameter adjustments
Sample adjustment parameters for ballooning.
10Two Possible Strategies
- Ballooning - Sphere of influence is increased
when obstacles impede progress. The robot moves
around large objects. - Squeezing - Sphere of influence is decreased when
obstacles impede progress. The robot moves
between closely spaced objects.
11Integration
Base System
Sensors
Controller
Position and Goal Information
Move To Goal(Gm)
Obstacle Information
?
Avoid Obstacles(Go,S)
Output direction
Wander(Gw,P)
- S obstacle sphere of
- influence
12Integration
Integrated System
Sensors
Controller
Position and Goal Information
Move To Goal(Gm)
Obstacle Information
?
Avoid Obstacles(Go,S)
Output direction
Wander(Gw,P)
- S obstacle sphere of
- influence
New Gm, Go, S, Gw, and P parameters.
LM Module
13Experiments in Simulation
- 150m x 150m area
- robot moves from (10m, 10m) to (140m, 90m)
- Obstacle densities of 15 and 20 were used.
- Obstacle radii varied between 0.38m and 1.43m.
14Ballooning
15Observations on Ballooning
- Covers a lot of area
- Not as easily trapped in box canyon situations
- May settle in locally clear areas
- May require a high wander gain to carry the robot
through closely spaced obstacles
16Squeezing
17Observations on Squeezing
- Results in a straighter path
- Moves easily through closely spaced obstacles
- May get trapped in small box canyon situations
for large amounts of time
18Simulations of the Real World
End Place
Start Place
24m x 10m
Simulated setup of the real world environment.
19Completion Rates For Simulation
Uniform Obstacle Size (1m radii)
Varying Obstacle Sizes (0.38m - 1.43m radii)
20Average Steps to Completion
Uniform Obstacle Size (1m radii)
Varying Obstacle Sizes (0.38m - 1.43m radii)
21Results FromSimulated Real Environment
Complete
Steps to Completion
- As before, there is an increase in completion
rates with an accompanying increase in steps to
completion.
22Simulation Results
- Completion rates can be drastically improved.
- Completion rate improvements come at a cost of
time. - Ballooning and squeezing strategies are geared
toward different situations.
23Physical Robot Experiments
- Nomad 150 robot
- Sonar ring for obstacle avoidance
- Traverses the length of a 24m x 10m room while
negotiating obstacles
24Outdoor Run (adaptive)
25Outdoor Run (non-adaptive)
26Physical Experiment Results
- Non-learning robots became stuck.
- Learning robots successfully negotiated the
obstacles. - Squeezing was faster than ballooning in this case.
Average steps to goal.
27Conclusions
- Improved success has a price of time.
- Performance of one strategy is very poor in
situations better suited for another strategy. - The ballooning strategy is generally faster.
- Ballooning robots can move through closely spaced
objects faster than squeezing robots can move out
of box canyon situations.
28Conclusions (contd)
- If some general knowledge of the terrain is know
a priori, an appropriate strategy can be chosen. - If terrain is totally unknown, ballooning is
probably the better choice. - A way to dynamically switch strategies should
improve performance.