Learning Momentum: Integration and Experimentation - PowerPoint PPT Presentation

About This Presentation
Title:

Learning Momentum: Integration and Experimentation

Description:

The number of consecutive steps the wander vector points in the same direction ... May require a high wander gain to carry the robot through closely spaced obstacles ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 29
Provided by: bria114
Category:

less

Transcript and Presenter's Notes

Title: Learning Momentum: Integration and Experimentation


1
Learning Momentum Integration and Experimentation
  • Brian Lee and Ronald C. Arkin
  • Mobile Robot Laboratory
  • Georgia Tech
  • Atlanta, GA

2
Motivation
  • Its hard to manually derive controller
    parameters.
  • The parameter space increases exponentially with
    the number of parameters.
  • You dont always have a priori knowledge of the
    environment.
  • Without prior knowledge, a user cant confidently
    derive appropriate parameter values, so it
    becomes necessary for the robot to adapt on its
    own to what it finds.
  • Obstacle densities and layout in the environment
    may be heterogeneous.
  • Parameters that work well for one type of
    environment may not work well with another type.

3
Adaptation and Learning Methods DARPA MARS
  • Investigate robot shaping at five distinct levels
    in a hybrid robot software architecture
  • Implement algorithms within MissionLab mission
    specification system
  • Conduct experiments to evaluate performance of
    each technique
  • Combine techniques where possible
  • Integrate on a platform more suitable for
    realistic missions and continue development

4
Overview of techniques
  • CBR Wizardry
  • Guide the operator
  • Probabilistic Planning
  • Manage complexity for the operator
  • RL for Behavioral Assemblage Selection
  • Learn what works for the robot
  • CBR for Behavior Transitions
  • Adapt to situations the robot can recognize
  • Learning Momentum
  • Vary robot parameters in real time

THE LEARNINGCONTINUUM Deliberative
(premission) . . . Behavioral switching . . . Reac
tive (online adaptation)
. . .
5
Basic Concepts of LM
  • Provides adaptability to behavior-based systems
  • A crude form of reinforcement learning.
  • If the robot is doing well, keep doing what its
    doing, otherwise try something different.
  • Behavior parameters are changed in response to
    progress and obstacles.
  • The system is still fully reactive.
  • Although the robot changes its behavior, there is
    no deliberation.

6
Currently Used Behaviors
  • Move to Goal
  • Always returns a vector pointing toward the goal
    position.
  • Avoid Obstacles
  • Returns a sum of weighted vectors pointing away
    from obstacles.
  • Wander
  • Returns vectors pointing in random directions.

7
Adjustable Parameters
  • Move to goal vector gain
  • Avoid obstacle vector gain
  • Avoid obstacle sphere of influence
  • Radius around the robot inside of which obstacles
    are perceived
  • Wander vector gain
  • Wander persistence
  • The number of consecutive steps the wander vector
    points in the same direction

8
Four Predefined Situations
  • no movement
  • M lt T movement
  • progress toward the goal
  • M gt T movement
  • P gt T progress
  • no progress with obstacles
  • M gt T movement
  • P lt T progress
  • O count gt T obstacles
  • no progress without obstacles
  • M gt T movement
  • P lt T progress
  • O count lt T obstacles
  • M average movement
  • M goal average
  • movement to the goal
  • P M goal / M
  • O count obstacles
  • encountered
  • T movement movement
  • threshold
  • T progress progress
  • threshold
  • T obstacles obstacles
  • threshold

9
Parameter adjustments
Sample adjustment parameters for ballooning.
10
Two Possible Strategies
  • Ballooning - Sphere of influence is increased
    when obstacles impede progress. The robot moves
    around large objects.
  • Squeezing - Sphere of influence is decreased when
    obstacles impede progress. The robot moves
    between closely spaced objects.

11
Integration
Base System
Sensors
Controller
Position and Goal Information
Move To Goal(Gm)
Obstacle Information
?
Avoid Obstacles(Go,S)
Output direction
Wander(Gw,P)
  • Gm goal gain
  • Go obstacle gain
  • S obstacle sphere of
  • influence
  • Gw wander gain
  • P wander persistence

12
Integration
Integrated System
Sensors
Controller
Position and Goal Information
Move To Goal(Gm)
Obstacle Information
?
Avoid Obstacles(Go,S)
Output direction
Wander(Gw,P)
  • Gm goal gain
  • Go obstacle gain
  • S obstacle sphere of
  • influence

New Gm, Go, S, Gw, and P parameters.
  • Gw wander gain

LM Module
  • P wander persistence

13
Experiments in Simulation
  • 150m x 150m area
  • robot moves from (10m, 10m) to (140m, 90m)
  • Obstacle densities of 15 and 20 were used.
  • Obstacle radii varied between 0.38m and 1.43m.

14
Ballooning
15
Observations on Ballooning
  • Covers a lot of area
  • Not as easily trapped in box canyon situations
  • May settle in locally clear areas
  • May require a high wander gain to carry the robot
    through closely spaced obstacles

16
Squeezing
17
Observations on Squeezing
  • Results in a straighter path
  • Moves easily through closely spaced obstacles
  • May get trapped in small box canyon situations
    for large amounts of time

18
Simulations of the Real World
End Place
Start Place
24m x 10m
Simulated setup of the real world environment.
19
Completion Rates For Simulation
Uniform Obstacle Size (1m radii)
Varying Obstacle Sizes (0.38m - 1.43m radii)
20
Average Steps to Completion
Uniform Obstacle Size (1m radii)
Varying Obstacle Sizes (0.38m - 1.43m radii)
21
Results FromSimulated Real Environment
Complete
Steps to Completion
  • As before, there is an increase in completion
    rates with an accompanying increase in steps to
    completion.

22
Simulation Results
  • Completion rates can be drastically improved.
  • Completion rate improvements come at a cost of
    time.
  • Ballooning and squeezing strategies are geared
    toward different situations.

23
Physical Robot Experiments
  • Nomad 150 robot
  • Sonar ring for obstacle avoidance
  • Traverses the length of a 24m x 10m room while
    negotiating obstacles

24
Outdoor Run (adaptive)
25
Outdoor Run (non-adaptive)
26
Physical Experiment Results
  • Non-learning robots became stuck.
  • Learning robots successfully negotiated the
    obstacles.
  • Squeezing was faster than ballooning in this case.

Average steps to goal.
27
Conclusions
  • Improved success has a price of time.
  • Performance of one strategy is very poor in
    situations better suited for another strategy.
  • The ballooning strategy is generally faster.
  • Ballooning robots can move through closely spaced
    objects faster than squeezing robots can move out
    of box canyon situations.

28
Conclusions (contd)
  • If some general knowledge of the terrain is know
    a priori, an appropriate strategy can be chosen.
  • If terrain is totally unknown, ballooning is
    probably the better choice.
  • A way to dynamically switch strategies should
    improve performance.
Write a Comment
User Comments (0)
About PowerShow.com