Learning Behavioral Parameterization Using SpatioTemporal CaseBased Reasoning - PowerPoint PPT Presentation

About This Presentation
Title:

Learning Behavioral Parameterization Using SpatioTemporal CaseBased Reasoning

Description:

Maxim Likhachev, Michael Kaess, and Ronald C. Arkin. Mobile Robot Laboratory. Georgia Tech ... Maxim Likhachev, Michael Kaess, and Ronald C. Arkin. 2 ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 20
Provided by: maximli
Category:

less

Transcript and Presenter's Notes

Title: Learning Behavioral Parameterization Using SpatioTemporal CaseBased Reasoning


1
Learning Behavioral Parameterization Using
Spatio-Temporal Case-Based Reasoning
  • Maxim Likhachev, Michael Kaess, and Ronald C.
    Arkin
  • Mobile Robot Laboratory
  • Georgia Tech

This research was funded under the DARPA MARS
program.
2
Motivation
  • Constant parameterization of robotic behavior
    results in inefficient robot performance
  • Manual selection of right parameters is
    difficult and tedious work

3
Motivation (contd)
  • Use of Case-Based Reasoning (CBR) methodology
  • an automatic selection of optimal parameters at
    run-time (ICRA01)
  • each case is a set of behavioral parameters
    indexed by environmental features

front-obstructed case
clear-to-goal case
4
Motivation for the Current Research
  • The CBR module
  • improves robot performance (in simulations and on
    real robots)
  • avoids the manual configuration of behavioral
    parameters
  • The CBR module still required the creation of a
    case library which
  • is dependent on a robot architecture
  • needs extensive experimentation to optimize cases
  • requires good understanding of how CBR works
  • Solution to extend the CBR module to learn
  • new cases from scratch or optimize existing cases
  • in a separate training process or during missions

5
Related Work
  • Use of Case-Based Reasoning in the selection of
    behavioral parameters
  • ACBARR Georgia Tech 92 , SINS Georgia Tech
    93
  • KINS Chagas and Hallam
  • Automatic optimization of behavioral parameters
  • genetic programming (e.g., GA-ROBOT Ram, et.
    al.)
  • reinforcement learning (e.g., Learning Momentum
    Lee, et. al.)

6
Behavioral Control and CBR Module
  • CBR Module controls (case output parameters)
  • Weights for each behavior BiasMove Vector
  • Noise Persistence Obstacle Sphere

7
Case Indices Environmental Features
  • Spatial features traversability vector
  • split environment into K 4 angular regions
  • compute obstacle density within each region
  • transform the density into traversability
  • Temporal features
  • Short-term velocity towards the goal
  • Long-term velocity towards the goal

8
Overview of non-learning CBR Module
9
Making CBR Module to Learn
set of spatially matching cases
Spatial Features Vector Matching (1st stage of
Case Selection)
Temporal Features Vector Matching (2nd stage of
Case Selection)
Feature Identification
spatial temporal feature vectors
current environment
set of spatially and temporally matching cases
all the cases in the library
Case output parameters ( behavioral assemblage
parameters)
Case Application
Random Selection Biased by Case Success and
Spatial and Temporal Similarities
Case Library
case ready for application
last K cases
last K cases with adjusted performance history
Case Adaptation
best matching case
new or existing best matching case
Case switching Decision tree
Old Case Performance Evaluation
New Case Creation (if necessary)
best matching or currently used case
best matching or currently used case
10
Extensive Exploration of Cases Modified Case
Selection Process
  • Random selection of cases with the probability of
    the selection proportional to
  • spatial similarity with the environment ( 1st
    step)
  • temporal similarity with the environment (2nd
    step)
  • weighted sum of the case past performance and
    spatial and temporal similarities (3rd step)

11
Positive and Negative Reinforcement Case
Performance Evaluation
  • Criteria for the evaluation of the case
    performance
  • the average velocity with which the robot
    approaches its goal during the application of the
    case
  • opportunities for intermediate case performance
    evaluations
  • may not always be the right criteria
  • such cases exhibit no positive velocity towards
    the goal
  • the evaluation of the performance is delayed by K
    (2) cases
  • case_success (represents case performance) is
  • increased if the average velocity is increased
    or sustained high
  • decreased otherwise

12
Maximization of Reinforcement Case Adaptation
  • Maximize case_success as a noisy function of case
    output parameters (behavioral assemblage
    parameters)
  • maintain the adaptation vector A(C) for each case
    C
  • if the last series of adaptations result in the
    increase of case_success then continue the
    adaptation
  • O(C) O(C) A(C)
  • otherwise switch the direction of the adaptation,
    add a random component and scale proportionally
    to case_success
  • A(C) -?A(C) ? R
  • O(C) O(C) A(C)

13
Maximization of Reinforcement Case Adaptation
(contd)
  • Incorporate prior knowledge into the search
  • fixed adaptation of the Noise_Gain and
    Noise_Persistence parameters based on the short-
    and long-term velocities of the robot
  • Constrain the search
  • limit Obstacle_Gain to be higher than the sum of
    the other schema gains (to avoid collisions)

14
The Growth of the Case Library Case Creation
Decision
  • To avoid divergence a new case is created
    whenever
  • case_success of the selected case is high and
    spatial and temporal similarities with the
    environment are low to moderate
  • case_success of the selected case is low to
    moderate and spatial and temporal similarities
    are low
  • Limit the maximum size of the library (10 in this
    work)
  • New case is initialized with
  • the spatial and temporal features of the
    environment
  • the output parameter values of the selected case

15
Experimental Analysis Example
Learning CBR first run (starting with an empty
library)
16
Experimental Analysis Example
  • Learning CBR a run after 54 training runs on
    various environments
  • library of ten cases was learned
  • 36 percent shorter travel distance

A case of a clear-to-goal strategy is learned
for such environments
A case of a squeezing strategy is learned for
such environments
17
Experiments Statistical Results
  • Simulation results (after 250 training runs for
    learning CBR system)

Heterogeneous environment
Homogeneous environment
Mission completion rate
non-adaptive
CBR
learning CBR
non-adaptive
learning CBR
CBR
Average number of steps
learning CBR
non-adaptive
non-adapt.
CBR
CBR
learn
18
Real Robot Experiments In Progress
  • RWI ATRV-Jr
  • Sensors
  • SICK laser scanners in front and back
  • Compass
  • Gyroscope
  • Experiments in progress, no statistical results
    yet

19
Conclusions
  • New and existing cases are learned and optimized
    during a training process or as part of mission
    executions
  • Performance
  • substantially better than that of a non-adaptive
    system
  • comparable to a non-learning CBR system
  • Neither manual selection of behavioral parameters
    nor careful creation and optimization of case
    library is required from a user
  • Future Work
  • real robot experiments
  • case forgetting component
  • integration with other adaptation learning
    methods (e.g., Learning Momentum, RL for
    Behavioral Assemblage Selection)
Write a Comment
User Comments (0)
About PowerShow.com