Adaptive%20Intelligent%20Mobile%20Robotics - PowerPoint PPT Presentation

About This Presentation
Title:

Adaptive%20Intelligent%20Mobile%20Robotics

Description:

robot moves in direction given by resultant force ... Given a choice of a region, what is a good set of macro actions for traversing it? ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 25
Provided by: lesliepack
Category:

less

Transcript and Presenter's Notes

Title: Adaptive%20Intelligent%20Mobile%20Robotics


1
  • Adaptive Intelligent Mobile Robotics
  • Leslie Pack Kaelbling
  • Artificial Intelligence Laboratory
  • MIT

2
Pyramid
  • Addressing problem at multiple levels

3
Built-in Behaviors
  • Goal general-purpose, robust visually guided
    local navigation
  • optical flow for depth information
  • finding the floor
  • optical flow information
  • Horswills ground-plane method
  • build local occupancy grids
  • navigate given the grid
  • reactive methods
  • dynamic programming

4
Reactive Obstacle Avoidance
  • Standard method in mobile robotics is to use
    potential fields
  • attractive force toward goal
  • repulsive forces away from obstacles
  • robot moves in direction given by resultant force
  • New method for non-holonomic robots move the
    center of the robot so that the front point is
    holonomic

5
Human Obstacle Avoidance
  • Control law based on visual angle and distance to
    goal and obstacles
  • Parameters set based on experiments with humans
    in large free-walking VR environment

6
Humans are Smooth!
7
Behavior Learning
  • Typical RL methods require far too much data to
    be practical in an online setting. Address the
    problem with
  • strong generalization techniques
  • locally weighted regression
  • skeptical Q-Learning
  • bootstrapping from human-supplied policy
  • need not be optimal and might be very wrong
  • shows learner interesting parts of the space
  • bad initial policies might be more effective

8
Two Learning Phases
Phase One
A
R
O
Learning System
9
Two Learning Phases
Phase Two
A
R
O
Learning System
10
New Results
  • Drive to goal, avoiding obstacles in visual field
  • Inputs (6 dimensions)
  • heading and distance to goal
  • image coordinates of two obstacles
  • Output
  • steering angle
  • Reward
  • 10 for getting to goal -5 for running over
    obstacle
  • Training simple policy that avoids one obstacle

11
Robots View
12
Local Navigation
13
Map Learning
  • Robot learns high-level structure of environment
  • topological maps appropriate for large-scale
    structure
  • low-level behaviors induce topology
  • based on previous work using sonar
  • vision changes problem dramatically
  • no more problems with many states looking the
    same
  • now same state always looks different!

14
Sonar-Based Map Learning
Data
True Model
15
Current Issues in Map Learning
  • segmenting space into rooms
  • detecting doors and corridor openings
  • representation of places
  • stored images
  • gross 3D structure
  • features for image and structure matching

16
Large Simulation Domain
  • Use for learning and large-scale experimentation
    that is impractical on a real robot
  • built using video-game engine
  • large multi-story building
  • packages to deliver
  • battery power management
  • other agents (to survey)
  • dynamically appearing items to collect
  • general Bayes-net specification so it can be used
    widely as a test bed

17
Hierarchical MDP Planning
  • Large simulated domain has unspeakably many
    primitive states
  • Use hierarchical representation for planning
  • logarithmic improvement in planning times
  • some loss of optimality of plans
  • Existing work on planning and learning given a
    hierarchy
  • temporal abstraction macro actions
  • spatial abstraction aggregated states
  • Where does the hierarchy come from?
  • combined spatial and temporal abstraction
  • top-down splitting approach

18
Region-Based Hierarchies
  • Divide state space into regions
  • each region is a single abstract state at next
    level
  • polices for moving through regions are abstract
    actions at next level

19
Choosing Macros
  • Given a choice of a region, what is a good set of
    macro actions for traversing it?
  • existing approaches guarantee optimality with a
    number of macros exponential in the number of
    exit states
  • our method is approximate, but works well when
    here are no large rewards inside the region

20
Point-Source Rewards
  • Compute a value function for each possible exit
    state, offline
  • Given a new valuation of all exit states online
  • Quickly combine value functions to determine
    near-optimal action

21
Approximation is Good
22
How to Use the Hierarchy
  • Off line
  • Decompose environment into abstract states
  • Compute macro operators
  • On line
  • Given new goal, assign values to exits at highest
    level
  • Propagate values at each level
  • In current low-level region, choose action

23
What Makes a Decomposition Good?
  • Trade off
  • decrease in off-line planning time
  • decrease in on-line planning time
  • decrease in value of actions
  • We can articulate this criterion formally but
  • we cant solve it
  • Current research on reasonable approximations

24
Next Steps
  • Low-level
  • apply JAQL to tune obstacle avoidance behaviors
  • Map learning
  • landmark selection and representation
  • visual detection of openings
  • Hierarchy
  • algorithm for constructing decomposition
  • test hierarchical planning on huge simulated
    domain
Write a Comment
User Comments (0)
About PowerShow.com