Title: Machine Learning and Robotics
1Machine Learning and Robotics
2Outline
- Machine Learning Basics and Terminology
- An Example DARPA Grand/Urban Challenge
- Multi-Agent Systems
- Netflix Challenge (if time permits)
3Introduction
- Machine learning is commonly associated with
robotics - When some think of robots, they think of machines
like WALL-E (right) human-looking, has
feelings, capable of complex tasks - Goals for machine learning in robotics arent
usually this advanced, but some think were
getting there - Next three slides outline some goals that
motivate researchers to continue work in this area
4Household Robot to Assist Handicapped
- Could come preprogrammed with general procedures
and behaviors - Needs to be able to learn to recognize objects
and obstacles and maybe even its owner (face
recognition?) - Also needs to be able to manipulate objects
without breaking them - May not always have all information about its
environment (poor lighting, obscured objects)
5Flexible Manufacturing Robot
- Configurable robot that could manufacture
multiple items - Must learn to manipulate new types of parts
without damaging them
6Learning Spoken Dialog System for Repairs
- Given some initial information about a system, a
robot could converse with a human and help to
repair it - Speech understanding is a very hard problem in
itself
7Machine Learning Basics and Terminology
- With applications and examples in robotics
8Learning Associations
- Association Rule probability that an event will
happen given another event already has (P(YX))
9Classification
- Classification model where input is assigned to
a class based on some data - Prediction assuming a future scenario is
similar to a past one, using past data to decide
what this scenario would look like - Pattern Recognition a method used to make
predictions - Face Recognition
- Speech Recognition
- Knowledge Extraction learning a rule from data
- Outlier Detection finding exceptions to the
rules
10Regression
- Linear regression is an example
- Both Classification and Regression are
Supervised Learning strategies where the goal
is to find a mapping from input to output - Example Navigation of autonomous car
- Training Data actions of human drivers in
various situations - Input data from sensors (like GPS or video)
- Output angle to turn steering wheel
11Unsupervised Learning
- Only have input
- Want to find regularities in the input
- Density Estimation finding patterns in the
input space - Clustering find groupings in the input
12Reinforcement Learning
- Policy generating correct actions to reach the
goal - Learn from past good policies
- Example robot navigating unknown environment in
search of a goal - Some data may be missing
- May be multiple agents in the system
13Possible Applications
- Exploring a world
- Learning object properties
- Learning to interact with the world and with
objects - Optimizing actions
- Recognizing states in world model
- Monitoring actions to ensure correctness
- Recognizing and repairing errors
- Planning
- Learning action rules
- Deciding actions based on tasks
14What We Expect Robots to Do
- Be able to react promptly and correctly to
changes in environment or internal state - Work in situations where information about the
environment is imperfect or incomplete - Learn through their experience and human guidance
- Respond quickly to human interaction
- Unfortunately, these are very high expectations
which dont always correlate very well with
machine learning techniques
15Differences Between Other Types of Machine
Learning and Robotics
- Planning can frequently be done offline
- Actions usually deterministic
- No major time constraints
- Often require simultaneous planning and execution
(online) - Actions could be nondeterministic depending on
data (or lack thereof) - Real-time often required
16An Example DARPA Grand/Urban Challenge
17The Challenge
- Defense Advanced Research Projects Agency (DARPA)
- Goal to build a vehicle capable of traversing
unrehearsed off-road terrain - Started in 2003
- 142 mile course through Mojave
- No one made it through more than 5 of the course
in 2004 race - In 2005, 195 teams registered, 23 teams raced, 5
teams finished
18The Rules
- Must traverse a desert course up to 175 miles
long in under 10 h - Course kept secret until 2h before the race
- Must follow speed limits for specific areas of
the course to protect infrastructure and ecology - If a faster vehicle needs to overtake a slower
one, the slower one is paused so that vehicles
dont have to handle dynamic passing - Teams given data on the course 2h before race so
that no global path planning was required
19A DARPA Grand Challenge Vehicle Crashing
20A DARPA Grand Challenge Vehicle that Did Not Crash
- namely Stanley, the winner of the 2005 challenge
21Terrain Mapping and Obstacle Detection
- Data from 5 laser scanners mounted on top of the
car is used to generate a point cloud of whats
in front of the car - Classification problem
- Drivable
- Occupied
- Unknown
- Area in front of vehicle as grid
- Stanleys system finds the probability that ?h gt
d where ?h is the observed height of the terrain
in a certain cell - If this probability is higher than some threshold
a, the system defines the cell as occupied
22(cont.)
- A discriminative learning algorithm is used to
tune the parameters - Data is taken as a human driver drives through a
mapped terrain avoiding obstacles (supervised
learning) - Algorithm uses coordinate ascent to determine d
and a
23Computer Vision Aspect
- Lasers only make it safe for car to drive lt 25
mph - Needs to go faster to satisfy time constraint
- Color camera is used for long-range obstacle
detection - Still the same classification problem
- Now there are more factors to consider
lighting, material, dust on lens - Stanley takes adaptive approach
24Vision Algorithm
- Take out the sky
- Map a quadrilateral on camera video corresponding
with laser sensor boundaries - As long as this region is deemed drivable, use
the pixels in the quad as a training set for the
concept of drivable surface - Maintain Gaussians that model the color of
drivable terrain - Adapt by adjusting previous Gaussians and/or
throwing them out and adding new ones - Adjustment allows for slow adjustment to lighting
conditions - Replacement allows for rapid change in color of
the road - Label regions as drivable if their pixel values
are near one or more of the Gaussians and they
are connected to laser quadrilateral
25(No Transcript)
26Road Boundaries
- Best way to avoid obstacles on a desert road is
to find road boundaries and drive down the middle - Uses low-pass one-dimensional Kalman Filters to
determine road boundary on both sides of vehicle - Small obstacles dont really affect the boundary
found - Large obstacles over time have a stronger effect
27Slope and Ruggedness
- If terrain becomes too rugged or steep, vehicle
must slow down to maintain control - Slope is found from vehicles pitch estimate
- Ruggedness is determined by taking data from
vehicles z accelerometer with gravity and
vehicle vibration filtered out
28Path Planning
- No global planning necessary
- Coordinate system used is base trajectory
lateral offset - Base trajectory is smoothed version of driving
corridor on the map given to contestants before
the race
29Path Smoothing
- Base trajectory computed in 4 steps
- Points are added to the map in proportion to
local curvature - Least-squares optimization is used to adjust
trajectories for smoothing - Cubic spline interpolation is used to find a path
that can be resampled efficiently - Calculate the speed limit
30Online Path Planning
- Determines the actual trajectory of vehicle
during race - Search algorithm that minimizes a linear
combination of continuous cost functions - Subject to dynamic and kinematic constraints
- Max lateral acceleration
- Max steering angle
- Max steering rate
- Max acceleration
- Penalize hitting obstacles, leaving corridor,
leaving center of road
31(No Transcript)
32Multi-Agent Systems
33Recursive Modeling Method (RMM)
- Agents model the belief states of other agents
- Beyesian methods implemented
- Useful in homogeneous non-communicating
Multi-Agent Systems (MAS) - Has to be cut off at some point (dont want a
situations where agent A thinks that agent B
thinks that agent A thinks that) - Agents can affect other agents by affecting the
environment to produce a desired reaction
34Heterogeneous Non-Communicating MAS
- Competitive and cooperative learning possible
- Competitive learning more difficult because
agents may end up in arms race - Credit-assignment problem
- Cant tell if agent benefitted because its
actions were good or if opponents actions were
bad - Experts and observers have proven useful
- Different agents may be given different roles to
reach the goal - Supervised learning to teach each agent how to
do its part
35Communication
- Allowing agents to communicate can lead to deeper
levels of planning since agents know (or think
they know) the beliefs of others - Could allow one agent to train another to
follow its actions using reinforcement learning - Negotiations
- Commitment
- Autonomous robots could understand their position
in an environment by querying other robots for
their believed positions and making a guess based
on that (Markov localization, SLAM)
36Netflix Challenge
37References
- Alpaydin, E. Introduction to Machine Learning.
Cambridge, Mass. MIT Press, 2004. - Kreuziger, J. Application of Machine Learning
to Robotics An Analysis. In Proceedings of the
Second International Conference on Automation,
Robotics, and Computer Vision (ICARCV '92).
1992. - Mitchell et. al. Machine Learning. Annu. Rev.
Coput. Sci. 1990. 4417-33. - Stone, P and Veloso, M. Multiagent Systems A
Survey from a Machine Learning Perspective.
Autonomous Robots 8, 345-383, 2000. - Thrun et. al. Stanley The Robot that Won the
DARPA Grand Challenge. Journal of Field
Robotics 23(9), 661-692, 2006.