Title: Advanced Topics in Robotics CS493790 X
1Advanced Topics in Robotics CS493/790 (X)
- Lecture 2
- Instructor Monica Nicolescu
2Robot Components
- Sensors
- Effectors and actuators
- Used for locomotion and manipulation
- Controllers for the above systems
- Coordinating information from sensors
- with commands for the robots actuators
- Robot an autonomous system which exists in the
physical world, can sense its environment and can
act on it to achieve some goals
3Sensors
- Sensor physical device that provides
- information about the world
- Process is called sensing or perception
- Robot sensing depends on the task
- Sensor (perceptual) space
- All possible values of sensor readings
- One needs to see the world through the robots
eyes - Grows quickly as you add more sensors
4State Space
- State A description of the robot (of a system in
general) - State space All possible states a robot could be
in - E.g. light switch has two states, ON, OFF light
switch with dimmer has continuous state (possibly
infinitely many states) - Different than the sensor/perceptual space!!
- Internal state may be used to store information
about the world (maps, location of food, etc.) - How intelligent a robot appears is strongly
dependent on how much and how fast it can sense
its environment and about itself
5Representation
- Internal state that stores information about the
world is called a representation or internal
model - Self stored proprioception, goals, intentions,
plans - Environment maps
- Objects, people, other robots
- Task what needs to be done, when, in what order
- Representations and models influence determine
the complexity of a robots brain
6Action
- Effectors devices of the robot that have impact
on the environment (legs, wings ? robotic legs,
propeller) - Actuators mechanisms that allow the effectors to
do their work (muscles ? motors) - Robotic actuators are used for
- locomotion (moving around, going places)
- manipulation (handling objects)
7Autonomy
- Autonomy is the ability to make ones own
decisions and act on them. - For robots take the appropriate action on a
given situation - Autonomy can be complete (R2D2) or partial
(teleoperated robots)
8Control Architectures
- Robot control is the means by which the sensing
and action of a robot are coordinated - Controllers enable robots to be autonomous
- Play the role of the brain and nervous system
in animals - Controllers need not (should not) be a single
program - Typically more than one controller, each process
information from sensors and decide what actions
to take - Should control modules be centralized?
- Challenge how do all these controllers
coordinate with each other?
9Spectrum of robot control
From Behavior-Based Robotics by R. Arkin, MIT
Press, 1998
10Robot control approaches
- Reactive Control
- Dont think, (re)act.
- Deliberative (Planner-based) Control
- Think hard, act later.
- Hybrid Control
- Think and act separately concurrently.
- Behavior-Based Control (BBC)
- Think the way you act.
11Thinking vs. Acting
- Thinking/Deliberating
- slow, speed decreases with complexity
- involves planning (looking into the future) to
avoid bad solutions - thinking too long may be dangerous
- requires (a lot of) accurate information
- flexible for increasing complexity
- Acting/Reaction
- fast, regardless of complexity
- innate/built-in or learned (from looking into the
past) - limited flexibility for increasing complexity
12Reactive Control Dont think, react!
- Technique for tightly coupling perception and
action to provide fast responses to changing,
unstructured environments - Collection of stimulus-response rules
- Limitations
- No/minimal state
- No memory
- No internal representations
- of the world
- Unable to plan ahead
- Unable to learn
- Advantages
- Very fast and reactive
- Powerful method animals are largely reactive
13Deliberative Control Think hard, then act!
- In DC the robot uses all the available sensory
information and stored internal knowledge to
create a plan of action sense ? plan ? act (SPA)
paradigm - Limitations
- Planning requires search through potentially all
possible plans ? these take a long time - Requires a world model, which may become outdated
- Too slow for real-time response
- Advantages
- Capable of learning and prediction
- Finds strategic solutions
14Hybrid Control Think and act independently
concurrently!
- Combination of reactive and deliberative control
- Reactive layer (bottom) deals with immediate
reaction - Deliberative layer (top) creates plans
- Middle layer connects the two layers
- Usually called three-layer systems
- Major challenge design of the middle layer
- Reactive and deliberative layers operate on very
different time-scales and representations
(signals vs. symbols) - These layers must operate concurrently
- Currently one of the two dominant control
paradigms in robotics
15Behavior-Based Control Think the way you act!
- Behaviors concurrent processes that take inputs
from sensors and other behaviors and send outputs
to a robots actuators or other behaviors to
achieve some goals - An alternative to hybrid control, inspired from
biology - Has the same capabilities as hybrid control
- Act reactively and deliberatively
- Also built from layers
- However, there is no intermediate layer
- Components have a uniform representation and
time-scale
16Fundamental Differences of Control
- Time-scale How fast do things happen?
- how quickly the robot has to respond to the
environment, compared to how quickly it can sense
and think - Modularity What are the components of the
control system? - Refers to the way the control system is broken up
into modules and how they interact with each
other - Representation What does the robot keep in its
brain? - The form in which information is stored or
encoded in the robot
17Behavior Coordination
- Behavior-based systems require consistent
coordination between the component behaviors for
conflict resolution - Coordination of behaviors can be
- Competitive one behaviors output is selected
from multiple candidates - Cooperative blend the output of multiple
behaviors - Combination of the above two
18Competitive Coordination
- Arbitration winner-take-all strategy ? only one
response chosen - Behavioral prioritization
- Subsumption Architecture
- Action selection/activation spreading (Pattie
Maes) - Behaviors actively compete with each other
- Each behavior has an activation level driven by
the robots goals and sensory information - Voting strategies
- Behaviors cast votes on potential responses
19Cooperative Coordination
- Fusion concurrently use the output of multiple
behaviors - Major difficulty in finding a uniform command
representation amenable to fusion - Fuzzy methods
- Formal methods
- Potential fields
- Motor schemas
- Dynamical systems
20Example of Behavior Coordination
Fusion ?flocking (formations)
Arbitration ? foraging (search, coverage)
21How to Choose a Control Architecture?
- For any robot, task, or environment consider
- Is there a lot of sensor noise?
- Does the environment change or is static?
- Can the robot sense all that it needs?
- How quickly should the robot sense or act?
- Should the robot remember the past to get the job
done? - Should the robot look ahead to get the job done?
- Does the robot need to improve its behavior and
be able to learn new things?
22Learning Adaptive Behavior
- Learning produces changes within an agent that
over time enable it to perform more effectively
within its environment - Adaptation refers to an agents learning by
making adjustments in order to be more attuned to
its environment - Phenotypic (within an individual agent) or
genotypic (evolutionary) - Acclimatization (slow) or homeostasis (rapid)
23Learning
- Learning can improve performance in additional
ways - Introduce new knowledge (facts, behaviors, rules)
- Generalize concepts
- Specialize concepts for specific situations
- Reorganize information
- Create or discover new concepts
- Create explanations
- Reuse past experiences
24Learning Methods
- Reinforcement learning
- Neural network (connectionist) learning
- Evolutionary learning
- Learning from experience
- Memory-based
- Case-based
- Learning from demonstration
- Inductive learning
- Explanation-based learning
- Multistrategy learning
25Reinforcement Learning (RL)
- Motivated by psychology (the Law of Effect,
Thorndike 1991) - Applying a reward immediately after the
occurrence of a response increases its
probability of reoccurring, while providing
punishment after the response will decrease the
probability - One of the most widely used methods for
adaptation in robotics
26Reinforcement Learning
- Goal learn an optimal policy that chooses the
- best action for every set of possible inputs
- Policy state/action mapping that determines
- which actions to take
- Desirable outcomes are strengthened and
undesirable outcomes are weakened - Critic evaluates the systems response and
applies reinforcement - external the user provides the reinforcement
- internal the system itself provides the
reinforcement (reward function)
27Learning to Walk
- Maes, Brooks (1990)
- Genghis hexapod robot
- Learned stable tripod
- stance and tripod gait
- Rule-based subsumption
- controller
- Two sensor modalities for feedback
- Two touch sensors to detect hitting the floor -
feedback - Trailing wheel to measure progress feedback
28Learning to Walk
- Nate Kohl Peter Stone (2004)
29Supervised Learning
- Supervised learning requires the user to give the
exact solution to the robot in the form of the
error direction and magnitude - The user must know the exact desired behavior for
each situation - Supervised learning involves training, which can
be very slow the user must supervise the system
with numerous examples
30Neural Networks
- One of the most used supervised learning methods
- Used for approximating real-valued and
vector-valued target functions - Inspired from biology learning systems are built
from complex networks of interconnecting neurons - The goal is to minimize the error between the
network output and the desired output - This is achieved by adjusting the weights on the
network connections
31ALVINN
- ALVINN (Autonomous Land Vehicle in a Neural
Network) - Dean Pomerleau (1991)
- Pittsburg to San Diego 98.2 autonomous
32Learning from Demonstration RL
- S. Schaal (97)
- Pole balancing, pendulum-swing-up
33Learning from Demonstration
- Inspiration
- Human-like teaching by demonstration
Demonstration
Robot performance
34Learning from Robot Teachers
- Transfer of task knowledge from humans to robots
Human demonstration
Robot performance
35Classical Conditioning
- Pavlov 1927
- Assumes that unconditioned stimuli (e.g. food)
automatically generate an unconditioned response
(e.g., salivation) - Conditioned stimulus (e.g., ringing a bell) can,
over time, become associated with the
unconditioned response
36Darvins Perceptual Categorization
Early training
After the 10th stimulus
- Two types of stimulus blocks
- 6cm metallic cubes
- Blobs low conductivity (bad taste)
- Stripes high conductivity (good taste)
- Instead of hard-wiring stimulus-response rules,
develop these associations over time
37Genetic Algorithms
- Inspired from evolutionary biology
- Individuals in a populations have a particular
fitness with respect to a task - Individuals with the highest fitness are kept as
survivors - Individuals with poor performance are discarded
the process of natural selection - Evolutionary process search through the space of
solutions to find the one with the highest
fitness
38Genetic Operators
- Knowledge is encoded as bit strings chromozome
- Each bit represents a gene
- Biologically inspired operators are applied to
yield better generations
39Evolving Structure and Control
- Karl Sims 1994
- Evolved morphology and control
- for virtual creatures performing
- swimming, walking, jumping,
- and following
- Genotypes encoded as directed graphs are used to
produce 3D kinematic structures - Genotype encode points of attachment
- Sensors used contact, joint angle and
photosensors
40Evolving Structure and Control
- Jordan Pollak
- Real structures
41Readings
- F. Martin Sections 1.1, 1.2.3, 5
- M. Mataric Chapters 1, 3, 10