Dr Will Browne - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

Dr Will Browne

Description:

Dr Will Browne – PowerPoint PPT presentation

Number of Views:38

Avg rating:3.0/5.0

Slides: 27

Provided by: Willb46

Category:

Tags: browne | personals | tg

more less

Transcript and Presenter's Notes

Title: Dr Will Browne

1
Dr Will Browne Co-author Dr Victor Becerra
Cybernetic Intelligence How Feedback Can Enhance
the Behaviour of Mobile Robotics
w.n.browne_at_reading.ac.uk Department of
Cybernetics The University of Reading Whiteknights
Reading UK 44 (0)118 378-6705
2
AcknowledgementsCybernetic Intelligence Research
Group
http//www.cirg.reading.ac.uk/ Cybernetic
intelligence is the study of intelligence and its
application. Considering theoretical,
mathematical and philosophical aspects of
consciousness and intelligence and their
application to the design of intelligent machines
and the control of complex systems
Becerra, Dr Victor Gasson, Mr Mark Goodhew, Mr
Iain Hong, Dr Xia Hutt, Mr Ben Lang, Mr Robert
Minchinton, Mr Paul Mitchell, Dr Richard Nasuto,
Dr Slawomir Warwick, Prof Kevin Wyatt, Mr Jim
3
Contents

Cybernetics
What is Cybernetics?
Robotic feedback
Robotic examples
Learning and intelligence
Steps towards intelligence

4
Cybernetics

Norbert Wieners definition (1948)
Control and Communication in the Animal and
Machine
Combines information theory (Shannon), biological
modelling (McCulloch), artificial intelligence
(Von Neumann) and systems (Wiener).

5
Open loop

The output signal is Not fed back to the input
signal.
Inputs System Outputs

6
Closed loop

The output signal is fed back to the input
signal.
Inputs System Outputs

7
Feedback Loop

Open loop
Closed loop, feedback system

Input
Output
System
Error
Input
Output
System
_

Feedback
8
Prosthetics

Dr Peter Kyberd, formerly Southampton University
and Now New Brunswick, Canada.
Sound sensors for slip detection

9
Cyborg

Professor Kevin Warwick, Micro array
Human in the loop

10
Mobile Robotics

Dr Susan Calvin obtained her bachelor's degree at
Columbia in 2003 and began graduate work in
cybernetics.
Asimov (1940)

11
7 Dwarf Robots

Several generations of small mobile robots

12
7 Dwarf Robots

Several generations of small mobile robots

13
7 Dwarf Robots

Several generations of small mobile robots

14
Rogerr

Marathon running robot designed to follow
infrared beacon on the back of a lead runner
Too much feedback!

15
Science Museum Robots

Millennium wing of Science Museum, London.
Four programmed activities.
Follow
Pursuit and evasion
Flock
Simon says

16
Science Museum

Tested in laboratory conditions, that mimicked
exhibition
Follow

17
Flock

avoid objects (most basic behaviour with highest
priority),
if no other robots are visible become a leader
and wander,
if in a flock try to maintain position,
if a flock can be seen in the distance, speed up
and head towards it, with more priory being given
to following the closest visible leader.
Must use a dynamic leader!
Communicate and feedback who is the leader.

18
Real Robots

Cybot from Seven Dwarfs
Eaglemoss parts work magazine
4 million copies worldwide

19
Interactive R2-D2

Co-designed by Dr Dave Keating
Researched the original seven dwarfs
200,000 robots worldwide
Uses motor feedback control in head and wheels

20
Morgui

Humanoid sensor fusion
Ultrasonic
Sound sensors
Vision
Infrared

21
Learning

Robots can learn from the interaction within an
environment
Performance feedback can be provided (colliding
with other objects is not rewarded highly!)
Reinforcement learning

22
Q-Learning

Look-up table of conditions (sensor readings) to
desired actions (motor movements)
Single step example has 224 states, which is a
very large look-up table
Fuzzy sets used to map the input space to five
states
no object near robot,
obstacle in distance (gt 500mm) to the right,
obstacle in distance (gt 500mm) to the left,
obstacle relatively near (lt 230mm) the right,
obstacle relatively near (lt 230mm) the left.
Weighted roulette wheel technique selects
randomly the most appropriate action for the
situation given the current probabilities
Probability increased of successful action.

23
Difficult Learning

Latent learning
Rat maze experiments (Blodgett 1929 and Seward
1949)
No immediate feedback of utility of the action.
Reinforcement learning algorithms must be adapted

24
Latent Learning

Latent learning has three stages
Robot (or Rat) enters the maze and explores it
without reward.
Robot (or Rat) is then placed in one of the end
zones (E,F) and given a reward
Robot (or Rat) is then placed at start (S) of
maze and must navigate in the shortest path back
to the reward state.

25
Latent Learning

Anticipatory Classifier Systems (Stolzmann 1999)
Showed latent learning in simulation
Difficulties in size of domain in real robots
Inaccuracies in feedback can cause fuzzy sets
problems over time.

26
Latent Learning

Robots not very good at consistently turning at
90
Latent learning environment simplified to
N,E,W,S, compass points.
After a five-minute run a robot will start
getting very close to one wall and will
eventually get stuck against it!

27
Improved Learning

Humans will take actions in order to improve the
quality of their feedback, not just the reward
itself
Robots will need to learn to take actions that do
not lead to a reward, but improve the certainty
of the action to take.
Cybernetic principles (second order Cybernetics)
will need to be applied.

28
Balancing Act
29
Learning Classifier Systems?

Evolutionary Computation
Rule form is Transparent,
(If...Then...)
Includes statistics about rule
(Rule Statistics Classifier)
Gain knowledge by experience or direct transfer
Draw correct conclusions from their own
hypothesised knowledge
LCS are a quagmire - a glorious, wondrous and
inventing quagmire, but a quagmire nonetheless
Goldberg et al.
92

30
Learning Classifier Systems
INPUT KNOWN MILL DATA
Past
LEARNING CLASSIFIER SYSTEM
IF... THEN... (STRENGTH) RULES
Future
31
INPUT KNOWN MILL DATA
LEARNING CLASSIFIER SYSTEM
INITIAL RULE BASE
MATCH
ENCODING
INPUT
SELECT
TRAINING RULE BASE
EFFECT
OUTPUT
CREDIT
FINAL RULE BASE
DECODING
IF... THEN... (STRENGTH) RULES
32
INPUT KNOWN MILL DATA
LEARNING CLASSIFIER SYSTEM
INITIAL RULE BASE
MATCH
ENCODING
INPUT
PLAUSIBLY BETTER RULES GENERATED
SELECT
TRAINING RULE BASE
RULE DISCOVERY
EFFECT
OUTPUT
CREDIT
FINAL RULE BASE
DECODING
IF... THEN... (STRENGTH) RULES
33
Summary