Title: Mechatronics
1Mechatronics RoboticsLecture 11 External
sensors machine vision
- Dr Vesna Brujic-Okretic
- Room 11cBC03 or
- MSRR Lab.
- Ext. 9676 or 9681
- Email mes1vb_at_surrey.ac.uk
2Week 11 External sensors and intelligent
robots1) Sensors for robotics 2) Machine
vision3) Image processing4) Applications of
machine vision5) Implementation issues
Tutorial questions
3- External Sensors for robotics
- Robots are highly flexible in the sense that most
controllers are fairly easily reprogrammed to
cope with deliberate variations in the task but
minimally adaptable in the sense of coping with
accidental variations in an imperfect world. - We attempt to get around this by using external
sensors - Vision probably the most common
- Touch force sensors, strain gauges
- Sound proximity sensors
- Smell not usually
- Taste not likely
4- Machine vision - giving robots sight
- Potentially the most powerful sensor is
artificial vision - Also called computer vision and machine vision.
- Machine vision is already used for both robotic
non-robotic applications. - Machine vision is a complex subject encompassing
a number of different high tech fields including
optical engineering, analogue video processing,
digital image processing, pattern recognition and
artificial intelligence and computer graphics. - The basic process is modelled on the mammalian
eye/brain interaction.
5- The mammalian visual process
- Vision is the process of converting sensory
information into the knowledge of shape, identity
or configuration of objects. - Other sensors besides light sensors can also
provide similar information including bat
sonar, and touch. - Previous input and its interpretation can greatly
affect current processing of sensory data. - Seeing is the physical recording of the pattern
of light energy received from the environment. It
consists of - selective gathering in of light
- projection or focusing of light on a
photoreceptive surface - conversion of light energy into chemical or
electrical activity - Information from sensors is usually not just ON
or OFF, but also includes ''how much''.
6- The human vision system
- Incident light falls on the receptive part of the
retina - This is then converted into tiny electrical
signals which are sent via the visual cortex to
the brain - There is some evidence to suggest that the visual
data is compressed by neurones in the visual
cortex before it is processed since there are
very much fewer dendrites (connections) in the
early part of the visual cortex than there are
receptive elements in the retina - Humans have two eyes which gives them the ability
to perceive depth, as first discussed by
Descartes in the 17th Century.
7- Components of a machine vision system
- Main elements camera,
- digitiser, frame-store, processor.
8- Lighting
- Good illumination of the subject is essential for
successful application of computer vision - Poor image quality cannot be rectified by further
stages of image processing - controlled lighting ensures consistent data from
the image. - There are several methods of illumination
- front lighting is used to determine surface
characteristics of the object, - back lighting is used for simple outline
analysis. - Structured lighting is used to recover 3D
geometry of a subject and - strobe lighting may be used to freeze a scene
such as objects on a moving conveyor belt.
9- Camera types
- The most common input devices for computer vision
systems are either vidicon or solid state (CCD)
cameras. - (1) TV or Vidicon camera
- The operating principle of the vidicon camera is
based upon the image pick-up tube often found in
television cameras. - Image focused onto a photo-conductive target.
- Target scanned line by line horizontally by an
electron beam - Electric current produces as the beam passes over
target. - Current proportional to the intensity of light at
each point. - Tap current to give a video signal
- Limited resolution finite number of scan lines
(about 625) and frame rate (30 or 60 frames per
second) - Unwanted persistence between one frame and the
next - Non-linear video output with respect to light
intensity. - suffer from blooming of the image around very
bright light spots, geometrical distortion of the
image across the photoconductor and sensitivity
to environmental conditions
10- Camera types
- (2) CCD camera
- CCD cameras are made up from several
photodetectors which may either be arranged in a
line or in an array. - Most common type of camera used in machine vision
- In its basic form it is a single IC device
consisting of an array of photosensitive cells - Each cell produces an electric current dependent
on the incident light falling on it. - These cells are polled to produce a video signal
output. In the UK standard colour (PAL) and b/w
(CCIR) video is 25Hz. - CCD cameras have less geometric distortion and a
more linear video output when compared to
Vidicons. - since these devices incorporate discrete elements
they are much better suited to digital processing - The large surveillance market means they are
cheap, robust and easy to use and integrate with
other systems
11- CCD camera types
- older systems were very chunky, heavy power
hungry - advances in IC technology have resulted in
smaller and smaller systems that use less power - miniaturisation first resulted in remote head
lipstick sized cameras - latest technologies include very small single
board systems
remote head camera
typical camera
singleboardcamera
12- Image digitiser and display module
- Digitisation/display is achieved by A/D and D/A
converters respectively. - In older systems both functions are normally
implemented on a single computer interface card -
typically architectures include ISA, VME. - Newer PCI cards usually send video to the
graphics adapter - either via the PCI bus or by a
direct digital video bus.
ADC
LUT
DAC
LUT
13- Image digitisation
- digitisers consist of an analogue amplifier and
signal conditioner, which give gain and offset
control on the incoming image data, and also
provide synchronisation for the signal. - Most are capable of converting input analogue
signals into digital values at full video rates
around 1/30 - 1/60th second per image. - Most digitisers have very limited spatial
resolution (since standard video is also limited
in this fashion) - typical output image sizes are
640x480, 512x512. - The digitiser is typically provided with lookup
tables (LUTs) to quickly change pixel values. For
each possible pixel value the corresponding
output value is 'looked up' in the table. - Most b/w digitisers are 8 bit. Colour ones
usually 24bit. - The output images from the A/D are usually fed to
an onboard FRAMESTORE.
14 15- Framestore module
- A framestore issimply RAM.
- In older computers system RAM wasvery limited
and could not storethe amounts of data required
forimaging applications . - Nowadays systems frequently only have one or two
times as much RAM as the size of the image they
are digitising - unless they have an onboard
co-processor. - A 640x480 RGB colour picture obviously requires
(640x480x3) bytes of storage space. Colour images
can be stored as three channels of red blue and
green or in other formats. - It should be made clear that computer vision
requires a considerable amount of computer
memory, even at minimum resolutions
16- Image processing systems
- Once an image is held in framestore it can be
frozen and processed - A processor module must have access to the
framestore memory - Pipeline image processors have a separate
hardware device connected to the framestore via a
dedicated image bus - thus no transfers of image
data are necessary over the host bus - Co-processor systems have a chip onboard the
digitiser/framestore module which has direct
hardware access to the framestore - again no
transfers of image data are necessary over the
host bus - Some systems have daughter-board co-processors
which may be linked via an image bus - normally a
mere ribbon cable - Many newer, cheaper, systems have no facility for
pipeline processing or co-processors and expect
that the host system will do the image processing
- this requires image transfers over the host
bus, and a very fast CPU
17- Pipeline image processing
- Separate hardware modules for each function
- Each module is interconnected via a digital video
bus. - Host computer is only used for programming.
- Very high rates (frame rates) of processing are
possible. RISC processor Reduced Instruction Set
Computer - Expensive, difficult to program
Vin
Vout
A/D
videoRAM
RISCchip
digital video bus
D/A
LUT
LUT
digitiser
processor
display
frmstore
system bus
18- Co-processor systems
- Single board solution - very flexible
- Co-processor may be mounted on a daughter-card
- Extra co-processors may be added for a scaleable
solution - Almost real time performance can be achieved
- Programming is easier than pipelined solutions
through use of dedicated black-box libraries is
normally necessary
Vin
Vout
A/D
videoRAM
RISCchip
D/A
LUT
LUT
digitiser/ display/framestore/co-processor
system bus
19- Image processing with the system processor
- Inexpensive - only extra item is the
digitiser/framestore - Software development is all targeted at the host
- relatively simple and can be done using
libraries, C, or even Java. - Speed dictated by the system - Windows operating
systems can slow the system down unpredictably - Display can be a problem - dictated by
performance of graphics card
20- Image processing stages
- Once an image hasbeen captured we needto make
some sense of it - Image processing involves three main actions
- image enhancement,
- image analysis and
- image understanding, as shown in Figure.
- Image processing canbe a complex and difficult
operation andis still very much aresearch field
21- Image enhancement
- GOAL
- to remove unwanted artefacts such as noise,
distortion, and non-uniform illumination from the
image - EXAMPLE OPERATIONS
- local area averaging (mean, mode, median
filtering) - image warping, image subtraction
- background flattening, contrast enhancement,
histogram equalisation - NOTES
- operations are typically applied from a library
and are not generally very application specific - many low level operations such as filtering can
be done in real time (I.e. at frame rates) and in
hardware
22- The local area analysis of an image
23- Image analysis
- converts the input image data into a description
of the scene - GOAL
- to identify and describe, in objective terms,
the objects in a given image - Techniques image segmentation, regions
labelling, further processing - EXAMPLE OPERATIONS
- thresholding, edge detection, object labelling
- blob and shape analysis, perimeter coding,
- extraction of syntactic or contextual
descriptions, point matching - NOTES
- operations are typically very application
specific - most operations are based on a generic theory
but require large amounts of programming to work
with a specific application
24- Image understanding
- Input a description of the image
- GOAL
- classify each object and attempt to generate a
logical decision based on the content of the
image (e.g. the red object is at location
x,y,z, or reject the component, or this is
not a sheep, or there is an intruder"). - EXAMPLE OPERATIONS
- pattern recognition, pattern matching
- use of knowledge based systems and neural
networks - Possible Outcomes further info required, objects
not recognised, objects successfully recognised - NOTES
- operations are completely application specific
25- Example of image processing robotic fishing
- our goal is to detect, classify and intercept
cardboard fish as they are presented to the
robot on a pallet - the fish can be one of three species and can be
presented in any orientation - the camera and robot coordinate systems must
first be calibrated - the calibration requires at least a translation,
a rotation and a scaling - how do we do it ?
26- Raw image pre-processing (inversion)
raw image
inverted image
27- Image segmentation (global thresholding)
threshold too low (128)
threshold too high (220)
28- Image analysis/object labelling
threshold applied at 180
objects detected
29 Object (003) Object
Co-ordinates (137,275) Transverse
length (pixels) 98 Longitudinal length
(pixels) 399 Extent of object (pixels)
12758 Mean grey level 37
Elongation ratio 97.6216 Object
(004) Object Co-ordinates
(286,369) Transverse length (pixels)
154 Longitudinal length (pixels) 248 Extent
of object (pixels) 24786 Mean grey level
33 Elongation ratio
41.0748
Object (001) Object
Co-ordinates (409,223) Transverse
length (pixels) 83 Longitudinal length
(pixels) 401 Extent of object (pixels)
13407 Mean grey level 34
Elongation ratio 91.8998 Object
(002) Object Co-ordinates
(235,123) Transverse length (pixels)
197 Longitudinal length (pixels) 175 Extent
of object (pixels) 19365 Mean grey level
37 Elongation ratio
49.6959
30- Example of pattern recognition how could we use
a neural network to classify our fish ? - a neural network generally requires training -
that is it is shown a great number of example
inputs and, for each case, its required output - during the training stage it constantly
restructures its internal state (the weightings
between its interconnections) - once trained it should be able to generalise
about new fish shapes that it is shown - the main problem (or advantage !) with neural
networks is that they are a black box solution
to often difficult problems
31- Example of machine vision in manufacturing
- seam tracking in robotic welding
- when arc welding of sheet metal plates it is not
always possible to accurately program the
desired weld path due to variations in the sheet
etc. - an on-line method must be used to accurately
feedback the required trajectory to the robot
controller - a laser stripe is used to illuminate the joint
profile (usually a V) - it is then possible to detect the centre of the
joint using image processing
32- Example of machine vision in manufacturing
- seam tracking in robotic welding
- only a single starting point needs to be taught
to the robot for it to accurately weld the
desired path - the torch travel speed along the seam, as well as
the torch stand-off and orientation with respect
to the seam, can all be controlled in real time. - this enables highly sophisticated weld process
control techniques to be implemented, including
feed-forward (adaptive) fill, and real-time
closed-loop feedback control of the welding
parameters. - fixturing and setup are thus greatly simplified,
which facilitates low volume industrial
applications - indeed, since path programming is also
eliminated, this system can effectively be used
for one-off production
33- Other successful examples of machine vision in
traditional robotics - pick and place
- parts location
- parts recognition
- parts inspection
34- Some non-robotic applications of machine vision
skin lesion classification
hand gesture recognition
defect detection in welds
facial recognition
hand writing recognition
35- More applications of machine vision
traffic surveillance and speed checks
lane and neighbouring vehicle detection
36- Advanced topics in machine vision
- colour vision
- three dimensional vision
- analysis of dynamic scenes
- analysis of complex scenes
- robust analysis of any scene !
- robust analysis of colour, 3D, dynamic complex
scenes !
37- Implementation issues why isnt the use of
machine vision more widespread ? - A number of factors have delayed the practical
development of intelligent sensors based
robotics. - These can be most easily divided into the
inadequacies of today's robots and the limited
performance of current vision systems. - These problems are compounded by the analytical
and computational complexity of both manipulator
control and sensory data interpretation. - Robot control systems are difficult to analyse
and design and tend to employ very simple robot
models so that trajectories can be computed
rapidly enough so as not to degrade the robot arm
dynamic performance. - In addition robot geometry may be slightly
different from the model, and therefore the
actual end-effector position may differ from the
desired position.
38- Implementation issues why isnt the use of
machine vision more widespread ? - The performance and effectiveness of current
vision systems remain too limited, in many cases,
for real-time sensory feedback applications. - The major requirements for a real-time sensory
feedback system is the development of three
dimensional sensing and the ability to carry out
dynamic scene analysis. - The current limitations on such a system are the
data processing rates required, the data volume
per image and the extremely complex data
extraction. - These limitations drastically affect real-time
dynamic manipulator control and, until very
recently, have been orders of magnitude slower
than manipulator dynamics. - The primary technical difficulties to overcome
are the development of high speed integrated
circuits, image processing architectures and
optimised image data flow.
39Week 11 External sensors and intelligent
robots1) Sensors for robotics 2) Machine
vision overview of processes and
components 3) Image processing image
enhancement image analysis image
understanding4) Implementation
issues5) Applications6) Advanced applications
Tutorial questions
409.1 Why is sensory development important to
robotic design ? 9.2 Consider a simple task and
discuss how the human intelligently
co- ordinates perception and action. Contrast
this with the manner in which a robot would
carry out the task. 9.3 Discuss the advantages
that vision systems have over conventional
sensor systems. 9.4 Describe the main functional
elements in a machine vision system with the aid
of diagrams. Go on to describe the major
processes in an image processing system. 9.5 In
a robot-based production engineering scenario
what are the main limitations that have
prevented the full implementation of vision
based robotics? 9.6 Describe briefly sensors for
use with robots and, in particular, the
development of machine vision. 9.7 Write a
short essay on the use of machine vision in robot
applications using sketches where appropriate.