Making robust computer vision in games - PowerPoint PPT Presentation

About This Presentation
Title:

Making robust computer vision in games

Description:

Making robust computer vision in games Presented by: Diarmid Campbell – PowerPoint PPT presentation

Number of Views:236
Avg rating:3.0/5.0
Slides: 103
Provided by: Nah58
Category:

less

Transcript and Presenter's Notes

Title: Making robust computer vision in games


1
Making robust computer vision in games
Presented by Diarmid Campbell
2
Introduction
  • Who I am Diarmid Campbell
  • What I do Run the Vision RD group
  • Where we do it Sonys London development studio
  • What we do Research computer vision for camera
    based games
  • This talk Making robust computer vision in games

3
Contents
  • What we do and why
  • The development process
  • Testing and videos
  • Computer Vision Concepts
  • A robust head tracker
  • Marker based Augmented Reality
  • The problems we faced
  • A demo of EyePet

4
Camera based games
  • Camera mounted on the TV
  • You see yourself on the TV
  • Game is overlaid on top of you

5
Past games on PS2
6
Computer Vision is hard
Computer vision makes you want to kill
yourself -Dr Nick Lord 2009
7
Why is it hard?
  • Humans mange it effortlessly
  • Image is a 2D array of numbers
  • Take 5 images and plot them as a height map

8
Pick the odd one out
9
Pick the odd one out
Odd one out
10
Pick the odd one out
Odd one out
11
Pick the odd one out
Odd one out
12
Factors affecting the pixels
  • Background objects in scene
  • Orientation/position of objects
  • Lighting/Shadows
  • Occlusion

13
George is in the pixels
  • Not interested in those
  • George was hidden in the pixels
  • Here is an image, what is it of?
  • The general computer vision problem is hard
  • If we constrain the problem, it is much easier
    (but still hard)

14
Robust Inputs
  • We can use computer vision as an input mechanism
  • Motion detection in EyeToy games
  • Robustness is how consistently an input mechanism
    does what the player is expecting
  • An input mechanism must be robust

15
Importance of robustness
  • If your fire button only worked 9 times out of
    10, you would chuck your controller out.

16
Importance of robustness
  • There are ways around it

17
Importance of robustness
  • Imagine your gun is a champagne bottle

18
Importance of robustness
  • Each button click shakes it
  • Eventually the top blows off
  • The lack of robustness is hidden

19
Importance of robustness
  • Perhaps you need to now fight tortoises instead
    of warriors

20
Importance of robustness
  • The mechanic is now robust
  • But it is laggy and unresponsive
  • Cannot rely on split-second timing

21
Importance of robustness
  • Illustrates a general point
  • If the game copes well with non-robust inputs
  • It will also cope well with someone not playing
    it well
  • It creates a skill ceiling
  • Manifests itself as lack of game-play depth

22
Importance of robustness
  • If you want a deep skill base game mechanics
  • Robust input is essential

23
The Development Process
Computer Vision Researcher
Game designer
Tell me what the game mechanic is and Ill make
you a state of the art solution
Give me something that works and Ill see what
we can make thats fun
24
The chicken and the egg
  • You cannot do one before the other
  • Both development timelines happen in parallel
  • We are still figuring it out
  • Here are some guidelines

25
Research timeline
Something up and running
Convinced we can create the technology
Vision tech beta before game reaches alpha
26
Required infrastructure
  • Prototyping environment
  • Matlab
  • Octave
  • Be able to capture videos
  • Runtime algorithms
  • Open CV
  • VXL

27
Videos and testing
28
Videos and testing
  • Computer vision is hard because many variables
    affect the images
  • The lighting
  • The players clothes
  • The wallpaper
  • Spectators
  • 3D cameras have their own pros and cons

29
Representative videos
  • Videos allow us to capture these variables and
    test
  • Videos MUST be representative
  • Works in 99 of cases
  • Useless if that 1 appear in 50 of living rooms
  • Make videos early in development
  • Demo head tracker capturing

30
Head detection videos
  • We run it through different algorithms
  • Cell SDK face detector
  • Show failure modes
  • When it fails we can find the frame it failed in
    and debug

31
Regression testing
  • Automated testing
  • Run through load of videos
  • Compare with expected results
  • Expected results could be is head visible?

32
When videos arent enough
  • SCEA RD labs invented the forthcoming
    PlayStationMove controller
  • Uses a camera and other sensors to track the
    controller
  • Videos were good early on
  • But cannot change a video
  • Lighting
  • Backgrounds
  • Camera settings

33
Solution
  • Video1
  • Video2

34
Reasons to buy a robot arm(as if you really need
persuading)
  • Can test the same motion under many different
    conditions
  • Can try special hard cases

35
Computer Vision Concepts
36
Computer Vision Concepts
  • Videos tell us when it fails
  • How do we fix it?
  • This is the field of computer vision
  • I cannot go into details of techniques
  • Instead I will explain
  • The common concepts
  • How they link together
  • This should help if you
  • Read papers
  • Talk to experts

37
Feature extraction
  • Images contain a lot of information
  • This one is 900K

38
Feature extraction
  • Instead of using pixels directly extract high
    level properties of groups of pixels
  • Result in less data which is more relevant to the
    problem at hand

39
Feature extraction
  • PS3 Demo Basic image
  • PS3 Demo Canny edge detector
  • Invariant to lighting changes
  • Store additional gradient info
  • PS3 Demo Motion
  • Used in all our camera games
  • PS3 Demo Feature points
  • Store image patch for each one
  • Can match them frame to frame

40
Likelihood functions
  • Given that we have observed these features, what
    is the probability that we are observing what we
    modelled
  • Conditional probability
  • Bayesian statistics underpins most vision
    algorithm

41
Cost functions Likelihood functions
  • Some terminology
  • Sometimes you will here about Cost functions
  • They are the same concept
  • Likelihood goes up with a good match
  • Cost goes down
  • One is (conceptually) the inverse of the other

42
Cost functions
  • Sum of Squared Differences (SSD)

SSD
1532
High cost bad match
SSD
12
Low cost good match
43
Cost functions
  • Sum of Squared Differences (SSD)

44
Classifiers
  • Compares observed features to a number of models
  • Tells you which model fits the features best

Which model fits best
45
Classifiers Face example
  • Is this a face?

46
Classifiers Face example
Classifier
  • Classic detector (Voila-Jones)
  • Models are trained on example images

47
Classifiers Face example
  • PS3 demo

48
Detectors
  • We have a model (with associated state)
  • Given some observed features
  • Detector returns
  • Is the object present? What its state?
  • Its state (X,Y position/rotation/Human pose)

49
Detectors Faces again
  • Viola-Jones face detector
  • Scans a box over the image
  • Different positions and sizes
  • Runs the classifier and returns any positives
  • Recall face detection demo

50
Trackers
  • We have a model, some observed features and the
    previous state
  • Tracker returns the next state

51
Trackers Face example
  • PS3 Demo SSD tracker
  • PS3 Demo Wand game
  • If we move quickly the tracker gets stuck in a
    local minimum

52
Learning more
  • Computer Vision Conferences
  • ICCV
  • CVPR
  • ECCV
  • Read papers accepted by conferences
  • Get friendly with an academic
  • Or hire one!

53
Robust Head Tracking
54
Track rotation and scale
  • The SSD based tracker did not track rotation and
    scale
  • Next iteration of tracker does
  • X, Y position
  • Scale
  • ? in plane rotation
  • PS demo Hager tracker
  • (swap demo)

55
Track rotation and scale
  • Tracked more types of movement
  • But very fragile
  • Problem
  • A 2D image patch is not a good model of a head

56
Track rotation and scale
  • Does not deal with out-of-plane rotation

57
Track rotation and scale
  • Even in-plane rotation is not right

58
Colour histograms
  • Lets move away from comparing pixels and think
    about features
  • Consider these images of the same objects

59
Colour histograms
  • If we compared them pixel for pixel they would
    seem very different
  • But look at a histogram of the colours that
    appear in them and they look the same

60
Colour histograms
  • Histograms are a feature that throw away all
    spatial information

61
Where we are now
  • Current system uses
  • Colour histograms
  • Keeps approximate spatial information

62
Where we are now
  • It has a foreground and a background model each
    with its own histograms

63
Where we are now
  • PS3 Demo

64
Marker based Augment Reality (AR)
65
Marker based AR
  • Marker based AR is in a published game EyePet

66
Camera setup
Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
67
What the player sees on the TV
Real
Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
Virtual
68
Marker based AR
  • We shipped a magic card with the game
  • Allows the players to manipulate virtual objects
    in 3D

69
Finding the marker
  • Input image

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
70
Finding the marker
  • Threshold

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
71
Finding the marker
  • Trace outlines

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
72
Finding the marker
  • Test for quad shapes

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
73
Finding the marker
  • Actually, just keep pairs of quads

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
74
Finding the marker
  • Take corner positions
  • Calculate a 2D transform

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
75
Finding the marker
  • Match the pattern

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
76
Finding the marker
  • Match the pattern

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
77
Finding the marker
  • Match the pattern

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
78
Finding the marker
  • Match the pattern

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
79
Finding the marker
  • Match the pattern

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
80
Finding the marker
  • Match the pattern (Yes!)

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
81
Finding the marker
  • Decompose the 2D transform
  • Camera projection
  • Model view matrix
  • Use a Kalman filter

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
82
Finding the marker
  • And were done..

Topics to discuss Camera based games What is
EyePet? Improving the tech Future research
83
Problems we faced
84
Picking the right threshold
  • Threshold to find black and white regions
  • But which one?
  • Many clever solutions didnt work
  • Brute force approach
  • Try lots (around 60) thresholds

85
Picking the right threshold
  • PS3 Demo Thresholds
  • PS3 Demo AR Thresholds

86
Light sensitive matching
  • Pattern matching used Sum of Square Differences
    (SSD)

SSD 14
SSD 874
SSD 2242
  • Brightness of image affected the score

87
Light sensitive matching
  • Use Normalised Cross Correlation (NCC) instead

SSD 0.9
SSD 0.9
SSD 0.8
88
Light sensitive matching
  • New way to look at images
  • An image is an array of numbers
  • We can list out every number and it becomes a
    vector

100
10,000
100
89
Light sensitive matching
  • This is a co-ordinate vector in image space
  • Every 100X100 image corresponds to a single
    unique point in image space

90
Light sensitive matching
  • This is a co-ordinate vector in image space
  • Every 100X100 image corresponds to a single
    unique point in image space
  • Brightening an image corresponds to scaling the
    position vector

91
Light sensitive matching
  • When comparing two images
  • SSD corresponds to the distance between them in
    image space
  • NCC corresponds to their angle

?
SSD
?
92
Light sensitive matching
  • Linear algebra is the other pillar of computer
    vision
  • Feature extracting is just a transformation from
    one space to another
  • Image space -gt Feature space
  • Classifiers are often just planes which divide up
    the space (e.g. into a region that contains faces
    and a region that doesnt)

93
Occlusion
  • It is easy to occlude the marker with your fingers

94
Occlusion
  • Put big red handle on and instruct the player to
    hold it
  • Also put handle on the back

95
Occlusion Another approach(still in research
phase)
  • Edge based tracking
  • Uses AR Marker to initialise
  • Then tracks using edge features
  • PS3Demo
  • (load EyePet)

96
False positives
  • When not occluded, we find the marker (almost)
    all the time
  • Our home videos showed this
  • False positives were a problem
  • Not represented in our videos
  • Added some Hollywood films to the video tests
  • We knew that no markers were present

97
False positives
  • Saved out all spurious frames

98
False positives
  • Made a number of tweaks to algorithm
  • E.g. Pattern matching whole marker, not just the
    centre pattern
  • 20 times less false detections

99
EyePet Demo
100
EyePet Demo
  • Use motion detection for normal interaction
  • Call
  • Jump
  • Stroke
  • Use AR card for health monitor
  • Screen-facing case
  • Needs stimulation
  • Trampoline
  • Finally
  • Give him a shower

101
Summary
  • What we do and why
  • The development process
  • Testing and videos
  • Computer Vision Concepts
  • A robust head tracker
  • Marker based Augmented Reality
  • The problems we faced
  • A demo of EyePet

102
The End(please fill out your questionnaires)
Write a Comment
User Comments (0)
About PowerShow.com