Title: 3D Face Animation
13D Face Animation
- Trevor Gerbrand
- Michael Leggio
- Jordan Nielson
- Chester Szeto
2Topics of Discussion
- Overview, History, and Challenges
- 3D Face Matching
- 3D Face Morphing
- Getting Your Face To Talk
- Techniques For Adding Realism To 3D Face Objects
- Conclusion/Questions
3Overview
- Computer facial animation is primarily an area of
computer graphics that encapsulates models and
techniques for generating and animating images of
the human head and face.
4History
- Human facial expression has been the subject of
scientific investigation for more than one
hundred years.
- Charles Darwins book The Expression of the
Emotions in Men and Animals (1872)
can be considered a major departure for modern re
search in
behavioural biology.
5History (contd)
- The earliest work with computer based facial
representation was done in the early 1970s.
- The first three-dimensional facial animation was
created by Frederick Parke in 1972.
- A University of Utah graduate, Parke currently
teaches at Texas AM University.
6History (contd)
- Since then, we have seen significant advances in
3D Face animation in the areas of science
(algorithms), technology (biometrics, missing
persons), and arts and entertainment (movies).
7Primary Challenges
- Believability
- Subtleties of human expression
83D Face Matching - One Idea
- Model-Based Tracking
- Track face positions and expressions from each
frame in a video sequence depicting a human
face.
- Create and fit a generic 3D face model to each
frame using a continuous optimization technique.
- Alter the newly matched 3D face object to change
the dynamic of the video sequence.
93D Face Matching
Video Frames
Model Matching
1 2 3 4
10The Model
- A linear combination of 3D texture-mapped models,
each corresponding to a basic facial expression.
11The Model
- Created by fitting a generic face model to a set
of photos (taken simultaneously) of a persons
face
12The Model
- The texture map is extracted by the different
photos into a single cylindrical texture map
13The Model
- Facial area is parameterized by a vector of
parameters, forming two subsets
- Position Parameters
- Include a translation t which indicates the
position of the centre of the face, and a
rotation R which indicates its orientation.
- Expression Parameters
- A set of blending weights, w1,w2, ...,wn, for
each basic expression (e.g. anger, sadness, joy,
surprise)
- ?1, ?2 ,..,?n-1 for each basic expression
exemplifying the intensity of the expression(s)
14The Model
- The relationship between the blending weights and
expression parameters
15The Model
- To span a wider range of expressions, the face
object is split into three regions (mouth area,
eyes area, and forehead) that can be
independently controlled (using a set of
expression parameters for each region).
16Fitting The Model To The Video Frame
- Continuous Optimization Technique
- Basic idea
- Computes the model parameters p producing a
rendering of the model Î(p) which best resembles
the target image It.
- How?...
17Fitting The Model To The Video Frame
- An error function ?(p) is used iteratively to
evaluate the discrepancy between Î(p) and It
- ½ S It(xj, yj) - Î(p)(xj, yj) 2 D(p)
- (xj, yj) corresponds to a location of a pixel on
the image plane and D(p) is a penalty function
that forces each blending weight to remain close
to the interval 0,1 - The error function uses variants of the
Levenberg-Marquardt algorithm.
18The Levenberg-Marquardt Algorithm In A Nutshell.
- The Levenberg-Marquardt (LM) algorithm is an
iterative algorithm that locates the minimum of a
multivariate function that is expressed as the
sum of squares of non-linear real-valued
functions. - Slow when current solution is far from the
correct one
19Fitting The Model To The Video Frame
- Once the position and the geometry of the face at
each frame has been recovered, this information
can be used to generate novel animations such
as
20Playing With The Model
- Exaggeration of facial expressions
- Change in viewpoint
- Transposing animation
- Final demo (whole process)
213D Facial Expressions
- Linear Interpolation
- Expression painting
- Timelines
- Muscle Simulation
- Mathematical Approach
- Sets of Vertex Displacements
- Facial Expression Databases
22Linear Interpolation
- Morphing from one facial expression to another by
linearly moving the corresponding vertices from
their first position to their last.
- Textures are combined in proportion to the
current percentage of the transformation
completed.
- Much work done by researchers from the Hebrew
University, the University of Washington, and
Microsoft.
23Animation Timeline
24Painting Interface
- Differences of point locations and textures from
an expression are painted on to a neutral face
or other expression.
- Depending on the brush, changes may open
partially occur.
25Creation of Expressions
26Example Video
Man reacting to Dr. Suess poem
27Muscle Simulation
- Facial muscles are modeled with ends attached to
points in the 3D model.
- Expressions are created by muscle contractions.
28Resulting Expressions
29A Mathematical Approach
- Neutral face as well as expressions are scanned
using a stereoscopic camera.
- For expressions, differences (point displacement)
is stored instead of actual position.
30Experimental Results
- Good results can be obtained by storing only
about 50 of the point displacements.
- The displacement of the remaining points can be
calculated by a weighted average of the
surrounding points. (Smoothing)
313D Facial Expression Databases
- Databases of scanned and preprocessed facial
expressions are becoming more available.
- One consisting of 6 subjects each performing six
basic expressions, with a total of 2581 video
frames was created by Ya Chang and Matthew Turk
of UC Santa Barbara and Marcelo Vieira and Luiz
Velho of Rio de Janeiro, Brazil with the
intention to make it publicly available.
32Another Database
33Getting Your Face to Talk
How do we turn this model
34Getting Your Face to Talk
Into this
Recorded from Elder Scrolls IV Oblivion
35Phonemes
- Smallest unit of language capable of conveying a
meaning
- The average english language consists of 40
phonemes.
36Visemes
- Visual representation of phonemes.
- Many phonemes are similar visually and can be
reduced to 20 visemes
- Motion captured to get key masks
- 12 consonant masks, 7 vowel masks and a silence
mask
Viseme Mask
37Ways to Get a Talking Face
- Motion Capture
- Preprocessed
- Real Time
38Motion Capture
- Done in two ways
- Markers
- Cameras capture the position of markers and
create the data.
- Markerless
- Cameras use features such as nose, eyes and
wrinkles to get data.
39Motion Capture
- Advantages
- Simplest to implement
- Capture all positions of points
- Gives the best results
- Can give real-time results
- Disadvantages
- Takes more time
- Requires a lot of data to be stored
- Inefficient
40Motion Capture
- Where would you see motion capture?
41Motion Capture
- Movies
- Tom Hanks talking into the intercom in The Polar
Express
- Done in House of Moves Motion Capture Studios
The Polar Express by Warner Bros.
42Motion Capture
- Games
- Motion capture of
- Boris Diaw talking while
- playing basketball
- Done in EAs motion capture studio in Vancouver
NBA Live by EA Sports
43Preprocessed Animation
- Can use audio or text to create 3D animation
- Audio or text is analyzed to create a sequence of
key frames
Neutral Position e r
Sequence of key frames
44Preprocessed Animation
- Animation can be interpolated between visemes by
using linear interpolation, or creating
inter-viseme frames to interpolate to
- The transitions between 2 viseme pairs can also
be constructed using neural networks that examine
a database of recordings
45Preprocessed Animation
- Advantages
- More generalized for audio input
- Can create a similar animation to motion capture
- Disadvantages
- Has no application for real-time animations
46Preprocessed Animation
- Where would you see preprocessed animation?
47Preprocessed Animation
- Games
- Karaoke Revolution
- Done by OC3 Entertainment with their Impersonator
software
Karaoke Revolution American Idol by Konami
48Real Time Animation
- Still uses visemes
- Requires the use of an Fast Fourier Transform
(FFT) to determine phonemes
- Also can use probabilities to help determine the
viseme masks
- Probabilities determined by a Gaussian Mixture
Model (GMM)
- Can use a Hidden Markov Model (HMM) common to
speech recognition
49Real Time Animation
- The GMM is trained to map the audio set to the
visual set to create a probability distribution
- The HMM is used to determine common audio
occurrences such as starting and stopping of words
50Real Time Animation
- Advantages
- Creates a better illusion of a realistic model
- Decreases load times of the animation
- Disadvantages
- Cant always ensure an accurate animation
- Requires intensive computations
51Real Time Animation
- Where would you use real time animation?
52Real Time Animation
- Chat programs
- Have a 3d model talk to you in place of video of
the person
- Games
- Half-Life 2
- Episode One
- Gears of War
Gears of War
53Calligraphic Displays
- 1950s 1980s
- Vector Displays or X-Y Displays
- Stroke Refresh versus Screen Refresh
- Solids were a problem
54Calligraphic Displays
http//www.goriya.com/flash/asteroids/asteroids.sh
tml
55Raster Display
- Modern Displays
- 2D matrix representative of screen pixels
- 24bit color depth the norm
56Rendering Images
- Eye coordinate system
- Viewport Clipping
- 3D to 2D representation
57Rendering Images
58Visible Surface Algorithms
- Z-Buffer
- 2D array storing z values
- High memory cost, but no sorting required
- Scan line
- Initial sort by y required
- Only polygons currently intersecting need to be
considered
- Spatial coherence greatly improves efficiency
59Anti-aliasing
- Aliasing - Distortion artifacts
- High frequency to lower frequency representation
- Direct sample
- Average area intensity per pixel
- Sinc filter
60Anti-aliasing
61Anti-aliasing
62Lighting and Shadows
- Realistic depiction requires a combination of
different techniques
- Specular
- Diffuse
- Ambient
63Lighting and Shadows
- Specular Reflection
- Perfect mirror-like reflection
64Lighting and Shadows
- Diffuse Reflection
- Light is reflected in all directions
- ie. chalk/flat paints, non-glossy
- Lambert's Law
65Lighting and Shadows
- Ambient Lighting
- Incoming light reflected from other surfaces
- Goral, Torrence, Greenberg
- Interreflection,
- Many small, perfectly diffuse polygons
- Lacks specular component
- expensive
66Texture Mapping
- Illusion of complexity at a greatly reduced cost
- Wallpapering polygons
- Texture represented in a 2D array
- Each vertex references an entry in the array,
used to color the polygon at that pixel
67Texture Mapping
- Light Mapping
- Bump Mapping
68Comparisons
2007
1998
69Comparisons
1998
2006
2006
70Comparisons
71Questions?
72References
- http//www.journalofvision.org/5/10/4/article.aspx
- Lip animation based on observed 3D speech
dynamics - Kalberer, Van Gool (2001)
- udn.epicgames.com/Two/ImpersonatorHeadRigging
- Real-Time Lip-Synch Face Animation Driven by
Human Voice - Fu Jie Huang and Tsuhan Chen