Title: Models of Human Motion: Tracking, Learning, and Animating People
1Models of Human MotionTracking, Learning, and
Animating People
Christoph Bregler Computer Science
Division University of California,
Berkeley Joint Work with Jitendra Malik,
Jerome A. Feldman, UC Berkeley Stephen M.
Omohundro, Yochai Konig, ICSI Michele Covell,
Malcolm Slaney, Interval.
2Problem Domains
- Human Computer Interaction
Gesture Recognition
Speech Reading
Personal Agent
3Problem Domains
- Content-Based Video Database Annotation
Content based Query
4Problem Domains
- Motion Capture / Animation
Body Suits, Markers
Video Motion Capture
5Human Motion
articulated
run
non-rigid
read my lips
measurement
animation
recognition
6Human Motion
articulated
run
non-rigid
read my lips
measurement
animation
recognition
7Visual Tracking
Standard Techniques
- Template Matching
- Edges / Shape / Color
- Background Subtraction
- Optical Flow
New Challenges
- Complex Variation
- Self Occlusion
- Noise ( Folds, Low Contrast)
8Optical Flow
local ambiguities
9Motion Constraints
- Optical Flow 2N Parameters
N Pixel
- Layered Motion 6L Parameters
L Areas
- Articulated Chains 6 K Parameters
K DOFs
10Motion Estimation
E(V)
Constrain
-
V (v1,,vn)
V
11Motion Estimation
Constrain
-
V
V
D
E(V)
D
D
12Motion Estimation
Constrain
-
V
V
D
V M( q )
E(V)
D
D
13Twist Motion and Exponential Map
Murray, Li, Sastry
x
e
r1 r2 r3 dx r4 r5 r6 dy r7 r8 r9 dz 0
0 0 1
G
G
q
q
t1
t
x
0 -w3 w2 u1 w3 0 -w1
u2 -w2 w1 0 u3 0 0 0
0
x
14Product of Exponential Map
x
G e e e
0
G(x ,a , a )
x
0
1
2
0
15Image Velocities
x
x
orthographic projection
x
e e e q
a
a
S 0 0 0 0 S 0 0
1
1
0
2
2
p
x
i
i
0
d
v
i
d t
a
a
2
1
x a
v M
i
i
Image
World
16Motion Estimation
Constrain
-
V
V
D
x a ... a
E(V)
D
V M
1
K
D
17Video
18Comparison to Biometric Data
kinematic tracker
Murray, Drought, Kory, 1964
19Human Motion
articulated
run
non-rigid
read my lips
measurement
animation
recognition
20Learning Gait Dynamics
HMM
Dynamical Systems
Motion Measurements
21Human Motion
articulated
run
non-rigid
read my lips
measurement
animation
recognition
22Multiple Views
23Eadweard Muybridge
24Video Graphics by Charles Ying
25Human Motion
articulated
run
non-rigid
read my lips
measurement
animation
recognition
26Non-Rigid Measurement
E(S)
Constrain
S (p ,,p )
1
n
27Manifold Learning with Steve Omohundro
EM
Mixture of Patches
Training Data
28Mixture of Projections
29Constrained Tracking
E(S)
Constrain
S (p ,,p )
1
n
30Example Tracks
31Video
32Human Motion
articulated
run
non-rigid
read my lips
measurement
animation
recognition
33Speech Reading with Yochai Konig
Acoustic features
Viseme Models
Noise acoustic visual 20db SNR
33.5 26.0 10db SNR 56.1
48.0 15db crosstalk 67.3 46.0
34Human Motion
articulated
run
non-rigid
read my lips
measurement
animation
recognition
35Video Rewrite with Michele Covell, Malcolm
Slaney
New
Training
36Goal Photo-realistic Talking Face
2
Video Rewrite
Handcoded 3D Model
OR
37Building Video Model
5
- Phonetic
- Head Pose
- Mouth Shape
/D/
/OH/
/N/
/AH/
38video
7
39Video Model
10
8 min 1,700 triphones
Ellen Model
40Synthesis - Overview -
11
background face
41Synthesis
12
- Transcribe
- Find Lip Clips
- Stitch Together
/J/
/EH/
/L/
/IY/
42Matching Co-Articulation
14
/ UW - T - UW/
?
43Matching Co-Articulation
15
match
44Co-Articulation Tri-Phones
16
/ UW - T - UW/
More than 20,000 Tri-Phones in English
/ AA - T - AA/
/ AA - S - AA/
.
45Matching Viseme-Distance
18
approximate match
46video
7
47Stitching
20
48Stitching
21
Morphing
49Video Rewrite Results
22
JFK - Video Model 2 minutes data
Ellen - Video Model 8 minutes data
50Human Motion
articulated
run
non-rigid
read my lips
measurement
animation
recognition
51Future Virtual Actors
52Challenges
Language
Graphics
Measurement
53Key Technologies
Learning
Video-Based Rendering