Synthesizing Realistic Facial Expressions from Photographs5 - PowerPoint PPT Presentation

1 / 25
About This Presentation

Synthesizing Realistic Facial Expressions from Photographs5


Synthesizing Realistic Facial Expressions from Photographs(5) Fr'ed'eric Pighin ... Digitize these photographs and manually mark a small set of initial ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 26
Provided by: Compu508


Transcript and Presenter's Notes

Title: Synthesizing Realistic Facial Expressions from Photographs5

Synthesizing Realistic Facial Expressions from
  • Frederic Pighin Jamie Hecker Dani Lischinski y
    Richard Szeliski z David H. Salesin
  • University of Washington y The Hebrew University
    z Microsoft Research
  • Siggraph 1998
  • Reported by WuXiangyang on Nov. 6,2001
  • State Key Lab. of CADCG, Zhejiang Univ.

  • 1. The idea of this paper
  • (1) Present new techniques for creating
    photorealistic textured 3D
  • facial models from photographs of a human
    subject, and for creating smooth transitions
    between different facial expressions by morphing
    between these different models.
  • (2) Employ a user-assisted technique to recover
    the camera poses
  • (3) A scattered data interpolation technique is
    used to deform a generic face mesh to fit the
    particular geometry of the subjects face.
  • (4) Using 3D shape morphing between the
    corresponding face models, while at the same time
    blending the corresponding textures.

lt?gt Introduction
  • 2. Main processing
  • With our approach, 2D morphing techniques can be
    combined with 3D transformations of a geometric
    model to automatically produce 3D facial
    expressions with a high degree of realism.
  • Our process consists of three steps
  • capture multiple views of a human subject (with a
    given facial expression) using cameras at
    arbitrary locations.
  • Digitize these photographs and manually mark a
    small set of initial corresponding points on the
    face in the different views. These points are
    then used to automatically recover the camera
    parameters (position, focal length, etc.)
  • The 3D positions are then used to deform a
    generic 3D face mesh to fit the face of the
    particular human subject.At this stage,
    additional corresponding points may be marked to
    refine the fit.
  • Extract one or more texture maps for the 3D model
    from the photos. Either a single view-independent
    texture map can be extracted, or the original
    images can be used to perform view-dependent
    texture mapping.

lt?gt Introduction
  • 3. Four advantages
  • gives the user complete freedom in specifying the
    correspondences, and enables the user to refine
    the initial fit as needed.
  • It can handle fairly arbitrary camera positions
    and lenses.
  • Our system not only for creating realistic face
    models,but also for performing realistic
    transitions between different expressions.
  • Develop a morphing technique that allows for
    different regions of the face to have different
    percentages or mixing proportions of facial
  • / Introduce a painting interface, which
    allows users to locally add in a little bit of
    an expression to an existing composite
  • /

lt?gt Model Fitting
  • The model fitting process consists of three
  • pose recovery we apply computer vision
    techniques to estimate the viewing parameters
    (position, orientation, and focal length) for
    each of the input cameras. We simultaneously
    recover the 3D coordinates of a set of feature
    points on the face.
  • scattered data interpolation using the
    estimated 3D coordinates of the feature points to
    compute the positions of the remaining face mesh
  • shape refinement we specify additional
    correspondences between facial vertices and image
    coordinates to improve the estimated shape of the
    face (while keeping the camera pose fixed).

lt?gt Model Fitting
  • 1. (Camare)Pose recovery
  • Starting with a rough knowledge of the camera
    positions (e.g.,frontal view, side view, etc.)
    and of the 3D shape (given by the generic head
    model), we iteratively improve the pose and the
    3D shape estimates in order to minimize the
    difference between the predicted and observed
    feature point positions.
  • Our formulation is based on the non-linear least
    squares structure-from-motion algorithm( Szeliski
    and Kang 41) .
  • In our implement
  • lt1gt We use the Levenberg-Marquardt algorithm to
    perform a complete iterative minimization over
    all of the unknowns simultaneously,
  • lt2gt We break the problem down into a series of
    linear least squares problems that can be solved
    using very simple techniques.
  • (3) Formulate the pose recovery problem

lt?gt Model Fitting
lt?gt Model Fitting
lt?gt Model Fitting
  • Explain
  • The upper equations are linear in each of the
  • We consider are not zero, the
    unknowns include
  • lt3gt For each parameter or set of parameters
    chosen, we solve for the unknowns using linear
    least squares. The simplicity of this approach is
    a result of solving for the unknowns in five
    separate stages, so that the parameters for a
    given camera or 3D point can be recovered
    independently of the other parameters.
  • The total process is iteratively made.

lt?gt Model Fitting
  • 2. Scattered data interpolation
  • Constructing such an interpolation function is a
    standard problem in scattered data interpolation.
  • We attempt to find a smooth vector-valued
    function f(p) fitted to the known data
  • from which we can compute the unknowns
  • (3) We use a method based on RBF

lt?gt Model Fitting
lt?gt Model Fitting
  • 3. Correspondence-based shape refinement(idea
  • we can further improve the shape by specifying
    additional correspondences.
  • we do not use additional correspondences to
    update the camera pose estimates.
  • we simply solve for the values of the new feature
    points pi using a simple least-squares fit

(No Transcript)
lt?gtTexture extraction
  • Introduction
  • There are two principal ways to blend values from
    different photographs
  • view-independent blending
  • resulting in a texture map that can be used to
    render the face from any viewpoint
  • view-dependent blending
  • which adjusts the blending weights at each
    point based on the direction of the current

lt?gtTexture extraction
  • 1. Weight maps
  • Definition
  • Face model can be expressed as a convex
    combination of the corresponding colors in the
  • where, T(p) the texture value of each point on
    the face model
  • Ikthe image function (color at
    each pixel of the k-th photograph)
  • (xk ,yk) the image coordinates of
    the projection of p onto the k-th image
  • Mk (p) weight value.

lt?gtTexture extraction
  • (2) The construction of weight
  • The construction of these weight maps is most
    interesting component of texture extraction
  • Four important considerations must be taken
    into account
  • 1gt Self-occlusion weight should be zero
    unless p is front-facing with respect to the k-th
    image and visible in it.
  • 2gt Smoothness the weight map should vary
    smoothly, in order to ensure a seamless blend
    between different input images.
  • 3gt Positional certainty m k(p) should depend
    on the positional certainty of p with respect
    to the k-th image. The positional certainty is
    defined as the dot product between the surface
    normal at p and the k-th direction of projection.
  • 4gt View similarity for view-dependent texture
    mapping, the weight should also depend on the
    angle between the direction of projection of p
    onto the j-th image and its direction of
    projection in the new view.

lt?gtTexture extraction
  • Attached(explanation)
  • In order to support rapid display of the textured
    face model from any viewpoint, it is desirable to
    blend the individual photographs together into a
    single texture map.
  • This texture map is constructed on a virtual
    cylinder enclosing the face model. The mapping
    between the 3D coordinates on the face mesh and
    the 2D texture space is defined using a
    cylindrical projection.
  • 2. View-independent texture mapping
  • (1) we index the weight map mk by the (u, v)
    coordinates of the texture being created.
  • (2) weight mk(u, v) is determined by the
    following four steps

lt?gtTexture extraction
  • 1gt Construct a feathered visibility map Fk for
    each image k.These maps are defined in the same
    cylindrical coordinates as the texture map. We
    initially set Fk (u, v) to 1 if the corresponding
    facial point p is visible in the k-th image, and
    to 0 otherwise. The result is a binary visibility
    map, which is then smoothly ramped (feathered)
    from 1 to 0 in the vicinity of the boundaries .
  • 2gt Compute the 3D point p on the surface of the
    face mesh whose cylindrical projection is (u, v)
    (see Figure 2). This computation is performed by
    casting a ray from (u, v) on the cylinder towards
    the cylinders axis. The first intersection
    between this ray and the face mesh is the point
    p. Let Pk (p)be the positional certainty of p
    with respect to the k-th image(dot product
    between the surface normal at p and the k-th
    direction of projection).
  • 3gt Set weight mk (u, v) to the product Fk (u, v)
    Pk (p).
  • For view-independent texture mapping,
    compute each pixel of the resulting texture T(u,
    v) as a weighted sum all of the original image
    functions, indexed by (u, v).

lt?gtTexture extraction
  • Disadvantage
  • This approach blend together resampled
    versions of the original images of the face.
    Because of resampling and slight registration
    errors, the resulting texture is slightly blurry.
  • 3. View-dependent texture mapping
  • (1) Definition rendering the model many times,
    each time using a different input photograph as a
    texture map, and blend the results.
  • (2) The item of Vk(d) which is related to the
    new viewing direction
  • Given a viewing direction d, we first
    select the subset of photographs used for
  • the rendering and then assign blending weights
    to each of these photographs.
  • ( Pulli et al. 38 select three
    photographs based on a Delaunay triangulation of
    a sphere surrounding the object)
  • Since our cameras were positioned roughly in
    the same plane, (accordingly,seen as a plane of
    2D)we select just the two photographs whose view
    directions dl and dl1 are the closest to d and
    blend between the two.

lt?gtTexture extraction
  • Assume, d the given viewing direction
  • k l , l1
  • subset dl , dl1

(No Transcript)
  • 4. Eyes, teeth, ears, and hair
  • (1) Their difficult
  • (2) select clear visible image
  • (3) eyes?teeth partially shadowed
  • 5. Expression morphing(clear)
  • (1) In general, the problem of morphing between
    arbitrary polygonal
  • meshes is a difficult one, since it requires a
    set of correspondences
  • between meshes with potentially different
    topology .
  • However, in our case the topology of all the
    face meshes is identical. Thus, there is
    already a natural correspondence between
  • (2) For 3D morphing, together with the
    geometric interpolation, it is required to
    blend the associated textures.

lt?gt Expression morphing
  • Traditional methods warping the two textures
    to form an intermediate one
  • Our approach the intermediate face model is
    rendered once with the first texture, and again
    with the second. The two resulting images are
    then blended together.
  • (a) Advantage
  • This approach is faster than warping the
    textures and it avoids the resampling.
  • (b) Blend specification
  • Global blend. The blending weights are
    constant over all vertices.
  • Regional blend. According to studies in
    psychology, the face can be split into several
    regions that behave as coherent units .The
    same region have the same weights.
  • Painterly interface

lt?gtFuture work
  • 1. Texture relighting
  • 2. Automatic modeling
  • 3. Modeling from video
  • 4. Audio and performance driven animation

  • The End
Write a Comment
User Comments (0)