BILLY CHEN AND PRADEEP SEN - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

BILLY CHEN AND PRADEEP SEN

Description:

BILLY CHEN AND PRADEEP SEN. MICROSOFT, REDMOND, WA. DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING, UNIVERSITY OF NEW MEXICO, ALBUQUERQUE, NM ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 32
Provided by: yuc6
Category:
Tags: and | billy | chen | pradeep | sen | redmond

less

Transcript and Presenter's Notes

Title: BILLY CHEN AND PRADEEP SEN


1
Video Carving
  • BILLY CHEN AND PRADEEP SEN
  • MICROSOFT, REDMOND, WA
  • DEPARTMENT OF ELECTRICAL AND COMPUTER
    ENGINEERING, UNIVERSITY OF NEW MEXICO,
    ALBUQUERQUE, NM
  • EUROGRAPHICS 2008

2
Outline
3
1. Introduction
  • Motivation
  • Why do we need video condensed or synopsis
    ?
  • This is a particularly significant problem with
    video, since raw, uneditted footage consists of
    lots of time where nothing important happens with
    only a few short moments of interest in between.
  • ex extracting the key information from a video
    is also particularly important problem in
    security and surveillance applications.

4
1. Introduction
  • Several techniques have been proposed to
    condense long video into a shorter and
    more useful synopsis.
  • downsampling or fast-forwarding
  • the video is cut-down in size by extracting
    only every nth frame.
  • drawback
  • fails to capture a rapidly moving object since
    the temporal samples might miss the actual
    object (an example of temporal aliasing).

5
1. Introduction
  • The author in this paper present a novel
    scheme to take a long video stream with m
    frames and condense it into a short
    viewable clip with n frames (where n ltlt m)
    that preserves the most important
    information.

6
1. Introduction
  • Idea
  • Most approaches prune down the video size by
    eliminating whole frames from the video stream,
    the author observe that each deleted frame does
    not have to consist of pixels from a single time
    step.
  • They think of the frames to be deleted as
    sheets within the space-time volume where each
    pixel on the sheet has one and only one time
    step, but different pixels can have different
    time steps

7
2. Related Work
  • Several techniques have been proposed to create
    video summaries.
  • Frame-based approaches
  • simply play the video faster
  • drawback fast activities may be the lost in the
    process.
  • gtTo avoid this problem, techniques have been
    developed that identify activities and adaptively
    adjust the frame rate
  • Object-based approaches
  • To represent activities as 3D objects in the
    space time domain (e.g. video cube) and seek a
    tighter packing of these objects in the time axis.

8
2. Related Work
  • This idea of incrementally removing regions
    is inspired by Avidan and Shamir's work on
    seam carving(siggraph2007)
  • To resize an image, they incrementally remove
    seams, which are 8-connected paths through the
    image.
  • Complementary to summarizing video
  • Video Retargetting is the task for different
    output resolutions.

9
3. Video Carving
  • A long video can be summarized through video
    carving by incrementally removing 2D sheets from
    the video cube to reduce its total time.
  • The sheet must fully cut across the xy-plane of
    the video cube.
  • To compute this sheet, author use a min-cut
    formulation.

10
3. Video Carving
  • A min-cut will traverse through regions of low
    difference (e.g. high similarity).When the
    low-difference sheet has been found and removed,
    the resulting video will have few visual
    artifacts since the removed pixels will be
    similar to their surroundings both spatially and
    temporally.
  • By creating an appropriate graph of video pixels
    and augmenting it with source and sink nodes,
    they can find the min-cut of this graph and
    therefore compute the corresponding sheet to
    remove from the video cube.

11
3. Video Carving
  • First, they define a node for each pixel of the
    video cube. Nodes have edges to their top,
    bottom, left, and right neighbors. They also have
    edges to nodes in the same pixel location in the
    next and previous frames.

Edge weights are computed using a measure of
spatio-temporal difference.
12
3. Video Carving
  • A source and sink node are connected to all the
    nodes in the First and last frame, respectively.

First frame
Last frame
13
3.1 Min-Cut/Max-Flow Algorithm
  • To compute the min-cut algorithm on the graph,
    they use Boykov and Kolmogorovs min-cut/max-?ow
    algorithms (IEEE Transactions om PAMI 2004).
  • First , we have a directed graph G ltV, Egt
  • Terminology
  • Active node active nodes represent the outer
    border in each tree while the passive nodes are
    internal. Active nodes allow trees to grow by
    acquiring new children (along non-saturated
    edges) from a set of free nodes.
  • Passive node passive nodes can not grow as they
    are completely blocked by other nodes from the
    same tree.
  • Free node the nodes that are not in S or T are
    called free.
  • We have
  • S?T Tree s?t source node and
    sink node
  • O orphans set A active nodes set
  • S ? V, s ? S , T ? V , t ? T , S n T Ø

14
3.1 Min-Cut/Max-Flow Algorithm
  • It is convenient to store content of search
    trees S and T via ?ags TREE(p) indicating
    a?liation of each node p so that
  • S if p ? S
  • TREE(p) T if p ? T
  • Ø if p is free node
  • If node p belongs to one of the search trees
    then the information about its parent will be
    stored as PARENT(p).
  • Roots of the search trees (the source and the
    sink), orphans, and all free nodes have no
    parents, t.e. PARENT(p) Ø.
  • We will also use notation tree_cap(p ? q) to
    describe residual capacity of either edge (p, q)
    if TREE(p) S or edge (q, p) if TREE(p) T.?
  • ? ?

15
3.1 Min-Cut/Max-Flow Algorithm
  • The algorithm iteratively repeats the following
    three stages
  • growth stage search trees S and T grow until
    they touch giving an s -gtt path
  • augmentation stage the found path is
    augmented, search tree(s) break into forest(s)
  • adoption stage trees S and T are restored.

16
3.1 Min-Cut/Max-Flow Algorithm
  • initialize S s, T t, A s, t, O Ø
  • while true
  • grow S or T to find an augmenting path
    P from s to t
  • if( P Ø ) terminate
  • augment on P
  • adopt orphans
  • end while

17
3.1.1 Growth stage
  • Growth stage
  • At this stage active nodes acquire new children
    from a set of free nodes, The growth stage
    terminates if an active node encounters a
    neighboring node that belongs to the opposite
    tree. In this case we detect a path from the
    source to the sink.
  • while A ! Ø
  • pick an active node p ? A
  • for every neighbor q such that tree_cap(p ? q) gt
    0
  • if TREE(q) Ø then add q to search tree as an
    active node
  • TREE(q) TREE(p), PARENT(q) p, A A ?
    q
  • if TREE(q) ! Ø and TREE(q) ! TREE(p) return P
    PATHs?t
  • end for
  • remove p from A
  • end while
  • return P Ø

18
3.1.2 Augmentation stage
  • Augmentation stage
  • The augmentation phase may split the search
    trees S and T into forests. The source s and the
    sink t are still roots of two of the trees while
    orphans form roots of all other trees.
  • find the bottleneck capacity ? on P
  • update the residual graph by pushing flow ?
    through P
  • for each edge (p, q) in P that becomes saturated
  • if TREE(p) TREE(q) S
  • then set PARENT(q) Ø and
  • O O ? q (q is orphan)
  • if TREE(p) TREE(q) T
  • then set PARENT(p) Ø and O O ?
    p (p is orphan)
  • end for

19
3.1.3 Adoption stage
  • Adoption stage
  • During this stage all orphan nodes in O are
    processed until O becomes empty. Each node p
    being processed tries to ?nd a new valid parent
    within the same search tree in case of success p
    remains in the tree but with a new parent,
    otherwise it becomes a free node and all its
    children are added to O, The goal of the adoption
    stage is to restore single-tree structure of sets
    S and T with roots in the source and the sink.
  • while O ! Ø
  • pick an orphan node p ? O and remove it from O
  • process p
  • end while

20
3.1.3 Adoption stage
  • Process p
  • Trying to ?nd a new valid parent for p among
    its neighbors.
  • If node p ?nds a new valid parent q then
  • set PARENT(p) q.
  • (In this case p remains in its search tree and
    the active (or passive) status of p remains
    unchanged.)
  • If p does not ?nd a valid parent then
  • scan all neighbors q of p such that TREE(q)
    TREE(p)
  • if tree cap(q ? p) gt 0
  • add q to the active set A
  • if PARENT(q) p
  • add q to the set of orphans O and set
    PARENT(q) Ø
  • TREE(p) Ø , A A - p (p becomes a
    free node)
  • (A valid parent q should satisfy TREE(q)
    TREE(p),tree cap(q ? p) gt 0, and the origin of
    q should be either source or sink.)

21
3.1 Min-Cut/Max-Flow Algorithm
  • Terminal Condition
  • The algorithm terminates when the search trees S
    and T can not grow (no active nodes) and the
    trees are separated by saturated edges. This
    implies that a maximum ?ow is achieved. The
    corresponding minimum cut can be determined by S
    S and T T.

22
3. Video Carving
  • Finally, we ?nd a min-cut on this graph and
    compute a corresponding sheet that has the
    property that it has only one temporal value at
    every projected pixel location.
  • To do this, they ?rst ?nd the set of nodes S,
    that have edges that cross the min-cut. We then
    use a front-surface strategy to determine which
    nodes to remove.

23
3. Video Carving
  • For each pixel location, we project it along the
    time-axis of the video cube, from the ?rst frame
    to the last frame. The ?rst node n ? S we
    encounter will be the pixel we remove from the
    video cube.

24
3. Video Carving
  • Once a sheet is removed from the video cube, the
    remaining pixels are packed to cover the empty
    space. Because every pixel location had one and
    only one frame removed, the total video cube is
    shortened by one frame.

25
4. Implementation
  • Restriction
  • the memory requirements of storing the entire
    data structure can be signicant.
  • gtstore the video stream as a 3D doubly-linked
    grid of of pixels with each pixel storing the
    color and gradient information as well as
    pointers to its neighbors, resulting in a
    structure 40 bytes in size per pixel.
  • this limits the maximum number of pixels in our
    graph to about 50 million.(32-bit Windows gives
    applications only 2GB of total memory)
  • Ex
  • For a 720480 video at 30 frames per second,
    this only yields about about 150 frames (5
    seconds), which is unacceptable.

26
4. Implementation
  • In order to process videos of larger sizes, they
    take the input video and break it up into smaller
    video subsets, each which can ?t entirely within
    memory. Then extract a single frame from each
    subset with the min-cut algorithm. Therefore,
    after the ?rst pass through the entire video is
    ?nished, they have removed as many frames as
    there were video subsets.
  • Continue making passes through the video removing
    frames until the video reaches the desired size.

27
5. Results
28
5. Results
  • video carving preserves important information
    that is not in the fast-forwarded version.

29
5. Results
  • However, our video carving technique has
    artifacts that show up as motion tails
    following rapidly-moving objects. These are
    caused by video sheets that traverse the path of
    the object, placing it with a previous image of
    itself on the same frame.
  • gt These artifacts are the direct cause of
    having to use a small subset of the video during
    processing.Since each video subset that was
    processed was only a few seconds long and
    required the removal of a video sheet.

30
6. Conclusions and Future Work
  • First, they might reduce the motion tails in the
    condensed video by processing larger blocks of
    video at one time.
  • In addition, it would be of interest to be able
    to enforce temporal order in the ?nal video.
  • Because we do not use any object information
    during processing, the carving of video sheets
    can cause discontinuities to appear as objects
    move.

31
6. Conclusions and Future Work
  • By carving out low-gradient video sheets from a
    long video, they are able to produce a much
    shorter version that preserves important
    information, even going as far as compositing
    objects together that happen different times in
    the same frame.
Write a Comment
User Comments (0)
About PowerShow.com