ENEE 631 Project Video Codec and Shot Segmentation - PowerPoint PPT Presentation

About This Presentation
Title:

ENEE 631 Project Video Codec and Shot Segmentation

Description:

ENEE 631 Project Video Codec and Shot Segmentation Aravind Sundaresan Vikas Raykar Main features/ functionality Can encode monochrome video with frame dimensions (16 ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 14
Provided by: Arav150
Category:

less

Transcript and Presenter's Notes

Title: ENEE 631 Project Video Codec and Shot Segmentation


1
ENEE 631 Project Video Codec and Shot
Segmentation
  • Aravind Sundaresan
  • Vikas Raykar

2
Main features/ functionality
  • Can encode monochrome video with frame dimensions
    (16M, 16N).
  • Codes the sequence as a series of I/P frames. The
    I/P decision is made according to suitability of
    each method. (Example when a scene change is
    detected, the subsequent frame is coded INTRA).
  • The frames are periodically coded as INTRA
    according to the INTRA refresh rate parameter.
  • Temporal prediction is closed loop. Performs
    full-pel Motion Estimation in a window of of
    dimensions 48 x 48. Macroblocks within an I-frame
    are coded as INTRA/ INTER according to the
    compression achieved.
  • Has a resynchronization marker at frame level.

3
Video Codec Structure
  • The Video Codec is split into two programs
    Encoder and Decoder. Both of them have three
    layers.
  • Top Layer Takes care of interface and I/O.
  • Performs the necessary Initializations.
  • Splits the input into frames and feeds them to
    the next layer sequentially.

4
Video Encoder - Block diagram
5
Video Codec Structure
  • Intermediate Layer. This Layer performs the frame
    level manipulations and also takes care of the
    frame-level and macroblock-level decision making
    in the encoder.
  • Performs Motion Estimation and Compensation or
    removes 128 from the frame to get Residue Frame.
  • Feeds the residue frame to the frame encode
    layer.
  • The reconstructed residue frame is used to
    reconstruct the current frame for future
    prediction.
  • Bottom Layer This layer performs the actual
    coding.
  • A hybrid coding technique, that employs both
    predictive coding to remove temporal redundancy
    and transform coding to remove spatial redundancy
    is used.
  • The frame is split into macroblocks and each
    macroblock is coded separately. The bits
    generated are put in the bitstream.

6
Top Layer (Interface Layer)
  • Performs the necessary initializations (such as
    Loading Huffman tables).
  • Serves as interface between user and the actual
    encoder.
  • Input sequence is read and passed as frames to
    the lower layer.
  • Very first frame is forced to be INTRA.
    Subsequent frames are by default directed to be
    coded as INTER. The lower layer may dynamically
    decide to code such a frame AS INTRA according to
    various parameters (scene change / INTRA
    refresh).

7
Intermediate Layer (Control Layer)
  • This is an important layer in that most of the
    decision making is performed here. These
    decisions are aimed at selecting the best coding
    technique according to the input frame. The
    decisions made and the functions performed are
    listed below.
  • Intra / Inter Decision. The top layer has the
    ability to force the Intra Option. The
    Intermediate layer has the option to change the
    INTER option to INTRA. If the number of
    consecutive frames coded as P-frames equals a
    certain parameter (INTRA REFRESH RATE), the next
    frame is coded as an I-frame.

8
Intermediate Layer (Control Layer)
  • Motion Estimation and Compensation. In case of
    Intra Macroblocks, 128 is subtracted from the
    Macroblock. Based on the output of the Motion
    Estimation a decision is made whether to code the
    frame as INTER.
  • The frame to be coded is split into a MC frame
    and Residue frame (in case of INTRA frame, the MC
    frame comprises of pixels with value 128).
  • The Residue frame is passed to the Encode frame
    layer. The layer returns the reconstructed
    residue frame which is added to the MC frame to
    be used for future prediction. (Closed Loop
    prediction to avoid error accumulation)

9
Bottom Layer (Encoder Layer)
  • The frame is split into Macroblocks each of which
    comprises of 4 blocks. The macroblocks are read
    in raster scan order and coded sequentially and
    the blocks in the Macroblock are also similarly
    coded. Each macroblock consists of a header
    followed by the coefficient data. The header
    contains
  • Coded Information (Coded/ Not coded, INTRA/
    INTER, etc)
  • Motion Vector (for INTRA MBs)
  • Coded Block Pattern
  • Optionally include the Quantizer (or differential
    Quantizer).

10
Bottom Layer (Encoder Layer)
  • The coding Procedure is described below.
  • Each block is transform coded using the DCT. The
    DCT coefficients are quantized. The Quantization
    tables for INTRA and INTER blocks are different.
  • Resulting matrix is split into DC and AC
    coefficients which are scanned in a zigzag manner
    and coded using fixed Huffman tables and
    run-length coding techniques.
  • For run-length values not found in the table the
    run-length and level values are coded using an
    ESCAPE code and fixed length codes. If none of
    the blocks contain any coefficients, the MB is
    'not coded'.
  • If only some of the blocks are coded, the
    corresponding bit is set in the CBP.

11
Video Decoder - Block diagram
12
Decoded Frames
13
Results
  • 30 frames encoded
  • Size of compressed stream 434096 bytes
  • Size of 30 frames 30.240.352 2534400 bytes
  • Compression 17
  • Greater Compression can be achieved by
  • Increasing quantizer step
  • Increase ratio of P frames to I frames (current
    ratio 101)
Write a Comment
User Comments (0)
About PowerShow.com