Perceptual Video Coding: How H'264 can do better - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Perceptual Video Coding: How H'264 can do better

Description:

Today's state of the art, Video/Image coders, optimize the compression quality by minimizing ... to rid block-based video coders of blocking artifacts which are ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 2
Provided by: kooh
Category:

less

Transcript and Presenter's Notes

Title: Perceptual Video Coding: How H'264 can do better


1
Perceptual Video Coding How H.264 can do better!
Koohyar Minoo kminoo_at_ucsd.edu and Truong Nguyen
nguyen_at_ucsd.edu
Video Processing Group, ECE Department,
University of California at San Diego
http//videoprocessing.ucsd.edu
  • Introduction
  • Todays state of the art, Video/Image coders,
    optimize the compression quality by minimizing
  • Some Distortion measure or
  • Some joint Rate-Distortion measure
  • Traditionally, Mathematical Distortion models
    such as MSE (mean Squared Error) or MAD (Mean
    Absolute Distance) have been used for video/Image
    coding optimization1.
  • These models are not accurately representing the
    perceived distortion based on HVS (Human Visual
    System)2.
  • Perceptually Enhanced H.264
  • We have introduced appropriate perceptual models
    for the following tasks in H.264
  • Mode Selection Instead of MSE (in RDO mode
    selection) or MAD (in simple mode selection) as a
    measure of distortion we propose a perceptual
    distortion model that allows us to
  • Select a mode that lowers consumed coding bits
    when the distortion associated with that mode is
    not noticeable. (Saving bit-budget)
  • Select a mode that preserve edges and boundary-
    integrity of scenes objects despite an increase
    in consumed bits. (Maintaining Quality)
  • Bit Allocation This part has applications in
    the following two scenarios
  • Rate Control or coding with Hypothetical
    Reference Decoder (HRD) Buffer Constraints. In
    this scenario the bit budget is assigned based on
    the predicted perceptual severity of coding loss,
    not just the Predicted MAD. (Saving bits for
    where its more needed)
  • Variable Bit Rate (VBR) for Storage Coding. In
    this scenario the Quantization parameter is
    assigned based on the amount of noise which can
    be tolerated for each Macro Block. (Maintaining
    same perceptual quality across the frames.)
  • Perceptually tuned Quantization Parameter
    assignment In this experiment, Quantization
    Parameter is assigned based on the perceptual
    characteristics of MBs.

  • (a) (b)
  • Amongst the non-normative tools, in reference
    H.264 codec by HHI, the following items
    significantly influence the performance of the
    codec
  • Rate-Distortion Optimized mode selection R-D
    Optimized mode selection uses MSE to compare
    distortion across different modes.
  • Motion Estimation for Inter modes Motion
    Estimation uses MAD or SA(T)D (Sum of Absolute
    (Transform) Difference) to decide which
    transitional motion vector, better predicts the
    current block.
  • Rate Control for VBR or CBR operations
    Currently the reference H.264 implementation of
    Rate Control algorithm uses a linear prediction
    of MAD of current block based on those of
    co-located block of previous frame to estimate
    the Quantization parameter for a given rate.
  • Objectives
  • Devising Perceptually suitable distortion
    measures to enhance the coding efficiency of the
    H.264 video coder.
  • These measures will be utilized by following
    units to make the coding, Perceptually more
    efficient.
  • R-D optimized mode Selection
  • Motion Estimation
  • Quantization Parameter assignment to each MB
  • Rate Control
  • Devised models need to be as general as possible
    and not application dependent. e.g. based on
    viewing condition and/or display type and etc. In
    future, application-based, fine tuning can be
    applied to these general models.
  • Keeping the Computational Complexity of
    perceptual distortion measurements, low, so it
    doesnt have any impact on overall encoding time
    of H.264 encoder.
  • Properties of HVS
  • Properties of Human Visual System can be used to
    correct shortcomings of mathematical models such
    as MSE. Under certain conditions HVS can tolerate
    more distortion than what MSE predicts. On the
    other hand there are some type of distortions
    that MSE doesnt signify as much as its
    perceived. Here are some of HVS properties which
    are the bases of our model
  • Texture Masking HVS is less sensitive to details
    in the areas with high amount of texture
    activity. This means that more noise can be
    tolerated in MBs with high amount of texture
    activity
  • Example In the following two images the noise
    energy added to both images are the same. In (a)
    the noise added randomly while in (b) the noise
    power is weighted base on Texture activity of the
    each MB
  • (a)
    (b)
  • Flower sequence SIF size Adding white noise with
    MSE208 to frame 1
  • Intensity Contrast Masking In lower/medium
    contrast areas more noise can be hidden in darker
    area.
  • Example In the following two images the noise
    energy added to both images are the same. In (a)
    the noise added randomly while in (b) the noise
    power is weighted base on Intensity Contrast for
    each MB
  • (a)
    (b)
  • DCT Domain Implementation
  • To construct the proposed distortion models, two
    constrained were imposed
  • Accuracy of Perceptual Model The proposed model
    needs to, objectively, measure distortion, as
    close as possible to subjective measure by
    average human observer.
  • Low Computational Complexity of model So it can
    be used for real-time video applications.
  • These two constraints resulted in a Perceptual
    model in Transform Domain which uses Transform
    coefficients to measure perceptual distortion by
    following rules.
  • Texture Activity is proportional to the
    variance of AC coefficients of DCT values.
  • Average Intensity for a block can be
    represented by DC coefficient of DCT.
  • Spatial Frequency Sensitivity can be accounted
    for, by weighing different coefficients of error
    signal in DCT domain, based on the position of
    coefficients.
  • Edge detection There are simple edge detection
    routines in DCT domain for Horizantal and
    Vertical edges.
  • Background on H.264
  • H.264 employs many normative and non-normative
    tools and features to achieve its superior video
    compression performance3. In this section we
    consider those tools and features which make
    H.264 a good candidate for benefiting from
    perceptual aspect of HVS.
  • Amongst normative tools, these new features are
    making considerable contribution to coding
    performance.
  • Variable block sizes for block prediction
    Blocks of sizes 16x16 to 4x4 more efficiently
    encapsulate the properties of video-Frames
    Regions. (e.g. smoother areas will be encoded
    with bigger blocks)
  • Smaller size (4x4) DCT transform This feature
    makes the transform coefficients more localized
    in space. So its easier to judge from DCT
    coefficients about the visual property of a
    region of a frame.
  • Quarter pixel Motion Estimation This will
    result in a more accurate prediction of
    translational motion. This feature also
    accommodate for canceling of added noise to
    reference frame from different sources. (e.g.
    quantization noise, inaccurate motion estimation
    of reference frames.)
  • De-Blocking In-Loop filter This helps to rid
    block-based video coders of blocking artifacts
    which are the main perceptual artifacts
    especially at low bit rates.
  • Flexible MacroBlock Ordering This feature
    facilitate grouping of MacroBlocks into slices
    which can be used either for error resiliency or
    more efficient video coding. This grouping can be
    based on perceptual importance of MBs (for error
    resiliency) or similar coding properties (e.g.
    quantization parameter) of different MBs of coded
    frame (for coding efficiency).
  • Future Works
  • Develop application-oriented distortion models
    e.g. distortion model can be fine-tuned or
    enhanced based on the video-frame size and the
    display type and viewing distance (To account for
    Contrast Sensitivity).
  • Performing optimal Rate-Distortion mode
    selection for lossy channels based on perceptual
    attributes of the coded MacroBlock. Currently
    this is part of mode selection based on MSE of
    error at a number of different virtual decoders
    on the encoder side.
  • Results Quality Enhancement
  • We used our algorithm to encode many of
    sequences, found on different video coding
    standards test-data base.
  • We found that quality gain based on perceptual
    coding algorithm depends on
  • Video Content The dominance of aforementioned
    perceptual features in coded sequence is a
    deciding factor on quality enhancement.
  • Target Bit Rate At low and mid ranges the
    quality gain is more noticeable than high
    bit-rates.
  • In here we show the results for two scenarios

References 1 Rate-distortion optimization for
video compressionSullivan, G.J. Wiegand, T.
Signal Processing Magazine, IEEEVolume 15, 
Issue 6,  Nov. 1998 Page(s)74 - 90 2 What's
wrong with mean-squared error? Girod, B. (1993).
In Watson, A. B. (ed.). Digital Images and Human
Vision. MIT Press, Cambridge, MA. 207-220. 3
Overview of the H.264/AVC video coding
standardWiegand, T. Sullivan, G.J. Bjntegaard,
G. Luthra, A.Circuits and Systems for Video
Technology, IEEE Transactions onVolume 13, 
Issue 7,  July 2003 Page(s)560 - 576
Acknowledgements This work is supported by a
grant from CalIT2 and Navy.
Write a Comment
User Comments (0)
About PowerShow.com