The H'264AVC Video Coding Standard - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

The H'264AVC Video Coding Standard

Description:

But industry and research coders are still way ahead ... Most video coders consist of three stages: Motion compensation / prediction ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 39
Provided by: gid1
Category:

less

Transcript and Presenter's Notes

Title: The H'264AVC Video Coding Standard


1
The H.264/AVCVideo Coding Standard
  • Peter Parnes, PhD
  • September 15 2004
  • Based on an earlier talk by Gidon Shavit
  • Based on material from
  • IEEE Transactions on Circuits and Systems for
    Video Technology, July 2003

2
Outline of This Talk
  • Background
  • The H.263 standard
  • New features in H.264

3
H.264/AVC History
  • In the early 1990s, the first video compression
    standards were introduced
  • H.261 (1990) and H.263 (1995) from ITU
  • MPEG-1 (1993) and MPEG-2 (1996) from ISO
  • Since then, the technology has advanced rapidly
  • H.263 was followed by H.263, H.263, H.26L
  • MPEG-1/2 followed by MPEG-4 visual
  • But industry and research coders are still way
    ahead
  • H.264/AVC is a joint project of ITU and ISO, to
    create an up-to-date standard.

4
Scope and Context
  • Aimed at providing high-quality compression for
    various services
  • IP streaming media (50-1500 kbps)
  • SDTV and HDTV Broadcast and video-on-demand (1 -
    8 Mbps)
  • DVD
  • Conversational services (lt1 Mbps, low latency)
  • Standard defines
  • Decoder functionality (but not encoder)
  • File and stream structure
  • Final results 2-fold improvement in compression

5
  • 2-fold improvement in compression
  • Same fidelity, half the size
  • Compared to H.263 and MPEG-2

6
Outline of This Talk
  • Background
  • The H.263 standard
  • New features in H.264
  • Relevance to our work

7
Video Compression 101
  • Most video coders consist of three stages
  • Motion compensation / prediction
  • Described current frame based on previous frame
  • Output description residual image
  • Predicted frames are called inter-frames.
  • Some frames (intra-frames) are encoded without
    prediction, as natural images.
  • Image transform
  • Concentrate image energy in relatively few
    numeric coefficients
  • Lossy coding
  • Compress coefficient values in a lossy manner
  • Try to keep most important information

8
The H.263 Standard Coder
original video
compressed video
Image Transform
Motion Compensation
Lossy Coding
9
The H.263 Standard Coder
original video
compressed video
  • H.263 Motion Compensation
  • Image is divided into 16x16 macroblocks,
  • Each macroblock is matched against nearby blocks
    in previous frame (called reference frame),
  • Nearby within 15-pixel horizontal/vertical
    range
  • Half-pixel accuracy (with bilinear pixel
    interpolation)
  • Best match is used to predict the macroblock,
  • The relative displacement, or motion vector, is
    encoded and transmitted to decoder
  • Prediction error for all blocks constitute the
    residual.

Image Transform
Motion Compensation
Lossy Coding
10
Motion Compensation Example
T1 (reference)
T2 (current)
11
The H.263 Standard Coder
original video
compressed video
  • H.263 Image Transform
  • Residual is divided into 8x8 blocks,
  • 8x8 2-d Discrete Cosine Transform (DCT) is
    applied to each block independently
  • DCT coefficients describe spatial frequencies in
    the block
  • High frequencies correspond to small features
    and texture
  • Low frequencies correspond to larger features
  • Lowest frequency coefficient, called DC,
    corresponds to the average intensity of the block

Image Transform
Motion Compensation
Lossy Coding
12
8x8 DCT Example
13
8x8 DCT Example
14
8x8 DCT Example
15
The H.263 Standard Coder
original video
compressed video
  • H.263 Lossy Coding
  • Transform coefficients are quantized
  • Some less-significant bits are dropped
  • Only the remaining bits are encoded
  • For inter-frames, all coefficients get the same
    number of bits, except for the DC which gets
    more.
  • For intra-frames, lower-frequency coefficients
    get more bits
  • To preserve larger features better
  • The actual number of bits used depends on a
    quantization parameter (QP), whose value depends
    on the bit-allocation policy
  • Finally, bits are encoded using entropy
    (lossless) code
  • Traditionally Huffman-style code

Image Transform
Motion Compensation
Lossy Coding
16
Outline of This Talk
  • Background
  • The H.263 standard
  • New features in H.264
  • Motion compensation and intra-prediction
  • Image transform
  • Deblocking filters
  • Entropy coding
  • Frames and slices

17
Changes in Motion Compensation
  • Quarter-pixel accuracy
  • A gain of 1.5-2dB across the board over ½-pixel
  • Variable block-size
  • Every 16x16 macroblock can be subdivided
  • Each sub-block gets predicted separately
  • Multiple and arbitrary reference frames
  • Vs. only previous (H.263) or previous and next
    (MPEG).
  • Anti-aliasing sub-pixel interpolation
  • Removes some common artifacts in residual

18
Variable Block-Size MC
  • Motivation size of moving/stationary objects is
    variable
  • Many small blocks may take too many bits to
    encode
  • Few large blocks give lousy prediction
  • In H.264, each 16x16 macroblock may be
  • Kept whole,
  • Divided horizontally (vertically) into two
    sub-blocks of size 16x8 (8x16)
  • Divided into 4 sub-blocks
  • In the last case, the 4 sub-blocks may be divided
    once more into 2 or 4 smaller blocks.

19
H.264 Variable Block Sizes
20
Motion Scale Example
T1
T2
21
Motion Scale Example
T1
T2
22
Motion Scale Example
T1
T2
23
H.264 VBS Example
T1
T2
24
Arbitrary Reference Frames
  • In H.263, the reference frame for prediction is
    always the previous frame
  • In MPEG and H.26L, some frames are predicted from
    both the previous and the next frames
    (bi-prediction)
  • In H.264, any one frame may be used as reference
  • Encoder and decoder maintain synchronized buffers
    of available frames (previously decoded)
  • Reference frame is specified as index into this
    buffer
  • In bi-predictive mode, each macroblock may be
  • Predicted from one of the two references
  • Predicted from both, using weighted mean of
    predictors

25
Intra Prediction
  • Motivation intra-frames are natural images, so
    they exhibit strong spatial correlation
  • Implemented to some extent in H.263 and MPEG-4,
    but in transform domain
  • Macroblocks in intra-coded frames are predicted
    based on previously-coded ones
  • Above and/or to the left of the current block
  • The macroblock may be divided into 16 4x4
    sub-blocks which are predicted in cascading
    fashion
  • An encoded parameter specifies which neighbors
    should be used to predict, and how

26
Intra-Prediction Example
27
Intra-Prediction ExampleVertical
28
Intra-Prediction ExampleHorizontal
29
Intra-Prediction ExampleMain Diagonal
30
Outline of This Talk
  • Background
  • The H.263 standard
  • New features in H.264
  • Motion compensation and intra-prediction
  • Image transform
  • Deblocking filters
  • Entropy coding
  • Frames and slices
  • Relevance to our work

31
H.264 Image Transform
  • Motivation
  • DCT requires real-number operations, which may
    cause inaccuracies in inversion
  • Better motion compensation means less spatial
    correlation no need for 8x8 transform
  • H.264 uses a very simple integer 4x4 transform
  • A (pretty crude) approximation to 4x4 DCT
  • Transform matrix contains only /-1 and /-2
  • Can be computed with only additions,
    subtractions, and shifts
  • Results show negligible loss in quality (0.02dB)

32
Deblocking Filters
  • Motivation block-based MC and transforms
    generate blocking artifacts
  • Very visible to human eye at low bit-rates
  • Previous standards applied simple filters to
    smudge edges between blocks
  • H.264 adaptively chooses for each edge which one
    of 5 deblocking filters to apply.
  • For instance, if both blocks have the same motion
    vector, less filtering is needed.
  • Improves objective quality as well about 7-9
    reduction in bit-rate for same PSNR.

33
Outline of This Talk
  • Background
  • The H.263 standard
  • New features in H.264
  • Motion compensation and intra-prediction
  • Image transform
  • Deblocking filters
  • Entropy coding
  • Frames and slices

34
Entropy Coding
  • Motivation traditional coders use fixed,
    variable-length codes
  • Essentially Huffman-style codes
  • Non-adaptive
  • Cant encode symbols with probability gt 0.5
    efficiently, since at least one bit required
  • H.263 Annex E defines an arithmetic coder
  • Still non-adaptive
  • Uses multiple non-binary alphabets, which results
    in high computational complexity

35
Entropy Coding CABAC
  • Arithmetic coding framework designed specifically
    for H.264
  • Binarization all syntax symbols are translated
    to bit-strings
  • 399 predefined context models, used in groups
  • E.g. models 14-20 used to code macroblock type
    for inter-frames
  • The model to use next is selected based on
    previously coded information (the context)

36
Outline of This Talk
  • Background
  • The H.263 standard
  • New features in H.264
  • Motion compensation and intra-prediction
  • Image transform
  • Deblocking filters
  • Entropy coding
  • Frames and slices

37
Frames and Slices
  • In H.263 and MPEG, each frame is either inter
    (P-frame) or intra (I-frame).
  • Exception some macroblocks in P-frames may be
    intra-coded, and are called I-blocks.
  • H.264 generalizes this each frame consists of
    one or more slices
  • Contiguous groups of macroblocks
  • Processed in internal raster order
  • Each is independently encoded and decoded
  • I-slices, P-slices, B-slices (two reference
    frames)

38
Summary
  • Some very strong results
  • ¼-pixel prediction
  • CABAC
  • Deblocking filters
  • Very exciting! First major improvement since mid
    90s
  • Two-fold improvement in performance!
Write a Comment
User Comments (0)
About PowerShow.com