Title: Nokia MVC Video Codec
1Nokia MVC Video Codec
Nokia Research Center Documents Q15-J-19
Decoder Description Q15-J-20 Source
Code Q15-J-21 Results Q15-J-46
Encoder Description
2MVC Coder
- Consistent 1 dB - 1.5 dB performance gains over
TML-2 (0.5 dB - 1.0 dB over Telenor proposal for
TML-4). - Superior coding performance achieved by
- Directional Intra prediction.
- Advanced motion model based on orthonormal affine
basis functions. - Variable size transforms used for prediction
error coding. - Adaptive loop filter.
3Encoder Overview
Prediction Error Coding
En(x,y)
In(x,y)
En(x,y)
Prediction Error Decoding
In(x,y)
MULTIPLEXING
In(x,y)
Frame Memory
Rn(x,y)
Motion Compensated Prediction
motion information
4Motion Compensated Prediction
- Macroblock based segmentation
- Macroblocks can be splitted into independently
coded 8x8 blocks - All motion vectors of the segment are described
by a function of few parameters.
5Motion Model
- Current standards - translational motion model.
- MVC -affine motion model - in addition to
translation models zooming, rotation and
shearing.
6Motion Model
- Affine functions 1, y, x orthonormalized with
respect to 8x8 or 16x16 block are used. - Coefficients corresponding to such functions are
robust to quantisation.
7Motion Parameter Prediction
- Adjacent blocks are often undergoing similar
motion. - Causal motion prediction used - motion can be
predicted from the upper or left block. - Prediction by extension of the motion vector
field. - Final motion vector field is sum of the predicted
field and separately transmitted refinement field
C
C
L
L
8Motion Compensation
- Fixed point motion vectors are evaluated
iteratively - Evaluate motion vector for the first pixel using
motion coefficients. - Calculate the motion vector differences when
moving a single pixel right (sxx, sxy) and down
(syx, syy). - Scan the whole segment adding only differences to
the previously evaluated motion vector.
- Subpixel values are obtained by cubic spline
interpolation. - Separable 2D filtering in 4x4 neighborhood.
- Filter coefficients are functions of the
fractional part of the coordinate (tabulated and
precalculated). - Bilinear interpolation for chrominance frames
9Motion Estimation
- Gauss-Newton - converge only towards local
minima, unless the initial coefficients lie in
the attraction domain of the global minimum. - Robust initial motion search
- Block matching
- Subsampled images are used in order to keep the
complexity down. - Hierarchical motion estimation
- Utilization of the motion of neighboring segments
10Motion Estimation and Mode Selection
11Coefficient Removal
- Reducing the number of bits by selectively
setting some of the motion coefficients to zero. - Compensating removed coefficient by adjusting
others. - Main difficulty - non-linear dependency of
prediction error on the values of motion
coefficients - The importance of a particular coefficient can
not be judged by its magnitude. - New representation of the optimization problem
using linearization of the image brightness
function (similar kind of operation than used in
Gauss-Newton minimization). - All operations are manipulations of small, at
most 6x6 matrices.
12Coefficient Removal
13Complexity Scalability in Motion Estimation
- Good complexity/video quality trade-offs can be
achieved by - Limiting the maximum number of iterations in
Gauss-Newton optimization - Performing only one Coefficient Removal per
segment (either with respect to predicted
coefficients or zero coefficients) - Increasing the threshold for splitting a
macroblock - Skipping the 32x32 motion estimation
- Reducing the number of RD-optimizations
- Example
- One Gauss-Newton iteration in motion estimations
- One Coefficient Removal
- Increased threshold for splitting macroblocks
- 1/3 complexity - 0.3dB maximum degradation
- Real-time on current Pentium processors (10Hz
QCIF)
14Example With Simplifications
15Intra Prediction
- 8x8 blocks are predicted (pixel prediction) from
the neighboring, already coded blocks. There are
10 possible prediction methods DC - prediction,
directional extrapolations and block matching.
16Intra Prediction
- Direction of the prediction inferred using
classification of the gradient directionality of
the neighboring blocks. - Blocks U and L classified into 6 classes
- Each combination of U and L classes is assigned
with a sub-set of all possible prediction
methods. - Classification based on simple pixel differences
- Intra prediction error is coded using similar set
of methods as used for coding motion prediction
error.
17Pixel Prediction Example
18Prediction Error Coding
- Prediction error signal has a statistically
varying nature difficult for a single coding
method to represent. - MVC uses a set of coding methods
- Multishape DCT,
- Extrapolation,
- Diagonal KLTs.
- 8x8 16 basis functions
- 4x4 8 basis functions
Example of prediction error
- Each of these methods work very well for certain
types of prediction error patterns largely
improving coding efficiency over plain 8x8 DCT
coding.
19Prediction Error Coding
- Coding methods used in Inter frames
- 8x8 DCT involving directional scanning orders,
- 4x8 cluster(s) coded with DCT,
- 4x4 clusters coded with
- 4x4 DCT involving directional scanning orders,
- 4x4 KLT
- Extrapolation.
- Coding methods used in Intra frames
- Same as those used in Inter frames and in
addition - 8x8 KLTs.
20Inter Prediction Error Coding
- The list of coding methods for a certain block
varies depending on - Quantisation parameter
- Luminance or chrominance
- Block size (8x8 or 4x4)
- Example Most probable coding methods for an 8x8
luminance block with different QPs - The best coding method for a given block is
selected in rate-distortion sense and its rank
inside the list is signaled to the decoder by VLC.
21Intra Prediction Error Coding
- The list of coding methods for a certain block
varies depending on - Quantisation parameter
- Luminance or chrominance
- Block size (8x8 or 4x4)
- Intra prediction direction
- Example Most probable coding methods for an 8x8
luminance block with diagonal and horizontal
Intra predictions (QPs 10-15) - The best coding method for a given block is
selected in rate-distortion sense and its rank
inside the list is signaled to the decoder by VLC.
22Status
- Long-term memory (Q15-J-42)
- Implemeted in the current software
- 0.1dB - 1dB improvements with 5 reference frames
- Intra / Prediction Error Coding simplifications
- Dropped 4 TCOEFF tables
- Dropped 3 Intra prediction directions
- Simplified classifier (6 classes instead of 11)
- Added RD constrained quantization to Intra coding
- Added block matching as one of the prediction
methods
23MVC Results
24MVC Results
25MVC Results
26MVC Results
27MVC Results
28MVC Results
29MVC Results
30MVC Intra Results
31MVC Intra Results
32MVC Intra Results
33MVC Intra Results
34MVC Intra Results
35MVC Intra Results
36MVC Intra Results
37MVC Modules with Current TML
- MVC Intra frame coding with TML-2 (Q15-J-47)
- MVC affine motion compensation with TML-2
(Q15-J-43) - Fairly modest improvements (0.1 dB - 0.5 dB)
- TML-2 prediction error coding suboptimal with
affine motion