Flexible Media Compression - PowerPoint PPT Presentation

1 / 83
About This Presentation
Title:

Flexible Media Compression

Description:

Flexible Media Compression State of the Art – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 84
Provided by: jin53
Category:

less

Transcript and Presenter's Notes

Title: Flexible Media Compression


1
Flexible Media Compression State of the Art
  • Jin Li
  • Microsoft Research

2
Outline
  • Introduction
  • Media compression any work to be done?
  • Flexible media compression
  • Image compression JPEG 2000 delivery
  • Compression of the 3D environment
  • Audio compression (functionality demo)
  • Conclusion

3
Media Compression So Many Standards
MP3, WMA, Real, G.723, G.722.1, MPEG-4 audio,
etc..
Audio/Speech
JPEG, BMP, GIF, JPEG 2000, etc..
Image
MPEG-1, MPEG-2, MPEG-4 H.261, H.263, WMV, Real,
AVI, Quicktime
Video
4
Media vs. File Compression
  • Whats the difference between media file
    compression?
  • File compression
  • Every bit is important, has to be compressed
    losslessly
  • Media compression
  • Exact bit/value is not important, distortion is
    tolerable
  • Amount of media is huge, high compression ratio
    is required
  • Often needs manipulated

5
Example Image
167
123
84
200
2D array of data
Lena, Image (512x512)
6
Media Representation
Subsample (128x128)
Manipulation
Reposition (256,256)-(384,384)
Image (512x512)
Compress (JPEG)
7
Flexible Media Compression
  • So media needs to be manipulated, how this is
    related to its compression?
  • Many current compression standard generates
    bitstream that is not manipulatable
  • The compressed media should have the ability to
    be flexibly adjustable
  • to match the requirement of the
  • Client device
  • Network channel
  • Storage device

8
Flexible Media Compression
  • A challenging task
  • Flexible
  • More functionality
  • Yet as efficient as possible

9
Examples
  • Works done
  • Flexible image compression
  • Flexible environment compression
  • Flexible audio compression
  • Flexible compression can be achieved without the
    loss of compression efficiency.

10
Flexible Image Compression JPEG 2000 Vmedia
11
How to Efficiently Browse a Large Image
  • User specify a region of interest (ROI)
  • The image is compressed into scalable units
  • Only bitstream required in the current view is
    delivered

12
Key Technologies
  • Compress an image into a set of scalable units
  • JPEG 2000
  • Deliver and manage bitstream segments
  • Vmedia

13
The benefit of JPEG 2000 over JPEG
  • Achieve better efficiency
  • Superior low bit-rate performance compared with
    JPEG
  • Better visual performance visual tools
  • Handle more types of image
  • Provide many new useful functionalities

14
The benefit of JPEG 2000 over JPEG (2)
  • Provide many new useful functionalities
  • Lossless compression
  • Progressive transmission
  • By quality, visual and resolution
  • Progression to lossless
  • Region of interest (ROI)
  • Encoder code a certain region with high quality
  • Decoder access and processing
  • Progressive ROI access arbitrary access a
    certain area, decoding resolution and quality
    level
  • Robustness to bit errors

15
JPEG 2000 Framework
Transform
Quantization
Entropy Coding
Image
Bitstream Assembler
. . .
Compressed bitstream
16
Transform
Transform Coeff. 4123, -12.4, -96.7, 4.5,
Original 128, 129, 125, 64, 65,
17
Quantization
Quantized Coeff.(Q64) 64, 0, -1, 0,
Transform Coeff. 4123, -12.4, -96.7, 4.5,
18
Two Tier Entropy Coding
Res1
Res2
. . .
Encode each block separately record the
bitstream of each block. Block size is 64x64.
19
Bitstream Assembler
D1
D1
D2
D2
R1
R1
R2
R2
D3
D3
D4
D4
R3
R3
R4
R4
20
Assemble the Bitstream
Res1
Res2
  • Bitstream
  • Rate-distortion optimized, for progressive by
    quality
  • May be reordered
  • Region with resolution access, progressive by
    quality

. . .
Encode each block separately record a bitstream
for each block
21
A Sample JPEG 2000 Bit Stream
22
Delivery of Flexibly Compressed Image
  • Technology for interactive browsing
  • Find content related to the current view
  • Delivery content efficiently

23
Problem
  • Need to deliver many segments of bitstream
  • Bitstream segments need to be delivered in a
    prioritized way
  • Need to cache the delivered bitstream segments

24
Vmedia
Media
Server
Vmedia
Network
Network
Network
. . .
Vmedia
Vmedia
Media Program
Media Program
Client1
Clientn
Most work done at client end
25
Virtual Media Concept
26
Cache Management
27
JPEG 2000 Interactive Image Browser
28
Initial Stage
  • Read filehead media structure

29
Initial Stage
  • File header packet head marked by companion
    file
  • Read with synchronous mode

30
Entire Image Low Res
31
Zooming In
32
Panning Around
33
Demo
Application
Browser
34
Flexible Environment Compression
35
Concentric Mosaic
Camera
Beam
36
CM Data
37
CM Rendering Engine
Camera trajectory
Inner circle
38
CM Rendering Engine
Camera Path
Inner circle
Environment
39
Challenge for Concentric Mosaic Compression
  • Specialized data structure
  • Large amount of data, even for one scene
  • Random access
  • Image is displayed as whole
  • Video is accessed frame by frame
  • An concentric mosaic data set is best kept in the
    compressed form, and decoded and rendered
    just-in-time (JIT)

40
Principles of Good Concentric Mosaic Coder
  • High compression ratio
  • Just-in-time rendering decoding
  • Access decode only the content needed to render
    the current view
  • Fast decoding operation
  • Random bitstream access delivery

41
Block Coding
  • Vector quantization
  • Color quantization, S3TC, etc
  • Block transformed based coding

42
Block Coding
  • Description
  • Encode each block at fixed length
  • Good candidate for image cache
  • Advantage
  • Simple system
  • Easy bitstream index access
  • Fast encoding decoding
  • Disadvantage
  • Low compression ratio (around 41-501)

43
Vector Quantization
  • Advantage
  • Fast simple decoding
  • Disadvantage
  • Need to record the lookup table
  • Lookup table grows if subblock is large or
    required quality is high
  • Complex in encoding

Subblock
index
Lookup Table
0
1
. . .
2
3
44
S3TC (Used in DirectX)
  • Advantage
  • Fast simple decoding
  • Easy encoding
  • Disadvantage
  • Limited compression ratio

Rule 00C0, 012/3C01/3C1, 101/3C02/3C1, 11C1
45
Transform Based Coding
Quantization
  • Advantage
  • Higher compression ratio
  • Disadvantage
  • More complex

46
Reference Block Coding - Structure
P A P P P P P P P A P P P P P P P A P P
47
Reference Block Coding Rendering
Shot sequence
P
P
A
P
P
P
P
P
P
P
A
P
P
P
P
P
P
P
A
P
Rendering engine
48
Reference Block Coding
  • Characteristics
  • Macroblock coding similar to MPEG
  • Modifications
  • P frame only refers to nearby A frames
  • Global panning local motion improve
    efficiency
  • Index for A P frame MBG for random data
    access
  • Advantage
  • High compression ratio (401 2001)
  • Leverage existing hardware
  • Be able to JIT decode the compressed bitstream
  • Disadvantage
  • Relatively complex (compared to block coding)

49
High Dimensional Transform Coding (3D Wavelet)
  • Framework
  • Frame alignment smart rebinning
  • Quick decoding progressive inverse wavelet
    synthesis

50
Why 3D Wavelet?
  • Good decorrelation and energy compaction
  • Easily designed quantization and coding
    algorithms
  • Better flexibility
  • Error resilience

51
3D Wavelet Compression System
52
3D Wavelet Transform
Fn
F3
F2
F1
F0
53
Lifting Implementation
An example of biorthogonal 9-7 filter
L0
x0
a
c
d
b
H0
x1
b
c
d
a
L1
x2
a
c
d
b
H1
x3
c
a
b
d
L2
x4
c
b
d
a
No auxiliary memory is needed
Easy to do inverse transform
H2
x5
Exactly the same result as convolution with half
the computational complexity.
b
c
d
a
L3
x6
a
b
c
d
Convolution on average,every node requires 4.5
X, 7 .
H3
x7
c
a
b
d
Lifting on average,every node requires
2 X, 4 .
L4
x8
High Low
Original
54
Inverse Transform
Transform
L0
x0
a
c
d
b
H0
x1
b
c
d
a
L1
x2
a
c
d
b
H1
x3
c
a
b
d
L2
x4
c
b
d
a
H2
x5
b
c
d
a
L3
x6
a
b
c
d
H3
x7
c
a
b
d
L4
x8
High Low
Original
55
Wavelet Packet Structure
HLL
y
y
y
LHL
HHL
z
z
z
x
x
x
Two-level Mallat
Two-level x decomp. two-level (y,z) Mallat
Two-level z decomp. two-level (x,y) Mallat
56
Quantizer
  • Scalar quantizer with a deadzone

2?
?
Quantized Magnitude
Sign
57
Block Entropy Coding
  • Segment each subband into blocks
  • Bitplane entropy encode each block
  • Split coefficient into bits, group the bit of
    same magnitude into a bitplane
  • Tree based encoder
  • Golomb-Rice coder
  • Context adaptive arithmetic coder
  • Assemble the embedded coded bitstream

58
Smart-Rebinning
  • Problem
  • 3D wavelet lt Reference coder (in performance)?
  • Approaches
  • Pan compensation (Taubman and Takhor)
  • Register-warp 3D ASWT (Wang et al.)
  • Block matching (Ohm)
  • Block MC without filling the holes(Tham et al)
  • Our solution
  • Data rearrangement

59
Horizontal Shot Alignment
60
Smart-rebinned Data Volume
61
Smart-rebinning Process
62
Smart-rebinned Data Volume
Original data volume
Part of the rebinned data volume
63
Cross-Panorama Correlation
Pan. 0 Pan. 10 Pan. 20
Pan. 30 Pan. 40
  • Well aligned objects
  • Gradual parallax transition

64
Arbitrary Region of Support
  • Cause
  • Environmental depth variation
  • Solutions
  • Simple rebinning
  • Restrict all horizontal translation to be the
    same
  • Padding
  • Wavelet coding with arbitrary region of support

65
PSNR-Y Results
Test Dataset Algorithm LOBBY (0.2bpp) LOBBY (0.12bpp) KIDS (0.4bpp) KIDS (0.24bpp)
MPEG-2 32.2 30.4 30.1 28.3
3D Wavelet 31.9 30.0 29.4 27.3
RBC 32.8 29.8 31.5 28.7
Simple rebinning 35.5 33.6 32.8 30.5
Smart-rebinning padding 36.0 34.0 33.4 31.1
Smart-rebinning arbitrary shape wavelet codec 36.3 34.3 33.8 31.3
66
Just-in-time Rendering
  • Challenges
  • Decode render wavelet compressed concentric
    mosaic just-in-time (JIT)

67
Selective 3D Wavelet Decompression System
Bottleneck
. . .
. . .
Bitstream parsing
68
Why is Partial Synthesis Slow?
  • Full decompression
  • 2x 4

69
Progressive Inverse Wavelet Synthesis
  • Use caching to avoidduplication
  • Provide random data access

70
1D PIWS
  • State-transition machine

0
1
2
71
Data Access with 1D PIWS
  • Guarantee minimum calculation
  • Great saving if adjacent access requests are near

0
1
2
72
PIWS in Concentric Mosaics
High-pass and low-pass coefficients are
interleaved
.
73
Multi-scale PIWS
Rendering Engine
PIWS Engine, Level 1
Selective Decoder
PIWS Engine, Level 2
Selective Decoder
Bitstream
74
3 Movement and Slits Access Pattern
75
Overall Rendering Speed VQ, RBC vs. PIWS
PS(frames/sec) BI(frames/sec)
RT VQ 19.7 16.3
RT RBC 16.8 13.9
RT PIWS 17.6 14.3
FB VQ 19.0 15.8
FB RBC 16.4 13.9
FB PIWS 15.8 13.5
ST VQ 17.7 14.9
ST RBC 12.5 10.9
ST PIWS 7.9 7.3
76
Subjective Evaluation VQ 121
77
Subjective Evaluation RBC 1001
78
Subjective Evaluation 3D Wavelet 2001
79
Demo
80
Flexible Environment Compression
  • Developed a number of IBR coding approaches
  • All with JIT decoding rendering capability
  • Block coding
  • Limited compression ratio (61-251)
  • Simple
  • Reference coding
  • Derived from MPEG
  • An order of magnitude more compression than the
    block coding approach
  • High dimensional transform (3D wavelet) coding
  • 2x to 4x more compression ratio than reference
    coding
  • Still be able to perform JIT rendering

81
Flexible Audio Compression
82
Features
  • Versatile
  • Lossless as good as monkeys audio
  • Lossy match/exceed best audio codec found

Original
MP4 TwinVQ (9.1kbps)
EAC (8kbps)
MP3 (17.8kbps)
83
Conclusion
  • Flexible media compression
  • Not only high compression ratio
  • But also manipulatable bitstream syntax
  • Flexible media compression is doable
  • Innovative technologies are yet to be invented
    (There are works to be done)
Write a Comment
User Comments (0)
About PowerShow.com