Context-based Data Compression

About This Presentation

Title:

Description:

Number of Views:215

Avg rating:3.0/5.0

Slides: 15

Provided by: Xiaol9

Category:

more less

Transcript and Presenter's Notes

Title: Context-based Data Compression

1
Context-basedData Compression
Xiaolin Wu Polytechnic University Brooklyn, NY

2
Context model estimated symbol probability

Variable length coding schemes need estimates of
probability of each symbol - model
Model can be
Static - Fixed global model for all inputs
English text
Semi-adaptive - Computed for specific data being
coded and transmitted as side information
C programs
Adaptive - Constructed on the fly
Any source!

3
Adaptive vs. Semi-adaptive

4
Adaptation with Arithmetic and Huffman Coding

Huffman Coding - Manipulate Huffman tree on the
fly - Efficient algorithms known but nevertheless
they remain complex.
Arithmetic Coding - Update cumulative probability
distribution table. Efficient data structure /
algorithm known. Rest essentially same.
Main advantage of arithmetic over Huffman is the
ease by which the former can be used in
conjunction with adaptive modeling techniques.

5
Context models

If source is not iid then there is complex
dependence between symbols in the sequence
In most practical situations, pdf of symbol
depends on neighboring symbol values - i.e.
context.
Hence we condition encoding of current symbol to
its context.
How to select contexts? - Rigorous answer beyond
our scope.
Practical schemes use a fixed neighborhood.

6
Context dilution problem

The minimum code length of sequence
achievable by arithmetic coding, if
is known.
The difficulty of estimating
due to insufficient sample statistics prevents
the use of high-order Markov models.

7
Estimating probabilities different contexts

Two approaches
Maintain symbol occurrence counts within each
context
number of contexts needs to be modest to avoid
context dilution
Assume pdf shape within each context same (e.g.
Laplacian), only parameters (e.g. mean and
variance) different
Estimation may not be as accurate but much larger
number of contexts can be used

8
Entropy (Shannon 1948)
9
Conditional Entropy

10
Entropy and Conditional Entropy

The conditional entropy can be
interpreted as the amount of uncertainty
remaining about the , given that we know
random variable .
The additional knowledge of should reduce the
uncertainty about .

11
Context Based Entropy Coders

12
Decorrelation techniques to exploit sample
smoothness

13
Benefits of prediction and transform

14
Further Reading

Text Compression - T.Bell, J. Cleary and I.
Witten. Prentice Hall. Good coverage on
statistical context modeling. Focus on text
though.
Articles in IEEE Transactions on Information
Theory by Rissanen and Langdon
Digital Coding of Waveforms Principles and
Applications to Speech and Video. Jayant and
Noll. Good coverage on Predictive coding.

Write a Comment

User Comments (0)