Title: Visual Optimizations for Wavelet-Based Lossy Image Compression
1Visual Optimizations for Wavelet-Based Lossy
Image Compression
- Damon Chandler
- Visual Communications Lab
- School of Electrical and Computer Engineering
2Lossy vs. Lossless Compression
- CompressionRepresent a signal using fewer
bits/sample - Lossless compression Invertible reconstructed
signal original signal GIF, PNG, TIFF,
PKZip, gzip, StuffIt. - Lossy compression Non-invertible, reconstructed
signal ? original signal MP3, JPEG,
JPEG-2000.
3Lossy vs. Lossless Compression
- CompressionRepresent a signal using fewer
bits/sample - Lossless compression Invertible reconstructed
signal original signal GIF, PNG, TIFF,
PKZip, gzip, StuffIt. - Lossy compression Non-invertible, reconstructed
signal ? original signal MP3, JPEG,
JPEG-2000.
- Focus only on lossy compression of images.
- Focus only on natural images.
- Focus only on grayscale natural images.
4Lossy Image Compression
- Pixel-value amplitude quantization
8 bits/pixel (256 shades of gray)
2 bits/pixel (4 shades of gray)
5Lossy Image Compression
- Pixel-value amplitude quantization
8 bits/pixel (256 shades of gray)
2 bits/pixel (4 shades of gray)
6Lossy Image Compression
- Transmit every other row and column
8 bits/pixel (1/4 size)
2 bits/pixel required to reconstruct
7Lossy Image Compression
2 bits/pixel (dithered)
2 bits/pixel (interpolated)
8Lossy Image Compression
GoalRepresent the image using building blocks
such that
- The correlation between elements is reduced.
- Quantization of element amplitudes produces
visually pleasing artifacts.
2 bits/pixel (dithered)
2 bits/pixel (interpolated)
9Discrete Wavelet Transform
- Invertible, linear transform.
- Represents an image as sum of little waves.
- Localized in space and frequency.
- Subband filtering implementation.
10Discrete Wavelet Transform
11Discrete Wavelet Transform
12Discrete Wavelet Transform
13Discrete Wavelet Transform
14Quantization of Subband Coefficient Amplitudes
15Quantization of Subband Coefficient Amplitudes
Goal Quantize the subbands in a visually optimal
manner.
- Most image compression methods minimize MSE
(i.e., minimize the Euclidean distance between
original and reconstructed images).
16Quantization of Subband Coefficient Amplitudes
Goal Quantize the subbands in a visually optimal
manner.
- Most image compression methods minimize MSE
(i.e., minimize the Euclidean distance between
original and reconstructed images).
Original
Quantized finest scale (MSE118)
Quantized mid. scale (MSE118)
17Quantization of Subband Coefficient Amplitudes
GoalQuantize the subbands in a visually optimal
manner.
- Most image compression methods minimize MSE
(i.e., minimize the Euclidean distance between
original and reconstructed images).
To maximize visual quality
- Need to understand human visual sensitivity to
the distortions. - Need to understand visual processing of the image.
Original
Quantized finest scale (MSE118)
Quantized mid. scale (MSE118)
18Outline
- Background Visual detection, contrast
sensitivity, visual summation. - Experiments Detection of wavelet quantization
distortions, contrast matching. - Application Contrast-based quantization.
- Work in progress State-space of natural images,
medical-image compression.
19Contrast Metrics
- Simple contrast
- Michelson contrast
- RMS contrast
20Contrast Metrics
- Simple contrast
- Michelson contrast
- RMS contrast
- Contrast detection threshold (CT) Minimum
contrast required to visually detect a target. - Contrast sensitivity ? 1 / CT.
21Contrast Sensitivity Function
From Peli 1996
22Contrast Sensitivity Function
From Peli 1996
23- Targets Wavelet subband quantization
distortions 1.15, 2.3, 4.6, 9.2, 18.4 c/deg. - Backgrounds (masks)
Unmasked (10.1 cd/m2 gray)
Natural-image maskers
24Experiment 1 Detection of Simple Wavelet
Distortions
luminance (cd/m2)
- Apparatus
- HP4033A 21 monitor,
- Stimuli
- Subbands obtained using 9/7 Daubechies filters.
- Distortions via uniform scalar quantization of
one subband. - Two natural images balloon and horse.
- Procedures
- 3AFC paradigm (choose the odd one out).
- Interleaved conditions (alternate images).
pixel value
25DWT
Quantize one band
Subtract orig. image
DWT-1
Add 128
26(No Transcript)
27(No Transcript)
28Experiment 1 Unmasked Detection Thresholds
29Experiment 1 Masked Detection Thresholds
30Experiment 1 Threshold Elevations
31Experiment 1 Observations
- CTs vary with spatial frequency.
- Unmasked CSF for wavelet distortions much more
low-pass than CSF for gratings. - Differences in bandwidth?
- Similar results for 1-octave Gabors (Peli 93)
and for wavelet noise (Watson 97). - Masked CSF shows greatest elevations in threshold
at lower spatial frequencies. - Attributable to 1/f spectrum?
32Compound Wavelet Distortions
33Quicks Vector Magnitude Summation
34Quicks Vector Magnitude Summation
35Quicks Vector Magnitude Summation
(Graham 1977,1989)
36Quicks Vector Magnitude Summation
(Graham 1977,1989)
37Quicks Vector Magnitude Summation
(Graham 1977,1989)
Relative Contrast
38Quicks Vector Magnitude Summation
(Graham 1977,1989)
Relative Contrast
39Quicks Vector Magnitude Summation
Compound _at_ threshold
Relative Contrast Threshold
40Quicks Vector Magnitude Summation
Compound _at_ threshold
Relative Contrast Threshold
With N2 components
41Relative Contrast Space
42Experiment 2 Detection of Compound Wavelet
Distortions
- Apparatus
- HP4033A 21 monitor
- Stimuli
- Subbands obtained using 9/7 Daubechies filters.
- Distortions via uniform scalar quantization of
two subbands. - Two natural images balloon and horse.
- Procedures
- 3AFC paradigm (choose the odd one out).
- Interleaved conditions (alternate images).
43DWT
Quantize two bands
Subtract orig. image
DWT-1
Add 128
44HV, 4.6 c/deg
HV, 2.3 c/deg
HV, 1.15 c/deg
H, 2.34.6 c/deg
H, 1.152.3 c/deg
V, 2.34.6 c/deg
V, 1.152.3 c/deg
45HV, 4.6 c/deg
HV, 2.3 c/deg
HV, 1.15 c/deg
H, 2.34.6 c/deg
H, 1.152.3 c/deg
V, 2.34.6 c/deg
V, 1.152.3 c/deg
46Experiment 2 Unmasked Summation HV
47Experiment 2 Unmasked Summation SF
48Experiment 2 Masked Summation HV
49Experiment 2 Masked Summation SF
50Experiment 2 Observations
- Unmasked summation more in line with probability
summation - ? 4-5
- Non-significant effect of spatial-frequency or
orientation. - Masked summation closer to energy or linear
summation - ? ? 1.5
- Off-frequency looking (channel switching)?
- Natural-image backgrounds activate higher-levels?
51Image with Compound Distortions _at_ Threshold (?
?)
52Image with Compound Distortions _at_ Threshold (?
1.5)
53What about suprathreshold distortions?
- Can use CSF to generate an image with distortions
_at_ threshold. - Detection thresholds reveal little about visual
responses _at_ suprathreshold contrasts - Contrast constancy (Georgeson et al. 1975, Brady
et al. 1995).
54Contrast Constancy Unmasked
55Contrast Constancy Masked
56Scaled CSF proportions
Contrast-matching proportions
Total C 0.18
57What about suprathreshold distortions?
- Can use CSF to generate an image with distortions
_at_ threshold. - Detection thresholds reveal little about visual
responses _at_ suprathreshold contrasts - Contrast constancy (Georgeson et al. 1975, Brady
et al. 1995). - Must consider visual processing of image.
58Visual Scale-Space Integration
- Natural images ? many low frequencies.
- Global-to-local analysis
- Global Precedence (Navon 1977)
- Global structure influences processing of local
structure (Schyns Oliva 1994, Hucka Kaplan
1996). - Edges tend to show power across scales
- Perception of edge-structure requires a continuum
across scale-space (Witkin 1983). - Global-to-local integration across scale-space
(Hayes 1989). - Edge detection algorithms
- Marr Hildreth (no integration) vs. Canny
(integration).
59(No Transcript)
60Coarse
Intermediate
Fine
61Coarse
Intermediate
Fine
62(No Transcript)
63Coarse
Intermediate
Fine
64(No Transcript)
65Coarse
Intermediate
Fine
66(No Transcript)
67What about suprathreshold distortions?
- Proportion the contrasts to preserve continuity
across scale space
CSNR ? Cimage / Cdistortions
68What about suprathreshold distortions?
- Proportion the contrasts to preserve continuity
across scale space
CSNR ? Cimage / Cdistortions
69What about suprathreshold distortions?
- Proportion the contrasts to preserve continuity
across scale space
CSNR ? Cimage / Cdistortions
70Edge-Preserving Ratios
- Contrast Signal-to-Noise Ratio
RMS contrast of image _at_ scale s, orientation ?
RMS contrast of distortions _at_ scale s,
orientation ?
71Edge-Preserving Ratios
- Contrast Signal-to-Noise Ratio
- Model best CSNR curve _at_ a given visual
distortion (VD)
RMS contrast of image _at_ scale s, orientation ?
RMS contrast of distortions _at_ scale s,
orientation ?
72Application to Compression
73Application to Compression
- Display model
- Relate MSE and RMS contrast in the image
74Application to Compression
- Display model
- Relate MSE and RMS contrast in the image
T.S.A.
75Application to Compression
- Display model
- Relate MSE and RMS contrast in the image
T.S.A.
76Application to Compression
- Display model
- Relate MSE and RMS contrast in the image
T.S.A.
77Application to Compression
- Display model
- Relate MSE and RMS contrast in the image
T.S.A.
where
78Application to Compression
- Display model
- Relate MSE and RMS contrast in the image
where
79Application to Compression
- Display model
- Relate MSE and RMS contrast in the image
- MSE in subband _at_ (s, ? )
where
80- Measure Cimage(s,? ) or estimate using
- Compute CSNR(s,?,VD ) using
- Compute D (s,?,VD ) using
81- Measure Cimage(s,? ) or estimate using
- Compute CSNR(s,?,VD ) using
- Compute D (s,?,VD ) using
Visually tuned distortions for scalable coding
where
82JPEG-2000
JPEG-2000 Contrast-Based
_at_ 0.4 bits/pixel
83JPEG-2000 WMSE
JPEG-2000 Contrast-Based
_at_ 0.25 bits/pixel
84JPEG-2000 WMSE
JPEG-2000 Contrast-Based
_at_ 0.1 bits/pixel
85JPEG-2000 WMSE
JPEG-2000 Contrast-Based
_at_ 0.1 bits/pixel
86(No Transcript)
87(No Transcript)
88(No Transcript)
89Conclusions and Future Work
- Can generate better-looking compressed images by
accounting for properties of vision - Image and rate adaptive via CSNR(VD).
- Implemented using MSE.
- Integrates easily with embedded coders.
90Conclusions and Future Work
- Can generate better-looking compressed images by
accounting for properties of vision - Image and rate adaptive via CSNR(VD).
- Implemented using MSE.
- Integrates easily with embedded coders.
- Spatially local contrast-based quantization.
91Conclusions and Future Work
- Can generate better-looking compressed images by
accounting for properties of vision - Image and rate adaptive via CSNR(VD).
- Implemented using MSE.
- Integrates easily with embedded coders.
- Spatially local contrast-based quantization.
- Quantify VD using psychophysical scaling.
92Future Work Visual Distortion Metric
Same MSE
93Future Work Visual Distortion Metric
94Future Work Visual Distortion Metric
Pixel-value quantized
Pixel-value quantized dithered
Interpolated
95Future Work Visual Distortion Metric
Pixel-value quantized
Pixel-value quantized dithered
Interpolated
96Conclusions and Future Work
- Can generate better-looking compressed images by
accounting for properties of vision - Image and rate adaptive via CSNR(VD).
- Implemented using MSE.
- Integrates easily with embedded coders.
- Spatially local contrast-based quantization.
- Quantify VD using psychophysical scaling.
- Compression of medical images.
97Future Work Medical Image Compression
98Future Work Medical Image Compression
99Conclusions and Future Work
- Can generate better-looking compressed images by
accounting for properties of vision - Image and rate adaptive via CSNR(VD).
- Display adaptive via contrast.
- Integrates easily with embedded coders.
- Spatially local contrast-based quantization.
- Compression of medical images.
- Quantify VD using psychophysical scaling.
- Better model of natural images.
100Future Work Better Building Blocks?
- Ultimate coder
- Digital image compression One index for every
image that would be encountered. - Efficient visual encoding One cell for every
image that would be encountered. - Enormous memory requirements.
- Can we find a compromise between memory and
efficiency? - Provide an index (or cell) only for the most
likely image structures.
101Future Work Better Building Blocks?
- What are the most likely natural-image structures?
102Future Work Better Building Blocks?
- What are the most likely 8x8 natural-image
structures?
103Future Work Better Building Blocks?
- Using 260K unique 8x8 puzzle pieces
104Backup Slides
105Lossy Image Compression
- Transmit every other row and column
8 bits/pixel (original)
2 bits/pixel (reconstructed)
106Lossy Image Compression
- The pixel-values of natural images are highly
correlated - Scale-invariance model (Field)
- Amplitude( f ) ? 1 / f
- Autocorrelation model (Girod et al.)
- R(d) ? exp(-d)
- ?(?) ? 1 / ?2
- ? Amplitude(? ) ? 1 / ?
107Contrast Sensitivity Function
Do these results hold when
- Targets are wavelet subband quantization
distortions? - These distortions are presented against a
natural-image background?
108Future Work Better Building Blocks?
- Using 16K unique 32x32 puzzle pieces
109Future Work Better Building Blocks?
- Using 65K unique 16x16 puzzle pieces