Title: PACKMAN Texture Compression for Mobile Phones
1PACKMAN Texture Compression for Mobile Phones
Jacob Ström, Tomas Akenine-Möller Ericsson
Research
2Outline
- Motivation and Previous work
- Design Goals
- Basic Idea
- Decompression of a Block
- Compression
- Results
3Why 3D Graphics on a Mobile Phone?
- Games
- Maps, Messaging, Browsing and more...
4Why is 3D Graphics Hard on a Mobile Phone?
- Limited resources
- Small amount of memory
- Little memory bandwidth
- Little chip area for special purpose
- Powered by batteries
5Texture Compression Helps
- Small amount of memory
- More texture data can fit in the limited amount
of memory - Little memory bandwidth
- More texturing possible for same amount of
bandwidth - Little chip area for special purpose
- A texture cache using compressed data can be made
smaller - Powered by batteries
- Reduced bandwidth means lower energy consumption
6Previous Work
7Previous Work
8Previous Work
9Previous Work
10Design Goals
- Major Design Goals
- Minimal decompression complexity
- Acceptable quality
- Minor Design Goals
- Low compression complexity
- Small block size
- Reasonable compression (around 4 bpp)
11Early Design Decisions
- Block size of 32 bits was chosen
- Few bits -gt small bit widths after texture cache
- Fits 32-bit wide buses on systems without texture
cache - No indirect addressing of colors
- LUT of colors increases latency and complicates
hardware - Implications
- 2x4 was the only reasonable block size, 4 bpp
12Basic Idea
- The Human Visual System (HSV) is more sensitive
to luminance than to chrominance - In video and still image coding, chrominance
information is most often subsampled in the x-
and y- direction (MPEG, JPEG, H263, H264 etc). - Our scheme has basically only one color per 2x4
block. The rest is luminance information
13Basic Idea
- Use only 12 bits to specify a general color for
a 2x4 block
12-bit general color
14Basic Idea
- Use only 12 bits to specify a general color for
a 2x4 block - Modify the luminance for each pixel in the block
12-bit general color
per-pixel luminance
15Basic Idea
- Use only 12 bits to specify a general color for
a 2x4 block - Modify the luminance for each pixel in the block
12-bit general color
per-pixel luminance
resulting image
16Luminance modification
Luminance is modified additively Example
general block color is R17, G34, B
51 Luminance modifier for the pixel is 220, The
new pixel value is R 17220 237G 34220
254B 51220 255 (after clamping)
17How to specify luminance
- Two bits per pixel are used to specify the
luminance modifier is one out of four values. - Problem Small values -8, -2, 2, 8 ?
18How to specify luminance
- Two bits per pixel are used to specify the
luminance modifier is one out of four values. - Problem Small values -8, -2, 2, 8 ?
- smooth transitions OK
19How to specify luminance
- Two bits per pixel are used to specify the
luminance modifier is one out of four values. - Problem Small values -8, -2, 2, 8 ?
- smooth transitions OK
- sharp edges bad
20How to specify luminance
- Two bits per pixel are used to specify the
luminance modifier is one out of four values. - Problem Small values -8, -2, 2, 8 ?
- smooth transitions OK
- sharp edges bad
- Big values -255, -127, 127, 255 ?
21How to specify luminance
- Two bits per pixel are used to specify the
luminance modifier is one out of four values. - Problem Small values -8, -2, 2, 8 ?
- smooth transitions OK
- sharp edges bad
- Big values -255, -127, 127, 255 ?
- smooth transitions bad
22How to specify luminance
- Two bits per pixel are used to specify the
luminance modifier is one out of four values. - Problem Small values -8, -2, 2, 8 ?
- smooth transitions OK
- sharp edges bad
- Big values -255, -127, 127, 255 ?
- smooth transitions bad
- sharp edges OK
23How to specify luminance
- Two bits per pixel are used to specify the
luminance modifier is one out of four values. - Problem Small values -8, -2, 2, 8 ?
- smooth transitions OK
- sharp edges bad
- Big values -255, -127, 127, 255 ?
- smooth transitions bad
- sharp edges OK
- Solution Codebook of tables, one/block.
24Modifier Codebook
- We started with random values and optimized by
minimizing the error for a set of images
25Modifier Codebook
- We started with random values and optimized by
minimizing the error for a set of images
26Modifier Codebook
- We started with random values and optimized by
minimizing the error for a set of images
- simulated annealing
- modified version of the Generalized Lloyd
algorithm (Linde Buzo Gray 80) - Symmetry was added to reduce on-chip codebook size
27Modifier Codebook
- We started with random values and optimized by
minimizing the error for a set of images
- simulated annealing
- modified version of the Generalized Lloyd
algorithm (Linde Buzo Gray 80) - Symmetry was added to reduce on-chip codebook size
28Modifier Codebook
- We started with random values and optimized by
minimizing the error for a set of images
- simulated annealing
- modified version of the Generalized Lloyd
algorithm (Linde Buzo Gray 80) - Symmetry was added to reduce on-chip codebook size
29Modifier Codebook
- We started with random values and optimized by
minimizing the error for a set of images
- simulated annealing
- modified version of the Generalized Lloyd
algorithm (Linde Buzo Gray 80) - Symmetry was added to reduce on-chip codebook size
30Decompressing a Block
- First 12 bits is RGB444 which gives the general
color for the entire block. It is extended to 24
bits.
153 153 85
extend to 24 bits
12 bit RGB444
31Decompressing a Block
- Next 4 bits select a table from the code book
5
12 bit RGB444
32Decompressing a Block
- Next 4 bits select a table from the code book
5
12 bit RGB444
33Decompressing a Block
- The next 2 bits modify the first pixel according
to the table
5
10
12 bit RGB444
34Decompressing a Block
- The next 2 bits modify the first pixel according
to the table
5
10
12 bit RGB444
35Decompressing a Block
- The next 2 bits modify the first pixel according
to the table and so on
5
11
10
12 bit RGB444
36Decompressing a Block
- The next 2 bits modify the first pixel according
to the table and so on
5
11
10
01
12 bit RGB444
37Decompressing a Block
- The next 2 bits modify the first pixel according
to the table and so on
5
11
10
01
01
12 bit RGB444
38Decompressing a Block
- The next 2 bits modify the first pixel according
to the table and so on
5
11
10
01
01
10
12 bit RGB444
39Decompressing a Block
- The next 2 bits modify the first pixel according
to the table and so on
5
11
10
01
01
10
01
12 bit RGB444
40Decompressing a Block
- The next 2 bits modify the first pixel according
to the table and so on
5
11
10
01
01
10
01
11
12 bit RGB444
41Decompressing a Block
- The next 2 bits modify the first pixel according
to the table and so on
5
11
10
01
01
10
01
11
11
12 bit RGB444
42Simple Decompression
- Only three adders and some muxes.
- Only twelve adders for four parallel units needed
for bilinear interpolation - S3TC 60 adders
- PVRTC 60 adders
43Simple Decompression
- The correct texel is selected
44Simple Decompression
- The correct texel is selected
- The modifier value is looked up
45Simple Decompression
- The correct texel is selected
- The modifier value is looked up
- The general color is extended to 24 bits
46Simple Decompression
- The correct texel is selected
- The modifier value is looked up
- The general color is extended to 24 bits
- The modifier value is added
47Simple Decompression
- The correct texel is selected
- The modifier value is looked up
- The general color is extended to 24 bits
- The modifier value is added
- The result is clamped
48Compression
- To compress a block, we need to find
- general color
general color
49Compression
- To compress a block, we need to find
- general color
- table
general color
table
50Compression
- To compress a block, we need to find
- general color
- table
- pixel indices.
pixel indices
general color
table
51Average Compression
- In average compression, the average color of the
2x4 block is used as general color.
- Exhaustive search is used for the table and the
pixel indices. - 30 milliseconds for a 64 x 64 texture.
pixel indices
general color
table
52Exhaustive Compression
- In exhaustive compression, exhaustive search is
used for the general color as well (optimal
compression). - One minute for a 64x64 texture.
- On average, about 1.5 dB better than average
compression.
53Exhaustive Compression
- Why is Exhaustive Compression better than
Average? - Often they represent the same 24 bit color, but
it has been quantized differently.
54Quantization of General Color
- When Quantizing to 12 bits, we go from many
positions in RGB space to relatively few.
G
R
55Quantization of General Color
- When Quantizing to 12 bits, we go from many
positions in RGB space to relatively few.
G
R
56Quantization of General Color
- When Quantizing to 12 bits, we go from many
positions in RGB space to relatively few. - Ordinary Quantizing just chooses the closest one
G
R
57Quantization of General Color
- When Quantizing to 12 bits, we go from many
positions in RGB space to relatively few. - Ordinary Quantizing just chooses the closest one
G
R
58Quantization of General Color
- However, the per-pixel luminance modification can
compensate for errors in the direction (1,1,1)
G
R
59Quantization of General Color
- However, the per-pixel luminance modification can
compensate for errors in the direction (1,1,1) - We can reach (almost) all points on the dotted
lines
G
R
60Quantization of General Color
- However, the per-pixel luminance modification can
compensate for errors in the direction (1,1,1) - We can reach (almost) all points on the dotted
lines
G
- The best quantization point is therefore
different.
R
61Quantization of General Color
- However, the per-pixel luminance modification can
compensate for errors in the direction (1,1,1) - We can reach (almost) all points on the dotted
lines
G
- The best quantization point is therefore
different.
R
62Quantization of General Color
- However, the per-pixel luminance modification can
compensate for errors in the direction (1,1,1) - We can reach (almost) all points on the dotted
lines
G
- The best quantization point is therefore
different.
R
63Combined Quantization Compression
- We call this type of quantization combined
quantization, since the R, G and B values are
now quantized together to the closest line.
pixel indices
general color
table
64Combined Quantization Compression (cont.)
- On average, Combined Quantization Compression
yields 1.0 dB better PSNR than average Compression
65Combined Quantization Compression (cont.)
- On average, Combined Quantization Compression
yields 1.0 dB better PSNR than average
Compression - Only 0.5 dB worse than Exhaustive search
PSNR
0.5 dB
Average
Exhaustive
Combined
66Combined Quantization Compression (cont.)
- On average, Combined Quantization Compression
yields 1.0 dB better PSNR than average
Compression - Only 0.5 dB worse than Exhaustive search
- Still only 30 milliseconds to compress a 64x64
texture.
PSNR
0.5 dB
Average
Exhaustive
Combined
67Examples
68Error Metric
- When selecting one way to compress a block over
another, an error metric is used to tell which is
better. - An obvious error metric to use is to sum the
squared error over the block - e2 (R-R)2 (G-G)2 (B-B)2
- However, since the eye is more sensitive to
errors in green than in red and blue, it makes
sense to add more weight to the green component - e2 wR(R-R) 2 wG(G-G) 2 wB(B-B) 2
69Error Metric (cont.)
70Image Quality
- Peak Signal to Noise Ratio (PSNR) was measured
for a number of images. - Proposed scheme on average 1 dB worse than S3TC
- However, luminance component on average 0.5 dB
better
71Game Example
Scene with original textures
72Game Example
Scene with PACKMAN compressed textures
73Game Example
Scene with original light maps
74Game Example
Scene with PACKMAN compressed light maps
75Game Example
Scene with original textures and light maps
combined
76Game Example
Scene with PACKMAN compressed textures and light
maps combined
77In the end, the rendered images should be
compared.
original textures
PACKMAN textures
78Weaknesses
- Block cannot contain two different hues
- E.g., A block with red and blue of same
luminance.
79Conclusion
- Emphasis on low complexity
- 20 of the adders of rivaling schemes
- Quality is about 1 dB worse than S3TC
- A fast and near optimal compression scheme has
been developed - Perceptual quality can be enhanced using weighted
error metrics - PACKMAN is implemented in Bitboys latest
graphics processors for mobile phones (announced
yesterday).
80Thanks
- www.gametutorials.com for their Quake3 scene and
BSP-culler. - Timo Aila for feedback on the sketch.
81(No Transcript)