Title: 2D3D Graphics Acceleration on Low Power Devices
1 2D/3D Graphics Acceleration on Low Power Devices
- Iosif Antochi
- Computer Engineering Laboratory
- Electrical Engineering Department
- TU Delft, NL
2Summary
- Introduction (Why do we need graphics
accelerators?) - The Software Interface
- Software implementation of a 3D graphics library
(Mesa) - What Needs to be Accelerated?
- Logical Organization of a Graphics Accelerator
- Advanced Features of a Graphics Accelerator
- Our Work
- Conclusions
3Why Do We Need Graphics Accelerators?
- The human-machine interaction is based more and
more on graphics interfaces (towards virtual 3D
worlds representations), which are more natural
than alphanumeric displays. - Current (consumer level) general processors are
not capable of rendering in real-time the
current graphics applications - To free the central processing unit(s) from
intensive and specialized computations
4Typical Applications for Computer Graphics
- Simulators
- Flight simulators
- Military training
- Hazardous conditions simulations
- Games
- Interactive Applications
- User friendly interfaces
- Video conferencing
5Generic Graphics Application
- Behaviour
- InitSystem
- While (running)
-
- Generate_a_scene
- Display_the_generated_scene
- Update_scene (automatic or on user input)
6 Graphics Application Data Structures
- All properties of a scene have
- a discrete representation!
World
Object 1
Object 2
Object n
Position
Shape
7Display Types
- Vectorial Displays
- Raster Displays
RGB component
8Sample Real-Time Rendered Image
9The Way You See a Scene
10The Way It Was Painted
11The Software Interface
- Software requirements for applications in order
to use an accelerator
Portable Library Calls (API)
High level
Device Driver
Low level
Graphic Accelerator
12A 3D Graphics Pipeline
Model View Transformation
Lighting
Normalization
Perspective Projection
Rasterization
13Mesa Structure
Mesa Core
Device Driver Entry Points
Triangle Acceleration
Line Acceleration
Point Acceleration
Software Rasterizer (OS Mesa)
Accelerated Functions
A Graphics Accelerator
14What Needs to Be Accelerated?
- 2D graphics
- - BitBlt (Blitter operations)
- - Lines and Points
- 3D graphics
- - Triangle setup processing
- - Texture mapping
152D Operations
- Fixed BitBlt used for
- - copying
rectangular areas from system memory or - video memory to
video memory or from video memory to
system memory. - - double buffered animation.
- Variable BitBlt used for
- - stretching an image
- - expanding an image
16Performance Bottlenecks in 3D Graphics Rendering
Systems
- Geometry processing and lighting models.
- Triangle setup calculating the triangle edge
slopes and increments necessary for
scan-conversion. - The rate at which the fixed points iterators
generate pixel values. - The bandwidth to the frame buffer and texture
memory.
17Our Graphics Accelerator
AMBA BUS
AMBA BUS INTERFACE
Instruction Decoder
Triangle Setup Pipeline
Span Generator ( Iterator )
2D Engine
Texture Unit
Texture Cache
BitBlt
Lines
Points
3D Engine
Pixel Engine
Pixel and Z buffer Cache
Local Memory
18Triangle Setup Pipeline
P1
Scanline
y
P4
P5
P3
P2
x
19Triangle Setup Pipeline (cont)
- Vertices
- - (x0 , y0 , z0), (x1 , y1 , z1), (x2 , y2
, z2) . - Colors
- - (r0, g0, b0, a1), (r1, g1, b1 , a1), (r2, g2
, b2 , a2) . ( specular color components ). - Texture Coordinates
- - (u0, v0), (u1, v1), (u2, v2).
- Interpolation increments
- - dr /dx, dr /dy, dg /dx, dg /dy, db /dx,
db /dy, da /dx, da /dy, - - du /dx, du /dy, dv /dx, dv /dy.
(perspective correction increments)
20Span Generator
- Linear interpolation
- For x, y coordinates and eventually color
components (r, g, b, a) - Hyperbolic Interpolation (perspective corrected)
- For texture coordinates sets (u 0, v 0 ), (u 1 ,
v 1) and other values that need a perspective
correction. - Usually this type of interpolation uses the w
coordinate from the clipping unit. -
- For each parameter k we want to
perform hyperbolic interpolation - We linearly interpolate kk/w and w1/w along
polygon edges and across scanlines, - At each pixel we have to divide k by w to
obtain the proper k value -
-
-
21Texture Unit
- Compute physical texture coordinates and extract
required texels according to a specified sampling
method. - Texture sampling methods
- - Point sampling. (one texel needed)
- - Linear interpolation. (4 texels )
- - Bilinear interpolation. (4 texels)
- - Trilinear interpolation. (8 texels)
- Combine the filtered color obtained from the
texture map with the primary color using a
specified method such as - Replace
- Modulate
- Decal
- Blend
Using Mip-mapping
22Bilinear Texture Filtering
23Trilinear filtering using Mip-maps
24Tiny Texture Cache
- Conventional texture cache size for a PC graphics
accelerator usually is at least 4KB. - We have evaluated different cache sizes and block
sizes. - Even for a 256B cache we obtained reasonable
performance at a smaller power consumption.
25Tiny Texture Cache Statistics
Original
Original
Realistic
Realistic
Point sampling
Trilinear interpolation
26Pixel Engine
- Fog Unit
- Color Sum
- Scissor Test
- Alpha Test
- Stencil Test
- Depth Buffer Test (Z Test)
- Blending Operations
- Logical Operations
27Advanced Features
- Tile Rendering
- Texture Processor
- Antialiasing
28Tile Rendering
- In software.
- At driver level.
- In hardware
- Pros
- Smaller onchip memory
- Extensibility
- Potential parallelism
- Cons
- Data overhead
Graphic Accelerator
Tile Memory
Graphics Memory
29Texture Processor
- Traditional texture combining methods
- Replace
- Modulate
- Decal
- Blend
- Texture Processor
- Pros
- An increased number of combinations (user
programmable) - Cons
- More computational power
- More memory bandwidth required ( bigger caches)
Fixed number of combinations
30Antialiasing
- At primitive level Points, Lines, Triangles
- Texture level (eg. Using Mip-maps)
- Full scene anti-aliasing using more frames
- Pros
- Might increase rendering quality
- Cons
- More processing power and memory are required
31Conclusions
- Hardware graphics accelerators are needed for
real-time graphics on consumer level devices . - Performance is not the only metric for evaluating
a graphics accelerator. - Low power graphics accelerators need different
design strategies. - Work in progress.