Title: GPU
1GPU
2What it is?
- Helps to create nice pictures?
- A very strong processor?
- A very fast processor?
- A way to make us pay more?
3Terms to know
? CPU Central Processing Unit ? GPU Graphics
Processing Unit ? GPGPU General Purpose
computation on the GPU ? FLOPS Floating point
Operations Per Second ? GFLOPS Giga-FLOPS
4Roadmap
- Intro
- Graphics pipeline
- Why is CPU insufficient?
- Why is the solution?
- History evolution of the pipeline
- GPUs
- How they work
- How to use them
- GPU vs. CPU
- GPGPU
5Graphics pipeline
Transformation / Lighting / Rasterization
6Transformation
- Model space ? World space
- World space ? Camera space
- Camera space ? Projection space
- Projection space ? Screen space
7Lighting
- Given
- Vertex properties
- Color
- Normal
- Texture coordinates
-
- Light sources
- Compute final color
8Rasterezation
- Given some points describing a primitive in
screen space, find the set of covered pixels
9Roadmap
- Intro
- Graphics pipeline
- Why is CPU insufficient?
- Why is the solution?
- History evolution of the pipeline
- GPUs
- How they work
- How to use them
- GPU vs. CPU
- GPGPU
10Why not CPU?
- Graphics applications are
- Highly computing intensive
- High arithmetic to data ratio
- Massively parallel
- Data stream based
- Straightforward
11Why not CPU?
- Graphics applications are
- Highly computing intensive
- High arithmetic to data ratio
- Massively parallel
- Data stream based
- Straightforward
- CPUs are
- Serial, one operation at a time
- Control flow based
- Designed to handle huge data volumes
- Flexible
- Obviously not specially adapted to graphics
12The mess of modern CPU
- Out of order execution (72 in-flight
instructions) - Exceptions (addresses, floating points)
- Address translation
- Branch prediction
- Register renaming
- Cache management
- A lot of Silicon!!!
13Close-up of an AMD
Only a fraction of the chip does computations
14Roadmap
- Intro
- Graphics pipeline
- Why is CPU insufficient?
- Why is the solution?
- History evolution of the pipeline
- GPUs
- How they work
- How to use them
- GPU vs. CPU
- GPGPU
15How should graphics work?
- Dont let CPU render the scene
- Pass the data (description of the 3D scene) to
the graphics card - Have the GPU render the scene and store it in the
frame buffer - Screen is updated from the frame buffer
16GPU vs. CPU
- GPUs devote hardware to do certain computations
extremely well (e.g. linear algebra). - CPUs are versatile and general purpose. They do a
lot of things with rather well.
17Roadmap
- Intro
- Graphics pipeline
- Why is CPU insufficient?
- Why is the solution?
- History evolution of the pipeline
- GPUs
- How they work
- How to use them
- GPU vs. CPU
- GPGPU
18Evolution of the pipeline
- From
- Full software
- Through
- Hardware rasterization
- Hardware TL
- Programmable hardware
- To
- Full programmable hardware
- And beyond
19Evolution of the pipeline
- From
- Full software
- Through
- Hardware rasterization
- Hardware TL
- Programmable hardware
- To
- Full programmable hardware
- And beyond
Pushed by the entertainment market
20Software pipeline
- All computations are performed on CPU
Quake, 1996
21Hardware rasterizer
- 1996 3DFX releases VOODOO
- TL by CPU
- Hardware rasterization
Unreal, 1997
22Hardware TL
- 1999 NVIDIA releases GeForce256
- TLR by videocard
- CPU performs animations and deformations
Quake III, 2000
23Programmable pipeline
- 2001 First GPU GeForce3
- Vertex Stage is programmable
- Animations, deformation, etc. move to GPU
Unreal III, 2003
24Full programmable pipeline
- ATI and nVidia
- Vertex and Pixel Stages are programmable
- Cinematic quality effects are computed by GPUs
Doom III, 2005
25Whats next
- Real Time Cinema
- Free CPU resources
- Physical simulations
Final Fantasy, the spirit within
26Whats next
- Real Time Cinema
- Free CPU resources
- Physical simulations
Far Cry
27Roadmap
- Intro
- Graphics pipeline
- Why is CPU insufficient?
- Why is the solution?
- History evolution of the pipeline
- GPUs
- How they work
- How to use them
- GPU vs. CPU
- GPGPU
28What is a GPU?
- Hardware support for graphics operations linear
algebra, lighting, texture, rasterization, and
others - High level of parallelism 256 processes
- No cache coherency (up to 16 KB)
29Parallel Computing
- ? Moore's Law
- Transistor count doubles every two years.
- Increase in transistor count is also a rough
measure of computer processing speed. - Has been given to everything that changes
exponentially. - ? New Moore's Law
- Microprocessors no longer get faster, just wider.
30The Ox vs. Chicken Analogy
- Seymour Cray If you were plowing a field, which
would you rather use a strong ox or 1024
chickens?
31The Ox vs. Chicken Analogy
- Seymour Cray If you were plowing a field, which
would you rather use Two strong oxen or 1024
chickens? - Chicken is winning these days
- For many applications, you can run many cores at
lower freq and come ahead at the speed game. - For example decrease freq by 20 ? 50 cut in
power ? can add one more dumb core (chicken) ?
power budget stays the same but with increased
performance!
32GPU in the classic architecture
33Schematic view
TL
34Vertex and Fragment Shaders
35Vertex processor
- Vertex transformation
- Normal transformation and normalization
- Texture Coordinates Generation and transformation
- Lighting (per vertex)
- Color Material (per vertex)
36Vertex processor
37Vertex processor
OSG data
OSG modes and attributes
38Fragment processor
- Scan conversion
- Texture computation
- Blending
39Geometry shader
- Latest GPU (2008
- Input entire primitive line, triangle
- Output more primitives
- Applications
- Fur simulation
- Geometry processing (for instance widen lines)
40Programmable processors
- Operate on a single vertex / pixel
- No vertex / pixel information can effect another
vertex / pixel
41Programmable GPU pipeline
42Hardware
43Hardware
44Roadmap
- Intro
- Graphics pipeline
- Why is CPU insufficient?
- Why is the solution?
- History evolution of the pipeline
- GPUs
- How they work
- How to use them
- GPU vs. CPU
- GPGPU
45GPU Programming models
46GPU Programming models
Linear algebra BLAS, LAPAC, TAUCS, CHOLMOD
OSG
47OpenGL / GLSL
- An abstraction
- Does not specify what happens
- Specifies what appears to happen
- Not a language, an API or State machine
- The way to communicate with hardware
48OpenGL / GLSL
- Specifies data XYZ type
- Sets modes, attributes, and parameters
- Allows to rewrite the vertex / fragment /
geometry shader - Performs the calculations in software if no GPU
is present
49Roadmap
- Intro
- Graphics pipeline
- Why is CPU insufficient?
- Why is the solution?
- History evolution of the pipeline
- GPUs
- How they work
- How to use them
- GPU vs. CPU
- GPGPU
50GPU vs. CPU
51GPU vs. CPU
52Why graphics cards?
- Pros
- Fast
- Cheap
- Low power
- Cons
- Specialized
- Hard to program
- Rapidly changing
- One way
53Roadmap
- Intro
- Graphics pipeline
- Why is CPU insufficient?
- Why is the solution?
- History evolution of the pipeline
- GPUs
- How they work
- How to use them
- GPU vs. CPU
- GPGPU
54General Purpose GPU
- Motivation
- Modern GPU are massive computation platforms
- They can reduce the load from CPU by performing
computation - Perfect for problems requiring parallel
processing of data - Key idea
- Take a general problem and transform it to images
/ graphics
55GPU examples
- Solving PDE
- Fluid dynamics
- Image / signal processing
- Financial forecasting
56Problems
- GPUs are designed for independent data
- Temporarily registers are zeroed
- No shared or static data
- No read-modify-write buffers