Title: H' Quynh Dinh
1Graphics Hardware and Shaders
2Evolution of Shaders
Register combiners (GeForce 256, 1999) 8 general
combiners that can do basic math operations (dot
prod, MUX, multiply add, interpolation)
Complex architecture (e.g., stream processors in
GeForce 8800) built-in functions such as sin,
cos, branching, loops, etc.
Increasing complexity of instruction set
Pre-shaders
Early shaders (e.g., Nvidias Cg) Initially
simple instruction set with no branching or
loops. Only operations like multiply, add,
distance, min, max, boolean ops were available.
Current shading languages OpenGL
DirectX Compiled into shader assembly
language. Converted into GPU-specific machine
language by graphics driver.
3Shaders
- Overcome limits of fixed-functionality of classic
graphics pipeline - Does NOT replace fixed-functionality of graphics
pipeline, though those fixed-functions can be
disabled and implemented in a shader
Fixed-functions (lighting, clipping, culling,
projection, viewport mapping)
Model data Vertex and texture coord
Vertex Programs
Rasterization
Fixed-functions (pixel ownership testing,
scissoring, alpha testing, blending, masking)
Fragment Programs
Framebuffer
4Issues in Graphics Architecture Design
- Number of vertex shader units vs. number of
fragment shader units - Typically more fragment shaders than vertex
shaders - Processing data as vector or scalar
- Vector composed of 4 scalars
- Vectors good for representing colors and normals
- Scalar would waste space when passed as vectors
- Branching and loops
- Very inefficient to do in shaders
- Typically a limit is placed on the number of loop
iterations possible - For efficiency, separate shader programs had to
be created, each for a particular number of loop
iterations (may no longer be true). - Early z-culling
- Fragment shaders can be expensive, so would like
to cull away as many fragments as possible before
sending to fragment shader - In general, graphics hardware optimized to output
to framebuffer, not read-back - Optimized for gathering (via texture lookups),
not scattering (cannot write to random
framebuffer locations)
5GeForce 8800 Architecture
- Unified shader architecture does NOT have
separate vertex and fragment shader units - 8 shader cores, each with
- 16 stream processors (total of 128 SP in GPU)
- 4 texture addressing units
- 8 texture filter units
- L1 cache
- Shader core sits in GPU chip
- Chips manufactured by various companies Nvidia,
ATI - Chips put into graphics cards and sold by 3rd
party vendors
6GeForce 8800 Architecture
- Scalar data stream
- Vectors require conversion to scalar in stream
processor (SP) - Better parallelism less idle time
- Allow math operations to execute during texture
fetches - Early z-culling
- Allow fragments to be culled before entering
fragment shader - Geometry shader new shader!
- Processes entire primitives, not just one vertex
- Inputs and outputs primitives
- Example primitives are point lists/strips,
triangle lists/strips
7GeForce 8800 Architecture
- See Figures 12 (p.20),18 (p.26), and 28 (p.41) in
Nvidia GeForce 8800 Architecture Technical Brief
8Shaders
- Overcome limits of fixed-functionality of classic
graphics pipeline - Does NOT replace fixed-functionality of graphics
pipeline, though those fixed-functions can be
disabled and implemented in a shader
Fixed-functions (lighting, clipping, culling,
projection, viewport mapping)
Model data Vertex and texture coord
Vertex Programs
Rasterization
Fixed-functions (pixel ownership testing,
scissoring, alpha testing, blending, masking)
Fragment Programs
Framebuffer
9Vertex Shaders
- Programmability of geometry within the graphics
pipeline - Should be used for operations that need geometric
information - Examples
- Vertex transformation
- Normal transformation and normalization
- Texture coordinate generationg
- Texture coordinate transformation
- Complex lighting
- Color material application
10Fragment (Pixel) Shaders
- Programmability of framebuffer fragment within
the graphics pipeline - Examples
- Operations on interpolated values
- Texture access
- Texture application
- Fog
- Color sum (blending)
11Fragment Shaders
- Cannot alter a fragments u,v position
- Cannot require knowledge of several concurrent
fragments simultaneously - Computation must be on single fragment
- Cannot access neighboring fragments
- Key advantage
- Can access texture memory an arbitrary number of
times multi-texturing - Can access random locations in texture memory
- Can combine texture values arbitrarily
- Results of one texture access can be used for
another texture access dependent texture reads
12Shader Data Structures
- Common to both vertex and fragment shaders
- Uniform variables
- Used to pass data from application program to
shader - Used for values that change infrequently
typically does not change per vertex or fragment - Built-in gl_ModelViewMatrix, gl_FrontMaterial,
gl_LightSource, gl_Fog, - User-defined ModelScaleFactor, EyePos, Epsilon,
LightPosition, - Varying variables
- Define data passed from vertex processor to
fragment processor - Potentially different value at each vertex and at
each fragment via interpolation within polygon - Built-in variables include color and texture
coordinates - User-defined include normal, model coordinates,
refraction index, - User-defined varying variables in fragment shader
must match those defined in vertex shader
13Shader Data Structures
- Vertex shaders only
- Per vertex attributes
- Built-in
- gl_Color, gl_Normal, gl_Vertex for vertex shaders
- User-defined attributes (defined by
glVertexAttribARB) - StartColor, Velocity, Elevation, Tangent,
- Special output variables
- gl_Position, gl_PointSize, gl_ClipVertex
- Fragment shaders only
- Special input variables (from fixed-functionality
of graphics pipeline) - gl_FragCoord, gl_FrontFacing
- Special output variables
- gl_FragColor, gl_FragDepth
14Data Flow from Vertex to Fragment Shaders
Fixed-Functions and Rasterization
Varying variables generated via interpolation
during rasterization
Vertex Programs
Fragment Programs
Special variables provided by fixed-functionality
15OpenGL Function Calls to Load Shaders
- Initialize shader
- glCreateProgram returns a GLuint (lets call it
program) - glCreateShader(type) type is GL_VERTEX_SHADER
or GL_FRAGMENT_SHADER - glShaderSource specifies shader source code in
a string - glCompileShader
- glAttachShader
- glLinkProgram
- glGetUniformLocation binds parameter names in
shader source code to declared variables in
application program - Use shader
- glUseProgram(program)
- Specify input variables
- glUniform1,2,3f,iv, glUniformMatrix3f,iv
- For textures, specify texture unit, bind it, and
specify texture type (1D, 2D, rectangle) using
glActiveTexture, glBindTexture, glUniform - Draw polygons between glBegin and glEnd
- Remove shader
- glDeleteProgram(program)
16References
- Rost, R.J. ed. OpenGL Shading Language, Addison
Wesley, 2004. - Nvidia GeForce 8800 Architecture Technical Brief,
http//www.nvidia.com/object/IO_37100.html - Lindholm, E., M. Kilgard, and H. Moreton. A
User-Programmable Vertex Engine, SIGGRAPH 2001,
pp.149-158.