CIS736Lecture2420060313

About This Presentation

Title:

CIS736Lecture2420060313

Description:

Hardware can do this almost for free and I can't think of a card that doesn't do ... Doom III and Half-Life 2 usher in a new era of realism. History ... – PowerPoint PPT presentation

Number of Views:254

Avg rating:3.0/5.0

Slides: 63

Provided by: brendo

Learn more at: https://www.kddresearch.org

Category:

more less

Transcript and Presenter's Notes

Title: CIS736Lecture2420060313

1
CIS 736 Computer Graphics Lecture 24 of 42 Exam
Review and Hardware Shading Monday, 13 March
2006 Reading Hardware Rendering
(shader_lecture) Adapted with permission from
slides by Andy van Dam and Kevin Egan W. H.
Hsu http//www.kddresearch.org
avd November 4, 2003 VSD 1/46
2

Texturing
Nothing is more important than texture
performance and quality. Textures are used for
absolutely everything.
Fake shading
Fake detail
Fake effects
Fake geometry
Geometry is expensive you gotta store it,
transform it, light it, clip it bah!
Use them in ways they arent supposed to be used
An image is just an array after all
If it werent for textures, wed be stuck with
big Gouraud shaded polys!
Quick hardware texture review
Interpolation is linear in 1/z

Multipass Rendering
In 123, everything weve done has been in one
pass but in reality you wont get anywhere with
that.
Multipass rendering gives you flexibility and
better realism
An early version of Quake 3 did this
(1-4 Accumulate bump map)
5 Diffuse lighting
6 Base texture
(7 Specular lighting)
(8 Emissive lighting)
(9 Volumetric effects)
(10 Screen flashes)
Multitexturing is the most important part of
multipass rendering (remember all of those
texture regs?)

Billboards
A billboard is a flat object that faces
something
There are lots of different billboarding methods,
but well stick with the easiest, most used one
Take a quad and slap a texture on it. Now we want
it to face the camera. How do we do that? (Hint
you just did it in modeler)
Bread and butter of older 3d games and still used
extensively today
Monsters (think Doom)
Items
Impostors (LOD)
Text
HUDs (sometimes)
Faked smoke, fire, explosions, particle effects,
halos, etc.
ing lens flares
Bad news Little to no shading

Aliasing when scaling up
Bilinear Filtering (a.k.a. Bilinear
Interpolation)
Interpolate horizontally by the decimal part of u
and vertically interpolate the horizontal
components by the decimal part of v
x floor(u) a u - x
y floor(v) b v y
T(u,v) (1 a)(1 b)T(x, y) bT(x, y 1)
a(1 b)T(x 1, y) bT(x 1, y 1)
(1 a)(1 b)T(x, y) a(1 b)T(x 1, y)
(1 a)bT(x, y 1) abT(x 1, y 1)
This is essentially what you did in filter when
scaling up
Hardware can do this almost for free and I cant
think of a card that doesnt do it by default
Not so free in a software engine

Mipmapping
Mip multum in parvo (many in a small place)
Solves the aliasing problem when scaling down
Its possible for more than one texel may cover
the area of a pixel (edges of objects, objects in
the distance). We could find all texels that
fall under that pixel and blend them, but thats
too much work
This problem causes temporal aliasing
Will bilinear filtering help? Will it solve the
problem?
Solution more samples per pixel or lower the
frequency of the texture
Mipmapping solves the problem by taking the
latter approach
Doing this in real time is too much work so well
precompute
Take the original texture and reduce the area by
0.25 until we reach a 1 x 1 texture
Use a good filter and gamma correction when
scaling
If we use a Gaussian filter, this is called a
Gaussian pyramid
Predict how bad the aliasing is to determine
which mipmap level to use
How much more memory are we using?
Can potentially increase texture performance
(Lars story)
Cards do mipmapping and bilinear filtering
by default. A good deal of console
games dont do mipmapping, why?

Problem Solved

Were good. A little too good.
We got rid of aliasing, but now everything is too
blurry!
Lets take more samples. Take a sample from the
mipmap level above and below the current one and
do bilinear filtering on the current mipmap level
Trilinear Filtering
Trilinear filtering makes it look a little better
but were still missing something If were going
to take even more samples we better be taking
them correctly.
Key observation suppose we take a pixel and
backwards map it onto a texture. Is the pixel
always a nice little square with sides parallel
to the texture walls? NO!
Bilinear and trilinear filtering are isotropic
because they sample the same distance in all
directions.
Now were going to sample more where it is
actually needed
Of course, a pixel is NOT a tiny little square.
But lets suppose it is

Anisotropic Filtering
Anisotropic not isotropic (surprise). Also
called aniso or AF for short.
There are a couple of aniso algorithms that dont
use mipmapping but our cards already do
mipmapping really well so well build off of
that.
When the pixel is backwards mapped, the longest
side of the quad determines the line of
anisotropy and well take a hojillion samples
along that line across mipmaps.
Aniso and FSAA are the two big features of
todays modern cards
ATI and NVIDIA have different algorithms that
they guard secretively and continue to
improve/screw up
We could be taking up to 128 samples per pixel!
This takes serious bandwidth. This is orders of
magnitude more costly than bilinear (4) or
trilinear (6) filtering.
Pictures!

Aniso Rules (1/3)
www.richleader.com

Aniso Rules (2/3)
Serious Sam extremetech.com

Aniso Rules (3/3)
Serious Sam extremetech.com

Texture Generation
Who needs artists?
Procedural Texturing
Use a random procedure to generate the colors
Perlin noise (not just for color)
Good for wood, marble, water, fire
Unreal Tournament did it quite a bit
Texture Synthesis
No games use this to my knowledge
Efros Leung, Ashikhmin use neighborhood
statistics
Cohen (Siggraph 2003) has a much faster tile
based method

14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
Summary Points

Solid Modeling Overview
Data structures
Boundary representations (B-reps) last time
Spatial partitioning representations today
Algorithms
Construction (composition)
Intersection, point classification
Know difference between B-reps and spatial
partitioning pros and cons
Spatial Partitioning (Review Guide)
Cell decomposition know how to obtain for
composite object (simple primitives)
Planar and spatial occupancy
Simple uniform subdivision (grid / pixel,
volumetric / voxel)
Hierarchical quadtrees and octrees know how to
obtain for 2D, 3D scenes
Binary Space Partitioning (BSP) trees know how
to obtain for simple 2D object
Constructive Solid Geometry (CSG) know typical
primitives, how to combine
Next Class Color Models Visible Surface Data
Structures

24
Programmable Hardware
Doom III
Research
Halo 2
Jet Set Radio Future
25
History

1992 - ids Wolfenstein 3D video game rocks
gaming world, all objects are billboards (flat
planes) and rendered in software
1996 - ids Quake introduces a full 3D polygonal
game, lighting vertices and shading pixels is
still done in software
1996 - Voodoo 3Dfx graphics card released, does
shading operations (such as texturing) in
hardware. QuakeWorld brings hardware acceleration
to Quake
1999 - Geforce 256 graphics card released, now
transform and lighting (TL) of vertices is done
in hardware as well (uses the fixed function
pipeline)
2001 Geforce 3 graphics card lets programmers
download assembly programs to control vertex
lighting and pixel shading keeping the speed of
the fixed function pipeline with none of the
restrictions
Future Expanded features and high level APIs
for vertex and pixel shaders, increased use of
lighting effects such as bump mapping and
shadowing, higher resolution color values. Doom
III and Half-Life 2 usher in a new era of realism

26
Fixed Function Pipeline

Starting in 1999 some graphics cards began to do
the standard lighting model and transformations
in hardware (TL). CPUs everywhere sighed in
relief.
Hardware TL existed in the 60s and 70s, it was
just really slow and really expensive.
Implementing the pipeline in hardware made
processing polygons much faster, but the
developer could not modify the pipeline (hence
fixed function pipeline). The fixed function
pipeline dates back to the first SGI
workstations.
New programmable hardware allows programmers to
write vertex and pixel programs to change the
pipeline
Vertex and pixel programs arent necessarily
slower than the fixed function alternative
Note that the common term vertex shader to
describe a vertex program is misleading
vertices are lit and pixels are shaded

27
A Quick Review

By default, GL will do the following
Take as input various per-vertex quantities
(color, light source, eye point, texture
coordinates, etc.)
Calculate a final color for each vertex using a
basic lighting model (OpenGL uses Phong lighting)
For each pixel, linearly interpolate the three
surrounding vertex colors to shade the pixel
(OpenGL uses Gouraud shading)
Write the pixel color value to the frame buffer

28
Programmable Hardware Pipeline

clip space refers to the space of the canonical
view volume
New graphics cards can use either the fixed
function pipeline or vertex/pixel programs

29
Example Cartoon Shading

Cartoon shading is a cheap and neat looking
effect used in video games such as Jet Set Radio
Future
Instead of using traditional methods to light a
vertex, use the dot product of the light vector
and the normal of the vertex to index into a 1
dimensional texture (A texture is simply a
lookup function for colors nothing more and
nothing less)
Instead of a smooth transition from low intensity
light (small dot product) to high intensity light
(large dot product) make the 1 dimensional
texture have sharp transitions
Textures arent just for wrapping 2D images on
3D geometry!
Viola! Cartoon Teapot

light
1.0
dot product
0.0
1 dimensional texture
30
What is Cg?

Cg is a C-like language that the graphics card
compiles in to a program
The program is run once per-vertex and/or
per-pixel on the graphics card
Cg does not have all the functionality of C
Different types systems
Cant include standard system headers
No malloc
http//www.cgshaders.org/articles/ has the
technical documentation for Cg
Cg is actually an abstraction of the more
primitive assembly language that the programmable
hardware originally supported

31
Cg Tips

Understand the different spaces your vertices may
exist in
model space the space in which your input vertex
positions exist, in this space the center of the
model is at the origin
world space the space in which you will do most
of your calculations
clip space the space in which your output vertex
positions must exist, this space represents the
canonical view volume
If you want a vector to have length 1 make sure
to normalize the vector, this often happens when
you want to use a vector to represent a direction
When writing a Cg program try to go one step at a
time, one sequence of steps might be
Make sure the model vertex positions are being
calculated correctly
Set the color or texture coordinates to an
arbitrary value, verify that you are changing the
surface color
Calculate the color or texture coordinates
correctly
Check out http//cgshaders.org/articles/ for some
helpful documents

32
The Big Picture

Write a .cg file. This will invariably take some
sort of information as a parameter to its
main() function
Note that this main() is not compiled by gcc (or
any C/C compiler). That would generate a
symbol conflict, among other things. It is only
processed by NVidias Cg compiler
Write a class that extends CGEffect. This is
cs123s object-oriented wrapper around the basic
C interface provided by NVidia
The CGEffect subclass allows you to bind data
from your .C files to variables in your .cg
vertex program
Make that CGEffect the IScenes current CGEffect
by calling IScenesetCGEffect(). IScene will
take ownership of the CGEffect at this point, so
you will not be deleting the memory you allocated
yourself. Rendering will now be done using your
vertex shader
Call ISceneremoveCGEffect() if you want to turn
vertex shaders off again

33
Cg Example Code (1/2)

pragma bind appin.Position ATTR0
pragma bind appin.Normal ATTR2
pragma bind appin.Col0 ATTR3
// define inputs from application
struct appin application2vertex
float4 Position
float4 Normal
float4 Col0
pragma bind vertout.HPosition HPOS
pragma bind vertout.Col0 COL0
// define outputs from vertex shader
struct vertout vertex2fragment
float4 HPosition
float4 Col0

34
Cg Example Code (2/2)

vertout main(appin IN,
uniform float4 lightpos,
uniform float4x4
ModelViewInvTrans,
uniform float4x4 ModelView,
uniform float4x4 ModelViewProj,
uniform float4x4 Projection)
vertout OUT
OUT.HPosition mul(ModelViewProj,
IN.Position)
float4 wsnorm mul(ModelViewInvTrans,
IN.Normal)
wsnorm.w 0
wsnorm normalize(wsnorm)
float4 worldpoint mul(ModelView,
IN.Position)
float4 lightvec lightpos - worldpoint
lightvec.w 0
lightvec normalize(lightvec)

35
Cg Explanation (1/6)
Declare input struct and bindings

pragma bind appin.Position ATTR0
pragma bind appin.Normal ATTR2
pragma bind appin.Col0 ATTR3
// define inputs from application
struct appin application2vertex
float4 Position
float4 Normal
float4 Col0

The appin struct extends application2vertex
indicating to Cg that appin will be used to hold
per-vertex input. The name appin is arbitrary,
but the name application2vertex is part of Cg
The pragma statements establish the mapping
between OpenGLs representation for vertex input
and the members of appin
pragma bind statements are kind of confusing.
Vertex inputs are supplied by the OpenGL program
and are then stored in registers on the graphics
card. These statements tell Cg how to initialize
each member of the input struct i.e. use the
value stored in the register specified by the
pragma binding

36
Cg Explanation (2/6)
Declare output struct and bindings

pragma bind vertout.HPosition HPOS
pragma bind vertout.Col0 COL0
// define outputs from vertex shader
struct vertout vertex2fragment
float4 HPosition
float4 Col0

The vertout struct extends vertex2fragment
indicating to Cg that vertout will be used to
return per-vertex output. The name vertout is
arbitrary, but the name vertex2fragment is part
of Cg
The pragma statements establish the mapping
between the members of vertout and OpenGLs
representation for vertex output
Similarly to inputs, the graphics card expects
the vertex outputs to be stored in registers.
These pragma bind statements tell Cg what to do
with the values stored in members of the output
struct returned from main put them in the
register specified by the pragma bind
The card then uses the values in these registers
in the rest of the pipeline

37
Cg Explanation (3/6)
Entry point to the Cg program

vertout main(appin IN,
uniform float4 lightpos,
uniform float4x4
ModelViewInvTrans,
uniform float4x4 ModelView,
uniform float4x4 ModelViewProj
uniform float4x4 Projection)

Cg requires a main() function in every vertex
program and uses this function as the entry point
The return type vertout indicates we must
return a structure of type vertout which will
hold per-vertex output
The IN parameter is of type appin Cg uses the
pragma bindings from the previous slide to
initialize IN with per-vertex input before it
is passed to main(). This is read-only
The uniform keyword indicates to Cg that the
specified input parameter is constant across all
vertices in the current glBegin()/glEnd() block
and is supplied by the application
The ModelView matrix maps from object space to
world space
The ModelViewProj matrix maps from object space
to the film plane
The ModelViewInvTrans is the inverse transpose of
the modelview matrix
Used to move normals from object space to world
space
The Projection matrix maps from world space to
film plane

38
Cg Explanation (4/6)

vertout OUT
OUT.HPosition mul(ModelViewProj,
IN.Position)

Create output vertex compute and set its clip
space position
The first thing we do is declare a struct OUT
of type vertout which we will use to return
per-vertex output. This is a write-only
variable We calculate the vertexs clip space
position by multiplying the model space position
by the composite modelview and projection matrix
Compute and normalize world space normal
float4 wsnorm mul(ModelViewInvTrans,
IN.Normal) wsnorm.w 0 wsnorm
normalize(wsnorm)
We calculate the world space normal by
multiplying the model space normal by the inverse
transpose of the modelview matrix We set w equal
to 0 for the world space normal since all vectors
should have 0 as a homogenous coordinate. Do Not
assume that Cg will do this sort of thing for you
its not IAlgebra We normalize the world space
normal to assure that it is of length 1
39
Cg Explanation (5/6)
Compute vertex world space position
Compute and normalize
vector from vertex to light (in world space)

float4 worldpoint mul(ModelView,
IN.Position)
float4 lightvec lightpos - worldpoint
lightvec.w 0
lightvec normalize(lightvec)

We calculate the vertexs world space position by
multiplying its model space position by the
modelview matrix (we previously calculated the
vertexs clip space position) Since the lightpos
constant used in this example is already in world
space coordinates we calculate the vector from
the vertex to the light by subtracting the
vertexs position from the lights
position Again, to normalize the light vector we
set the homogenous coordinate to 0 and call
normalize()
40
Cg Explanation (6/6)

float dp dot(wsnorm, lightvec)
dp clamp(dp, 0.0, 1.0)

Compute and clamp dot product (used in lighting
calculation)
To calculate the intensity associated with the
incoming light we dot the world space normal with
the world space light vector So that we do not
have to worry about negative dot product values
we clamp the dot product to be between 0.0 and
1.0. Note that we dont use a conditional here.
You should almost never have a branch instruction
in one of your vertex shaders.
Set output color return output vertex
OUT.Col0 IN.Col0 dp return
OUT
To calculate the diffuse contribution of the
light source we scale the diffuse color of the
object by the dot product We have set both the
clip space position and color in the OUT
structure so we now return the OUT structure from
main()
41
How Can I Set The Parameters?

We have two different address spaces
You have parameters to your main() function in a
.cg file
You have floats and pointers to floats in a C/C
file
We provide support code to help bind the two
together. Our wrappers also make this all a bit
more object-oriented
Look at the documentation for CGEffect.H/C
There are bindings for the actual vertex programs
and for the individual parameters sent to the
vertex program
The support code handles the ModelView/Projection/
etc matrices automatically
Lets take a look at a .C file

42
The .C File (1/2)

include "CGDiffuse.H"
CGDiffuseCGDiffuse(CGcontext context,
const char strCgFileName,
const char strModelViewName,
const char strModelViewProjName,
const char strProjectionName,
const char strMVInvTransName,
const double_t lightPosX,
const double_t lightPosY,
const double_t lightPosZ)
CGEffect(context, strCgFileName,
strModelViewName,
strModelViewProjName,
strProjectionName,
strMVInvTransName)
m_lightPos0 lightPosX
m_lightPos1 lightPosY
m_lightPos2 lightPosZ
m_cgLightPosParam NULL

43
The .C File (2/2)

void CGDiffuseinitializeStudentCgBindings()
m_cgLightPosParam cgGetNamedParameter(m_cgProg
ramHandle, "lightPos")
assert(cgIsParameter(m_cgLightPosParam))
void CGDiffusebindStudentUniformCgParameters()
if (cgIsParameter(m_cgLightPosParam))
cgGLSetParameter4f(m_cgLightPosParam,
m_lightPos0,
m_lightPos1,
m_lightPos2,
1)

44
The .C File Explained (1/3)
Initialize the effect

include "CGDiffuse.H"
CGDiffuseCGDiffuse(CGcontext context,
const char strCGFileName,
const char
strModelViewName,
const char
strModelViewProjName,
const char
strProjectionName,
const char
strMVInvTransName
const double_t lightPosX,
const double_t
lightPosY,
const double_t
lightPosZ)
CGEffect(context, strCGFileName,
strModelViewName,
strModelViewProjName, strProjectionName,
strMVInvTransName)
// this stuff shouldnt need explanation, so it
is elided

Initializing the effect simply involves calling
the superclass constructor, passing it
The CGcontext, which IScene stores as the
protected variable m_cgContext
strCGFileName the .cg file with the Cg code for
this effect
The name of the modelview, composite modelview
projection, projection, and modelview inverse
transpose matrices.
These names should be the names of our parameters
in the main function of the .cg file, i.e.
ModelViewInvTrans, ModelView,
ModelViewProj, and Projection

45
The .C File Explained (2/3)
Initializing bindings

CGDiffuse initializeStudentCgBindings()
m_cgLightPosParam cgGetNamedParameter(m_cgProg
ramHandle, lightPos)
assert(cgIsParameter(m_cgLightPosParam)

This function is called when the effect is
created to initialize your bindings
cgGetNamedParameter takes a CGprogram and a
string
The first parameter is a handle to the text of
the corresponding Cg program for this effect
The CGDiffuse class inherits m_cgProgramHandle
from CGEffect this protected variable is used in
most of the Cg calls
The second variable lightPos is a string with
the form
The uniform variable is in the .cg file, not this
.C file!
It returns a CGparameter
This binding will be used later on to set a value
for the uniform variable lightPos
Well see how to do this on the next slide
Initializing a binding does not give it a value!

46
The .C File Explained (3/3)
Assigning values to a binding

void
CGDiffusebindStudentUniformCgParameters()
if(cgIsParameter(m_cgLightPosParam))
cgGLSetParameters4f(m_cgLightPosParam,
m_lightPos0,
m_lightPos1,
m_lightPos2,
1)

This function is called to give actual values to
a binding
It is called exactly once by the support code
with each call you make to redraw()
Here, our binding represents the position of the
light in our scene
cgGLSetParameters4f takes the variable in our .C
file representing the binding, and four floats
The binding were specifying must be to a
variable of type float4. In this case we are
binding to lightPos, which is a float4 in our
cg program.
The variables fields are initialized to the four
floats we specify
Essentially, this function determines actual
parameters for uniform variables in the .cg file
the next time the Cg program is run

47
Lets Code!

As a class lets reconstruct the shader we just
saw and add specular lighting.
Then lets work out what needs to change in the
.C file
Fun!

48
Revised Cg Code

// the stuff at the top of the file is unchanged
in this case. Not so if we
// were using textures, etc, etc.
float4 reflect(float4 incoming, float4 normal)
float4 temp 2 dot(normal, incoming)
normal
return (temp incoming)
vertout main(appin IN,
uniform float4 eye,
uniform float4 lightPos,
uniform float4x4
ModelViewInvTrans,
uniform float4x4 ModelView,
uniform float4x4 ModelViewProj,
uniform float4x4 Projection)
// same
float4 reflectedlight reflect(lightvec,
wsnorm)

49
Revised .C File (1/2)

void
CGDiffuse initializeStudentCgBindings()
m_cgLightPosParam cgGetNamedParameter(m_cgPro
gramHandle, lightPos)
assert(cgIsParameter(m_cgLightPosParam))
m_cgEyePointParam cgGetNamedParameter(m_cgProg
ramHandle,
eye)
assert(cgIsParameter(m_cgEyePointParam))

50
Revised .C File (2/2)

void
CGDiffusebindStudentUniformCgParameters()
if(cgIsParameter(m_cgLightPosParam))
cgGLSetParameters4f(m_cgLightPosParam,
m_lightPos0,
m_lightPos1,
m_lightPos2,
1)
if (cgIsParameter(m_cgEyePointParam ))
const IAPoint eyept m_camera-eyePoint()
cgGLSetParameter4f(m_cgEyePointParam ,
eyept0,eyept1,eyep
t2, 1)

51
Changes Checklist

When we went from a diffuse shader to a specular
shader, we did the following
Wrote the new .cg file
Added uniform float4 eye to the main function
Determined the specular component and added it to
the diffuse color when setting OUT.Col0
Added m_cgEyePointParam member variable of type
CGparameter to the .H file (.H file not shown)
Initialized the new binding in initializeStudentCg
Bindings using cgGetNamedParameter
Used the program handle inherited from CGEffect,
m_cgProgramHandle
The string was eye because we wanted the
binding to specify the parameter eye in the cg
program.
Gave a value to the binding in bindStudentUniformC
gParameters using cgGLSetParameter4f
We got the eye point from the camera and passed
it as four floats
When our Cg program is run we know that eye will
be float4( eyept0, eyept1, eyept2, 1)

52
Debugging Cg

Debugging Cg can be hard
Compile errors happen at runtime, when the
shader is loaded, and do not have any helpful
information
All you get is The compile returned an error
To get some useful feedback use the CG compiler
/course/cs123/bin/cgc profile vp20
No printf
The only output you have is the vertex you
return, so you can use the output color to do
primitive testing
Comment a lot! Treat it as if you were writing
assembly code from cs31

53
Cg Types (1/2)

Used in .C and .H files
CGprogram (in example code m_cgProgramHandle)
All of the NVidia Cg calls are global functions.
We need this pointer to tell the NVidia Cg
library which program were talking about
CGparameter (in example code m_cgLightPosParam,
m_cgEyePointParam)
We need to connect the values (0,0,1, say) to a
parameter (lightpos, for example). This
variable represents that connection or binding.

54
Cg Types (2/2)

Used in .cg files
float4 (in example code eye, lightpos)
This is a 4-vector. (Think IAPoint) You can
access the elements in different ways
lightpos.x or lightpos0, lightpos.y or
lightpos1
float4x4 (in example code ModelView, etc.)
This is a 4x4 matrix. (Think IAMatrix) You can
do matrix multiplications in hardware with these
float (can be used within a function, but not as
a parameter)
(Think float) Unfortunately, you will probably
try to pass one as a parameter and get one of the
absolutely opaque Cg compile errors. Dont try
it! You cant bind to a single float.
Do use them within the body of a Cg function
definition

55
Where did Cg come from?
(or, culture is good for you)
56
Old vs. New

Before the GeForce 3, graphics programmers sent
position, normal, color, transparency, and
texture data to the card and it used the fixed
function pipeline to render the vertices (the
left side of the picture on slide 5). Sceneview
used the fixed function pipeline to render.
This meant the programmer had limited control
over how the hardware created the final image. To
do non-standard effects, like cartoon shading,
required a lot of hackery.
Programmers had to trick the card in to doing
different effects or handle a lot of the effects
in software
The current generation of hardware, however,
takes a different view of rendering. The
programmer simply sends data to the card and then
writes a program to interpret the data and create
an image. Most programmers still send standard
types of data like position, normal, color, and
texture data since it often makes the most sense.

57
Basics

In the first generation of programmable cards,
the programmer wrote short assembly language
programs to create a final image.
Vertex shader programs take as input per vertex
information (object space position, object space
normal, etc.) and per frame constants
(perspective matrix, modeling matrix, light
position, etc.). They produce some of the
following outputs clip space position, diffuse
color, specular color, transparency, texture
coordinates, and fog coordinates.
Pixel shader programs take as input the outputs
from the vertex shader program and texture maps.
They produce a final color and transparency as
output. They are often called fragment shaders.
Pixel shaders are trickier so we dont cover
them. Take CS224 if you want to learn more!
Pixel shaders Vertex shaders

58
Using a Shader

A programmer would write the vertex and pixel
shaders as simple text files.
Then a program would load each of the shaders it
intended to use. This sends the text file to the
driver where it is compiled in to a binary
representation and stored on the graphics card.
Each rendering pass, the program would enable one
vertex shader and/or one pixel shader. This tells
the graphics card to use the them to render the
objects instead of the fixed function pipeline.
Finally, the program passes data to the card.
Its interpreted by the shaders and an image is
produced!
Disabling the shaders (or never enabling any)
prompts the card to use the fixed function
pipeline.

59

Ack, assembly!

MUL R1, R1.x, R2
DP4 R1.x, R3, -R1
MUL oCOL0, v3, R1.x
MOV R2.xyz, -c1
MOV R2.w, c18.x
DP4 R1.x, R2, R2
RSQ R1.x, R1.x
MUL R4, R1.x, R2
MUL R1, c0.yzxw, R4.zxyw
MAD R2, R4.yzxw, c0.zxyw, -R1
DP4 R1.x, R2, R2
RSQ R1.x, R1.x
Not only to we need to write some CPU assembly to
make our game run fast, now we have to write
assembly for the graphics cards
Different graphics cards support different
versions of the vertex and pixel shader assembly
languages
Shader programs run at different speeds on
different cards different assembly for each
card
John Carmack, a man not afraid of assembly,
believes that high level shader languages are
critical for the future success of programmable
hardware and hes right

60
Enter the High Level Languages

Microsofts HLSL
New in DirectX 9
struct VS_OUTPUT float4 Pos
POSITION float3 Light TEXCOORD0 float3
Norm TEXCOORD1
VS_OUTPUT VS(float4 Pos POSITION, float3
Normal NORMAL) VS_OUTPUT Out
(VS_OUTPUT)0 Out.Pos mul(Pos,
matWorldViewProj)
Out.Light vecLightDir
Out.Norm normalize(mul(Normal, matWorld))
return Out
float4 PS(float3 Light TEXCOORD0, float3 Norm
TEXCOORD1) COLOR    float4 diffuse 1.0,
0.0, 0.0, 1.0    float4 ambient 0.1, 0.0,
0.0, 1.0     return ambient diffuse
saturate(dot(Light, Norm))
NVIDIAs Cg
Geared towards OpenGL but it can work with
DirectX and compile for DirectX specific assembly
languages
The OpenGL ARBs SLang
Will be in OpenGL 2.0 whenever that comes out
(HLSL code from gamasutra.com)

61
Cg

During the summer of 2002 nVidia released Cg. Cg,
as weve seen, is a language specification for
vertex and pixel shaders that looks a lot like C.
It is useful because its more intuitive and
easier to program in than the assembly language
used prior to its release.
For the modeler assignment you will take
advantage of Cg to make a vertex program.
Cg programs are still simple text files and
processed by the graphics card, just like the
assembly programs. The only difference is the
language.
We cant use HLSL because were using Linux (duh)
and SLang isnt out yet Cg!

62
The Future

Pixel shaders, pixel shaders, and more pixel
shaders
Shader performance and power is increasing at an
insane rate. Look at how far weve come in only
two years!
Real time ray tracing? Radiosity?
BRDF/BSSRDF
Real time RenderMan?
Scientific computing
Who needs the CPU?