Title: Advanced RealTime Shader Techniques
1Advanced Real-Time Shader Techniques
- Natalya Tatarchuk
- 3D Application Research Group
- ATI Research
2Overview
- Problems that RenderMonkey solves for you
- Real-time shader creation using RenderMonkey
development environment - Building an effect from scratch
- Efficient HLSL shaders tips and tricks
- Advanced Shader Examples
3RenderMonkey Addresses the Needs of Real-Time
Effect Developers
- Shaders are more than just assembly code
- Encapsulating shaders can be complex
- Cannot deliver shader-based effects using a
standard mechanism - Solve problems for shader development
- Designing software on emerging hardware is
difficult - Lack of currently existing tools for full shader
development - Need a collaboration tool for artists and
programmers for shader creation
4Designed for Extensibility
- Flexible framework design
- Evolves with the graphics API
- Allows easy incorporation of existing APIs
- Full DirectX 9.0 support, including Microsoft
HLSL - Extensible framework to support emerging HL
standards - OpenGL 2.0 Shading Language Prototype
5Making Your Shader Development More Effective
- Simplifies shader creation
- Fast prototyping and debugging of new graphics
algorithms - Helps you share graphics effects with other
developers and artists - Easy integration of effects into existing
applications - Quickly create new components using our flexible
framework - A communication tool between artists and
developers
6Program Your Shaders Using Our Intuitive,
Convenient IDE
Workspace View
7Creating a Shader-based Effect Using RenderMonkey
- Setup all your effect parameters
- Variables
- Stream mapping
- Models and textures
- Rendering states
- Create your vertex and pixel shaders
- Link RenderMonkey parameter nodes to your shaders
as necessary - Compile and explore!
8Phong Illumination Example
Classic lighting equation example
where each of the lighting contribution
components can be computed as follows
9RenderMonkey Run-Time Database Overview
- Encapsulate all effect data in a single text file
- Each Effect Workspace consists of
- Effect Group(s)
- Effect(s)
- Pass(es)
- Render State
- Pixel Shader
- Vertex Shader
- Geometry
- Textures
- Variables, notes and stream mapping nodes
10RenderMonkey Database Uses Standard XML File
Format
- Allows easy data representation
- Industry standard
- User-extensible
- Parsers are readily available
- User-readable file format
- Easily add Perl scripts to post-process your
files - Published DTD for RenderMonkey XML allows robust
file validation - Describes all effect-related information
- Shader code
- Render states
- Models / texture information
- Rendering states
- Import and export easily from your native formats
- Use our parser and run-time format
- Write an exporter / importer plug-in (Example
Our Microsoft FX Exporter)
11Workspace View Your Window into Our Database
- All effect data organized in a hierarchical
Workspacetree view - Node rules enforce correct datarelationships
- Full Cut / Copy / Paste / Undo
- Easily identify node data typesby their icons
- Quickly identify invalid data
- Easy perusal of data values via automatic
tooltips
12Effect Group Nodes
- Group related effects in one container
- Provides mechanism for dealing with a lot of
effects - Facilitate fallback versions of same effect
- Group shaders by type, pick the first one that
validates - How you group effects is entirely up to you
13Effect Nodes
- Encompass all information needed to implement a
real time visual effect - Composed of one or more passes
- Inherit data and states from a Default Effect
- Used to set a known starting state for all
effects
14Data Scope and Traversal
- Store common effect data in the default effect
- All other effects inherit data from it
- Easy to share data from a common point
- Common rendering states
- Common vertex and pixel Shaders
- Effects dont inherit data from other effects in
the workspace - Common data that should be global to the effects
- Stream mapping
- Texture variables
- Model variables
- Variable scope is similar to C-style variable
scope - Data validation occurs upward through passes in
the effect then upward through passes in the
default effect
15Pass Nodes
- Every pass is a draw call
- Passes inherit data from previous pass within the
effect - First pass inherits from default effect
- A typical pass contains
- A vertex and pixel shader pair (either HLSL or
ASM) (required) - Render state block
- Render states are inherited from pass to pass
- Texture objects
- Must reference a valid texture variable
- Each texture object stores associated texture
states - Geometry model reference (required)
- Stream mapping reference (required)
- May also contain nodes of other types (variables,
etc) - Different geometry can be used in each pass
16Variable Nodes
- Parameters to your shaders
- More intuitive way of dealing with constant store
registers - Give shader constants meaningful names and types
- Manipulate shader constants using convenient GUI
widgets - Supported variable types
- Matrices - Vectors
- Scalars - Colors
- Textures - Strings (Notes)
17Pre-defined Variables Help You Set Up Your
Shaders Quickly
- RenderMonkey IDE calculates their values at
run-time - Provide a set of commonly used parameters
- View projection matrix
- View matrix
- Inverse view matrix
- Projection matrix
- View direction vector
- View position vector
- Time, time cycle period
- cos_time, sin_time, tan_time
- and more
18Edit Shader Parameters Using GUI Widgets via
Editor plug-ins
- Utilize knowledge of the variable type for
editing - Color Editor
- Vector Editor
- Matrix Editor
- Scalar Editor
- Edit variables with custom widgets easy to
create your own - Only accept changes you are happy with
19Phong Illumination Parameters
- Ambient, diffuse, and specular light
contributions - Coefficients
- Color intensities
- Can be expressed as shader constants
- Normal, view and reflection vectors
- Normal vector is one of vertex shader inputs
- View and reflection vectors are computed
per-vertex - Specular reflection parameter
- Also can be expressed as a shader constant
parameter
20Simplified Stream Mapping Setup
- Setup each stream using theStream Mapping
Editor - A stream mapping node can be created at any
point in the workspace - Stream mapping nodes can be shared by multiple
effects - Multiple references to a stream mapping node can
be created throughout the workspace - Each pass must have its own stream mapping
reference
21Full Support for DirectX 9.0 Shader Models
- Assembly shaders
- Vertex shader versions 1.0/1.1 2.0
- Pixel shader versions 1.0/1.1/1.3/1.4 2.0
- Integrated HLSL support
- Supported compilation targetsvs_1_0,
vs_2_0ps_1_x and ps_2_0 - Future support for shader versions 2_0_sw, 2_0_x,
and 3_0
22Edit Shaders Using Specialized Editors
- Tabbed view allows editing multiple passesin the
effect - Easy switching between vertex and pixel shaders
in the pass - Full support for standard Windows editor
functionality - Undo / Cut / Copy / Paste / Font settings
- Automatic syntax coloring of shader code
- Separate syntax rules for HLSL and ASM shaders
- Type your code, compile and see instant results!
23Assembly Shader Editor Plug-in
- Easily link RenderMonkey variable nodes to
constant store registers - Syntax colored for shader assembly language
- Automatically maintains count of ALU and texture
ops in your shader as you type! - Allows saving shader code to text files
24HLSL Shader Editor Plug-in
- Allows incredible ease of shader creation
- Simple interface links RenderMonkey nodes to
HLSL variables andsamplers - Link vectors, colors, scalars and matricesto
variable parameters - Link texture objects to samplers
- Control target and entry points for your shaders
25Specular Lighting Vertex Shader
- struct VS_OUTPUT
-
- float4 Pos POSITION
- float3 Normal TEXCOORD0
- float3 Light TEXCOORD1
- float3 View TEXCOORD2
-
- VS_OUTPUT main( float4 Pos POSITION, float3
Norm NORMAL ) -
- VS_OUTPUT Out (VS_OUTPUT) 0
-
- Out.Pos mul( view_proj_matrix, Pos ) //
Transformed position - Out.Normal normalize( mul(view_matrix, Norm)
) // Normal -
- Out.Light normalize(-lightDir) // Light
vector - float3 Pview mul( view_matrix, Pos ) //
Compute view position
26Specular Lighting Pixel Shader
- float4 ambient
- float4 diffuse
- float4 specular
- float Ka
- float Ks
- float Kd
- float N
- float4 main( float4 Diff COLOR0, float3
Normal TEXCOORD0, - float3 Light TEXCOORD1, float3
View TEXCOORD2 ) - COLOR
-
- // Compute the reflection vector
- float3vReflect normalize(2dot(Normal,
Light)Normal - Light) - // Final color is composed of ambient, diffuse
and specular - // contributions
- float4 FinalColor Ka ambient
- Kd diffuse dot(
Normal, Light ) - Ks specular pow( max(
dot( vReflect,
27Linking Variable Nodes to Constant Store
Registers is Easy
- Use constant store editor
- Bind a constant storage register to a variable
from the effect workspace - Preview the incoming values
28Read All Application Messages in a Single Place -
The Output Module
- Output the results of shader compilation and
application messages - Linked with the shader editor for compilation
error highlighting - View messages from the renderer of your effect
- Notifies you whenever resources are created
(textures loaded, shaders compiled)
29Integrated Compile Time Error Reporting
Simplifies Shader Creation
- Compilation errors displayedin the output module
window - Double-clicking on the errorhighlights the line
containingerroneous code in the editor - Optional compilation of a single shader, all
shaders in active effect or all shaders in the
workspace
30Interactively Preview Your Effects in the Viewer
Plug-in
- All changes to the shaderor its parameters
modify the rendered image in real time - DirectX 9.0 preview
- HAL / REF
- Customize the preview
- Standard trackball navigation
- Customizable settings for camera and clear colors
- A set of preset views (Front / Back / Side / etc)
31Setup All Render States in the Render State
Editor Plug-in
- Modify any render state within a particular pass
- Render states are inherited
- From previous passes in the effect
- From the default effect
- Great for exploring the results of changing a
render state a useful learning tool
32Using Textures in a RenderMonkey Effect
- Texture variables can beshared between
differenteffects - Support the following texturetypes
- 1D, 2D texture maps (JPEG,TGA, BMP formats)
- Cube maps (DDS)
- Volume textures (DDS)
- Dynamic renderable textures
- Texture objects belong to apass node
- Reference a texture variable to sample from
- Store all related texture and sampler states
- Maps directly to HLSL samplers
33Texture and Sampler State Editing
- Specify texture and sampler state values
(filtering, clamping, etc) for each texture
node within a pass - Texture and samplerstates are bundled together
in one editor - State changes modify rendered output in real time
34Dynamic Texture Rendering Example
- Direct output of one or more passes to a texture
map - Sample from that texture to create interesting
effects - Scene post-processing
- Depth of Field
- Gaussian Blur
- Tone Mapping
- HDR Rendering
- Other image processing techniques
35Rendering to a Texture Using RenderMonkey
- Simple to setup
- Special texture variable type
- Renderable Texture
- Direct pass output to a texture
- Use Render Target node to link to a Renderable
Texture - Modify parameters using appropriate editors
- Render Target Editor
- Renderable Texture Editor
36Customize Renderable Texture
- Specify desired texture format
- Select from a variety of formats
- Control texture dimensions
- Width and Heightor
- Tie the texture size to viewport dimensions
- Specify whether mip maps will be generated
automatically
37Control Render Target Parameters
- Modify parameters to suit your needs
- Specify whether the textureshould be cleared
- Specify clear color
- Toggle whether the depth buffer should be
cleared - Specify the depth clear value
38Develop Shaders With Your Artists
- Use the Artist Editor Interface to explore
shaders - Expose the power of programmable shaders to
artists and designers - Programmers and Artists living in harmony!
- View workspace using Art tab
- Only view data relevant to the artists
- Programmers can select which data is
artist-editable - Provides look and feel of GUI widgets artists are
familiar with - See changes in real time
39Edit All Artist Parameters Using a Single
Interface
- Programmer has control of what parameters can be
modified - Flag variables as artist-editable as needed
- Artist modifies only specified variables
- Use the Artist Editor interface
- Data organization follows the effect structure
- Variables are grouped in tab sheets by
passes/effects they belong to - Artists can tweak parameters and instantly see
changes - Modify vectors, scalars, colors using convenient
controls
40Artist Editor Interface
41Overview
- Writing optimal HLSL code
- Compiling issues
- Optimization strategies
- Code structure pointers
- HLSL Shader Examples
- Multi-layer car paint effect
- Translucent Iridescent Shader
- Überlight Shader
42Why use HLSL?
- Faster, easier effect development
- Instant readability of your shader code
- Better code re-use and maintainability
- Optimization
- Added benefit of HLSL compiler optimizations
- Still helps to know whats under the hood
- Industry standard which will run on cards from
any vendor - Current and future industry direction
- Increase your ability to iterate on a given
shader design, resulting in better looking games - Conveniently manage shader permutations
43Compile Targets
- Legal HLSL is still independent of compile target
chosen - But having an HLSL shader doesnt mean it will
always run on any hardware! - Currently supported compile targets
- vs_1_1, vs_2_0, vs_2_sw
- ps_1_1, ps_1_2, ps_1_3, ps_1_4, ps_2_0, ps_2_sw
- Compilation is vendor-independent and is done by
a D3DX component that Microsoft can update
independent of the runtime release schedule
44Compilation Failure
- The obvious program errors (bad syntax, etc)
- Compile target specific reasons your shader is
too complex for the selected target - Not enough resources in the selected target
- Uses too many registers (temporaries, for
example) - Too many resulting asm instructions for the
compile target - Lack of capability in the target
- Such as trying to sample a texture in vs_1_1
- Using dynamic branching when unsupported in the
target - Sampling texture too many times for the target
(Example more than 6 for ps_1_4) - Compiler provides useful messages
45Use Disassembly for Hints
- Very helpful for understanding relationship
between compile targets and code generation - Disassembly output provides valuable hints when
compiling down to an older compile target - If successfully compiled for a more recent target
(eg. ps_2_0), look at the disassembly output for
hints when failing to compile to an older target
(eg. ps_1_4) - Check out instruction count for ALU and tex ops
- Figure out how HLSL instructions get mapped to
assembly
46Getting Disassembly Output for Your Shaders
- Directly use FXC
- Compile for any target desired
- Compile both individual shader files and full
effects - Various input arguments
- Allow to turn shader optimizations on / off
- Specify different entry points
- Enable / disable generating debug information
47Easier Path to Disassembly
- Use RenderMonkey while developing shaders
- See your changes in real-time
- Disassembly output is updated every time a
shader is compiled - Displays count for ALUand texture ops, as well
as the limits forthe selected target - Can save resulting assembly code into text file
48Optimizing HLSL Shaders
- Dont forget you are running on a vector
processor - Do your computations at the most efficient
frequency - Dont do something per-pixel that you can do
per-vertex - Dont perform computation in a shader that you
can precompute in the app - Use HLSL intrinsic functions
- Helps hardware to optimize your shaders
- Know your intrinsics and how they map to asm,
especially asm modifiers
49HLSL Syntax Not Limited
- The HLSL code you write is not limited by the
compile target you choose - You can always use loops, subroutines, if-else
statements etc - If not natively supported in the selected compile
target, the compiler will still try to generate
code - Loops will be unrolled
- Subroutines will be inlined
- If else statements will execute both branches,
selecting appropriate output as the result - Code generation is dependent upon compile target
- Use appropriate data types to improve instruction
count - Store your data in a vector when needed
- However, using appropriate data types helps
compiler do better job at optimizing your code
50Using If Statement in HLSL
- Can have large performance implications
- Lack of branching support in most asm models
- Both sides of an if statement will be executed
- The output is chosen based on which side of the
if would have been taken - Optimization is different than in the CPU
programming world
51Example of Using If in Vs_1_1
If ( Threshold gt 0.0 ) Out.Position
Value1else Out.Position Value2
generates following assembly output
// calculate lerp value based on Value gt 0 mov
r1.w, c2.x slt r0.w, c3.x, r1.w // lerp between
Value1 and Value2 mov r7, -c1 add r2, r7, c0 mad
oPos, r0.w, r2, c1
52Example of Function Inlining
// Bias and double a value to take it from 0..1
range to -1..1 range float4 bx2(float x)
return 2.0f x - 1.0f float4 main( float4
tc0 TEXCOORD0, float4 tc1
TEXCOORD1, float4 tc2 TEXCOORD2,
float4 tc3 TEXCOORD3) COLOR
// Sample noise map three times with different
// texture coordinates float4 noise0
tex2D(fire_distortion, tc1) float4 noise1
tex2D(fire_distortion, tc2) float4 noise2
tex2D(fire_distortion, tc3) // Weighted sum
of signed noise float4 noiseSum bx2(noise0)
distortion_amount0
bx2(noise1) distortion_amount1
bx2(noise2) distortion_amount2 //
Perturb base coordinates in direction of noiseSum
as function of height (y) float4
perturbedBaseCoords tc0 noiseSum (tc0.y
height_attenuation.x
height_attenuation.y) // Sample base and
opacity maps with perturbed coordinates float4
base tex2D(fire_base, perturbedBaseCoords)
float4 opacity tex2D(fire_opacity,
perturbedBaseCoords) return base opacity
53Code Permutations By Compiling Out Portions of
the Code
gt
gt
54Scalar and Vector Data Types Optimization Notes
- Scalar data types are not all natively supported
in hardware - i.e. integers are emulated on float hardware
- Not all targets have native half and none
currently have double - Can apply swizzles to vector types
- float2 vec pos.xy
- But!
- Not all targets have fully flexible swizzles
- Acquaint yourself with the swizzles native to the
relevant compile targets (particularly ps_2_0 and
lower)
55Integer Data Type
- Added to make relative addressing more efficient
- Using floats for addressing purposes without
defined truncation rules can result in incorrect
access to arrays. - All inputs used as ints should be defined as ints
in your shader
56Example of Integer Data Type Usage
- Matrix palette indices for skinning
- Declaring variable as an int is a free
operation gt no truncation occurs - Using a float and casting it to an int or using
directly gt truncation will happen
57Real-World Shader Examples
- Will present several case studies of developing
shaders used in ATIs demos - Multi-tone car paint effect
- Translucent iridescent effect
- Classic überlight example
- Examples are presented as RenderMonkeyTM
workspaces - Distributed publicly with version 1.0 release
58Multi-Tone Car Paint
59Multi-Tone Car Paint Effect
- Multi-tone base color layer
- Microflake layer simulation
- Clear gloss coat
- Dynamically Blurred Reflections
60Car Paint Layers Build Up
Multi-Tone Base Color
Microflake Layer
Clear gloss coat
Final Color Composite
61Multi-Tone Base Paint Layer
- View-dependent lerpingbetween three paintcolors
- Normal from appearancepreserving
simplificationprocess, N - Uses subtractive tone to control overall color
accumulation
62Multi-Tone Base Coat Vertex Shader
VS_OUTPUT main( float4 Pos POSITION,
float3 Normal NORMAL,
float2 Tex TEXCOORD0,
float3 Tangent TANGENT, float3
Binormal BINORMAL ) VS_OUTPUT Out
(VS_OUTPUT) 0 // Propagate transformed
position out Out.Pos mul( view_proj_matrix,
Pos ) // Compute view vector Out.View
normalize( mul(inv_view_matrix,
float4( 0, 0, 0, 1)) - Pos ) //
Propagate texture coordinates Out.Tex Tex
// Propagate tangent, binormal, and normal
vectors to pixel shader Out.Normal
Normal Out.Tangent Tangent
Out.Binormal Binormal return Out
63Multi-Tone Base Coat Pixel Shader
float4 main( float4 Diff COLOR0, float2
Tex TEXCOORD0, float3 Tangent
TEXCOORD1, float3 Binormal TEXCOORD2,
float3 Normal TEXCOORD3, float3 View
TEXCOORD4 ) COLOR float3 vNormal
tex2D( normalMap, Tex ) vNormal 2 vNormal
- 1.0 float3 vView normalize( View )
float3x3 mTangentToWorld transpose( float3x3(
Tangent,
Binormal, Normal )) float3
vNormalWorld normalize( mul(mTangentToWorld,v
Normal)) float fNdotV saturate( dot(
vNormalWorld, vView ) ) float fNdotVSq
fNdotV fNdotV float4 paintColor fNdotV
paintColor0
fNdotVSq paintColorMid
fNdotVSq fNdotVSq paintColor2
return float4( paintColor.rgb, 1.0 )
64Microflake Layer
65Microflake Deposit Layer
- Simulating light interaction resulting from
metallic flakes suspended in the enamel coat of
the paint - Uses high frequency normalized vector noise map
(Nn) which is repeated across the surface of the
car
66Computing Microflake Layer Normals
- Start out by using normal vector fetched from
the normal map, N - Using the high frequency noise map, compute
perturbed normal Np - Simulate two layers of microflake deposits by
computing perturbed normals Np1 and Np2
where c b
where a ltlt b
67Microflake Layer Pixel Shader
- float4 main(float4 Diff COLOR0, float2
Tex TEXCOORD0, float3 Tangent
TEXCOORD1, float3 Binormal TEXCOORD2,
float3 Normal TEXCOORD3, float3 View
TEXCOORD4, float3 SparkleTex
TEXCOORD5 ) COLOR -
- fetch and signed scale the normal fetched
from the normal map - float3 vFlakesNormal 2 tex2D(
microflakeNMap, SparkleTex ) - 1 - float3 vNp1 microflakePerturbationA
vFlakesNormal normalPerturbation
vNormal - float3 vNp2 microflakePerturbation (
vFlakesNormal vNormal ) - float3 vView normalize( View )
- float3x3 mTangentToWorld transpose( float3x3(
Tangent, Binormal,
Normal )) - float3 vNp1World normalize( mul(
mTangentToWorld, vNp1) ) - float fFresnel1 saturate( dot( vNp1World,
vView )) - float3 vNp2World normalize( mul(
mTangentToWorld, vNp2 )) - float fFresnel2 saturate( dot( vNp2World,
vView )) - float fFresnel1Sq fFresnel1 fFresnel1
- float4 paintColor fFresnel1 flakeColor
fFresnel1Sq flakeColor
fFresnel1Sq fFresnel1Sq flakeColor
pow( fFresnel2, 16 )
flakeColor
68Clear Gloss Coat
69Dynamically Blurred Reflections
Blurred Reflections
70Dynamic Blurring of Environment Map Reflections
- A gloss map can be supplied to specify the
regions where reflections can be blurred - Use bias when sampling the environment map to
vary blurriness of the resulting reflections - Use texCUBEbias for to access the cubic
environment map - For rough specular, the bias is high, causing a
blurring effect - Can also convert color fetched from environment
map to luminance in rough trim areas
71Clear Gloss Coat Pixel Shader
- float4 ps_main( ... / same inputs as in the
previous shader / ) -
- // ... use normal in world space (see
Multi-tone pixel shader) - // Compute reflection vector
- float fFresnel saturate(dot( vNormalWorld,
vView)) - float3 vReflection 2 vNormalWorld fFresnel
- vView - float fEnvBias glossLevel
- // Sample environment map using this reflection
vector and bias - float4 envMap texCUBEbias( showroomMap,
float4( vReflection,
fEnvBias ) ) - // Premultiply by alpha
- envMap.rgb envMap.rgb envMap.a
- // Brighten the environment map sampling result
- envMap.rgb brightnessFactor
72Compositing Multi-Tone Base Layer and Microflake
Layer
- Base color and flake effect are derived from Np1
and Np2 using the following polynomial - color0(Np1V) color1(Np1V)2 color2(Np1V)4
color3(Np2V)16
Base Color
Flake
73Compositing Final Look
... // Compute final paint color combines
all layers of paint as well// as two layers of
microflakes float fFresnel1Sq fFresnel1
fFresnel1 float4 paintColor fFresnel1
paintColor0 fFresnel1Sq
paintColorMid fFresnel1Sq
fFresnel1Sq paintColor2
pow( fFresnel2, 16 ) flakeLayerColor //
Combine result of environment map reflection with
the paint // color float fEnvContribution
1.0 - 0.5 fNdotV // Assemble the final
look float4 finalColor finalColor.a
1.0finalColor.rgb envMap fEnvContribution
paintColor return finalColor
74Original Hand-Tuned Assembly
75Car Paint Shader HLSL Compiler Disassembly Output
76Full Result of the Application of Multi-Layer
Paint to Car Body
77Translucent Iridescent Shader Butterfly Wings
78Translucent Iridescent Shader Butterfly Wings
- Simulates translucency of delicate butterfly
wings - Wings glow from scattered reflected light
- Similar to the effect of softly backlit rice
paper - Displays subtle iridescent lighting
- Similar to rainbow pattern on the surface of soap
bubbles - Caused by the interference of light waves
resulting from multiple reflections of light off
of surfaces of varying thickness - Combines gloss, opacity and normal maps for a
multi-layered final look - Gloss map contributes to satiny highlights
- Opacity map allows portions of wings to be
transparent - Normal map is used to give wings a bump-mapped
look
79RenderMonkey Butterfly Wings Shader Example
- Parameters that contribute to the translucency
and iridescence look - Light position and scene ambient color
- Translucency coefficient
- Gloss scale and bias
- Scale and bias for speed of iridescence change
- WorkspaceHLSL_IridescentButterly.xml
80Translucent Iridescent Shader Algorithm Basic
Steps
- Compute light, view and halfway vectors in
tangent space - Load base texture map, gloss map, opacity map and
normal map - Compute diffusely reflected light
- Compute scattered illumination contribution
- Adjust for transparency of wings
- Compute iridescence contribution
- Add gloss highlights
- Assemble final color
81Translucent Iridescent Shader Vertex Shader
- ..
- // Propagate input texture coordinates
- Out.Tex Tex
- // Define tangent space matrix
- float3x3 mTangentSpace
- mTangentSpace0 Tangent
- mTangentSpace1 Binormal
- mTangentSpace2 Normal
- // Compute the light vector (object space)
- float3 vLight normalize( mul(
inv_view_matrix, lightPos ) - Pos ) -
- // Output light vector in tangent space
- Out.Light mul( mTangentSpace, vLight )
-
- // Compute the view vector (object space)
- float3 vView normalize( mul(
inv_view_matrix, float4(0,0,0,1)) - Pos )
82Translucent Iridescent Shader Loading
Information
float3 vNormal, baseColor float fGloss,
fTranslucency // Load normal and gloss
map float4( vNormal, fGloss ) tex2D(
bump_glossMap, Tex ) // Load base and opacity
map float4 (baseColor, fTranslucency) tex2D(
base_opacityMap, Tex )
83Diffuse Illumination For Translucency
float3 scatteredIllumination saturate(dot(-vNorm
al, Light))
fTranslucency translucencyCoeff float3
diffuseContribution saturate(dot(vNormal,Light
)) ambient baseColor
scatteredIllumination diffuseContribution
84Adding Opacity to ButterlyWings
- Resulted color is modulated by the opacity value
to add - transparency to the wings
// Premultiply alpha blend to avoid clamping the
highlights baseColor fOpacity
85Making Butterfly Wings Iridescent
// Compute index into the iridescence gradient
map, which // consists of NV coefficient float
fGradientIndex dot( vNormal, View)
iridescence_speed_scale iridescence_speed_bias
// Load the iridescence value from the gradient
map float4 iridescence tex1D( gradientMap,
fGradientIndex )
86Assembling Final Color
// Compute glossy highlights using values from
gloss map float fGlossValue fGloss (
saturate( dot( vNormal, Half ))
gloss_scale gloss_bias ) // Assemble the
final color for the wings baseColor
fGlossValue iridescence
87HLSL Disassembly Comparison to the Hand-Tuned
Assembly
12 ALU 3 Texture 15 Total
15 ALU 3 Texture 18 Total
88Example of Translucent Iridescent Shader
89Optimization Study Überlight
- Flexible light described in JGT article Lighting
Controls for Computer Cinematography by Ronen
Barzel of Pixar - Überlight is procedural and has many controls
- light type, intensity, light color, cuton,
cutoff, near edge, far edge, falloff, falloff
distance, max intensity, parallel rays, shearx,
sheary, width, height, width edge, height edge,
roundness and beam distribution - Code here is based upon the public domain
RenderMan implementation by Larry Gritz
90Überlight Spotlight Mode
- Spotlight mode defines a procedural volume with
smooth boundaries - Shape of spotlight is made up of two nested
superellipses which are swept along direction of
light - Also has smooth cuton and cutoff planes
- Can tune parameters to get all sorts of looks
91Überlight Spotlight Volume
Roundness ½
92Überlight Spotlight Volume
Outer swept superellipse
Roundness 1
b
Inner swept superellipse
a
A
B
93Original clipSuperellipse() routine
- Computes attenuation as a function of a points
position in the swept superellipse. - Directly ported from original RenderMan source
- Compiles to 42 cycles in ps_2_0, 40 cycles on R3x0
float clipSuperellipse ( float3 Q,
// Test point on the x-y plane float
a, // Inner superellipse float
b, float A, // Outer
superellipse float B,
float roundness) // Same roundness for both
ellipses float x abs(Q.x), y abs(Q.y)
float re 2/roundness // roundness
exponent float q a b pow (pow(bx, re)
pow(ay, re), -1/re) float r A B pow
(pow(Bx, re) pow(Ay, re), -1/re) return
smoothstep (q, r, 1)
94Vectorized Version
- Precompute functions of roundness in app
- Vectorize abs() and all of the multiplications
- Compiles to 33 cycles in ps_2_0, 28 cycles on
R3x0
float clipSuperellipse ( float2 Q,
// Test point on the x-y plane
float4 aABb, // Dimensions of superellipses
float2 r) // Two precomputed
functions of roundness float2 qr, Qabs
abs(Q) float2 bx_Bx Qabs.x aABb.wzyx
// Swizzle to unpack bB float2 ay_Ay Qabs.y
aABb qr.x pow (pow(bx_Bx.x, r.x)
pow(ay_Ay.x, r.x), r.y) qr.y pow
(pow(bx_Bx.y, r.x) pow(ay_Ay.y, r.x), r.y)
qr aABb aABb.wzyx return smoothstep
(qr.x, qr.y, 1)
95smoothstep() function
- Standard function in procedural shading
- Intrinsics built into RenderMan and DirectX HLSL
1
0
edge0
edge1
96C implementation
float smoothstep (float edge0, float edge1, float
x) if (x lt edge0) return 0 if (x
gt edge1) return 1 // Scale/bias into
0..1 range x (x - edge0) / (edge1 -
edge0) return x x (3 - 2 x)
97HLSL implementation
- The free saturate handles x outside of
edge0..edge1 range
float smoothstep (float edge0, float edge1, float
x) // Scale, bias and saturate x to 0..1
range x saturate((x - edge0) / (edge1
edge0)) // Evaluate polynomial return x
x (3 2 x)
98Vectorized HLSL Implementation
- With these optimizations, the entire spotlight
volume computation of überlight compiles to 47
cycles in ps_2_0, 41 cycles on R3x0
float3 smoothstep3 (float3 edge0, float3 edge1,
float3 OneOverWidth, float3
x) // Scale, bias and saturate x to 0..1
range x saturate( (x - edge0) OneOverWidth
) // Evaluate polynomial return x x
(3 2 x)
float3 smoothstep3 (float3 edge0, float3 edge1,
float3 OneOverWidth, float3
x) // Scale, bias and saturate x to 0..1
range x saturate( (x - edge0) OneOverWidth
) // Evaluate polynomial return x x
(3 2 x)
99Conclusion
- RenderMonkey IDE is a powerful, intuitive
environment for developing shaders - Prototype your shaders quickly
- Explore all parameters for your shaders using
convenient GUI widgets - Let your artists tweak shader parameters to find
the desired look - Develop shaders with your artists!
100Presentation Summary
- Building effects in real-time with RenderMonkey
- Writing optimal HLSL code
- Shader Examples
- Shipped with RenderMonkey version 1.0see
www.ati.com/developer
HLSL Car Paint.xml
HLSL Iridescent Butterfly.xmll