DirectX Graphics 9 Overview - PowerPoint PPT Presentation

About This Presentation
Title:

DirectX Graphics 9 Overview

Description:

All object handles are converted to interfaces so can be Released ... Requires dmap and nmap data to be synched -can't tile independently. D-Map Usage Scenario B ... – PowerPoint PPT presentation

Number of Views:175
Avg rating:3.0/5.0
Slides: 66
Provided by: chas158
Category:

less

Transcript and Presenter's Notes

Title: DirectX Graphics 9 Overview


1
DirectX Graphics 9 Overview
  • PM
  • DirectX Graphics Team
  • Microsoft Corporation

2
Overview
  • Higher Order Surface update
  • Displacement Mapping
  • Vertex Shader 2.0
  • Pixel Shader 2.0
  • Infrastructure
  • Future plans

3
API Update
  • DXG9 is a new .dll
  • DX8 interfaces handed to previous dll
  • API tokens renamed from 8 to 9
  • All object handles are converted to interfaces so
    can be Released()
  • TSS split up to handle samplers vs texture
    coord/parameter iterators
  • Will clearly delineate Renderstates that still
    apply when using shaders

4
Scissor Plane
  • One per-device rect. that can be set to clip
    pixels
  • Supported in all HW
  • Interesting for productivity apps
  • API cap and enable flag
  • Devcaps-gtScissor
  • D3DRS_SCISSORENABLE

5
Quads
  • May return as higher order primitive
  • Shared vertex patch scheme, and/or
  • Catmull-Clark subdivision surfaces
  • Implementations can use 2 tris for planar case
  • Would use gl-style color provoking
  • Last vert of quad sets flat-shade color
  • Not a high priority for DXG9

6
Fog
  • Still supported outside shaders
  • In FF pipeline controlled by renderstate
  • Old apps still rely on this behavior
  • Many newer apps will use shader code to implement
    it instead
  • May move into shader in DX10

7
Deprecated Features
  • W-Buffering
  • Less important with high-precision pixel ops
  • May be slower than just using z depth
  • Color key
  • Likely to be cut from textures
  • May support some keying in video path

8
HO Primitive Overview
  • More support for float tessellation levels
  • Allows smoother variation of LOD
  • Rational surface support
  • Adaptive tessellation
  • Displacement mapping
  • 2 methods

9
Rational Surfaces
  • Improves quality of surface model
  • Reduces primitives reqd. in content
  • Enabled in most current hardware
  • API support
  • D3DRS_RATIONALSURFACE
  • Devcaps-gtRationalSurface
  • Affects N-patches and RT patches

10
Adaptive Tessellation
  • Screen-space edge length-based criterion
  • Avoids cracks across shared edges
  • Supported for both RT-patches and N-patches
  • API specifies
  • Desired edge length in screen space
  • Min/max limits on resulting segment count

11
Displacement Mapping
  • Implemented as perturbation to tessellated
    vertices
  • Not happening per pixel yet
  • Displacement is special register in vertex shader
  • Defined by vertex shader declaration
  • If present in FF pipeline will be
  • Scaled by normal and add to position

12
D-Mapping Methods
  • 2 types of capability supported
  • 1. Precomputed Displacement Mapping
  • Better for objects
  • 2. Sampled Displacement Mapping
  • Better for terain

13
1. Precomputed D-Mapping
  • Assumes all primitives tessellated to same level
  • Displacements are pre-computed at points to be
    produced by tessellator
  • Ordering may be HW-specific
  • Displacements provided as additional
    asynchronous vertex stream
  • Displacement available in vertex shader as
    read-only register

14
Precomputed D-Map API
  • API routine provided to generate displacement
    values
  • Syntax proposals
  • Set result vector as vertex shader input stream
    component
  • Or possibly as vshader texture

15
Precomputed D-Map Issues
  • Advantages
  • Precomputation enables fast runtime
  • Useful for characters and compact objects
  • Limitations
  • Wont work with adaptive tessellation
  • gt May not be as good for terrain
  • Cant animate displacement map
  • Even scrolling it requires regeneration
  • Procedural mods to d-map likewise

16
2. Sampled D-Mapping
  • Filter/sample dmap at render time
  • Input surface specified as special texture for
    vertex access
  • Evaluator samples using tri-linear
  • Makes scalar result available in special register
    for use by vertex shader
  • No support for dependent reads
  • ie no shader modification of u/v values before
    sampling

17
Sampled D-Map Issues
  • Advantages
  • Works with adaptive tessellation
  • Therefore faster on terrain
  • Provides automatic LOD for objects
  • Limitations
  • Must use care with displacement maps across seams
    to avoid cracking
  • Implementations must guarantee order of operations

18
D-Map Usage Scenario A
  • Encode normal maps assuming displacements present
  • Compute lighting basis from base mesh normals
    only
  • Do all lighting per-pixel
  • Dmap provides occlusion effect only
  • Can be easily skipped in far field
  • Requires dmap and nmap data to be synched -cant
    tile independently

19
D-Map Usage Scenario B
  • Compute pixel lighting basis per post-tessellated
    polygon
  • Use with per-pixel or per-vertex lighting
  • Requires displacement map to contain normal
    displacements also

20
D-Map Authoring
  • Tool development underway in D3DX
  • Take high polycount models
  • Convert to low-polycount displacement map
  • Many other issues being handled

21
DMAP Data Formats
  • 32-bit float pixel format is ideal
  • Other formats supported
  • 16-bit float
  • 16 and 8-bit signed integer
  • 2 Channel counts supported
  • 1-channel fmt. for displacement only
  • 4-channel fmt. for normal displacement

22
Vertex Shader Update
  • Minimal changes from DirectX8
  • Double instruction and const count
  • Both set at 256
  • No new math ops
  • Add support for flow control

23
Vs.2.0 Model
  • 4-D address register a0 w-o 1
  • 16 vertex inputs v r-o 1
  • 16 temporary registers r r/w 3
  • 256 constants c r-o 1
  • 256 instructions
  • using same constant twice in one instruction is
    ok

24
Vertex Shader Flow Control
  • DX9 shaders will support flow control, based on
    constants only
  • Designed to solve current issues
  • enable/disable envt. mapping, etc.
  • varying of lights problem
  • Otherwise shaders are regression vs. fixed
    function DX7-style capability

25
Instruction Counts vs. Slots
  • Flow control means slots ! counts
  • Instruction store is 256, but more instructions
    can be executed than are stored
  • Actual instruction count limit is higher

26
Vertex Shader Jump
  • Syntax
  • jump condition, label
  • Where
  • condition evaluates to a constant reg, and
  • label is an instruction downstream of this one

27
Vertex Shader Loop
  • Syntax
  • loop condition, label
  • Where
  • condition evaluates to a constant reg, and
  • label is an instruction downstream of this one
  • Indexing of constant array bank via
    auto-increment counter

28
Vertex Shader Subroutine
  • Syntax
  • jsr condition, label
  • Where
  • condition evaluates to a constant reg, and
  • label is an instruction
  • Ret statement marks end of routine
  • Consumes an instruction slot

29
Vertex Flow Rules
  • Jumps can appear inside loops
  • Subroutines can be called inside loops
  • Jumps and loops can be in subroutines
  • Jumps cannot enter or leave loops
  • Subroutines cannot be nested

30
Vertex Shader Declaration
  • Syntax will be updated to decouple declaration
    from execution
  • Declaration will allow redefinition of VB inputs
    without requiring recompilation of shader body
  • E.g. Separate handle for declaration
  • Will also support more orthogonal feature set

31
Vertex Input Formats
  • Orthogonal Types
  • 1-, 2-, 3-, and 4-D float
  • 2 and 4-D short/word
  • 11-11-10 for normals
  • BYTE4 0xWWZZYYXX
  • D3DCOLOR 0xWWXXYYZZ
  • Pad unspecified elements with
  • x,y,z,w 0,0,0,1

32
Orthogonal Data Mappings
  • Support all the preceding with
  • Signed and unsigned
  • 2s complement or not
  • Normalized and not
  • Normalized signed values are -1.0 to 1.0

33
Pixel Shading Overview
  • 2 Levels of Shader Programming
  • High Level Language
  • Assembly Language
  • Both are implemented in D3DX
  • Just as in DX8.1

34
Assembly Language ps.2.0
  • Provides more Direct control
  • Exposes details of hw
  • Register counts
  • Instruction counts
  • Will support .cpp macro capability
  • Can be embedded in higher-level language as in
    effect files

35
Syntax Note
  • Syntax today is not final
  • It is intended to represent the logical model
  • Please provide feedback on syntax

36
ps.2.0 Model
  • 4-D Vector Registers
  • 2 color iterators v r-o 1
  • 8 texcoord iterators t r/w 2
  • 16 textures/samplers s r-o 1
  • 16 temporary registers r r/w 3
  • 32 constant registers c r-o 2
  • Arbitrarily intermixable Instructions
  • 32 address ops
  • 64 math ops

37
ps.2.0 Sample Shader
  • ps.2.0
  • dcl v0.rgba // diffuse color
  • dcl t0.rg // comes from oT0
  • dcl s0.a, 1D // scalar texture
  • dcl s1.rgb, 2D
  • dcl s2.rgb, 3D, bx2// convert to signed
  • dcl s3.rgb, cube
  • texld r0, s1, t0 // sample 1-D texture
  • mul r1, r0, v0 // modulate diffuse
  • out r1 // emit

38
Register Declarations
  • Iterators and samplers must be declared before
    use
  • Registers v, t, and s
  • Temp regs r possible
  • Provides opportunity for comment
  • Identifies resources allocated for use in shader
  • Using less helps performance

39
Declaration Instruction
  • dcl regname.components
  • Identifies components that this shader will need
    to use
  • Must specify components used
  • No defaults in declarations
  • Identifies 1-D, 2-D, 3-D, vs cube map
  • All other state is in TSS

40
Initializer Instruction (def)
  • def c, val, val, val, val
  • Just an initializer
  • Not a declaration
  • Can be overridden by subsequent calls to
    SetShaderConstant()

41
Address Instructions
  • texld r, s, t/r
  • Loads r with value sampled from stage s at
    coordinate t or r
  • Using r as sampling address indicates dependent
    read, t is non-dependent

42
Address Instructions2
  • texldp r, s, t/r
  • Loads r with value sampled from stage s at
    coordinate t or r dividing by w just before
    sampling
  • Using r as sampling address indicates dependent
    read, t is non-dependent
  • Used for projected texturing

43
Address Instructions3
  • texkill t/r
  • Works with either t or r register inputs
  • Kills pixel if any input components lt0
  • As in DX8
  • Needs flag for gt vs
  • Introduces aliasing artifacts that wont be
    helped by most FSAA schemes

44
Output Instructions
  • out r
  • Indicates register to contain final color
  • Register r cannot be re-used
  • zout r
  • Indicates register to contain final z-buffer
    value
  • Register r

45
Math Instructions
  • Parallel ops
  • add, sub, mul, mad, frc, cmp
  • Vector ops
  • dp3, dp4
  • Scalar ops
  • rcp, rsq, exp2, log2 , pow
  • Logic ops
  • min, max, sge, slt

46
Possible Macro Instructions
  • From DX8 pixel shaders
  • lrp, cnd
  • From DX8 vertex shaders
  • m4x4, m3x3, dst, lit
  • Common requests
  • norm, cross, abs, sincos

47
Input Argument Modifiers
  • Negate any input
  • Arbitrary swizzles/replicates on inputs
  • r0.xyzw r2.rgba r0.a
  • No complement or bias/_bx2 modifiers

48
Note on Unsigned Textures
  • _bx2 functionality still available for loading
    unsigned formats as signed data in range -1..1
  • DX6/7/8 content needs this
  • Specified on sampler stage declaration or in
    TextureStageState

49
Output Modifiers
  • Arbitrary mask on outputs
  • r0.y
  • _sat is still required as in frc
  • Clamps to range 0..1
  • No shifts supported
  • _d2, _d4, _x2, etc.

50
Co-Issue
  • None.

51
Dependent Reads
  • Can be serialized, but only to a max depth of 4
  • dcl t0.xy
  • dcl s0.rg, 2D
  • ld r0, s0, t0
  • ld r1, s1, r0
  • ld r2, s1, r1
  • ld r3, s1, r2
  • Is legal

52
TextureStageState
  • Most components unchanged
  • Number of stages updated to 16
  • Some components affect the 8 texture parameters
  • D3DTSS_WRAPUV
  • All others control the 16 samplers
  • D3DTSS_FILTER, D3DTSS

53
DXG9 Surface Formats
  • Format Filter Blend
  • AGBR8 y y
  • ABGR10 y y
  • ABGR16 y n
  • ABGR16f n n
  • ABGR32f n n

54
Channel Counts
  • Will expose support for the following
  • 1-, 2- and 4-channel 32-bit formats
  • 1-, 2- and 4-channel 16-bit formats
  • No odd-byte formats

55
Multiple Color Buffers
  • Looking at writing to multiple color buffers from
    one shader pass
  • Concept of compound render targets
  • 2 different surfaces in same RT
  • Read by setting each surface at separate stage
  • No filtering or blending supported in early
    implementations
  • Reason for out ps.2.0 instruction

56
DXG Infrastructure Update
  • Color converting Present()
  • Perf improvements likely
  • Including stretch case
  • Copy with Vertical Synch flag
  • D3DSWAPEFFECT_GDI_MODE
  • Enables better GDI integration
  • Flag bits at Reset() vs Create()
  • Enables more flexible reset

57
Volume Textures
  • More compression techniques
  • DXTc supported since DX8
  • DXVc added in DX9

58
Gamma Correction
  • Many apps need correct g math
  • Rendering quality
  • Math (adds) for light or shadow intensities
    breaks down if g ! 1.0
  • Compositing quality
  • a-blending transparent objects
  • e.g video effects, decals
  • a-blending for antialiasing
  • greatly improves quality IF g is handled

59
Gamma Correction
  • Hardware now has enough precision to handle gamma
    correctly
  • Correcting gamma from 8-bit textures was
    pointless on 8-bit hardware
  • DXG9 adds explicit API support for gamma
    correction on reads and writes
  • 2 color space standards supported
  • sRGB gamma 2.2 all current content
  • scRGB gamma 1.0 (linear) 16-bit precision

60
Gamma Input Syntax
  • New TextureStageState in DXG9
  • D3DTSS_GAMMA
  • sRGB
  • default
  • Indicates that gamma correction from 2.2 space to
    linear will occur on read
  • Performs exponentiation of each channel by 2.2
    before pixel shader

61
Gamma Output Syntax
  • Need to correct back to g 2.2 for framebuffer
    display
  • Involves taking high-precision data to 0.45th
    power before write to fb blender
  • Enabled by Renderstate
  • D3DRS_RTsRGB, TRUE
  • Indicates that RenderTarget is in sRGB format
    that is gamma corrected
  • relative to pixel shader math

62
Potential Scenarios
  • All current texture art authored at g2.2
  • 8-bit g 2.2 render target
  • 10-bit g 2.2 render target
  • 10-bit linear render target
  • Correct in DACs?
  • 16-bit integer render target

63
Gamma Usage Patterns
  • Some applications need to use multipass blending
  • Gamma correction is in pixel shader, not frame
    buffer blender
  • Apps will need to do lots of
  • SetTexture( RenderTarget )

64
DirectX10
  • Continue to increase generality
  • Conditionals (predicate)
  • Subroutines (finite depth stack)
  • Still arg calling conventions
  • Pixel-Vertex shader integration

65
Pixel-Vertex Integration
  • Equivalent precisions
  • No double requirement
  • Same instruction set
  • Implementations may share math units?
  • Opaque routines (provided by impl)
  • Tessellate( vertex value, vertex value)
  • Evaluate( pixel value, vertex value)
  • Rationalized input formats
  • Vertex streams vs texture pixel fmts

66
DX10 Gamma
  • Full gamma support in ps for DX9
  • DXG10 hw should support gamma in FB blender
  • Performs xg on reads and x1/g on write
  • With blend math at gt14-bit precision
  • Enables blending of data with arbitrary input
    gamma for multi-pass methods
Write a Comment
User Comments (0)
About PowerShow.com