Title: Dependency Graph Scheduling in a Ray Tracing Architecture
1Dependency Graph Scheduling in a Ray Tracing
Architecture
- Susan Frank and Arie Kaufman
- Center for Visual Computing
- Department of Computer Science
- State University of New York at Stony Brook, USA
2Why use ray tracing?
- Global illumination
- Unifying technique for volumes
- Processing and rendering
- Triangles, points, implicit surfaces etc.
- Early ray termination
- Scene complexity independence
- Inherently parallel
3Why not use ray tracing?
- Non-uniform memory access
- Need spatial coherence
4Ray Tracing Systems
- Ray Queues Pharr et al. 97
- GI-Cube Dachille, Kaufman 00
- Pyramid clipping and octree subdivision Reinhard
et al. 99 - Kilauea system Nishimura et al. 01
- AR250 ART 99
- Coherent Ray Tracing Wald, et al. 2001
5Outline
- Our System
- Cell Tree
- Dependency Graph Scheduling
- Peel Algorithm
- Results
6GI-Cube Architecture
DSP
PCI Bus
SDRAM
Ray Bus
Block Processor
Block Processor
Block Processor
Block Processor
800MHz
RDRAM
RDRAM
RDRAM
RDRAM
7Single Processor
Frame Buffer
Block Processor
DSP
Main Memory
Ray Bus
PCI Bus
SDRAM
RDRAM
CPU
8Ray Queues
- Maintain ray queue for each cell
- Process all rays while a cell is in cache
- Spawned rays added to queue of next intersected
cell
Subdivide Volume Into Cells
9Our Scheduling Schema
- Cell Tree
- Ray-cell dependencies from frame i used to create
schedule for frame i1 - Max Work
- First frame (ray dependencies unknown) and if
rays remain after Cell Tree schedule - Any level of the memory hierarchy
- Cell size set to memory size
10Outline
- Our System
- Cell Tree
- Dependency Graph Scheduling
- Peel Algorithm
- Results
11Psuedo-Random Ray Traversal
12Cell Tree
- Gathers clusters of rays as theyre generated
- Concisely describes all ray-cell dependencies of
completed frame - 100 times fewer nodes than rays represented
- Predict better schedule for next frame
13Cell Tree Creation
- Initialize
- Maintain CellTreeNode in Ray Packet
- Add nodes to Cell Tree as needed to represent
ray-cell dependencies
14Ray Packet
256
Position X
Position Y
Position Z
Direction X
196
Direction Z
Direction Y
Destination U
Lifetime
Destination V
128
Contribution
Ray ID
Generation
Opacity
64
Red
CellTreeNode
Type
Cell
Interaction
Green
Blue
0
0
4
8
12
16
20
24
28
32
15Initialization
1
root
4
5
0
1
7
6
3
2
16Initialization
1
root
4
5
0
1
7
6
3
2
17Initialization
1
root
2
4
5
0
1
7
6
3
2
18Initialization
1
root
2
4
5
0
1
7
6
3
2
19Reflections Refractions and Shadows
1
root
1
2
5
4
0
1
7
6
3
2
20Reflections Refractions and Shadows
1
root
1
2
5
4
0
1
7
6
3
2
21Reflections Refractions and Shadows
1
root
1
2
5
5
4
0
1
7
6
3
2
22Reflections Refractions and Shadows
1
root
1
2
5
5
4
0
1
7
6
3
2
23Reflections Refractions and Shadows
2
1
root
1
2
5
5
4
0
1
7
6
3
2
24Reflections Refractions and Shadows
2
1
root
1
2
5
5
4
0
1
7
6
3
2
25Secondary Reflections
2
1
2
root
1
2
5
5
4
0
1
7
6
3
2
26Secondary Reflections
2
1
2
root
1
2
5
5
4
0
1
7
6
3
2
27Secondary Reflections
2
1
2
root
1
2
2
5
5
4
0
1
7
6
3
2
28Secondary Reflections
1
2
1
2
root
1
2
2
5
5
4
0
1
7
6
3
2
29Outline
- Our System
- Cell Tree
- Dependency Graph Scheduling
- Peel Algorithm
- Results
30Task Scheduling Problem
- Goal - minimize memory fetches
- Equivalently - minimize color changes in super
sequence which contains all sequences
31Cyclic Dependency Graphs
- Rays must visit cells in a particular order
- A ray may revisit a cell several times
- Sub-volume must be cached each time
Cell 0
Cell 1
Cell 3
Cell 2
32Cache Saving Links
4
3
2
0
1
feasible schedule - all rays can be processed in
required order
5
1
1
3
root
0
6
2
0
2
3
1
conflict - no feasible schedule contains both
links
7
3
1
33Cache Saving Links
4
2
0
1
5
1
1
3
root
0
6
2
0
2
optimal schedule - maximal group of
non-conflicting links
3
1
7
3
1
34Chains
4
2
0
- Chain of non-conflicting links may produce a
non-feasible schedule
1
5
1
1
3
root
0
6
2
0
2
3
1
7
3
1
35Multiple Chains
4
2
0
- A combination of chains may also produce a
non-feasible schedule
1
5
1
1
3
root
0
6
2
0
2
3
1
7
3
1
36Definitions
3
0
1
- generation(node) - nodes between root and node
with same cell as node - maxGen(cell) - max number of times any ray enters
cell
5
1
3
1
3
root
2
3
2
1
1
3
2
1
3
1
3
37Optimal Bound
3
0
1
- schedule size gt
- ? maxGen(cell)
5
1
3
1
cells
3
root
2
3
2
1
1
3
2
1
3
1
3
38Outline
- Our System
- Cell Tree
- Dependency Graph Scheduling
- Peel Algorithm
- Results
39Peel Algorithm
- Peel tree leaves to create reverse schedule
- Gather cache savings links
40Completion Peel
3
0
1
- Remove ready cell leaf nodes from tree and add it
to schedule
5
1
3
1
3
root
2
3
2
1
1
3
2
ready cell - all the maxGen nodes of a cell are
leaf nodes
1
3
1
3
41Split Peel
3
0
1
- Remove non-ready cell leaf nodes from tree and
add it to schedule
5
1
3
1
3
root
2
3
2
1
1
3
2
1
3
1
42Peel (tree)
- While tree has any nodes
- Does tree have a ready cell c?
- yes - add c to schedule and peel c leaf nodes
- no - Find cellmax with most leaf nodes
Peel cellmax and add it to schedule - Return schedule
43Peel (tree)
- While tree has any nodes
- Does tree have a ready cell c?
- yes - add c to schedule and peel c leaf nodes
- no - Find cellmax with most leaf nodes
- Peel cellmax and add it to schedule
- Return schedule
44Peel (tree)
- While tree has any nodes
- Does tree have a ready cell c?
- yes - add c to schedule and peel c leaf nodes
- no - Find cellmax with most leaf nodes
- Peel cellmax and add it to schedule
- Return schedule
45Peel (tree)
- While tree has any nodes
- Does tree have a ready cell c?
- yes - add c to schedule and peel c leaf nodes
- no - Find cellmax with most leaf nodes
- Peel cellmax and add it to schedule
- Return schedule
46Peel (tree)
- While tree has any nodes
- Does tree have a ready cell c?
- yes - add c to schedule and peel c leaf nodes
- no - Find cellmax with most leaf nodes
- Peel cellmax and add it to schedule
- Return schedule
474
2
0
1
5
1
1
3
root
6
2
0
0
2
3
1
7
3
1
484
2
0
5
1
3
root
6
2
0
0
2
3
7
3
492
0
5
1
3
root
6
2
0
0
2
3
7
3
502
0
1
3
root
6
2
0
0
2
3
7
3
512
0
1
3
root
2
0
0
2
3
7
3
522
0
1
3
root
2
0
0
2
3
3
532
0
1
3
root
2
2
3
3
540
1
3
root
2
3
3
550
1
root
2
561
root
2
57root
2
58Algorithm Performance
- Guaranteed feasible
- Not guaranteed optimal
- Worst time O(n)
- Improvement over Max Work
- Hardware implementation reasonable
59Outline
- Our System
- Cell Tree
- Dependency Graph Scheduling
- Peel Algorithm
- Results
60Tests
- C simulation
- SGI 02/RISC 10000 128MB
- Volumes split into 8 cells and 27 cells
- Image resolution 2562
61(No Transcript)
62(No Transcript)
63(No Transcript)
64Cell Tree Sizes
Â
6530 Fewer Fetches
Â
66Conclusion
- Cell Tree captures all ray-cell dependencies
- Dependency graph based algorithm significantly
improves cache performance
67Future Work
- Dynamic load balance
- Dynamic volume subdivision
- Multi-level memory hierarchy
- Limited depth recursion
68Acknowledgments
- ONR Grant N00140110034
- NYSTAR Grant COD0057
- CES Computer Solutions Inc.
- Kevin Kreeger, Frank Dachille, Michael Bender,
Nan Zhang
69Thank you!