Efficient Partitioning of Fragment Shaders for Multiple-Output Hardware - PowerPoint PPT Presentation

About This Presentation

Title:

Efficient Partitioning of Fragment Shaders for Multiple-Output Hardware

Description:

Efficient Partitioning of Fragment Shaders for Multiple-Output Hardware. Tim Foley ... mark subset of nodes as splits. split nodes define pass boundaries. 2n ... – PowerPoint PPT presentation

Number of Views:27

Avg rating:3.0/5.0

Slides: 44

Provided by: ericc150

Learn more at: https://www.graphicshardware.org

Category:

more less

Transcript and Presenter's Notes

Title: Efficient Partitioning of Fragment Shaders for Multiple-Output Hardware

1
Efficient Partitioning of Fragment Shaders for
Multiple-Output Hardware

Tim Foley
Mike Houston
Pat Hanrahan
Computer Graphics Lab
Stanford University

2
Motivation

GPU Programming
Interactive shading
Offline rendering
Computation
physical simulations
numerical methods
BrookGPU Buck et al. 2004
Shouldnt be constrained by hardware limits
but demand high runtime performance

3
Motivation Multipass Partitioning

Divide GPU program (shader) into a partition
set of rendering passes
each pass satisfies all resource constraints
save/restore intermediate values in textures
Many possible partitions exist
The problem
given a program, find the best partition

4
Related Work

SGIs ISL Peercy et al. 2000
treat OpenGL machine as SIMD processor
Recursive Dominator Split (RDS) Chan et al.
2002
graph partitioning of shader dag
Data-Dependent Multipass Control Flow on GPU
Popa and McCool 2004
partition around flow control and schedule passes
Mio Riffel et al. 2004
instruction scheduling with backtracking

5
Contribution

Merging Recursive Dominator Split (MRDS)
MRDS Extends RDS
support shaders with multiple outputs
support hardware with multiple render targets
generate more optimal partitions
same running time as RDS

6
Outline

Motivation
Related Work
RDS Algorithm
MRDS Algorithm
Results
Future Work

7
RDS - Overview

Input dag of n nodes
shader ops
inputs
interpolants
constants
textures
Goal mark subset of nodes as splits
split nodes define pass boundaries
2n possible subsets

8
RDS - Overview

Input dag of n nodes
shader ops
inputs
interpolants
constants
textures
Goal mark subset of nodes as splits
split nodes define pass boundaries
2n possible subsets

9
RDS - Overview

Input dag of n nodes
shader ops
inputs
interpolants
constants
textures
Goal mark subset of nodes as splits
split nodes define pass boundaries
2n possible subsets

10
RDS - Overview

Combination of approaches to limit search space
Save/recompute decisions
primary performance tradeoff
Dominator tree
used to avoid save/recompute tradeoffs

11
RDS Save / Recompute

M multiply refereced node

12
RDS Save / Recompute

M multiply refereced node

13
RDS Save / Recompute

M multiply refereced node

14
RDS Save / Recompute

M multiply refereced node

15
Dominator

B dom G
all paths to B go through G

16
Dominator Tree
17
Key Insight

if B, G in same pass
and B dom G
then no save/recompute costs for G

18
MRDS Multiple-Output Shaders
19
MRDS Multiple-Output Shaders
20
MRDS Multiple-Output Hardware
float4 x, y ... for( i0 iltN i ) x' xx
- yy y' 2xy x x' y y' ...
21
MRDS Multiple-Output Hardware
float4 x, y ... for( i0 iltN i ) x' f(
x, y ) y' g( x, y ) x x' y y' ...
22
MRDS Multiple-Output Hardware
float4 x, y ... for( i0 iltN i ) x' f(
x, y ) y' g( x, y ) x x' y y' ...
23
MRDS Multiple-Output Hardware