Title: CompilerAssisted Optimization for Graphics
1Compiler-Assisted Optimization for Graphics
- Paul Arthur Navrátil
- The University of Texas at Austin
2Motivation OpenGL Extensions
- Hardware manufacturers create acceleration
features for market advantage - Not all extensions supported on all hardware
- 300 extensions in OpenGL Extension registry
- 90 extensions supported by NVIDIA NV3x
- 75 extensions supported by ATI RADEON 9x00
- Programmers wont become an expert on each
implementation of each extension - Extension pragmatics are not documented
3Current Compilers Are Dumb
- Current compilers understand only syntax
- Incorrect use of library abstractions are allowed
float vtx func_023(200func_041(float)) func_
174(2, 0x1406, 0, vtx) func_288(0x8074) func_07
5(vtx)
GLfloat vtx malloc(200sizeof(GLfloat)) glVert
exPointer(2, GL_FLOAT, 0, vtx) glEnableClientSta
te(GL_VERTEX_ARRAY) free(vtx)
GLfloat vtx glXAllocateMemoryNV(200sizeof(GLfl
oat),0.0,0.0,1.0) glVertexPointer(2, GL_FLOAT,
0, vtx) glEnableClientState(GL_VERTEX_ARRAY_RANGE
_NV) glEnableClientState(GL_VERTEX_ARRAY) glXFre
eMemoryNV(vtx)
Adapted from Sam Guyer
4Solution Library Annotations
- Expert annotates extension definition
- Makes programmer knowledge explicit
- Relationships among functions (e.g. malloc and
free) - Function arguments, e.g. glBegin(), glEnable()
- Specializations, e.g. GL_VERTEX_ARRAY,
GL_VERTEX_ARRAY_RANGE_NV - Compiler applies expert knowledge to user code
- Effort amortized across use of library
- Need a complete system to perform all opts
5System Broadway Guyer, Lin 2000, 2003 Guyer
2003
- Library meta-interface to describe
- Domain-specific analysis
- Domain-specific transformations
- Client-driven adaptive analysis strength
- Flow Context Insensitive Sensitive
- Provides powerful analysis at cheap cost
- Pointer analysis important too
- Need accurate picture of memory
6Example Vertex Arrays
-
- // Mesa GL types
- include "pnav_enum.h"
- include ltmalloc.hgt
- / these OpenGL functions use vertex and color
arrays / - void glVertexPointer( int size, int type, int
stride, int ptr ) - void glColorPointer( int size, int type, int
stride, int ptr ) - void glTexCoordPointer( int size, int type, int
stride, int ptr ) - / this is called to enable the use of the
arrays / - / vals are GL_VERTEX_ARRAY, GL_COLOR_ARRAY,
GL_TEXTURE_COORD_ARRAY / - void glEnableClientState( int type )
-
- property MemoryType Normal, Vector, Color,
Texture - initially Normal
7Example Vertex Arrays
- procedure glVertexPointer( size, type, stride,
ptr ) -
- on_entry ptr --gt memory_chunk
- analyze MemoryType memory_chunk lt- Vector
- when (11) replace with
- glVertexArrayRangeNV( size, ptr )
- glVertexPointer( size, type,
stride, ptr ) -
-
- procedure glEnableClientState( type )
-
- when (type GL_VERTEX_ARRAY) replace-with
- glEnableClientState(GL_VERTEX_ARRAY_RANGE_NV
) - glEnableClientState(GL_VERTEX_ARRAY)
-
8Example Vertex Arrays
- procedure malloc( size )
-
- on_exit return --gt new memory_chunk
- modify memory_chunk
- when (MemoryType memory_chunk is-exactly
Vector - replace-with glAllocateMemoryNV(size, 0.0,
0.0, 0.75) -
- procedure free( ptr )
-
- on_entry ptr --gt memory_chunk
- modify memory_chunk
- when (MemoryType memory_chunk is-exactly
Vector) - replace-with glXFreeMemoryNV( ptr )
9Initial Results gfxbench
- Adaptation of class project code Ian Buck
- Originally designed to test rasterization pattern
- Triangle strip 100 triangles, 202 vertices
- Strip rotated 360º in 5º increments
- Strip repeatedly rendered for one second of
clock time per orientation ( 3 sec)
10Initial Results gfxbench
11Issues gfxbench and beyond
- Is the extension used correctly?
- Are vertices getting flushed prematurely?
- Is the vertex declaration in the right place to
capitalize on the benefits of the extension? - GLUT windowing callback structure
- Broadways analysis cannot reach into function
specified by glutDisplayFunc() - Annotation of glutMainLoop() may fix this
- Do we even need Broadway?
- Implemented gfxbench optimization in perl
- More complex programs will need pointer analysis
- How many others can be inserted automatically?
12Future Work
- More complex models
- Tessellated spheres 13k, 130k triangles
- Texture mapped surfaces
- Suggest opportunities to use subjective
extensions - Expand to errors
- OpenGL
- Matching glBegin(), glEnd() pairs
- Invalid command between glBegin(), glEnd()
- Windowing libraries (GLUT, X, MFC)
- No glutPostRedisplay() call in function marked
with glutReshapeFunc() - glutKeyboardFunc() vs. glutSpecialFunc()