Title: Advanced Graphics: Part 2
1Advanced Graphics Part 2
- Quick review of OpenGL
- Some more information about OpenGL
- Performance optimisation techniques generally
- Performance optimisation in OpenGL
- Assignment
- Project proposal
- Last time's homework
2Quick revision
- Walk through hello.c and rot.c
3A scene graph
- Can represent all objects in a scene as a graph
Each node contains geometry
World
Each edge stores a transformation
Xwing1
TIEFighter
Pilot
R2D2
Torpedo
Left Arm
Head
4The Basic OpenGL pipeline
- Start with scene graph lights.
- Transform everything into world space.
- Transform into camera space.
- Clip against the view volume.
- Light the vertices.
- Rasterise polygons (including visible surface
determination).
5Performance Optimisation in OpenGL
- Some quotes on optimisation"We should forget
about small efficiencies, say about 97 of the
time premature optimization is the root of all
evil. - Donald Knuth"Rules of
OptimizationRule 1 Don't do it.Rule 2 (for
experts only) Don't do it yet." - M.A.
Jackson
6But in graphics ...
- Frequently, performance is critical to
utility/value. - Working on the edge of the possible.
- So may have to optimise.
- Systems are engineered for optimisation.
7Making things run faster
- Approaches to optimisation
- Hardware acceleration
- Right data in the right place at the right time
- Getting rid of redundant calculations
- Tricking the eye ("close enough is good enough")
- Trading space for time
- Not drawing (elimination of what wouldn't be seen
anyway) - Writing it in assembly/C (but very rarely,
usually last)
8Hardware acceleration
- Can be a good option.
- Problem Price-performance curve is exponential
Price()
Performance(polys/sec)
9More on hardware acceleration
- Implication It's easy to get very good
performance using hardware accel, but it gets
extremely expensive when trying to obtain
excellent performance. - Don't forget Moore's law.
- Interaction between long development times
Moore's law means sometimes problem "fixes
itself".
10Right data in the right place
- One of the best techniques
- The basis of caching
- Exploits locality -- likely to reuse the same
information again and again - Two types
- Temporal
- Spatial
11Another way to think of OpenGL
- OpenGL can be thought of as a client-server
architecture - Some examples of client-server
- The Web
- X windows
- When did we ever say that the client and server
were on the same machine? - OpenGL can run on a network
12Client-server concept
- The program that makes API calls is the client
- The OpenGL implementation is the server
- The client sends requests to the server
- Client and server may be different machines -
e.g. client is big mainframe spewing OpenGL
commands server is a PC with hardware
acceleration - Still convenient to think of as client my
program, server OS/driver/graphics card
13Client-server concept
- Client-server concept is still useful on a single
machine. - Intuition Client is your program, server is your
graphics card - Why is it useful concept? Important from a
performance point of view. Different performance
if data is stored at client or server.
14Right place at the right time
- This is where the client server stuff comes in.
- Now have graphics cards with 128MB on board.
- What use is it?
- Once data is on the graphics card, everything is
faster. - Problem Once it's on the graphics card, it can't
(easily) be modified.
15Display lists
- A very simple way to speed up OpenGL.
- Idea Take almost any sequence of OpenGL
commands, and package them up then you can use
them like macros. - Other libraries have similar concepts. e.g DX has
"execute buffers".
16When and why
- Why?
- Convenience give something akin to a function
calling structure but more efficient. - Efficiency hardware can optimise, reduces
function call overhead, data can live on the
graphics card - When?
- What you want to render is unlikely to change
- When you are reusing structure
- When you need speed
17Initialisation
- 3 steps Initialise, define, use.
- Get a display list ID (actually an int) using
glGenLists(size) - Can request more than one list at a time.
- Returns an int you can use. Return 0 if none
available
18Definition
- Like glBegin() and glEnd()
- glNewList(index, GL_COMPILE)
- ... code for rendering things ...
- glEndList()
- Instead of GL_COMPILE, can be GL_COMPILE_AND_EXECU
TE
19Use
- To render stuff, use glCallList(index)
- IMPORTANT NOTES
- Almost anything can go in a display list matrix
ops, material defs, textures, geometry, lights,
whatever ... - Display lists COPY data you can't modify the
data once it's in a display list, even if it's a
reference (i.e. e.g. if you use glfv(object), it
won't notice when object changes). - Display lists affect and are effected by the
current matrix stack values!!
20What CAN'T you call for a DL
- Some things not allowed
- Anything that asks about the current state.
- Anything that changes the rendering mode.
- Anything that makes or deletes a list (but
calling another display list is fine - can use
this to build a hierarchy)
21Code example
- Look at nodisplaylist.c vs displaylist.c
- Conclusion
- Likely to be much faster, since data lives on
graphics card. - Not much effort.
22Redundant calculations
- Also very important optimisation technique.
- Closely related to locality idea.
23Redundant calculations
- An example Vertex arrays.
- Consider rendering a cube in OpenGL.
7
6
glBegin(GL_QUAD) glVertex3f(x0,y0,z0)glVertex3f
(x1,y1,z1) glVertex3f(x2,y2,z2)
glVertex3f(x3,y3,z3) glEnd() glBegin(GL_QUAD)
glVertex3f(x1,y1,z1) glVertex3f(x5,y5,z5) glVer
tex3f(x6,y6,z6) glVertex3f(x2,y2,z2) glEnd()
3
3
3
3
3
3
3
3
4
5
2
2
2
2
2
2
2
2
0
0
0
0
0
0
0
0
1
1
1
1
24Question
- How many points are transformed and lit in
previous rendering of cube? - How many points would minimally have to be
transformed and lit in previous rendering? - How much calculations are wasted?
Answers 24, 8, 67 per cent
25Huge waste!
- Same calculations are repeated.
- How to solve?
- Use indexed face set data structure.
- Consists of two lists
- A list of coordinates.
- A list of polygons a list of lists of vertex
indices.
26Cube example
- float vertices x0,y0,z0, x1,y1,z1,
x2,y2,z2, ..., x7,y7,z7 - int faces 0,1,2,3, 0,5,6,2, ...,
4,5,6,7 - But what about other data, e.g. surface normals?
- Need to store them too.
27Problem Needs API support
- To do this efficiently, API needs to support such
an approach. - Any good graphics API (e.g. OpenGL, DX8,
Inventor, VRML97, etc) supports this. - Have various names.
- In OpenGL, called a vertex array.
28Using Vertex Arrays
- Can have up to 6 different arrays, for
- Vertex coordinates
- Normals
- Colours
- Texture coordinates
- A few other funky ones index, edge flag
- Enable which ever arrays you need
- glEnableClientState(GL_VERTEX_ARRAY)
29Step 2
- After initialising, tell it where the data lives
- e.g. glVertexPointer(size, type, stride,
vertices) - Size is number of values per vertex (typ. 2, 3 or
4) - Type GL_FLOAT or whatever
- Stride is for more funky stuff (e.g. interleaved
arrays) - Similar calls for glNormalPointer,
glTexCoordPointer etc
30Step 3 Access the data
- Lots of different ways to call. Simplest
glArrayElement(index). - Action depends on what's enabled, but let's say
only vertex arrays are enabled. Then this looks
up index in the last thing glVertexPointer was
called on (say x) and does glVertex3f(x). - If normal arrays were enabled,(and normal for
index was y) this would do glNormal3f(y)
glVertex3f(x) - NOTE belongs between glBegin, glEnd.
31Bunches of indices
- Can also give multiple points at once use
glDrawElements(mode, count, type, indices). - Mode is GL_LINE, GL_POLYGON, etc.
- Count is number of indices
- Type is usually GL_UNSIGNED_INT
- NOTE Does NOT go between a glBegin/glEnd
32glDrawElements
- Functionally equivalent to
- glBegin(mode)for(i0 i lt count i)
glArrayElement(indicesi) - glEnd()
- glDrawRangeElements() is similar, but you specify
a constrained range of indices.
33What does OpenGL do?
- Can cache previously transformed vertices
- Can use glDrawRangeElements to help tell OpenGL
what's going to change - glDrawElements can draw lots of objects. Example
if all polys have four vertices, then use
GL_QUADS instead and can give list of 24
vertices.
34Funky stuff
- Can do some weird things with interleaved arrays
- OpenGL extension compiled vertex arrays
- You tell the OpenGL when you won't be fiddling
the arrays and when you will. use
glLockArraysEXT() to lock it and
glUnlockArraysEXT() to unlock.
35Code Example
- vertexarray.c
- Note can mix and match normal with vertex arrays.
36Practical implications
- You CAN use display lists and vertex arrays at
the same time, but it's a bit tricky. - When you change data in a vertex array, and
render immediately, that's fine. But with a
display list, the data is copied. - Example Say you have a creature with constantly
moving body. Can't use a a display list. - But can use, for say, a helmet or a head.
37Space-time tradeoff
- Sometimes, can use more space to make algorithm
faster or vice versa. - E.g. can sometimes precompute values if they will
be reused alot. - Trading space for time example precomputing
sin/cos tables. - Trading time for space example compressed
textures (but really still about time).
38Tricking the eye
- Lots of examples in what you've already studied.
- E.g. Gouraud shading is nonsense theoretically.
- Strictly Gouraud shading should be
perspective-corrected. - Not noticeable for Gouraud, but IS noticeable for
texture maps.
39Not rendering things
- Back face culling not drawing polygons facing
away from us. - Easy to enable in OpenGL glEnable(GL_CULL_FACE)
- But lots of other examples e.g. using visibility
trees (similar to BSP trees) and portal systems
to cut back on polygons. Any coincidence games
are indoors? (more later) - Also the multires stuff and LOD (more later)
40Rewriting code
- Usually the last resort.
- Usually the big gains are in algorithmic
improvement, not rewriting code more efficiently
or re-implementing in C/Assembly. - Assembly less significant with RISC processors.
- Very time consuming both initially and long-term.
41Profiling
- Profiling is analysing software as it runs to see
how much time executing different parts of code. - General observation 90 per cent of time spent
executing 10 per cent of code. - Pointless optimising wrong thing.
- Example Say you improve code outside top 10 per
cent by 100 per cent. Will only make program run
5 per cent faster.
42Bottlenecks
- Profiling frequently reveals the bottleneck (the
thing that slows everything down). Type of
bottleneck suggest solution. - Typical bottlenecks
- Fill-limited Rasterising/texturing polygons.
Occurs with software renderers. - Geometry-limited Calculations of geometry. Too
many polygons/vertices. - Client-side limited Calculations on client side
(e.g. of vertex/texture coordinates). Code
optimization? Maybe
43Assignment 1
- Should not take too long - I estimate 15-20 hours
(if you C OpenGL). - Two parts
- Code Due 29 August 2003
- Report Due 4 September 2003
44What is it?
- Read a vertex array (OFF format)
- Render it rotating using immediate mode, display
lists and vertex arrays. - Small component on quaternions which we'll
discuss next week. - Compare performance.
- Either C/C or Java.
- Must compile in CSE labs
45Extras
- Encouraged to run it on different
hardware/operating systems so we get a good
cross-section. - then discuss results in class.
- Extension mark (20 per cent of ass mark)
available.
46Project proposal
- Project/seminar proposals due week 3.
- Worth 10 per cent of the assessment.
- Four pages maximum
47What to cover
- Who (partners, workload division, etc).
- Project outline.
- Existing work/software/sources (depends).
- Resources required (depends).
- Estimated time commitment (how long). Should be
around 40-60 hours - Basic sketch of your project what it will
contain, minimal functionality.