Title:
1QSplat A Multiresolution Point Rendering
System for Large Data Meshes
- Authors
- Szymon Rusinklewicz
- Marc Levoy
- Presentation
- Nathaniel Fout
2Motivation
- A quick review
- - Rendering time is a very strong function of
scene complexity - - Which class of rendering algorithms is this
not true for ? - - Does this pose a problem for rendering the
Stanford - Bunny in real time? What about 100 Bunnies?
- What about 1000 Bunnies?
- ( 4.5 billion tri/sec)
- - Current graphics hardware
- nVidia Quadro at 17 million tri/sec
3Motivation
- Who would want to render 1000 Bunnies in real
time ? - Practical Applications
- - rendering complex terrain (games, simulators)
- - rendering sampled models of physical objects
- Advances in scanning technology have enabled the
creation of very large meshes with hundreds of
millions of polygons - Conventional rendering will not work. Why not?
- Obviously insufficient triangle throughput, but
what about storage? - 1,000,000 triangles ? 36 MB
- 100,000,000 triangles ? 3600 MB
- etc
4Rendering Large Data Sets
- Methodologies for dealing with this problem
- 1) Visibility Culling includes frustum
culling, back- - face culling, occlusion culling
- 2) LOD Control discrete or fine-grained
control - 3) Geometric Compression saves on storage
- costs, but must be
- decoded to render
- 4) Point Rendering use a simpler primitive,
the - point, instead of triangles
- Many algorithms use some of these techniques
QSplat uses all of them
5What is QSplat ?
- QSplat is a point-based rendering system that
uses the visibility culling, LOD control, and
geometric compression to render very large
triangular meshes. The core of the renderer is a
hierarchical data structure consisting of
bounding spheres.
6General Description
- Basic Idea instead of rendering all those
polygons, lets approximate the mesh with points
along the surface - We can then splat these points on the image
plane z-buffer takes care of visibility as usual - Point samples are organized in a hierarchical
fashion using bounding spheres this facilitates
easy visibility culling, LOD control, and
rendering - Hierarchy construction is a preprocessing step
it is done once only and saved to a disk
7Rendering
- The rendering algorithm
-
- TraverseHierarchy(node)
- if (node not visible)
- skip this branch of the tree
- else if (node is a leaf node)
- draw a splat
- else if (benefit of recursing further is
low) - draw a splat
- else
- for each child in children(node)
- TraverseHierarchy(child)
-
? Visibility Culling
? LOD Control
8Rendering Visibility Culling
- Frustum culling is performed by testing the
bounding sphere against all six planes of the
viewing frustum - Each node stores a normal cone
- which is a collective representation
- of the normals of the subtree for
- that node this cone is used
- for back face culling
- Occlusion culling is not used
N
9Rendering LOD Control
- LOD control is accomplished by adjusting the
depth of recursion when traversing the tree - There are two factors which control the depth of
recursion - - projected screen space area of the bounding
sphere - - user selected frame rate
- If the projected area of the sphere exceeds a
threshold value then we descend to the next level - A feedback adjustment takes place to keep the
frame rate at a user specified value this
adjustment is based simply on the ratio of actual
to desired frame rate - Progressive refinement is initiated once the user
stops moving the area threshold is successively
reduced until it is the size of a pixel
10LOD Control
Threshold 15 pixels Points
130,712 Rendering Time 132 ms
Threshold 1 pixel Points
14,835,967 Rendering Time 8308 ms
Michelangelos statue of St. Matthew
11Preprocessing
- Building the Hierarchy tree
- What do the nodes look like?
12Preprocessing
- Building the hierarchy tree
- - we begin with a list of vertices
- - next we find a bounding box
- which contains the vertices
- - find the midpoint vertex along the
- longest axis of the bounding box
- - split the set of vertices into two
- parts
- - this creates the two children of
- the current node
- - the current node corresponds to
- the bounding sphere of the two
- child nodes
- - continue recursively
13Preprocessing
- Preprocessing Issues
- - to ensure that there are no holes in the
rendering - we set the leaf node spheres to be a certain
size - If two vertices are joined by an edge, then
the - spheres for those vertices are made large
- enough to touch each other.
- Also, the size of a sphere at a vertex is set
to - the size of the maximum sphere of the
- vertices which make up that triangle
- - to decrease the size of the tree, nodes are
combined to - increase the average branching factor to 4
- - after the tree is created the properties of
the nodes are - calculated
14Design Overview
15Design Details
position and radius
- Position and radius of sphere encoded as offsets
relative to parent and quantized to 13 values - Not all of 134 values are valid in fact, only
7621 are valid - Incremental encoding of geometry essentially
spreads out the bits of information among the
levels of the hierarchy - Note that connectivity information is discarded
- Encoding saves space but increases rendering time
due to the necessity of decoding on-the-fly - Quantization saves space but pays for it by
sacrificing accuracy
16Design Details
tree structure
- Information as to the structure of the tree is
necessary for traversal since the number of
children may vary - Normally a pointer is kept for each child
however, if we store the tree in breadth-first
order then we only need one pointer for each
group of siblings - This one pointer (along with the tree structure
bits) is enough for traversal - The first two bits represent the number of
children 0, 2, 3, or 4 - The last bit indicates whether or not all
children are leaf nodes
17Design Details
normal
- Normals are quantized to 14 bits
- These bits hold an encoded direction a virtual
cube with each face sub-divided into a 52 x 52
grid represents the possible values - Grid positions are warped to sample normal space
more uniformly - Unlike the range of positions, normal space is
bounded this makes it efficient to use a single
look-up table for rendering - Incremental encoding is more expensive to decode
and is not used for normals - Banding artifacts can be seen in specular
highlights
18Design Details
width of normal cone
- Width of normal cone is quantized to four values
- cones whose half-angles have sines of 1/16,
4/16, 9/16, or 16/16 - On typical data sets, back face culling with
these quantized cone values discards over 90 of
nodes which would be discarded were exact normal
cone widths to be used - Again, incremental encoding could be used, but
with a penalty in rendering time
19Design Details
color
- Colors stored using 16 bits (RGB as 5-6-5)
20Design Details
- file layout
- - as a consequence of storing the tree in
breadth-first order, the - information necessary to render at low
resolution is located - in the first part of the file
- - therefore only a working set needs to be
loaded into memory - wait to load in a tree level until it is
needed - - this progressive loading may slow frame rates
temporarily - when zooming in for the first time, but
greatly increases initial - load time
- - speculative prefetching could help to amend
this problem
21Design Details Splatting
- Splat Shape
- - OpenGL point (rendered as a square)
- - opaque circle
- - fuzzy spot which decays radially as Gaussian
using - alpha blending
- In order to render points as fuzzy spots we need
to make sure splats are drawn in the correct
order. - We can accomplish this with multi-pass
rendering - 1. Offset depth values by some amount z0
- 2. Render only into the depth buffer
- 3. Unset depth offset and render additively
- into the color buffer using depth
- comparison but not depth update
22Design Details Splatting
- Comparison of splats with a constant size
- Gaussian kernel exhibits less aliasing
- Relative rendering times for square, circle, and
Gaussian are 1, 2, and 4 respectively - Constant threshold of 20 pixels
23Design Details Splatting
- Based on this comparison it is better to use
Gaussian kernels, right? - Not all splats are rendered in the same amount of
time - What if we allow the threshold to fluctuate, but
constrain the rendering times to be the same - Sample Rate vs. Reconstruction Quality
24Design Details Splatting
- Comparison of splats with constant rendering
time
- Based on rendering time the square is the best
splat shape to use - Note that results will be hardware dependent
25Design Details Splatting
- Another consideration is whether the splats are
always round or if they can be elliptical
(perspectively correct) - Can use the node normal to determine eccentricity
of ellipse - Using elliptical splats reduce noise and enhance
the smoothness of silhouettes - Using ellipses can cause holes to occur
26Design Details Splatting
- A visual comparison of circles vs. ellipses
27Performance
- Typical preprocessing times
28Performance
- Preprocessing timing comparisons
- - Hoppe reports 10 hours for 200,000 vertices
- - Luebke and Erikson report 121 seconds for
- 281,000 vertices
- - QSplat can process 200,000 vertices in
- under 5 seconds
- - Comparisons with mesh simplification for a
bunny - with 35,000 vertices
- Lindstrom and Turk report 30 s to 45 min
- Rossignac and Borrel report less than 1 s
- QSplat takes 0.6 s
29Performance
30Performance
- Rendering Performance
- - QSplat can render between 1.5 and 2.5
- million points per second
- - Hoppe reports 480,000 polygons per second
- with progressive meshes
- - ROAM system (a terrain rendering system)
- reports 180,000 polygons per second
- - QSplat can render 250 400 thousand points
- per second on a laptop with no 3D graphics
- hardware
31Performance
- Rendering Performance
- - Comparison with polygon rendering
b) Polygons same number of primitives and same
rendering time as a)
c) Polygons same number of vertices as a) but
twice the rendering time
a) Points
32Conclusions
- QSplat accomplishes its goal of interactive
rendering of very large data sets - QSplats performance both in preprocessing and
rendering is competitive with the fastest
progressive display algorithms and mesh
simplification algorithms - Geometric compression achieved by QSplat is close
to that of current geometric compression
techniques - QSplat can be implemented independent of 3D
graphics hardware
33Future Work
- Huffman coding could be used to achieve even
greater compression, but would require further
decompression prior to rendering - For cases when rendering speed is more important
and storage is not a problem, the incremental
encoding could be removed - The rendering algorithm could be parallelized by
distributing different parts of the tree to
different processors - Exploration into using the data structure for ray
tracing acceleration - Exploration into instancing for for scenes
- Exploration into storing additional items in the
nodes such as transparency, etc.
34Some final pictures
QSplat on display at a museum in Florence some
kids kept crashing the program by zooming in too
close.