Title: Data Structures for Orthogonal Range Queries
1Data Structures for Orthogonal Range Queries
- A New Data Structure and Comparison to Previous
Work. - Application to Contact Detection in Solid
Mechanics. - Sean Mauch
- Caltech
- April, 2003
2Terminology Orthogonal Range Queries
- An orthogonal range query, (ORQ), determines the
points in a set which lie within a given window
(bounding box). - To the right is a set of points representing the
location and population of cities. - We do an orthogonal range query on the cities We
show the window and its projections onto the
coordinate planes.
3Example Application Contact Detection
- Algorithm for solid mechanics simulation with
contact detection - Search for potential contacts between nodes and
surfaces. Which nodes are close to each face? - Do detailed contact check among potential
contacts. - Enforce contacts by computing force required to
remove overlap.
Contact search.
Contact check.
4Contact Detection Depends on ORQs
- Searches for potential contacts are done with
orthogonal range queries. - The capture box contains the nodes that may
contact the face. - An ORQ finds the nodes in the capture box.
- Detecting possible contacts is computationally
expensive, typically half of total solid
simulation execution time. - Good performance depends on choosing an efficient
data structure and search algorithm for the
orthogonal range queries.
5Data Structures for ORQs
- Octrees and kd-trees.
- Tried and tested data structures from
computational geometry. - Point in box method, (Swegle, et al.).
- Developed at Sandia National Laboratories for
doing parallel contact. - Cell arrays.
- A bucket sort in 3-D.
- Optimal computational complexity when properly
tuned. - Dense arrays may have a high storage overhead.
- Cell arrays coupled with binary and forward
searching. - New data structure.
6Kd-trees
- Kd-trees recursively divide the domain by
choosing the median in the chosen coordinate
direction. - Depth is determined only by the number of points.
7Octrees
- Quadtrees (2-D) and Octrees (3-D) recursively
divide space into quadrants or octants. - Depth is determined by the number of points and
point spacing.
8Point-in-box Method (Swegle et. al.)
- Sorts and ranks the points in each coordinate
direction. - There are three slices for a window. Choose the
slice with the least points and see if those
points are in the window. - The rank array allows integer instead of floating
point comparisons which is much faster on some
processors.
9Cell Arrays
- The computational domain is spanned by an array
of cells. Each cell contains a set of points. - Constant time access to cells, but perhaps large
storage overhead because of empty cells.
10Sparse Cell Arrays
- Array is sparse in one dimension.
- Store only non-empty cells.
- Trade reduced storage for increased cell access
time.
11Cell Arrays Coupled with Searching
- Cell array does not span one dimension. Produces
long, thin cells. - Compared to sparse cell array, we search on
records instead of on cells. - Binary search on records is similar to sparse
cell array. - If we sort the queries we can do forward
searching in each cell.
12List of Methods
Label Description
seq. scan sequential scan
projection projection
pt-in-box point-in-box
kd-tree kd-tree
kd-tree d. kd-tree with domain checking
octree octree
cell cell array
sparse cell sparse cell array
cell b. s. cell array with binary search
cell f. s. cell array with forward search
cell f. s. k. cell array with forward search on keys
13Parameters
- N The number of records.
- K The records are in K-dimensional space.
- M The number of cells.
- I The number of records in the query.
- R The ratio of the length of a query range to
the length of a cell. - Q The number of queries.
14Computational Complexity Table
Method Expected Complexity of ORQ Storage
seq. scan N N
projection K log NN1-1/K K N
pt-in-box K log NN1-1/K (2K1) N
kd-tree log NI N
kd-tree d. log NI N
octree log NI (D1) N
cell (R1)K(11/R)K I MN
sparse cell (R1)K-1 log(M1/K)(R1)K(11/R)K I M1-1/KN
cell b. s. (R1)K-1 log(N/M1-1/K)(11/R)K-1 I M1-1/KN
cell f. s. (R1)K-1N/Q(11/R)K-1 I M1-1/KN
cell f. s. k. (R1)K-1N/Q(11/R)K-1 I M1-1/KN
15Test Problem Randomly Distributed Points
- Records are randomly distributed points in the
unit cube. - Number of records varies from 100 to 1,000,000.
- ORQ on a box containing approximately 10 records
is performed around each point.
16Execution Time Random Points
17Memory Usage Random Points
18Conclusions
- Projection methods like Point-In-Box perform
poorly. - The kd-tree does not yield the best performance,
but it performs fairly well over a wide range of
problems. - Cell methods offer the best performance when they
are well suited to the problem at hand. - Dense cell arrays work well for uniformly
distributed data. - Sparse cell arrays and cell arrays coupled with
binary searching offer good performance for most
problems. - Cell arrays coupled with forward searching offers
the best performance when doing multiple queries.