Title: Lower Bounds, Alternative Models, etc'
1Lower Bounds, Alternative Models, etc.
2External Memory Data Structures
(n, log nT/B) (nBe, logBnT/B)
(n, log n T/B)?
B
2
3Previous Results Internal Memory
- Computation model Pointer machine
- Most upper bound structures also fall into this
model - Range searching (T is the output size)
- O(N) space, O(NeT) time (BM 80)
- O(N logN / loglogN) space, O(logNT) time
Chazelle 88 - Tight for O(logcNT) query structures, Chazelle
90 - Point enclosure Chazelle 86
- Persistent interval tree
- ?(N) space, ?(logNT) time
- Optimal in both space and time
4External Memory Models
- External pointer machine
- Natural generalization of the internal pointer
machine - Each node contains B data objects
- Out-degree 2 ?B
- Little success
- Bounding-volume hierarchy (Non-replicating index
structure) - Tree structure
- Each object is stored only once
- Lower bound known for R-trees, kdB-trees
- Indexability model Hellerstein et al. PODS 97,
Arge, Samoladas, and Yi, ESA04
D
Block I/O
M
P
5External Memory Models
- Indexability model
- No structure at all!
- Only models layout of data
- Each block contains B data objects
- Can magically find the smallest set ? of blocks
whose union contains all results - Cost is defined to be ?
Indexability model
External pointer machine
Bounding volume hierarchy
6Indexability Model in Details
- N data objects laid out in disk blocks, possibly
with redundancy - Each block holds at most B objects
- Cost of a query q minimum blocks needed to
retrieve all answers - Can find those blocks without cost
- Redundancy r and access overhead A0, A1
- r Average copies in the index
- Size is rn blocks
- A0, A1 Any query can be covered by
- Lower bound expressed as a tradeoff between r and
A0, A1
7Trade-off Results
- Range searching
- Point enclosure
- Dual bounds in external memory!
8Alternative ModelsCache-Oblivious Model
?
9Memory Hierarchy
- Modern machines have complicated memory hierarchy
- Levels get larger and slower further away from
CPU - Block sizes and memory sizes are different!
- There are a few attempts to model the hierarchy
but not successful - They are too complicated!
10The Cache-Oblivious Model FLPR99
- Assume any (constant) number of levels in the
hierarchy with different M and B - Theorem If the algorithm works with any values
of M and B in the two-level model, then the
algorithm uses the optimal (up to a constant)
number of memory transfers on any level in the
memory hierarchy - We still analyze the algorithm in the standard
two-level model!
R A M
11Example Cache-Oblivious Search Tree
Question How to layout the binary tree in memory
such that any root-to-leaf path visits O(logBN)
blocks?
How can we make it work on any block size B?
12Van Emde Boas Layout
- Consider the first level of recursionwhere tree
size lt B - Tree size between and B
- So has height T(log B)
13Alternative ModelsDynamic Memory Model Barve
and Vitter, FOCS 98
- Allocated memory to algorithm changes over time
14Memory Allocated Changes Over Time
- Sorting
- One I/O allowed in each time unit
- Memory ( blocks) changes by 0, 1 or -1 in each
time unit - Resources consumed by one I/O when the memory
has m blocks is defined to be log m - A sorting algorithm is measured in terms of
resource consumption - Gave an merge-sort algorithm with resource
consumption O(n log n), this is also tight