Title: Fast, precise and dynamic distance queries
1Fast, precise and dynamic distance queries
- Yair Bartal Hebrew U.
- Lee-Ad Gottlieb Weizmann ? Hebrew U.
- Liam Roditty Bar Ilan
- Tsvi Kopelowitz Bar Ilan ? Weizmann
- Moshe Lewenstein Bar Ilan
TexPoint fonts used in EMF. Read the TexPoint
manual before you delete this box. AAAA
2Distance oracles
- A distance oracle for a point set S with distance
function d() preprocesses S so that given any two
points x,y in S, d(x,y) (or an approximation
thereof) can be retrieved quickly. - Interesting cases
- Expensive to store all n2 point pairs
- Sublinear space
- Expensive to query distance function d()
- for example, when d() is graph-induced
3Distance oracles
- Introduced by TZ-05
- Setting weighted graph
- Approximation ratio 2k-1 (kgt1)
- Query time O(k)
- Space n11/k
- Other possible parameters
- Setting
- Planar, Euclidean, graph, metric
- Approximation to d(x,y)
- O(k), O(logn), 1?
- Query time
- O(k), O(logn), O(1)
- Space
- O(n), n11/k
- Dynamic updates
- addition of removal or points or graph edges
4Preliminaries Doubling dimension
- Definition Ball B(x,r) all points within
distance r from x. - The doubling constant (of a metric M) is the
minimum value ?gt0 such that every ball can be
covered by ? balls of half the radius - First used by Assoud 83, algorithmically by
Clarkson 97. - The doubling dimension is ddim(M)log2?(M)
- Euclidean ddim(Rd) O(d)
- Packing property of doubling spaces
- A set with diameter diam and minimum
- inter-point distance a, contains at most
- (diam/a)O(ddim) points
Here ?7.
4
Efficient classification for metric data
5Survey of oracle results
Reference Setting Distortion Query time space
TZ-05 weighted graph 2k-1 kgt1 O(k) n11/k
MN-06 Metric O(k) O(1) n11/k
Kle-02, Tho-04 Planar graph 1? O(? -1) O(n log n/?)
HM-06 Doubling metric 1? O(ddim) ?-O(ddim) n
BGKRL-11 Doubling metric, dynamic 1? O(1) ?-O(ddim) n 2O(ddim log ddim) n
Caveat word RAM model, and assuming a word is
sufficient to store any single interpoint
distance. Related model Distance labeling
Tal-04, Sli-05
6Overview of techniques
- Some tools well need (both static and dynamic
versions) - Point hierarchies for doubling spaces
- By now a standard construction
- Metric embeddings
- Into trees
- Into Euclidean space
- Tree search structures
- Level ancestor queries in O(1) time
- Least common ancestor (LCA) queries in O(1) time
7Preliminaries Spanners
- Oracle central idea Motivated by an observation
originally made in the context of low-stretch
spanners. - GGN-04, GR-08a, GR-08b
- A spanner of G is a subgraph H
- H contains all vertices of G
- H contains a subset of the edges of G
- Interesting properties of H
- Stretch, degree, hop diameter
G
H
1
2
2
1
1
1
1
8Point hierarchies
- To explain the observation motivating the oracle,
we need to introduce point hierarchies - Hierarchies are the starting point for problems
in doubling spaces - NNS, spanners, routing, embeddings
- A point hierarchy is composed of levels of r-nets
- An r-net for a point set S is a set of balls of
radius r centered at points of S - Packing The centers are separated from each
other by some minimum distance r - Covering The balls Cover all the points of S.
9Point hierarchies
1-net 2-net 4-net 8-net
10Point hierarchies
1-net 2-net 4-net 8-net
Packing
Radius 1
Covering all points are covered
11Point hierarchies
1-net 2-net 4-net 8-net
Covering all 1-net points are covered
12Point hierarchies
1-net 2-net 4-net 8-net
13Point hierarchies
1-net 2-net 4-net 8-net
14Point hierarchies
1-net 2-net 4-net 8-net
15Point hierarchies
1-net 2-net 4-net 8-net
16Point hierarchies
1-net 2-net 4-net 8-net
17Point hierarchies
1-net 2-net 4-net 8-net
18Another perspective
1-net 2-net 4-net 8-net
DAG
Number of levels log(aspect ratio)
19Another perspective
1-net 2-net 4-net 8-net
Make arbitrary parent-child assignments
DAG ? Spanning tree
Number of levels log(aspect ratio)
20Another perspective
1-net 2-net 4-net 8-net
Spanning tree
Number of levels log(aspect ratio)
21Towards an oracle
- Oracle stores all tree parent-child tree links
- O(n) space
- Define c-neighbors r-net point pairs within
distance c 3r/? - Store all distances between c-neighbors, and
between their children - ?-O(ddim)n space
- Note that the c-neighbor property is hereditary
- If nodes a,b are c-neighbors in tree level r
- Then the ancestor a,b of a,b in any tree level
ri are c-neighbors as well (or are the same
node) - Proof d(a,b) d(a,a) d(a,b) d(b,b)
- 2(ri) cr 2(ri)
- lt c(ri)
22c-neighbors
1-net 2-net 4-net 8-net
23Spanner observation
- Let x,y denote two points in S, and by extension
their corresponding tree leaf nodes. - Let x,y be the highest tree ancestors of x,y
that are not c-neighbors. - Note that d(x,y) is stored by the oracle, since
the parents of x,y are c-neighbors. - Spanner Theorem
- d(x,y) (1?) d(x,y)
- Proof by illustration
24Spanner observation
1-net 2-net 4-net 8-net
y
x
x
y
25Spanner observation
1-net 2-net 4-net 8-net
gt 12/?
Distortion (12/?12)/(12/?) 1 ?
y
x
6
x
y
26Oracle query
- Oracle query
- For x,y in S, find d(x,y)
- Oracle does this instead
- For x,y in S, find x,y (the highest ancestors
that are not c-neighbors) - Return stored d(x,y)
- Left with the following question
- Ancestral non-neighbors query Find the highest
tree ancestors that are not c-neighbors - We could view this as an abstract problem on
trees and ignore the metric
27Ancestral non-neighbors query
- Some ideas (static case) Recall that
neighborliness is hereditary - Brute force ? try all ancestors O(log aspect
ratio) - Binary search ? using level ancestor queries
O(log log aspect ratio) - Balanced tree brute force O(log n)
- Balanced tree binary search O(log log n)
- But we can do better
- Make use of the tree structure
- Get some help from the metric structure
28Ancestral neighbors query
- Lemma d(x,y) is closely related to the tree
level r of ancestors x,y - r log d(x,y) log c O(1)
- Corollary
- A b-approximation to d(x,y) pinpoints the level
of x,y to log b O(1) possible tree levels
29Oracle query
- Oracle Step 1 Run the oracle of MS-09 (similar
in flavor to TZ-05, MN-06) on x,y with parameter
k O(log n) - Approximation ratio O(k) O(log n)
- Query time O(1)
- Space n(11/k) O(n)
- By the Corollary, an approximation ratio of O(log
n) to d(x,y) limits the tree level of x,y to
O(log log n) possible levels.
30Oracle query
O(loglog n) levels
31Oracle query
- Snowflake embedding of Ass-04 and GKL-03
- Given a set S in metric space
- Embed S into O(ddim log ddim) Euclidean space
- Distortion O(ddim) into the snowflake d½
- Oracle Step 2
- Recall that the level of x,y has been narrowed
down to O(loglogn) candidate levels. - Embed neighborhoods of O(loglogn) levels into
Euclidean space
32Oracle query
- Whats going on?
- Weve narrowed down the level of x,y to
O(loglogn) levels - These neighborhoods are small
- Build a snowflake for each neighborhood
- O(ddim) O(log1/3n) dimensions
- O(log ddim loglog n) bits per dimension
- So the Euclidean representation of each point
fits into o(log½ n) bits (into a word) - Lemma The embedded (snowflake) distance between
two points can be returned in O(1) time - Proof outline The distance between two vectors
w,z is ww - 2wz zz. - A dot product can be computed in O(1) time by
manipulating the multiplication operator
33Oracle query
- Dot product via multiplication, proof by example
- w (1,2,3,4)
- z (5,6,7,8)
- w 0004000300020001
- z 0005000600070008
- wz 0032002400160008
- 00280021001400070000
- 002400180012000600000000
- 0020001500100005000000000000
- ------------------------------------------------
- 0020003200560070004400230008
34Oracle query
- Result of Step 2
- O(ddim) approximation to the snowflake distance
x,y (or rather, their ancestors in the
appropriate neighborhood) - By the corollary, restricts the candidate levels
of x,y to O(log ddim) levels - Oracle Step 3
- Preprocessing In neighborhoods of O(log dim)
levels, store a pointer from each pair to highest
ancestors which are not c-neighbors - Space 2O(ddim log ddim) per neighborhood or point
- O(1) query time
35Dynamic oracle
- Steps that needed to be made dynamic
- Hierarchy Already done CG-06
- MS-09 oracle Problem! Answer Tree
embeddingBar96 - Level ancestor query Problem! Answer Jump trees
- Snowflake embedding Problem! Extension of above
techniques - Conclusion
- There exists a dynamic 1? approximate
distortion oracle for doubling spaces with O(1)
query time, which uses ?-O(ddim) n 2O(ddim log
ddim) n space and can be updated in time
2-O(ddim) log n 2O(ddim log ddim)