Title: Advances in Metric Embedding Theory
1Advances in Metric Embedding Theory
- Ofer Neiman
- Ittai Abraham Yair Bartal
- Hebrew University
2Talk Outline
- Current results
- New method of embedding.
- New partition techniques.
- Constant average distortion.
- Extend notions of distortion.
- Optimal results for scaling embeddings.
- Tradeoff between distortion and dimension.
- Work in progress
- Low dimension embedding for doubling metrics.
- Scaling distortion into a single tree.
- Nearest neighbors preserving embedding.
3Embedding Metric Spaces
- Metric spaces (X,dX), (Y,dy)
- Embedding is a function fX?Y
- For non-contracting Embedding f,
- Given u,v in X let
- Distortion c if maxu,v ? X distf(u,v) c
4Low-Dimension Embeddings into Lp
- For arbitrary metric space on n points
- Bourgain 85 distortion O(log n)
- LLR 95 distortion T(log n) dimension O(log2 n)
- Can the dimension be reduced?
- For p2, yes using JL to dimension O(log n)
- Theorem embedding into Lp with distortion O(log
n), dimension O(log n) for any p. - Theorem distortion O(log1? n), dimension
T(log n/ (? loglog n))
5Average Distortion Embeddings
- In many practical uses, the quality of an
embedding is measured by its average distortion - Network embedding
- Multi-dimensional scaling
- Biology
- Vision
- Theorem Every n point metric space can be
embedded into Lp with average distortion O(1),
worst-case distortion O(log n) and dimension
O(log n).
6Variation on distortion The Lq distortion of an
embedding
- Given a non-contracting embedding
- f from (X,dX) to (Y,dY)
- Define its Lq-distortion
Thm Lq-distortion is bounded by O(q)
7Partial Scaling Distortion
- Definition A (1-e)-partial embedding has
distortion D(e), if at least 1-e of the pairs
satisfy dist(u,v) - Definition An embedding has scaling distortion
D() if it is a 1-e partial embedding with
distortion D(e), for all e0 simultaneously. - KSW 04
- Introduce the problem in context of network
embeddings. - Initial results.
- A 05
- Partial distortion and dimension O(log(1/e)) for
all metrics. - Scaling distortion O(log(1/e)) for doubling
metrics. - Thm Scaling distortion O(log(1/e)) for all
metrics.
8Lq-Distortion Vs Scaling Distortion
- Upper bound O(log 1/e) on Scaling distortion
implies - Lq-distortion O(minq,log n).
- Average distortion O(1).
- Distortion O(log n).
- For any metric
- ½ of pairs distortion are c log(2) c
- ¼ of pairs distortion are c log(4) 2c
- ? of pairs distortion are c log(8) 3c
- .
- 1/n2 of pairs distortion are 2c log(n)
- For e
- Lower bound O(log 1/e) on partial distortion
implies - Lq-distortion O(minq,log n).
9Probabilistic Partitions
- PS1,S2,St is a partition of X if
-
- P(x) is the cluster containing x.
- P is ?-bounded if diam(Si)? for all i.
- A probabilistic partition P is a distribution
over a set of partitions. - P is ?-padded if
10Partitions and Embedding
- Let ?i4i be the scales.
- For each scale i, create a probabilistic
?i-bounded partitions Pi, that are ?-padded. - For each cluster choose si(S)Ber(½) i.i.d.
- fi(x) si(Pi(x))d(x,X\Pi(x))
- Repeat O(log n) times.
- Distortion O(?-1log1/p?).
- Dimension O(log nlog ?).
diameter of X ?
?i
8
4
x
d(x,X\P(x))
11Upper Bound
- fi(x)
si(Pi(x))d(x,X\Pi(x)) - For all x,y?X
- Pi(x)?Pi(y) implies d(x,X\Pi(x))d(x,y)
- Pi(x)Pi(y) implies d(x,A)-d(y,A)d(x,y)
12Lower Bound
y
x
- Take a scale i such that ?id(x,y)/4.
- It must be that Pi(x)?Pi(y)
- With probability ½ d(x,X\Pi(x))??i
- With probability ¼ si(Pi(x))1 and
si(Pi(y))0
13?-padded Partitions
- The parameter ? determines the quality of the
embedding. - Bartal 96 ?O(1/log n) for any metric space.
- Rao 99 ?O(1) used to embed planar metrics
into L2. - CKR01FRT03 Improved partitions with
?(x)log-1(?(x,?)). - KLMN 03 Used to embed general doubling
metrics into Lp distortion O(?-(1-1/p)log1/pn),
dimension O(log2n) - The local growth rate of x at radius r is
14Uniform Probabilistic Partitions
- In a Uniform Probabilistic Partition
- ?X?0,1
- All points in a cluster have the same padding
parameter. - Uniform partition lemma There exists a uniform
probabilistic ?-bounded partition such that for
any , ?(x)log-1?(v,?), where
C1
C2
v2
v1
v3
?(C1) ?
?(C2) ?
15Embeddinginto one dimension
- Let ?i4i.
- For each scale i, create uniformly padded
probabilistic ?i-bounded partitions Pi. - For each cluster choose si(S)Ber(½) i.i.d.
- , fi(x)
si(Pi(x))?i-1(x)d(x,X\Pi(x)) - Upper bound f(x)-f(y) O(log n)d(x,y).
- Lower bound Ef(x)-f(y) O(d(x,y))
- Replicate DT(log n) times to get high
probability.
16Upper Bound f(x)-f(y) O(log n) d(x,y)
-
- For all x,y?X
- - Pi(x)?Pi(y) implies fi(x) ?i-1(x)
d(x,y) - - Pi(x)Pi(y) implies fi(x)- fi(y)
?i-1(x) d(x,y)
Use uniform padding in cluster
17Lower Bound
y
x
- Take a scale i such that ?id(x,y)/4.
- It must be that Pi(x)?Pi(y)
- With probability ½ fi(x) ?i-1(x)d(x,X\Pi(x))?i
18Lower bound Ef(x)-f(y) d(x,y)
- Two cases
- R
- prob. ? si(Pi(x))1 and si(Pi(y))0
- Then fi(x) ?i ,fi(y)0
- f(x)-f(y) ?i/2 O(d(x,y)).
- R ?i/2 then
- prob. ¼ si(Pi(x))0 and si(Pi(y))0
- fi(x)fi(y)0
- f(x)-f(y) ?i/2 O(d(x,y)).
19Coarse Scaling Embedding into Lp
- Definition For u?X, re(u) is the minimal radius
such that B(u,re(u)) en. - Coarse scaling embedding For each u?X, preserves
distances outside B(u,re(u)).
re(w)
w
re(u)
u
re(v)
v
20Scaling Distortion
- Claim If d(x,y) re(x) then 1 distf(x,y)
O(log 1/e) - Let l be the scale d(x,y) ?l
- Lower bound Ef(x)-f(y) d(x,y)
- Upper bound for high diameter terms
- Upper bound for low diameter terms
- Replicate DT(log n) times to get high
probability.
21Upper Bound for high diameter termsf(x)-f(y)
O(log 1/e) d(x,y)
-
- Scale l such that re(x)d(x,y) ?l
22Upper Bound for low diameter termsf(u)-f(v)
O(1) d(u,v)
- Scale l such that d(x,y) ?l 4d(x,y).
- All lower levels i l are bounded by ?i.
23Embedding into Lp
- Partition P is (?,d)-padded if
- Lemma there exists (?,d)-padded partitions with
?(x)log-1(?(v,?))log(1/d), where
vminu?P(x)?(u,?). - Hierarchical partition every cluster in level i
is a refinement of cluster in level i1. - Theorem Every n point metric space can be
embedded into Lp with dimension O(ep log n). For
every q
24Embedding into Lp
- Embedding into Lp with scaling distortion
- Use partitions with small probability of padding
de-p. - Hierarchical Uniform Partitions.
- Combination with Matouseks sampling techniques.
25Low Dimension Embeddings
- Embedding with distortion O(log1? n), dimension
T(log n/ (? loglog n)). - Optimal trade-off between distortion and
dimension. - Use partitions with high probability of padding
d1-log-?n.
26Additional Results Weighted Averages
- Embedding with weighted average distortion O(log
?) for weights with aspect ratio ? - Algorithmic applications
- Sparsest cut,
- Uncapacitated quadratic assignment,
- Multiple sequence alignment.
27Low Dimension EmbeddingsDoubling Metrics
- Definition A metric space has doubling constant
?, if any ball with radius r0 can be covered
with ? balls of half the radius. - Doubling dimension log ?.
- GKL03 Embedding doubling metrics, with tight
distortion. - Thm Embedding arbitrary metrics into Lp with
distortion O(log1? n), dimension O(log ?). - Same embedding, with similar techniques.
- Use nets.
- Use Lovász Local Lemma.
- Thm Embedding arbitrary metrics into Lp with
distortion O(log1-1/p?log1/p n), dimension Õ(log
nlog?). - Use hierarchical partitions as well.
28Scaling Distortion into trees
- A 05 Probabilistic Embedding into a
distribution of ultrametrics with scaling
distortion O(log(1/e)). - Thm Embedding into an ultrametric with scaling
distortion . - Thm Every graph contains a spanning tree with
scaling distortion . - Imply
- Average distortion O(1).
- L2-distortion
- Can be viewed as a network design objective.
- Thm Probabilistic Embedding into a distribution
of spanning trees with scaling distortion
Õ(log2(1/e)).
29New ResultsNearest-Neighbors Preserving
Embeddings
- Definition x,y are k-nearest neighbors if
B(x,d(x,y))k. - Thm Embedding into Lp with distortion Õ(log k)
on k-nearest neighbors, for all k
simultaneously, and dimension O(log n). - Thm For fixed k, embedding into Lp distortion
O(log k) and dimension O(log k). - Practically the same embedding.
- Every level is scaled down, higher levels more
aggressively. - Lovász Local Lemma.
30Nearest-Neighbors Preserving Embeddings
- Thm Probabilistic embedding into a distribution
of ultrametrics with distortion Õ(log k) for all
k-nearest neighbors. - Thm Embedding into an ultrametric with
distortion k-1 for all k-nearest neighbors. - Applications
- Sparsest-cut with neighboring demand pairs.
- Approximate ranking / k-nearest neighbors search.
31Conclusions
- Unified framework for embedding arbitrary
metrics. - New measures of distortion.
- Embeddings with improved properties
- Optimal scaling distortion.
- Constant average distortion.
- Tight distortion-dimension tradeoff.
- Embedding metrics in their doubling dimension.
- Nearest-neighbors preserving embedding.
- Constant average distortion spanning trees.