Title: Algorithmic Aspects of Finite Metric Spaces
- Moses Charikar
- Princeton University
2Metric Space
- A set of points X
- Distance function d(x,y)d X ?0??)
- d(x,y) 0 iff xy
- d(x,y) d(y,x) Symmetric
- d(x,z) d(x,y) d(y,z) Triangle inequality
3Example Metrics Normed spaces
- x (x1, x2, , xd) y (y1, y2, , yd)
- lp norm l1 l2 (Euclidean) l?
- lpd lp norm in Rd
- Hamming cube 0,1d
4Example Metrics domain specific
- Shortest path distances on graph
- Symmetric difference on sets
- Edit distance on strings
- Hausdorff distance, Earth Mover Distance on sets
of n points
5Metric Embeddings
- General idea Map complex metrics to simple
metrics - Why ? richer algorithmic toolkit for simple
metrics - Simple metrics
- normed spaces lp
- low dimensional normed spaces lpd
- tree metrics
- Mapping should not change distances much (low
6Low Distortion Embeddings
- Metric spaces (X1,d1) (X2,d2),embedding f X1
? X2 has distortion D if ratio of distances
changes by D ? x,y ? X1
- High dimensional ? Low dimensional(Dimension
reduction) - Algorithmic efficiency (running time)
- Compact representation (storage space)
- Streaming algorithms
- Specific metrics ? normed spaces
- Nearest neighbor search
- Optimization problems
- General metrics ? tree metrics
- Optimization problems, online algorithms
Solve problems on very large data sets in one
pass using a very small amount of storage
8A (very) Brief Historyfundamental results
- Metric spaces studied in functional analysis
- n point metric embeds into l?n with no distortion
Frechet - n point metric embeds into lp with distortion log
n Bourgain 85 - Dimension reduction for n point Euclidean metric
with distortion 1e Johnson, Lindenstrauss 84
9A (very) Brief Historyapplications in Computer
- Optimization problems
- Application to graph partitioning Linial,
London, Rabinovich 95Arora, Rao, Vazirani
04 - n point metrics into tree metrics Bartal 96
98 FRT 03 - Efficient algorithms
- Dimension reduction
- Nearest neighbor search, Streaming algorithms
metric as model
- This is not an attempt at a survey
- Biased by my own interests
- Much more relevant and related work than I can do
justice do in limited time. - Goal Give glimpse of different applications of
finite metric spaces - Core ideas, no messy details
12Disclaimer Community Bias
- Theoretical viewpoint
- Focus on algorithmic techniques with performance
guarantees - Worst case guarantees
14Metric as data
- What is the data ?
- Mathematical representation of objects (e.g.
documents, images, customer profiles, queries). - Sets, vectors, points in Euclidean space, points
in a metric space, vertices of a graph. - Metric is part of data
15Johnson Lindenstrauss JL84
- n points in Euclidean space (l2 norm) can be
mapped down to O((log n)/?2) dimensions with
distortion at most 1?. - Quite simple JL84, FM88, IM98, AV99, DG99,
Ach01 - Project onto random unit vectors
- projection of (u-v) onto one random vector
behaves like Gaussian scaled by u-v2 - Need log n dimensions for tight concentration
bounds - Even a random -1,1 vector works
16Dimension reduction for l2
- Two interesting properties
- Linear mapping
- Oblivious choice of linear mapping does not
depend on point set - Many applications
- Making high dimensional problems tractable
- Streaming algorithms
- Learning mixtures of gaussians Dasgupta 99
- Learning robust concepts Arriaga,Vempala 99
Klivans,Servedio 04
17Dimension reduction for l1
- C,Sahai 02Linear embeddings are not good for
dimension reduction in l1 - There exist n points in l1 in n dimensions, such
that any linear mapping with distortion ? needs
n/?2 dimensions
18Dimension reduction for l1
- C, Brinkman 03Strong lower bounds for
dimension reduction in l1 - There exist n points in l1 , such that any
embedding with constant distortion ? needs n1/?2
dimensions - Alternate, simpler proof Lee, Naor 03
20Frequency Moments
Alon,Matias,Szegedy 99
- Data stream is sequence of elements in n
- ni frequency of element i
- Fk ? nik kth frequency moment
- F0 number of distinct elements
- F2 skewness measure of data stream
- Goal
- Given a data stream, estimate Fk in one pass and
sub-linear space
21Estimating F2
- Consider a single counter c and randomly chosen
xi ? 1, -1 for each i in n - On seeing each element i, update c xi
- c ? ni xi
- Claim Ec2 ? ni2 F2 Varc2 ?
2(F2)2 (4-wise independence) - Average 1/?2 copies of this estimator to get
(1?) approximation
22Differences between data streams
- ni frequency of element i in stream 1
- mi frequency of element i in stream 2
- Goal measure ? (ni mi)2
- F2 sketches are additive? ni xi - ? mi xi ?
(ni mi)xi - Basically, dimension reduction in l2 norm
- Very useful primitivee.g. frequent items
C, Chen, Farach-Colton 02
23Estimate l1 norms ?
- Indyk 00
- p-stable distributionDistribution over R such
that? ni xi distributed as (? nip )1/p X - Cauchy distribution c(x)1/?(1x2) 1-stable
- Gaussian distribution 2-stable
- As before, c ? ni xi
- Cauchy does not have finite expectation !
- Estimate scale factor by taking median
25Similarity Preserving Hash Functions
- Similarity function sim(x,y)
- Family of hash functions F with probability
distribution such that
- Compact representation scheme for estimating
similarity - Approximate nearest neighbor search
Indyk,Motwani 98 Kushilevitz,Ostrovsky,Rabani
27Estimating Set Similarity
- Broder,Manasse,Glassman,Zweig,97
- Broder,C,Frieze,Mitzenmacher,98
- Collection of subsets
28Minwise Independent Permutations
29Existence of SPH schemes C 02
- sim(x,y) admits an SPH scheme if? family of
hash functions F such that - Theorem If sim(x,y) admits an SPH scheme then
1-sim(x,y) satisfies triangle inequality embeds
into l1 - Rounding procedures for LPs and SDPs yield
similarity and distance preserving hashing
30Random Hyperplane Rounding based SPH
- Collection of vectors
- Pick random hyperplane through origin (normal
) - Goemans,Williamson
31Earth Mover Distance (EMD)
LP Rounding algorithms for optimization problem
(metric labelling) yield log n approximate
estimator for EMD on n points. Implies that EMD
embeds into l1 with distortion log n
33Graph partitioning problems
- Given graph, partition into U,V
- Maximum cut maximize E(U,V)
- Sparsest cut
- minimize
34Correlation clustering
Similar ()
Dissimilar (-)
example courtesy Shuchi Chawla
Mr. Rumsfeld
The secretary
Saddam Hussein
35Graph partitioning as metric problem
- Partitioning is equivalent to finding appropriate
0,1 metric - possibly additional constraints
- Objective function linear in metric
- Find best 0,1 metric
cut metric
36Metric relaxation approaches
- Max Cut Goemans,Williamson 94
- map vertices to points on unit sphere (SDP)
- exploit geometry to get good solution(random
hyperplane cut) - Sparsest Cut Linial,London,Rabinovich 95
- LP gives best metric need l1 metric
- Bourgain 84 embeds any metric into l1 with
distortion log n - Existential theorem can be made algorithmic
- log n approximation
- recent SDP based ?log n approximationArora,Rao,V
azirani 04
37Metric relaxation approaches
- Correlation clustering C,Guruswami,Wirth,03
Emanuel,Fiat,03 Immorlica,Karger,03 - Find best 0,1 metric from similarity/dissimilari
ty data via LP - Use metric to guide clustering
- close points in same cluster
- distant points in different clusters
- Learning best metric ?
- Note In many cases, LP/SDP can be eliminated to
yield efficient algorithms
39Some connections to learning
- Dimension reduction in l2
- Learning mixtures of Gaussians Dasgupta
99Random projections make skewed gaussians
more spherical, making learning easier - Learning with large marginArriaga,Vempala 99
Klivans,Servedio 04Random projections
preserve margin,large margin ? few dimensions - Kernel methods for SVMs
- mappings to l2
40Ongoing developments
- Notion of intrinsic dimensionality of metric
,Mendel,Naor,04 - Doubling dimension How many balls of radius R
needed to cover ball of radius 2R ? - Complexity measure of metric space
- natural parameter for embeddings
- Open Can every metric of constant doubling
dimension in l2 be embedded into l2 with O(1)
dimensions and O(1) distortion ? - Not true for l1
- related to learning low dimension manifolds,
PCA, MDS, LLE, Isomap
41Some things I didnt mention
- Approximating general metrics via tree metrics
- modified notion of distortion
- useful for approximation, online algorithms
- Many mathematically appealing questions
- Embeddings between normed spaces
- Spectral methods for approximating matrices
(SVD, LSI) - PCA, MDS, LLE, Isomap
- Whirlwind tour of finite metrics
- Rich algorithmic toolkit for finite metric spaces
- Synergy between Computer Science and Mathematics
- Exciting area of active research
- range from practical applications to deep
theoretical questions - Many more applications to be discovered