Title: Unfolding Manifolds: Locally Linear KIsomaps
1Unfolding Manifolds Locally Linear K-Isomaps
- Ashutosh Saxena
- Abhinav Gupta
2Agenda
- Need for Unfolding Manifolds
- Tannenbaum Isomaps
- Local Linear Embedding
- Short-circuiting problem in Isomaps
- Proposed Locally Linear K-Isomaps
- Results
3Dimensionality Reduction
- Need to analyze large amounts multivariate data.
- Human Faces
- Handwritten characters
- Speech Waveforms
- Global Climate patterns
- Discover compact representations of high
dimensional data. - Visualization
- Compression
- Better Recognition
- Probably meaningful dimensions
4(No Transcript)
5Types of structures in Multivariate Data
- Clusters.
- On or around low Dimensional Manifolds
- Linear
- NonLinear
6Concepts of Manifolds
- A manifold is a topological space which is
locally Euclidean. - Manifolds arise naturally whenever there is a
smooth variation of parameters - like pose of the face in previous example
- Curve parameters in handwritten characters
- The dimension of a manifold is the minimum
integer number of co-ordinates necessary to
identify each point in that manifold.
7Non-linear Manifolds
PCA and MDS see the Euclidean distance
A
What is important is the geodesic distance
Unroll the manifold
8Preserve the geodesic distance and not the
euclidean distance.
9Two Methods
- Tenenbaum et.als Isomap Algorithm 1
- Global approach.
- On a low dimensional embedding
- Nearby points should be nearby.
- Farway points should be faraway.
- Roweis and Sauls Locally Linear Embedding
- Local approach
- Nearby points nearby
- 1 Tenenbaum, et al, SCIENCE, Dec 2000 22 290.
- 2 Roweis, et al, SCIENCE, Dec 2000, 22, 290.
10Isomap
- Estimate the geodesic distance between faraway
points. - For neighboring points Euclidean distance is a
good approximation to the geodesic distance. - For farway points estimate the distance by a
series of short hops between neighboring points. - Find shortest paths in a graph with edges
connecting neighboring data points
Once we have all pairwise geodesic distances use
classical metric MDS
11Isomap
12Isomap-Algorithm
- Determine the neighbors.
- All points in a fixed radius.
- K nearest neighbors
- Construct a neighborhood graph.
- Each point is connected to the other if it is a K
nearest neighbor. - Edge Length equals the Euclidean distance
- Compute the shortest paths between two nodes
- Construct a lower dimensional embedding.
- Classical Multi-Dimensional Scaling (MDS )
13Local Linear Embedding
Fit Locally , Think Globally
14Fit Locally
We expect each data point and its neighbours to
lie on or close to a locally linear patch of
the manifold.
Each point can be written as a linear combination
of its neighbors. The weights choosen to minimize
the reconstruction Error.
15Think Globally
16Tennenbaum Isomaps Short Circuit Problem
-
- How to choose neighborhoods.
- Susceptible to short-circuit errors if
neighborhood is larger than the folds in the
manifold. - If small we get isolated patches.
- Noisy Data Short-Circuit problem surfaces
17Sparse Swiss-roll data used
18Original DataCorrect 2-D embedding
19Noisy dataShort-Circuit occurs
20Proposed Algorithm
- A better method for choosing neighborhood is
proposed in Tennenbaum algorithm. - We explicitly use the fact that Manifolds are
locally linear. - Manifolds arise due to smooth variation of
parameters. - LL-K-Isomaps
- KLL neighbors out K-nearest neighbors are chosen,
based on how well they reconstruct the point
linearly
21LL-K-Isomaps algorithm
- Determine the neighbors.
- K nearest possible neighbors
- KLLneighbors based on local linearity
- Construct a neighborhood graph.
- Each point is connected to the other if it is a K
nearest neighbor. - Edge Length equals the Euclidean distance
- Compute the shortest paths between two nodes
- Construct a lower dimensional embedding.
- Classical Multi-Dimensional Scaling (MDS )
22LL-K-Isomaps Results
- Noise was added to the Swiss-Roll Data
- Tennenbaum original algorithm suffered from
short-circuiting - LL-K-Isomaps were able to find correct 2-D
embedding with Noisy data
23Resultscontd
Residual Variance
Tennenbaum
Residual Variance
Proposed
24Results contd (disconnected neighborhood)
Tennenbaum
Proposed
25Future Work
- Convergence of algorithm will be proved
mathematically - Will be tested on
- Swiss-roll and S-roll data-sets
- Synthetic faces pose variations
- Handwritten characters
26Thank you. Questions ?