Graph Diffusion - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Graph Diffusion

Description:

Random Walk Graph. The first step for all applications is to define a ... A symmetric matrix that is adjoint to the original random walk matrix is also defined. ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 12
Provided by: Joh6208
Category:

less

Transcript and Presenter's Notes

Title: Graph Diffusion


1
Graph Diffusion
  • John Paisley
  • Duke University

2
Motivation
  • Graph diffusion provides a new method for
    answering old questions
  • What is the distance
  • between 1,2,3?
  • How would I cluster/label
  • this dataset?
  • What is an effective
  • dimensionality reduction?
  • Graph diffusion provides meaningful answers via
  • Diffusion maps
  • Diffusion distances
  • Spectral clustering, etc.

3
Random Walk Graph
  • The first step for all applications is to define
    a kernel function, e.g.
  • With N data points, we obtain an N x N matrix
    where closer points have a larger weight. The
    kernel width, , controls the meaning of
    closeness.
  • Using the total weight of a node (data point), we
    normalize each row of the matrix, L, to obtain a
    random walk matrix, M.

4
Symmetric Graph
  • A symmetric matrix that is adjoint to the
    original random walk matrix is also defined.
  • This means and share the same
    eigenvalues, , and eigenvectors are related
    as follows.
  • where is the jth eigenvector of and
    and
  • are left and right eigenvectors of

5
Diffusion Mapping
  • These right eigenvectors provide new coordinates
    for the mapping with each dimension weighted by
    its eigenvalue
  • For example, for the ith observation, the new
    coordinates are
  • If d lt N, havent we increased the
    dimensionality?
  • Answer Not necessarily. Since the eigenvalues
    are decreasing
  • each dimension is weighted less. By looking at
    the values of these eigenvalues, one can
    threshold and select the top few dimensions,
    discarding the rest.

6
Example 1 Spectral Clustering
  • Spectral clustering is clustering in this
    diffusion-mapped space. K-means or more
    advanced clustering methods can be used.
  • The mapping is shown in 2-D, though more
    dimensions exist. Notice how diffusion mapping
    has unwrapped the data manifold, making
    clustering much easier. Now, Euclidean distance
    is meaningful.

7
Example 2 Data Sorting
  • Consider the task of sorting data by similarity.
    Euclidean distance in the original space is not
    always the best way to do this.
  • By diffusion mapping, we can sort along the
    manifold. In a diffusion-mapped space, we simply
    use the Euclidean distance.

8
Calculating Distances
  • The diffusion distance is written below. In this
    form, we can see that the eigenvalues effectively
    weight the contribution of each dimension to the
    final distance measure.
  • For a random walk of t steps, the matrix is
    simply raised to that power and the
    diffusion distance becomes
  • Only the eigenvalues change in how they weight
    dimensions. Therefore, we can deduce that lower
    eigenvectors contain coarser information and vice
    versa. For this reason, diffusion maps can be
    used for multi-scale problems.

9
Some Intuition Behind the Mapping
  • Intuitively, the diffusion mapping can be
    understood by writing the equations for one
    eigenvector
  • The coordinate, xi, is proportional to a weighted
    average of the coordinates of all other points.

10
Extension to New Data
  • Considered this way, a simple extension can be
    formulated for estimating the coordinates of the
    (N1)st data point.
  • We append a new row to consisting of the
    normalized kernel values to all N existing
    points.
  • If the first N data points adequately represent
    all data that will come, we can easily and
    quickly map new data without having to reconstruct

11
More Intuition A Direct Method
  • The values of can be found from the
    eigenfunctions
  • The Spectral Theorem
  • Using the relation of the two matrices
    eigenvectors
  • This leads to the following calculation of the
    diffusion distance
  • The intuition is that the diffusion distance can
    be (almost) viewed as the Euclidean distance
    between rows of the random walk matrix. Instead
    of measuring distance between a and b directly,
    their distance is measured by their relative
    distance to all other points in the dataset,
    making this distance robust to noise.
Write a Comment
User Comments (0)
About PowerShow.com