Learning with Local and Global Consistency - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Learning with Local and Global Consistency

Description:

The key to semi-supervised learning problems is the prior assumption of consistency: (1)Local Consistency: nearby points are likely to have the same label; ... – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 13
Provided by: Qiuhu8
Category:

less

Transcript and Presenter's Notes

Title: Learning with Local and Global Consistency


1
Learning with Local and Global Consistency
By Dengyong Zhou, Olivier Bousquet, Tomas Navin
Lal, Jason Weston and Bernhard Schölkopf at NIPS
2003
Presented by Qiuhua Liu Duke University Machine
Learning Group March 23, 2007
2
Outline
  • The consistency assumption for semi-supervised
    learning Why unlabeled data could help the
    classification?
  • The Consistency algorithm a very simple
    algorithm based on the above assumption.
  • The Relation to Xuejuns Label Iteration
    algorithm
  • Experimental results
  • Conclusions

3
Semi-supervised Learning Problem
  • We all know that Semi-supervised Learning is very
    important, but why could we do that?
  • The key to semi-supervised learning problems is
    the prior assumption of consistency
  • (1)Local Consistency nearby points are
    likely to have the same label
  • (2)Global Consistency Points on the same
    structure (cluster or manifold) are likely to
    have the same label

4
Local and Global Consistency
  • The key to the consistency algorithm is to let
    every point iteratively spread its label
    information to its neighbors until a global
    stable state is achieved.

5
Some Terms
  • x, data point set
  • L, Label set
  • F, a classification on x
  • Y, initial classification on x, which is a
    special case of F with

6
The Consistency Algorithm
  • 1.Construct the affinity matrix W defined by a
    Gaussian kernel
  • 2. Normalize W symmetrically by
  • 3. Iterate
    until converge.
  • 4. Let denote the limit of the sequence
    F(t). The classification results is

where D is a diagonal matrix with
7
The Consistency Algorithm (Cont.)
  • The first two steps are the same as Spectral
    Clustering.
  • The third step
  • First term each point receive information from
    its neighbors.
  • Second term retains the initial information.
  • From the iteration equation, it is very easy to
    show that
  • So we could compute F directly without
    iterations.

8
The convergence process
  • The initial label information are diffused along
    the two moons.

9
Another view of the algorithm from regularization
framework
  • It could be easily shown that iteration result F
    is equivalent to minimize the following cost
    function

With
  • The first term is the Smoothing Constraint
    nearby points are likely to have the same label
  • The second term is the Fitting Constraint the
    classification results does not change too much
    from the initial assignment.

10
Some variations of the consistency algorithm
Let
Variation (1)
Variation (2)
Variation (3) is Xuejuns label iteration
algorithm, where Go is our graph
11
Experiment Results
(b)Text classification topics including autos,
motorcycles, baseball and hockey from the
20-newsgroups
(a)Digit recognition digit 1-4 from the USPS
data set
12
Conclusions
  • The key to semi-supervised learning problem is
    the consistency assumption.
  • The consistency algorithm proposed was
    demonstrated effective on the data set
    considered.
Write a Comment
User Comments (0)
About PowerShow.com