Semi-Supervised Learning Using Randomized Mincuts - PowerPoint PPT Presentation

About This Presentation
Title:

Semi-Supervised Learning Using Randomized Mincuts

Description:

Semi-Supervised Learning Using Randomized Mincuts Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira Carnegie Mellon Motivation Often have little labeled data ... – PowerPoint PPT presentation

Number of Views:173
Avg rating:3.0/5.0
Slides: 24
Provided by: Shuchi2
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Semi-Supervised Learning Using Randomized Mincuts


1
Semi-Supervised Learning Using Randomized Mincuts
  • Avrim Blum, John Lafferty, Raja Reddy, Mugizi
    Rwebangira
  • Carnegie Mellon

2
Motivation
  • Often have little labeled data but lots of
    unlabeled data.
  • We want to use the relationships between the
    unlabeled examples to guide our predictions.
  • Assumption Similar examples should generally
    be labeled similarly."

3
Learning using Graph MincutsBlum and Chawla
(ICML 2001)
4
Construct an (unweighted) Graph
5
Add auxiliary super-nodes
6
Obtain s-t mincut
-

Mincut
7
Classification

-
Mincut
8
  • Problem
  • Plain mincut gives no indication of its
    confidence on different examples.
  • Solution
  • Add random weights to the edges.
  • Run plain mincut and obtain a classification.
  • Repeat the above process several times.
  • For each unlabeled example take a majority vote.
  • Margin of the vote gives a measure of the
    confidence.

9
Before adding random weights

-
Mincut
10
After adding random weights

-
Mincut
11
  • PAC-Bayes
  • PAC-Bayes bounds show that the average of
    several hypotheses that are all consistent with
    the training data will probably be more accurate
    than any single hypothesis.
  • In our case each distinct cut corresponds to a
    different hypothesis.
  • Hence the average of these cuts will probably be
    more accurate than any single cut.

12
  • Markov Random Fields
  • Ideally we would like to assign a weight to each
    cut in the graph (a higher weight to small cuts)
    and then take a weighted vote over all the cuts
    in the graph.
  • This corresponds to a Markov Random Field model.
  • We dont know how to do this efficiently, but we
    can view randomized mincuts as an approximation.

13
Related Work Gaussian Fields
  • Zhu, Gharamani and Lafferty (ICML 2003).
  • Each unlabeled example receives a label that is
    the average of its neighbors.
  • Equivalent to minimizing the squared difference
    of the labels.

14
  • How to construct the graph?
  • k-NN
  • Graph may not have small balanced cuts.
  • How to learn k?
  • Connect all points within distance d
  • Can have disconnected components.
  • How to learn d?
  • Minimum Spanning Tree
  • No parameters to learn.
  • Gives connected, sparse graph.
  • Seems to work well on most datasets.

15
Experiments
  • ONE vs. TWO 1128 examples .
  • (8 X 8 array of integers, Euclidean distance).
  • ODD vs. EVEN 4000 examples .
  • (16 X 16 array of integers, Euclidean distance).
  • PC vs. MAC 1943 examples .
  • (20 newsgroup dataset, TFIDF distance) .

16
ONE vs. TWO
17
ODD vs. EVEN
18
PC vs. MAC
19
Accuracy Coverage PC vs. MAC (12 labeled)
20
  • Conclusions
  • We can get useful estimates of the confidence of
    our predictions.
  • Often get better accuracy than plain mincut.
  • Minimum spanning tree gives good results across
    different datasets.

21
  • Future Work
  • Sample complexity lower bounds (i.e. how much
    unlabeled data do we need to see?).
  • More principled way of sampling cuts?

22
THE END
23
Questions?
Write a Comment
User Comments (0)
About PowerShow.com