Semisupervised Learning and Class Discovery - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Semisupervised Learning and Class Discovery

Description:

Semisupervised Learning and Class Discovery. David Bazell Eureka Scientific, Inc. David Miller Penn State University. Funded by NASA/AISRP. Overview. Objectives ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 15
Provided by: davidba151
Category:

less

Transcript and Presenter's Notes

Title: Semisupervised Learning and Class Discovery


1
Semisupervised Learning and Class Discovery
  • David Bazell Eureka Scientific, Inc.
  • David Miller Penn State University
  • Funded by NASA/AISRP

2
Overview
Objectives
Algorithm
Semisupervised Learning
Results
Future Work
3
Objectives
  • Use Available Labeled and Unlabeled Data
  • Use Prior Knowledge of Existing Classes
  • Identify Known Classes
  • Identify New and Interesting Classes

4
Semisupervised Learning
  • Unsupervised Learningno labeled data
  • Supervised Learninglabeled data
  • Semisupervised Learninguse both types of data
  • More unlabeled data than labeled data
  • Unknown Object Classes may exist in the data

5
Mixture Model
6
Mixture Model
  • Each data point generated by a mixture component
  • Each class associated with one or more mixture
    components
  • Predefined components generate both labeled and
    unlabeled data
  • Nonpredefined components generate only
    unlabeled data

7
Mixture Model
(xj, cj, l)
(xi, m)
(xk, m)
Predefined Mixture Component Labels randomly
missing
Nonpredefined mixture component All labels missing
Key Parameter bck Prob (c Mk) Fraction
of samples from component k that belong to
class c, including unknown class u.
8
Error Measures
  • Criterion 1 Error Error in classifying known
    classes to new classes and new classes to known
    classes
  • Criterion 2 Error Error in classifying examples
    to known classes
  • Criterion 3 Error Error in classifying examples
    to new classes

9
Data Sets
  • ESOLV data
  • 5217 objects
  • 12 parameters
  • 5 classes of galaxies
  • SDSS Early Release Data
  • 54007 Objects
  • 6 parameters
  • 7 classes of objects

10
Results
11
Results
12
Results
13
Results
14
Applications andFuture Work
  • Classification/discovery in large data sets with
    limited labeled data
  • Expand to larger feature sets (spectra)
  • Explore other methods of model selection
  • Guiding of unsupervised clustering
Write a Comment
User Comments (0)
About PowerShow.com