Building Classifiers in Environments with Multimodal Inputs - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Building Classifiers in Environments with Multimodal Inputs

Description:

... effort in building classifiers. Useful for building tools for distance ... Distance interaction is a useful testbed for building tools that adapt to humans ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 16
Provided by: weesu
Category:

less

Transcript and Presenter's Notes

Title: Building Classifiers in Environments with Multimodal Inputs


1
Building Classifiers in Environments with
Multimodal Inputs
  • Lee Wee Sun
  • Singapore-MIT Alliance
  • Department of Computer Science
  • National University of Singapore

2
Classifiers Multimodal Inputs
A classifier takes an input - e.g. an imageand
outputs a class - e.g. person k in database
In certain environments,we have more than one
typeof input corresponding tothe same object -
e.g. image and voice of the same person
3
Classifiers Multimodal Inputs
Wong Weng Fai
Albert Einstein
Multimodal inputs can help improve
classification accuracy
This talk Multimodal inputs can help us build
(learn) classifierswithout the help of a teacher.
4
Outline
  • Co-training
  • Sufficient conditions
  • Conditional Independence
  • Learning from noise
  • Potential applications in distance interaction
    and education

5
Supervised Learning
  • Standard method for training classifiers.
  • A teacher labels a set of examples.
  • Machine learning algorithm learns from labeled
    examples.
  • Aims to do well in future unseen examples

Lee
Lee
Goh
Goh
Lee
Goh
?
6
Co-training
  • Blum Mitchell 1998
  • Motivating application Classifying web pages
  • Web pages have two descriptions
  • Content of the page itself and content of the
    links (anchor text) to the page
  • Algorithm
  • Build two classifiers, one on page content, one
    on link contents
  • Iteratively make the two classifiers agree with
    each other
  • Slight advantage in one classifier creates a
    virtuous cycle
  • Iterations eventually results in two good
    classifiers

7
Conditional Independence
  • Sufficient conditions
  • Possible to classify correctly with either input
    modes (at least one of the modes)
  • The two input modes are conditionally independent
    given the class.

Identity
P(ImageIdentity,Voice)P(ImageIdentity) P(Voice
Identity,Image)P(VoiceIdentity)
8
Conditional Independence
  • Each observation is an image-voice pair,
    corresponding to a link in the bipartite graph
  • If conditional independence exists, an image
    instance is paired with any voice instance
    according to the distribution of voice instances,
    regardless of what the image is.

Image
Voice
Same person
9
Conditional Independence
Image
Voice
  • What if conditional independence does not hold?

Same person
10
Learning with Noise
  • Sufficient to be able to classify correctly using
    one of the input modes.
  • The other input mode gives noisy label.
  • If conditional independence condition holds,
    noise is independent classification noise.
  • Can be learned using any method that can tolerate
    independent classification noise.

11
Applications
  • Learning to classify web pages from content of
    the page and anchor text of links pointing to the
    page (Blum Mitchell 1998)
  • Learning to classify named entity from spelling
    and appositive (Collins Singer 1999)
  • e.g. Goh Chok Tong, Prime Minister of Singapore
  • ...
  • Learning to classify web images from image
    features and text on a web page
  • Word Sense Disambiguation from discourse topic
    features and collocation features

12
Potential Applications
  • Distance interaction
  • Automatically learn the identity of the students
    in the class
  • Usually have both video and voice of the same
    person at the same time during interaction.

Identity
13
Potential Applications
  • Automatic transcribing of lectures
  • Usually write on the board and speak at the same
    time
  • Written word spoken around the same time and vice
    versa
  • Adapt to the persons speech as well as
    handwriting

14
Potential Applications
  • Automatic grading of certain types of exercises
  • For example True or False and Justify
  • True or False one mode, easy to automatically
    grade
  • Justify Natural language answer, other mode,
    hard to automatically grade
  • True or False used as noisy labels for learning
    to grade Justify part.

15
Conclusions
  • Multimodal inputs common in distance interaction
    and education
  • Exploiting multimodal inputs can improve
    performance of classifier
  • Multimodal inputs often allow learning without
    human teacher
  • Reduces effort in building classifiers
  • Useful for building tools for distance
    interaction and education
  • These tools are actually quite general e.g. face
    recognition, speaker recognition, speech
    recognition, handwriting recognition, natural
    language processing.
  • Distance interaction is a useful testbed for
    building tools that adapt to humans
Write a Comment
User Comments (0)
About PowerShow.com