A speech about Boosting - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

A speech about Boosting

Description:

'An effective method of producing an accurate prediction rule from ... Discriminative reranking for natural language parsing.2000. Escudero, Marquez, Rigau. ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 36
Provided by: roberto67
Category:

less

Transcript and Presenter's Notes

Title: A speech about Boosting


1
A speech about Boosting
  • Presenter Roberto Valenti

2
The Paper
R.Schapire. The boosting approach to Machine
Learning An Overview, 2001
3
I want YOU
TO UNDERSTAND
4
Overview
  • Introduction
  • Adaboost
  • How Does it work?
  • Why does it work?
  • Demo
  • Extensions
  • Performance Applications
  • Summary Conclusions
  • Questions

5
Introduction to Boosting
  • Lets start

6
Introduction
  • An example of Machine Learning Spam classifier
  • Highly accurate rule difficult to find
  • Inaccurate rule BUY NOW
  • Introducing Boosting
  • An effective method of producing an accurate
    prediction rule from inaccurate rules

7
Introduction
  • History of boosting
  • 1989 Schapire
  • First provable polynomial time boosting
  • 1990 Freund
  • Much more efficient, but practical drawbacks
  • 1995 Freund Schapire
  • Adaboost Focus of this Presentation

8
Introduction
  • The Boosting Approach
  • Lots of Weak Classifiers
  • One Strong Classifier
  • Boosting key points
  • Give importance to misclassified data
  • Find a way to combine weak classifiers in general
    rule.

9
Adaboost
  • How does it work?

10
Adaboost How does it work?
11
Adaboost How does it work?
  • Base Learner Job
  • Find a base Hypothesis
  • Minimize the error
  • Choose at

12
Adaboost How does it work?
13
Adaboost
  • Why does it work?

14
Adaboost Why does it work?
  • Basic property reduce the training error
  • On binary Distributions
  • e 1/2 - gt
  • Training error bounded by
  • Is at most e-2Tg2 -gtdrops exponentially!

15
Adaboost Why does it work?
  • Generalization Error bounded by
  • T number of iterations
  • msample size
  • d Vapnik-Chervonenkis dimension2
  • Pr . empirical probability
  • Õ Logarithmic and constant factors
  • Overfitting in T!

16
Adaboost Why does it work?
  • Margins of the training examples
  • margin(x,y)
  • Positive only if correctly classified by H
  • Confidence in prediction
  • Qualitative Explanation of Effectiveness
  • Not Quantitative.

17
Adaboost Other View
  • Adaboost as a zero-sum Game
  • Game matrix M
  • Row Player Adaboost
  • Column Player Base Learner
  • Row player plays rows with distribution P
  • Column player plays with distribution Q
  • Expected Loss PTMQ
  • Play a Repeated game Matrix

18
Adaboost Other View
  • Von Neumanns minmax theorem
  • If exist a classifier with e lt1/2 - g
  • Then exist a combination of base classifiers with
    margin gt 2g
  • Adaboost has potential of success
  • Relations with Linear Programming and Online
    Learning

19
Adaboost
  • Demo

20
Demo
21
Adaboost
  • Extensions

22
Adaboost - Extensions
  • History of Boosting
  • 1997 Freund Schapire
  • Adaboost.M1
  • First Multiclass Generalization
  • Fails if weak learner achieves less than 50
  • Adaboost.M2
  • Creates a set of binary problems
  • For x, better l1 or l2?
  • 1999 Schapire Singer
  • Adaboost.MH
  • For x, better l1 or one of the others?

23
Adaboost - Extensions
  • 2001 Rochery, Schapire et al.
  • Incorporating Human Knowledge
  • Adaboost is data-driven
  • Human Knowledge can compensate lack of data
  • Human expert
  • Chose rule p mapping x to p(x) ? 0,1
  • Difficult!
  • Simple rules should work..

24
Adaboost - Extensions
  • To incorporate human knowledge
  • Where
  • RE(pq)p ln(p/q)(1-p) ln((1-p)/(1-q))

25
Adaboost
  • Performance and Applications

26
Adaboost - Performance Applications
Error Rates on Text categorization
Reuters newswire articles
AP newswire headlines
27
Adaboost - Performance Applications
Six Class Text Classification (TREC)
Test Error
Training Error
28
Adaboost - Performance Applications
Spoken Language Classification
How may I help you
Help desk
29
Adaboost - Performance Applications
OCR Outliers
Rounds
4
12
25
class, label1/weight1,label2/weight2
30
Adaboost - Applications
  • Text filtering
  • Schapire, Singer, Singhal. Boosting and Rocchio
    applied to text filtering.1998
  • Routing
  • Iyer, Lewis, Schapire, Singer, Singhal. Boosting
    for document routing.2000
  • Ranking problems
  • Freund, Iyer, Schapire, Singer. An efficient
    boostingalgorithm for combining preferences.1998
  • Image retrieval
  • Tieu, Viola. Boosting image retrieval.2000
  • Medical diagnosis
  • Merler, Furlanello, Larcher, Sboner. Tuning
    costsensitive boosting and its application to
    melanoma diagnosis.2001

31
Adaboost - Applications
  • Learning problems in natural language processing
  • Abney, Schapire, Singer. Boosting applied to
    tagging and PP attachment.1999
  • Collins. Discriminative reranking for natural
    language parsing.2000
  • Escudero, Marquez, Rigau. Boosting applied to
    word sense disambiguation.2000
  • Haruno, Shirai, Ooyama. Using decision trees to
    construct a practical parser.1999
  • Moreno, Logan, Raj. A boosting approach for
    confidence scoring.2001
  • Walker, Rambow, Rogati. SPoT A trainable
    sentence planner.2001

32
Summary and Conclusions
  • At last

33
Summary
  • Boosting takes a weak learner and converts it to
    a strong one
  • Works by asymptotically minimizing the training
    error
  • Effectively maximizes the margin of the combined
    hypothesis
  • Adaboost is related to other many topics
  • It Works!

34
Conclusions
  • Adaboost advantages
  • Fast, simple and easy to program
  • No parameter required
  • Performance Dependency
  • (Skurichina, 2001) Boosting is only useful for
    large sample size.
  • Choice of weak classifier
  • Incorporation of classifier weights
  • Data distribution

35
Questions
  • ?

(dont be mean)
Write a Comment
User Comments (0)
About PowerShow.com