Ofer Dekel, Shai ShalevShwartz, Yoram Singer - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Ofer Dekel, Shai ShalevShwartz, Yoram Singer

Description:

Loss functions used in classification Boosting: ... GD and EG online algorithms for Log-loss. Relative loss bounds. Future Directions ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 15
Provided by: ofe75
Category:

less

Transcript and Presenter's Notes

Title: Ofer Dekel, Shai ShalevShwartz, Yoram Singer


1
Smooth e-Insensitive Regression by Loss
Symmetrization
  • Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer
  • School of Computer Science and Engineering
  • The Hebrew University
  • oferd,shais,singer_at_cs.huji.ac.il
  • COLT 2003 The Sixteenth Annual Conference on
    Learning Theory

2
Before We Begin
Linear Regression given find
such that Least Squares minimize Support
Vector Regression minimize s.t.
3
Loss Symmetrization
Loss functions used in classification Boosting
Symmetric versions of these losses can be used
for regression
4
A General Reduction
  • Begin with a regression training set
  • where ,
  • Generate 2m classification training examples of
    dimension n1
  • Learn while maintaining
  • by minimizing a margin-based classification loss

5
A Batch Algorithm
  • An illustration of a single batch iteration
  • Simplifying assumptions (just for the demo)
  • Instances are in
  • Set
  • Use the Symmetric Log-loss

6
A Batch Algorithm
Calculate discrepancies and weights
43210
0 1 2 3
4
7
A Batch Algorithm
Cumulative weights
0 1 2 3
4
8
Two Batch Algorithms
Update the regressor
43210
Log-Additive update
0 1 2 3
4
9
Progress Bounds
Theorem (Log-Additive update) Theorem
(Additive update) Lemma Both bounds are
non-negative and equal zero only at the optimum
10
Boosting Regularization
  • A new form of regularization for regression and
    classification Boosting
  • Can be implemented by addingpseudo-examples
  • Communicated by Rob Schapire

where
11
Regularization Contd.
  • Regularization ? Compactness of the feasible set
    for
  • Regularization ? A unique attainable optimizer of
    the loss function

?
Proof of Convergence
Progress compactness uniqueness asymptotic
convergence to the optimum
12
Exp-loss vs. Log-loss
  • Two synthetic datasets

Log-loss Exp-loss
13
Extensions
  • Parallel vs. Sequential updates
  • Parallel - update all elements of in parallel
  • Sequential - update the weight of a single weak
    regressor on each round (like classic boosting)
  • Another loss function the Combined Loss

Log-loss
Exp-loss
Comb-loss
14
On-line Algorithms
  • GD and EG online algorithms for Log-loss
  • Relative loss bounds

Future Directions
  • Regression tree learning
  • Solving one-class and various ranking problems
    using similar constructions
  • Regression generalization bounds based on natural
    regularization
Write a Comment
User Comments (0)
About PowerShow.com