NoRegret Algorithms for Online Convex Programs

About This Presentation

Title:

Description:

Number of Views:72

Avg rating:3.0/5.0

Slides: 22

Provided by: nicolasc9

Category:

more less

Transcript and Presenter's Notes

Title: NoRegret Algorithms for Online Convex Programs

1
No-Regret Algorithms for Online Convex Programs

2
Outline

3
Online Learning

4
Goal of Paper

5
Regret

If we had used a fixed hypothesis y, the loss
would have been
The regret is the difference between the total
loss of the adaptive and fixed hypotheses
Positive regret means that we should have
preferred the fxed hypothesis

6
Hypothesis Set

Assume that hypothesis set Y is a convex subset
of Rd
For example, the simplex of probability
distributions
The corners of Y represent pure actions and the
middle region a probability distribution over
actions

7
Loss Function

8
Regret Vector

9
Use of Regret Vector

10
Safe Set

Region of the regret space in which the regret is
guaranteed to be nonpositive for all hypotheses
Goal of the Lagrangian Hedging algorithm is to
keep its regret vector near the safe set

11
Safe Set (continued)
Hypothesis set Y
Safe Set S
12
Unnormalized Hypotheses

13
Lagrangian Hedging (Setting)

At each step, the algorithm chooses its play
according to the current regret vector and a
closed convex potential function F(s)
Define (sub)gradient of F(s) as f(s)
Potential function is what defines the problem to
be solved
E.g. Hedge / Weighted Majority

14
Lagrangian Hedging (Gradient)
15
Optimization Form

In practice, may be difficult to define, evaluate
and differentiate an appropriate potential
function
Optimization form same pseudo-code as
previously, but define F in terms of a simpler
hedging function W
Example corresponding to previous F1

16
Optimization Form (contd)

17
Theoretical Results(In a nutshell it all works)
18
One-Card Poker

Hypothesis space is the set of sequence weight
vectors
information about when it is player is turn to
move and the actions available at that time
Two players gambler and dealer
Ante 1 / given 1 card from 13-card deck
Gambler Bets / Dealer Bets / Gambler Bets
A player may fold
If neither folds player with highest card wins
pot

19
Why is it interesting?