Efficient learning algorithms for changing environments - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Efficient learning algorithms for changing environments

Description:

We want efficient algorithms to get low Adaptive-Regret for Portfolio Management ... It uses standard low-regret algorithms as black box ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 38
Provided by: sesha6
Category:

less

Transcript and Presenter's Notes

Title: Efficient learning algorithms for changing environments


1
Efficient learning algorithms for changing
environments
  • Elad Hazan and C. Seshadhri
  • (IBM Almaden)

2
The online learning setting
G1
G2
GT
3
The online setting
f1(x1)
f1
x1
x2
f2(x2)
f2
xT
fT(xT)
fT
  • Convex bounded functions
  • Total loss ?t ft(xt)
  • Adversary chooses any function from family

4
Regret
f2
f1
fT
x
minima for ?t ft(x) (fixed optimum in hindsight)
  • Loss of our algorithm ?t ft(xt)
  • Regret ?t ft(xt) - ?t ft(x) (Standard notion
    of performance)
  • Continuum of experts
  • Online learning problem - design efficient
    algorithms that attain low regret

5
Sublinear Regret
  • We want
  • Why?
  • Loss per round converges to optimal
  • Obviously, cant compete with best set of points

6
Portfolio Management
Loss
  • HKKA Efficient algorithms that give O(log T)
    regret
  • (Much smaller than usual O(vT) regret)

7
Convergence behaviour
x
xT
x1
  • As t increases, xt xt1 decreases
  • As t increases, learning decreases?
  • Does not adapt to environment

8
Adapting with time
f (1, ½)
f (½,1)
  • Optimal fixed portfolio is (½, ½) put equal
    money on both stocks
  • Low regret algorithms will converge to this
  • But this is terrible!
  • We want algorithm to make a switch!
  • Cannot happen with convergence behaviour

9
Something better than regret?
  • Littlestone-Warmuth, Herbster-Warmuth,
    Bousquet-Warmuth study k-shifting optima
  • Finite expert setting
  • Freund-Schapire-Singer-Warmuth Sleeping experts
  • Lehrer, Blum-Mansour Time selection functions

10
Adaptive Regret
x1
x2
x3
xT
J
f3
fT
f2
f1
11
Adaptive Regret
x1
x2
x3
xT
J
f3
fT
f2
f1
Adaptive Regret
  • Max regret over all intervals
  • Different optimum xJ for every interval J
  • Captures movement of optimum as time progresses
  • We want Adaptive Regret o(T)
  • In any interval of size ?(AR), algorithm
    converges to optimum

12
Results
  • We want efficient algorithms to get low
    Adaptive-Regret for Portfolio Management
  • Normal regret can be as low as O(log T)
  • Can we get Adaptive-Regret close to that?
  • We will deal with a larger class of problems and
    give general results

13
FLH
  • We will describe algorithm Follow-the-Leading-Hist
    ory
  • It uses standard low-regret algorithms as black
    box
  • Bootstrapping procedure - convert low regret into
    low adaptive regret efficiently
  • Done by streaming technique

14
And now for something completely different
  • For exp-concave setting (e.g. square loss,
    portfolio management) HKKA

15
Other work
  • Auer-Cesa Bianchi-Freund-Schapire, Zinkevich,
    Y. Singer
  • Kozat-A. Singer independent work in DSP
    community
  • k-shifting results for portfolio management
  • We give more different technique

16
Study your history!
T
f2
f3
ft
f1
Room of experts
HKKA from f2
HKKA from f3
HKKA from ft
HKKA from f1
xt
17
Who to choose?
ft
HKKA from f1
HKKA from f2
HKKA from f3
HKKA from ft
Multiplicative update based on Herbster Warmuth
Losses of all experts
  • Weight wi for each expert (probabilities)
  • Choose according to this
  • After ft is revealed
  • wi updated with a multiplicative factor, and then
    mix with uniform distribution

18
Running time problem
FTL from f2
FTL from f3
FTL from ft
FTL from f1
J
  • Regret in J is O(log T)
  • Adaptive Regret O(log T)
  • But ?(T) experts needed
  • Running time O(RT) since we runs ?(T) FTLs!!

19
Removing experts
Working set
  • Stream through experts
  • We remove experts
  • Once removed, they are banished forever
  • Working set is very dynamic

20
Working set
t
in St
  • St working set at time t
  • Subset of t
  • Properties
  • St1 \ St t1
  • St O(log t)
  • Well spread out

Woodruff Elegant deterministic construction
Rule on who to throw out from St to get St1
t
21
And therefore
  • Working set always of size O(log T)
  • Running time for each step is only O(R log T)
  • We get O(log2 T) Adaptive Regret with O(log T)
    copies of original low regret algorithm

22
To summarize
  • Defined Adaptive-Regret, a generalization of
    regret that captures moving solutions
  • Low Adaptive-Regret means we converge to fixed
    optimum in every interval
  • Gave bootstrapping algorithm that converts low
    regret into low Adaptive-Regret (almost optimal)
  • For (say) portfolio management, what is the right
    history to look at?

23
Further directions
  • Can streaming/sublinear ideas be used for
    efficiency?
  • Applications to learning scenarios with cost of
    shifting
  • Maybe this technique can be used for online
    algorithms
  • Competitive ratio instead of regret
  • What kind of competitive ratio can these learning
    techniques give?

24
Thanks!
  • No, we didnt make/lose any money playing the
    stock market with this algorithmyet.

25
Tree update problem
Universe n
at
at
Binary search tree Bt on n
Loss cost of accessing at in Bt
26
Tree update problem
Universe n
Binary search tree Bt on n
27
Tree update problem
Universe n
Rotations
Binary search tree Bt1
  • Total cost Total access cost Total rotation
    cost
  • Sleator-Tarjan Splay trees are O(1)-competetive
  • Conjecture

28
Tree update problem
Given sequence a1, a2,, aT
Binary search tree B
  • Total cost Total access cost Total rotation
    cost
  • Regret Total cost Total cost of B o(T)
  • Regret o(cost of B)
  • Static optimality

29
For tree update
  • Given query sequence a1, a2, , aT , let OPT be
    cost of best tree
  • KV FTL based approach gives
  • Total cost (1 1 / vT) OPT
  • Given contiguous sequence J of queries, OPTJ is
    cost of best tree for J
  • We get
  • Cost for J (1 1 / T1/4 ) OPTJ T3/4

30
Square Loss
Loss
xt
xt1
yt
  • Have to pay xt - xt1
  • Get competitive ratio bounds?

31
Being lazy
  • Do we have to update decision every round?
  • Could be expensive - tree update problem
  • We can be lazy, and only do total of m updates
  • But pay regret T/m
  • Used to get low Adaptive-Regret for tree update
    problem

32
Study your history!
T
f2
f3
ft
f1
Room of experts
FTL from f2
FTL from f3
FTL from ft
FTL from f1
xt
33
Running time
FTL from f2
FTL from f3
FTL from ft
FTL from f1
  • Adaptive Regret O(log T)
  • But ?(T) experts needed
  • Running time O(RT) since we runs ?(T) FTLs!!

34
Removing experts
Working set
  • Stream through experts
  • We remove experts
  • Once removed, they are banished forever
  • Working set is very dynamic

35
Working set
t
in St
  • St working set at time t
  • Subset of t
  • Properties
  • St1 \ St t1
  • St O(log t)
  • Well spread out

t
36
Maintaining experts
i
t
  • Woodruff Elegant deterministic construction
  • Rule on who to throw out from St to get St1
  • Completely combinatorial working set

37
And therefore
  • We get O(log2 T) Adaptive Regret with O(log T)
    copies of original low regret algorithm
  • Same ideas for general convex functions
  • Different math though!
  • regret with O(log T) copies
Write a Comment
User Comments (0)
About PowerShow.com