Blind online optimization Gradient descent without a gradient - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Blind online optimization Gradient descent without a gradient

Description:

Move in the direction of steepest ascent. Compute f'(x) (rf(x) in ... Holds for arbitrary sequences. Stronger than stochastic model: f1, f2, ..., i.i.d. from D ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 29
Provided by: tauman
Category:

less

Transcript and Presenter's Notes

Title: Blind online optimization Gradient descent without a gradient


1
Blind online optimizationGradient descent
without a gradient
  • Abie Flaxman CMU
  • Adam Tauman Kalai TTI
  • Brendan McMahan CMU

2
Standard convex optimization
  • Convex feasible set S ½ ltd
  • Concave function f S ! lt

x
3
Steepest ascent
  • Move in the direction of steepest ascent
  • Compute f(x) (rf(x) in higher dimensions)
  • Works for convex optimization
  • (and many other problems)

x1
x2
x3
x4
4
Typical application
  • Company produces certain numbers of cars per
    month
  • Vector x 2 ltd (Corollas, Camrys, )
  • Profit of company is concave function of
    production vector
  • Maximize total (eq. average) profit

PROBLEMS
5
Problem definition and results
  • Sequence of unknown concave functions
  • period t pick xt 2 S, find out only ft(xt)
  • convex

Theorem
6
Online model
expected regret
  • Holds for arbitrary sequences
  • Stronger than stochastic model
  • f1, f2, , i.i.d. from D
  • x arg minx2S EDf(x)

7
Outline
  • Problem definition
  • Simple algorithm
  • Analysis sketch
  • Variations
  • Related work applications

8
First try
Zinkevich 03 If we could only compute
gradients
f4(x4)
f3(x3)
f2(x2)
f4
PROFIT
f1(x1)
f3
f2
f1
x1
x2
x3
x4
x
CAMRYS
9
Idea one point gradient
With probability ½, estimate f(x ?)/?
With probability ½, estimate f(x ?)/?
PROFIT
E estimate ¼ f(x)
x
x?
x-?
CAMRYS
10
d-dimensional online algorithm
x3
x4
x1
x2
S
11
Outline
  • Problem definition
  • Simple algorithm
  • Analysis sketch
  • Variations
  • Related work applications

12
Analysis ingredients
  • E1-point estimate is gradient of
  • is small
  • Online gradient ascent analysis Z03
  • Online expected gradient ascent analysis
  • (Hidden complications)

13
1-pt gradient analysis
PROFIT
x?
x-?
CAMRYS
14
1-pt gradient analysis (d-dim)
  • E1-point estimate is gradient of
  • is small 2
  • 1

15
Online gradient ascent Z03

(concave, bounded gradient)
16
Expected gradient ascent analysis
  • Regular deterministic gradient ascent on gt

(concave, bounded gradient)
17
Adaptive adversary
18
Hidden complication
S
19
Hidden complication
S
20
Hidden complication
S
21
Hidden complication
  • Thin sets are bad

S
22
Hidden complication
  • Round sets are good

reshape into isotropic position LV03
23
Outline
  • Problem definition
  • Simple algorithm
  • Analysis sketch
  • Variations
  • Related work applications

24
Variations
diameter
gradient bound
  • Works against adaptive adversary
  • Chooses ft knowing x1, x2, , xt-1
  • Also works if we only get a noisy estimate of
    ft(xt), i.e. Eht(xt)xtft(xt)

25
Related convex optimization
Gradient descent, ...
Ellipsoid, Random walk BV02, Sim. annealing
KV05, Finite difference
Gradient descent (stoch.)
1-pt. gradient appx. G89,S97
Finite difference
Gradient descent (online) Z03
1-pt. gradient appx. BKM04 Finite difference
Kleinberg04
26
Related discrete optimization
27
Switching lanes (experts)
2
3
5
0
3
1
2
3
5
5
0
3
2
2
5
0
3
4
2
3
5
2
3
0
28
Multi-armed bandit (experts)
2
3
5
1
2
3
5
0
2
2
5
0
2
3
5
0
R52,ACFS95,
29
Driving to work (online routing)
TW02,KV02, AK04,BM04
25
Exponentially many paths Exponentially many slot
machines? Finite dimensions Exploration/exploitati
on tradeoff
S
30
Online product design
31
High dimensions
One-dimensional problem easy Discretize,
special case of multi-armed bandit
problem 1/? slot machines No need for convexity

?
d-dimensional problem harder Discretizing at ?
granularity Exp many (1/?d) slot machines )
exponential regret
32
Non-linear applications
33
Conclusions and future work
  • Can learn to optimize a sequence of unrelated
    functions from evaluations
  • Answer toWhat is the sound of one hand
    clapping?
  • Applications
  • Cholesterol
  • Paper airplanes
  • Advertising
  • Future work
  • Many players using same algorithm (game theory)
Write a Comment
User Comments (0)
About PowerShow.com