Title: Blind online optimization Gradient descent without a gradient
1Blind online optimizationGradient descent
without a gradient
- Abie Flaxman CMU
- Adam Tauman Kalai TTI
- Brendan McMahan CMU
2Standard convex optimization
- Convex feasible set S ½ ltd
- Concave function f S ! lt
x
3Steepest ascent
- Move in the direction of steepest ascent
- Compute f(x) (rf(x) in higher dimensions)
- Works for convex optimization
- (and many other problems)
x1
x2
x3
x4
4Typical application
- Company produces certain numbers of cars per
month - Vector x 2 ltd (Corollas, Camrys, )
- Profit of company is concave function of
production vector - Maximize total (eq. average) profit
PROBLEMS
5Problem definition and results
- Sequence of unknown concave functions
- period t pick xt 2 S, find out only ft(xt)
- convex
Theorem
6Online model
expected regret
- Holds for arbitrary sequences
- Stronger than stochastic model
- f1, f2, , i.i.d. from D
- x arg minx2S EDf(x)
7Outline
- Problem definition
- Simple algorithm
- Analysis sketch
- Variations
- Related work applications
8First try
Zinkevich 03 If we could only compute
gradients
f4(x4)
f3(x3)
f2(x2)
f4
PROFIT
f1(x1)
f3
f2
f1
x1
x2
x3
x4
x
CAMRYS
9Idea one point gradient
With probability ½, estimate f(x ?)/?
With probability ½, estimate f(x ?)/?
PROFIT
E estimate ¼ f(x)
x
x?
x-?
CAMRYS
10d-dimensional online algorithm
x3
x4
x1
x2
S
11Outline
- Problem definition
- Simple algorithm
- Analysis sketch
- Variations
- Related work applications
12Analysis ingredients
- E1-point estimate is gradient of
- is small
- Online gradient ascent analysis Z03
- Online expected gradient ascent analysis
- (Hidden complications)
131-pt gradient analysis
PROFIT
x?
x-?
CAMRYS
141-pt gradient analysis (d-dim)
- E1-point estimate is gradient of
- is small 2
-
- 1
15Online gradient ascent Z03
(concave, bounded gradient)
16Expected gradient ascent analysis
- Regular deterministic gradient ascent on gt
(concave, bounded gradient)
17Adaptive adversary
18Hidden complication
S
19Hidden complication
S
20Hidden complication
S
21Hidden complication
S
22Hidden complication
reshape into isotropic position LV03
23Outline
- Problem definition
- Simple algorithm
- Analysis sketch
- Variations
- Related work applications
24Variations
diameter
gradient bound
-
- Works against adaptive adversary
- Chooses ft knowing x1, x2, , xt-1
- Also works if we only get a noisy estimate of
ft(xt), i.e. Eht(xt)xtft(xt)
25Related convex optimization
Sighted (see entire function(s)) Blind (evaluations only)
Regular (single f)
Stochastic (dist over fs or dist over errors)
Online (f1, f2, f3, )
Gradient descent, ...
Ellipsoid, Random walk BV02, Sim. annealing
KV05, Finite difference
Gradient descent (stoch.)
1-pt. gradient appx. G89,S97
Finite difference
Gradient descent (online) Z03
1-pt. gradient appx. BKM04 Finite difference
Kleinberg04
26Related discrete optimization
Linear function(s) over discrete set Sighted (see entire function(s)) Blind aka bandit (evaluations only)
Regular (single f) Shortest path, max,
Stochastic (dist over fs) Huffman trees,
Online (f1, f2, f3, ) Weighted majority, Online linear optimization Hannan57,KV03 Adversarial bandits, Blind linear optimization AK04, MB04 (adaptive adversary)
27Switching lanes (experts)
2
3
5
0
3
1
2
3
5
5
0
3
2
2
5
0
3
4
2
3
5
2
3
0
28Multi-armed bandit (experts)
2
3
5
1
2
3
5
0
2
2
5
0
2
3
5
0
R52,ACFS95,
29Driving to work (online routing)
TW02,KV02, AK04,BM04
25
Exponentially many paths Exponentially many slot
machines? Finite dimensions Exploration/exploitati
on tradeoff
S
30Online product design
31High dimensions
One-dimensional problem easy Discretize,
special case of multi-armed bandit
problem 1/? slot machines No need for convexity
?
d-dimensional problem harder Discretizing at ?
granularity Exp many (1/?d) slot machines )
exponential regret
32Non-linear applications
33Conclusions and future work
- Can learn to optimize a sequence of unrelated
functions from evaluations - Answer toWhat is the sound of one hand
clapping? - Applications
- Cholesterol
- Paper airplanes
- Advertising
- Future work
- Many players using same algorithm (game theory)