Title: Incremental Subgradient Methods for Convex Nondifferentiable Optimization
1Incremental Subgradient Methods for Convex
Nondifferentiable Optimization
5th Ballarat Workshop on Global and Nonsmooth
Optimization 28 Nov. 2006
Angelia Nedich
- Dept. of Industrial and Enterprise Systems
Engineering - University of Illinois at Urbana-Champaign
2Outline
- Large Convex Optimization
- Subgradient Methods
- Incremental Subgradient Method
- Convergence Results
- Randomized Subgradient Method
- Partially Asynchronous Incremental Method
- Convergence Rate Comparison
3Convex Problem with Large Number of Component
Functions
- The functions Fi are convex and possibly
non-differentiable - The constraint set C is nonempty, closed, convex,
and simple for projection - F denotes the optimal value
- C denotes the set of optimal solutions
4Agent-Task Scheduling Problem
- The number of tasks can be very large
- Relaxing the agent time constraints yields
- Issue
- The dual function q is non-differentiable
- The gradient-based methods do not work
5Subgradient
6Basic Subgradient Property
- Property ensuring that subgradient methods work
- Subgradient descent property
- At each iteration, for a range of stepsize
values - - Either function value decreases
- - Or the iterate distance from the optimal set
decreases
7Classical Subgradient Method
- Requires computing the subgradients of all
components Fi at the current iterate - Not suitable for some real-time operations (m
large) - Large volume of incoming data to the processing
node - Issues with the link capacities and congestion
- Large delays between updates
8Incremental Subgradient Method
- Admits decentralized computations
- Suitable for real-time computations
- Intermediate updates along subgradients of
functions Fi - y0k xk
- Cycle through Fi
- yik ?C yi-1,k ak gik (yi-1,k)
- xk1 ymk
- gik(yi-1,k) is a subgradient of Fi at yi-1,k
9Stepsize Rules
- Constant stepsize
- Diminishing stepsize
- Adaptive stepsize (F known )
- Adaptive stepsize (F estimated )
10Convergence Results
- Proposition Let xk be the sequence generated
by the incremental method. - For a constant stepsize
B is the norm bound on subgradients of Fi for all
i
Motivated by a result in Rubinov and Demyanovs
book on Approximate Methods for Optimization
Problems
- For a diminishing stepsize
- For an adaptive stepsize (F known), when C is
nonempty
11Proof Idea
Let xk be the sequence generated by the
incremental method. Then for any feasible vector
y and any k, we have
Thus for any xk and a feasible vector y with
lower cost than xk, the next iterate xk1 is
closer to y than xk, provided the stepsize ak is
sufficiently small
12Randomized Subgradient Methods
Pick randomly an index w from 1,,m
13Convergence Results
- Proposition Let xk be the sequence generated
by the randomized method. - For a constant stepsize
- When C is nonempty
- For a diminishing stepsize with
- For an adaptive stepsize (F known)
Here
B is the norm bound on subgradients of Fi for all
i
14Convergence Rate Comparison
-
- Convergence rate results for constant stepsize
- Assumption The optimal solution set C is
nonempty - For any given error level egt0, we have
B is a bound on the subgradient norms
15Partially Asynchronous
- Estimates xk communicated with delays
- Processing nodes sends the estimates in slotted
times tk
D is an upper bound on the communication delay
Convergence for a stepsize of the form
16Conclusions
- A new subgradient method for large scale convex
optimization - Convergence and rate of convergence analysis
- Extensions to partially asynchronous settings
- Ongoing work on fully asynchronous implementations
17References
- Nedich and Bertsekas Incremental subgradient
methods for nondifferentiable optimization SIAM
J. Optimization - Nedich and Bertsekas Convergence rate of
incremental subgradient algorithms Stochastic
Optimization Algorithms and Applications - Nedich, Bertsekas, and Borkar Distributed
asynchronous incremental subgradient methods
Inherently parallel algorithms in feasibility and
optimization and their applications