Title: Primal Methods
1Primal Methods
2Primal Methods
- By a primal method of solution we mean a search
method that works on the original problem
directly by searching through the feasible region
for the optimal solution. - Methods that work on an approximation of the
original problem are often referred to as
Transformation Methods - Each point in the process is feasible
(theoretically) and the value of the objective
function constantly decreases. - Given n variables and m constraints, primal
methods can be devised that work in spaces of
dimension n-m, n, m, or nm. In other words, a
large variety exists.
3Advantages of Primal Methods
- Primal methods possess 3 significant advantages
(Luenberger) - 1) Since each point generated in the search
process is feasible, if the process is terminated
before reaching the solution, the terminating
point is feasible. Thus, the final point is
feasible and probably nearly optimal. - 2) Often it can be guaranteed that if they
generate a convergent sequence, then the limit
point of that sequence must be at least a local
constrained minimum. - 3) Most primal methods do not rely on a special
problem structure, such as convexity, and hence
these methods are applicable to general nonlinear
programming problems. - Furthermore, their convergence rates are
competitive with other methods, and particularly
for linear constraints, they are often among the
most efficient.
4Disadvantages of Primal Methods
- Primal methods are not without disadvantages
- They require a (Phase I) procedure to obtain an
initial feasible point. - They are all plagued, particularly for problems
with nonlinear constraints, with computational
difficulties arising from the necessity to remain
within the feasible region as the method
progresses. - Some methods can fail to converge for problems
with inequality constraints (!) unless elaborate
precautions are taken.
5Some Typical Primal Algorithm Classes
- The following classes of algorithms are typically
noted under primal methods - Feasible direction methods which search only in
directions which are always feasible - Zoutendijks Feasible Direction method
- Active set methods which partition inequality
constraints into two groups of active and
inactive constraints. Constraints treated as
inactive are essentially ignored. - Gradient projection methods which project the
negative gradient of the objective onto the
constraint surface. - (Generalized) reduced gradient methods which
partition the problem variables into basic and
non-basic variables.
6Active Sets
7Dividing the Constraint Set
- Constrained optimization can be made much more
efficient if you know which constraints are
active and which are inactive. - Mathematically, active constraints are always
equalities (!) - Only considering the active constraints leads to
a family of constrained optimization algorithms
that can be classified as active set methods
8Active Set Methods
- The idea underlying active set methods is to
partition inequality constraints into two groups
- those that are active and
- those that are inactive.
- The constraints treated as inactive are
essentially ignored. - Clearly, if the active constraints (for the
solution) would be known, then the original
problem could be replaced by a corresponding
problem having equality constraints only. - Alternatively, suppose we guess an active set and
solve the equality problem. Then if all
constraints and optimality conditions would be
satisfied, then we would have found the correct
solution.
9Basic Active Set Method
- Idea behind active set methods is to define at
each step af the algorithm a set of constraints,
termed the working set, that is to be treated as
the active set. - Active set methods consist of two components
- 1) determine a current working set that is a
subset of the active set, - 2) move on the surface defined by the working
set to an improved solution. This surface is
often referred to as the working surface. - The direction of movement is generally determined
by first or second order approximations of the
functions.
10Basic Active Set Algorithm
- Basic active set algorithm is as follows
- Start with a given working set and begins
minimizing over the corresponding working
surface. - If new constraint boundaries are encountered,
they may be added to the working set, but no
constraints are dropped from the working set. - Finally, a point is obtained that minimizes the
objective function with respect to the current
working set of constraints. - For this point, optimality criteria are checked,
and if it is deemed optimal, the solution has
been found. - Otherwise, one or more inactive constraints are
dropped from the working set and the whole
procedure is restarted with this new working set. - Many variations are possible
- Specific examples
- Gradient Projection algorithm
- (Generalized) Reduced Gradient algorithm
11Some Problems With Active Set Methods
- Accuracy of activity can cause some problems.
- Also, the calculation of the Lagrangian
multipliers may not be accurate if we are just a
bit off the exact optimum. - In practice, constraints are dropped from the
working set using various criteria before an
exact minimum on the working surface is found. - For many algorithms, convergence cannot be
guaranteed and jamming may occur in (very) rare
cases. - Active set methods with various refinements are
often very effective.
12Feasible Direction Methods
13Basic Algorithm
- Each iteration in a feasible direction method
consists of - selecting a feasible direction and
- a constrained line search.
14(Simplified) Zoutendijk Method
One of the earliest proposals for a feasible
direction method uses a linear programming
subproblem. Consider min (x) subject
to a1T x ? b1 ... amT x ? bm Given a
feasible point, xk, let I be the set of indices
representing active constraints, that is, aiT x
bi for i ? I. The direction vector dk is the
chosen as the solution to the linear program
minimize ?(xk) d subject to aiT d ? 0, i
? I (normalizing constraint) where d
(d1, d2, ..., dn) Constraints assure that
vectors of the form will be feasible for
sufficiently small a gt 0, and subject to these
conditions, d is chosen to line up as closely as
possible with the negative gradient of . This
will result in the locally best direction in
which to proceed. The overall procedure
progresses by generating feasible directions in
this manner, and moving along them to decrease
the objective.
15Feasible Descent Directions
- Basic problem
- Min f(x)
- Subject to gi(x) ? 0 with i 1, .., m
- Now think of a direction vector d that is both
descending and feasible - descent direction ( reducing f(x))
- feasible direction ( reducing g(x)
increasing feasibility) - If d reduces f(x), then the following holds
?f(x)T d lt0 - If d increases feasibility of gi(x), then the
following holds ?gi(x)T d lt0 - Given that you know d, you know need to know how
far to go along d. - xk1 xk ak dk
16Finding the direction vector A LP Problem
- The following condition expresses the value of
ak - ak max ?f(x)T d , ?gj(x)T d for each j ? I
- where I is the set of active constraints
- Note that ak lt 0 MUST hold if you want both a
reduction in f(x) and increase in feasibility
(remember g(x) lt 0, thus lower g(x) is better) - The best ak is the lowest valued (most negative)
ak , thus the problem now becomes - minimize a
- Subject to
- ?f(x)T d ? a
- ?gj(x)T d ? a for each j ? I
- -1 ? di ? 1 where I 1, .., n
- This Linear Programming problem now has n1
variables (n elements of vector d plus scalar a)
17Next step Constrained Line Search
- The idea behind feasible direction methods is to
take steps through the feasible region of the
form - xk1 xk ak dk
- where dk is a direction vector and ak is a
nonnegative scalar. - Given that we have dk , next we need to know how
far to move along dk . - The scalar ak is chosen to minimize the objective
function with the restriction that the point
xk1 and the line segment joining xk and xk1 be
feasible. - IMPORTANT Note that while moving along dk, we
may encounter constraints that were inactive, but
can now become active. - Thus, we do need to do a constrained line search
to find the maximum au . - Approach in textbook
- Determine maximum step size based on bounds of
variables. - If all constraints are feasible at the variable
bounds, take this maximum step size as the step
size. - Otherwise, search along dk until you find a
constraint that cause infeasibility first.
18Major Shortcomings
- Two major shortcomings of feasible direction
methods that require modification of the methods
in most cases - 1) For general problems, there may not exist any
feasible direction. (example??) - In such cases, either
- relax definition of feasibility or allow
points to deviate, or - introduce concept of moving along curves
rather than straight lines. - 2) Feasible direction methods can be subject to
jamming a.k.a. zigzagging, that is, it does not
converge to a constrained local minimum. - In Zoutendijk's method, this can be caused
because the method for finding a feasible
direction changes if another constraint becomes
active.
19Gradient Projection Methods
20Basic Problem Formulation
- Gradient projection started from nonlinear
optimization problem with linear constraints - Min f(x)
- S.t.
- arTx ? br
- asTx bs
21Gradient Projection Methods
- Gradient projection method is motivated by the
ordinary methods of steepest descent for
unconstrained problems. - Fundamental Concept The negative gradient of
the objective function is projected on the
working surface (subset of active constraints) in
order to define the direction of movement. - Major task is to calculate projection matrix (P)
and subsequent feasible direction vector d.
22Feasible Direction Vector and Projection Matrix
23Nonlinear Constraints
For the general case of min (x) s.t. h(x)
0 g(x) ? 0 the basic idea is that at a
feasible point xk one determines the active
constraints and projects the negative gradient of
onto the subspace tangent to the surface
determined by these constraints. This vector (if
nonzero) determines the direction for the next
step. However, this vector is in general not a
feasible direction since the working surface may
be curved. Therefore, it may not be possible to
move along this projected negative gradient to
obtain the next point.
24Overcoming Curvature Difficulties
- What is typically done to overcome the problem of
curvature and loss of feasibility is to search
along a curve along the constraint surface, the
direction of the search being defined by the
projected negative gradient. - A new point is found as follows
- First, a move is made along the projected
gradient to a point y. - Then a move is made in the direction
perpendicular to the tangent plane at the
original point to a nearby feasible point on the
working set. - Once this point is found, the value of the
objective function is determined. - This is repeated with various y's until a
feasible point is found that satisfies the
descent criteria for improvement relative to the
original point.
25Difficulties and Complexities
- The movement away from the feasible region and
then coming back introduces difficulties that
require series of interpolations for and
nonlinear equation solutions for their
resolution, because - 1) You first have to get back in the feasible
region, and - 2) next, you have to find a point on the active
set of constraints. - Thus, a satisfactory gradient projection method
is quite complex. - Computation of the nonlinear projection matrix is
also more time consuming than for linear
constraints
Nevertheless, gradient projection method has been
successfully implemented and found to be
effective (your book says otherwise). But, all
the extra features needed to maintain feasibility
require skill.
26(Generalized) Reduced Gradient Method
27Reduced Gradient Method
- Reduced gradient method is closely related to
simplex LP method because variables are split
into basic and non-basic groups. - From a theoretical viewpoint, the method behaves
very much like the gradient projection method. - Like gradient projection method, it can be
regarded as a steepest descent method applied on
the surface defined by the active constraints. - Reduced gradient method seems to be better than
gradient projection methods.
28Dependent and Independent Variables
Consider min (x) s.t. Ax b, x ?
0 Partition variables into two groups x (y, z)
where y has dimension m and z has dimension n-m.
This partition is formed such that all varaibles
in y are strictly positive. Now, the original
problem can be expressed as min (y,
z) s.t. By Cz b, y 0, z 0 (with, of
course, A B, C) Key notion is that if z is
specified (independent variables), than y (the
dependent variables) can be uniquely solved.
NOTE y and z are dependent. Because of this
dependency, if we move z along the line z a Dz,
then y will have to move along a corresponding
line y aDy.
Dependent variables y are also referred to as
basic variables Independent variables z are also
referred to as non-basic variables
29The Reduced Gradient
30Generalized Reduced Gradient
- The generalized reduced gradient solves nonlinear
programming problems in the standard form - minimize (x)
- subject to h(x) 0, a ? x ? b
- where h(x) is of dimension m.
GRG algorithm works similar as with linear
constraints. However, is also plagued with
similar problems as gradient projection methods
regarding maintaining feasibility. Well known
implementation GRG2 software
31Movement Basics
- Basic idea in GRG is to search with z variables
along reduced gradient r for improvement of
objective function and use y variables to
maintain feasibility. - If some zi is at its bound (see Eq. 5.62), then
set the search direction for that variable di
0, dependent on the sign of the reduced gradient
r. - Reason you do not want to violate a variables
bound. Thus, that variable is fixed by not
allowing it to change (di 0 means it has now no
effect on f(x)). - Search xk1 xk a d with d dy, dz (column
vector) - If constraints are linear, then new point is
(automatically) feasible. - See derivation on page 177 constraints and
objective function are combined in reduced
gradient. - If constraint(s) are non-linear, you have to
adjust y with some Dy to get back to feasibility. - Different techniques exist, but basically it is
equivalent to an unconstrained optimization
problem that minimizes constraint violation.
32Changing Basis
- Picking a basis is sometimes poorly discussed in
textbooks - Some literally only say pick a set z and y
- Your textbook provides a method based on Gaussian
elimination on pages 180-181 that is done every
iteration of the method in your textbook. - Other (recent) implementations favor changing
basis only when a basic variable reaches zero (or
equivalently, its upper or lower bound) since
this saves recomputation of B-1. - Thus, if a dependent variable (y) becomes zero,
then this zero-valued dependent (basic) variable
is declared independent and one of the strictly
positive independent variables is made dependent. - Either way, this is analogous to an LP pivot
operation
33Reduced Gradient Algorithm (one implementation)
- Pick a set of independent (z) and dependent (y)
variables - Let ?z -ri if ri lt 0 or ri gt 0. Otherwise let
?z 0 - If ?z 0, then stop because the current point is
the solution. - Otherwise, find Dy B-1CDz
- Find a1, a2, a3 achieving, respectively
- max a1 y a1 Dy 0
- max a2 z a2 Dz 0
- min (x a3 Dx ) 0 a3 a1, 0 a3 a2
- Let x x a3 Dx
- If a3 lt a1, return to (1). Otherwise, declare
the vanishing variable in the dependent set (y)
independent and declare the strictly positive
variable in the independent set (z) dependent
(pivot operation). Update B and C.
Note that your book has a slightly different
implementation on page 181!
34Comments
- GRG method can be quite complex.
- Also note that inequality constraints have to be
converted to equalities first through slack and
surplus variables. - GRG2 is a very well known implementation.