Primal Methods - PowerPoint PPT Presentation

1 / 34
About This Presentation

Primal Methods


... convergence cannot be guaranteed and jamming may occur in (very) rare cases. ... direction methods can be subject to jamming a.k.a. zigzagging, that is, it does ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 35
Provided by: Dig79


Transcript and Presenter's Notes

Title: Primal Methods

Primal Methods
Primal Methods
  • By a primal method of solution we mean a search
    method that works on the original problem
    directly by searching through the feasible region
    for the optimal solution.
  • Methods that work on an approximation of the
    original problem are often referred to as
    Transformation Methods
  • Each point in the process is feasible
    (theoretically) and the value of the objective
    function constantly decreases.
  • Given n variables and m constraints, primal
    methods can be devised that work in spaces of
    dimension n-m, n, m, or nm. In other words, a
    large variety exists.

Advantages of Primal Methods
  • Primal methods possess 3 significant advantages
  • 1) Since each point generated in the search
    process is feasible, if the process is terminated
    before reaching the solution, the terminating
    point is feasible. Thus, the final point is
    feasible and probably nearly optimal.
  • 2) Often it can be guaranteed that if they
    generate a convergent sequence, then the limit
    point of that sequence must be at least a local
    constrained minimum.
  • 3) Most primal methods do not rely on a special
    problem structure, such as convexity, and hence
    these methods are applicable to general nonlinear
    programming problems.
  • Furthermore, their convergence rates are
    competitive with other methods, and particularly
    for linear constraints, they are often among the
    most efficient.

Disadvantages of Primal Methods
  • Primal methods are not without disadvantages
  • They require a (Phase I) procedure to obtain an
    initial feasible point.
  • They are all plagued, particularly for problems
    with nonlinear constraints, with computational
    difficulties arising from the necessity to remain
    within the feasible region as the method
  • Some methods can fail to converge for problems
    with inequality constraints (!) unless elaborate
    precautions are taken.

Some Typical Primal Algorithm Classes
  • The following classes of algorithms are typically
    noted under primal methods
  • Feasible direction methods which search only in
    directions which are always feasible
  • Zoutendijks Feasible Direction method
  • Active set methods which partition inequality
    constraints into two groups of active and
    inactive constraints. Constraints treated as
    inactive are essentially ignored.
  • Gradient projection methods which project the
    negative gradient of the objective onto the
    constraint surface.
  • (Generalized) reduced gradient methods which
    partition the problem variables into basic and
    non-basic variables.

Active Sets
Dividing the Constraint Set
  • Constrained optimization can be made much more
    efficient if you know which constraints are
    active and which are inactive.
  • Mathematically, active constraints are always
    equalities (!)
  • Only considering the active constraints leads to
    a family of constrained optimization algorithms
    that can be classified as active set methods

Active Set Methods
  • The idea underlying active set methods is to
    partition inequality constraints into two groups
  • those that are active and
  • those that are inactive.
  • The constraints treated as inactive are
    essentially ignored.
  • Clearly, if the active constraints (for the
    solution) would be known, then the original
    problem could be replaced by a corresponding
    problem having equality constraints only.
  • Alternatively, suppose we guess an active set and
    solve the equality problem. Then if all
    constraints and optimality conditions would be
    satisfied, then we would have found the correct

Basic Active Set Method
  • Idea behind active set methods is to define at
    each step af the algorithm a set of constraints,
    termed the working set, that is to be treated as
    the active set.
  • Active set methods consist of two components
  • 1) determine a current working set that is a
    subset of the active set,
  • 2) move on the surface defined by the working
    set to an improved solution. This surface is
    often referred to as the working surface.
  • The direction of movement is generally determined
    by first or second order approximations of the

Basic Active Set Algorithm
  • Basic active set algorithm is as follows
  • Start with a given working set and begins
    minimizing over the corresponding working
  • If new constraint boundaries are encountered,
    they may be added to the working set, but no
    constraints are dropped from the working set.
  • Finally, a point is obtained that minimizes the
    objective function with respect to the current
    working set of constraints.
  • For this point, optimality criteria are checked,
    and if it is deemed optimal, the solution has
    been found.
  • Otherwise, one or more inactive constraints are
    dropped from the working set and the whole
    procedure is restarted with this new working set.
  • Many variations are possible
  • Specific examples
  • Gradient Projection algorithm
  • (Generalized) Reduced Gradient algorithm

Some Problems With Active Set Methods
  • Accuracy of activity can cause some problems.
  • Also, the calculation of the Lagrangian
    multipliers may not be accurate if we are just a
    bit off the exact optimum.
  • In practice, constraints are dropped from the
    working set using various criteria before an
    exact minimum on the working surface is found.
  • For many algorithms, convergence cannot be
    guaranteed and jamming may occur in (very) rare
  • Active set methods with various refinements are
    often very effective.

Feasible Direction Methods
Basic Algorithm
  • Each iteration in a feasible direction method
    consists of
  • selecting a feasible direction and
  • a constrained line search.

(Simplified) Zoutendijk Method
One of the earliest proposals for a feasible
direction method uses a linear programming
subproblem. Consider min (x) subject
to a1T x ? b1 ... amT x ? bm Given a
feasible point, xk, let I be the set of indices
representing active constraints, that is, aiT x
bi for i ? I. The direction vector dk is the
chosen as the solution to the linear program
minimize ?(xk) d subject to aiT d ? 0, i
? I (normalizing constraint) where d
(d1, d2, ..., dn) Constraints assure that
vectors of the form will be feasible for
sufficiently small a gt 0, and subject to these
conditions, d is chosen to line up as closely as
possible with the negative gradient of . This
will result in the locally best direction in
which to proceed. The overall procedure
progresses by generating feasible directions in
this manner, and moving along them to decrease
the objective.
Feasible Descent Directions
  • Basic problem
  • Min f(x)
  • Subject to gi(x) ? 0 with i 1, .., m
  • Now think of a direction vector d that is both
    descending and feasible
  • descent direction ( reducing f(x))
  • feasible direction ( reducing g(x)
    increasing feasibility)
  • If d reduces f(x), then the following holds
    ?f(x)T d lt0
  • If d increases feasibility of gi(x), then the
    following holds ?gi(x)T d lt0
  • Given that you know d, you know need to know how
    far to go along d.
  • xk1 xk ak dk

Finding the direction vector A LP Problem
  • The following condition expresses the value of
  • ak max ?f(x)T d , ?gj(x)T d for each j ? I
  • where I is the set of active constraints
  • Note that ak lt 0 MUST hold if you want both a
    reduction in f(x) and increase in feasibility
    (remember g(x) lt 0, thus lower g(x) is better)
  • The best ak is the lowest valued (most negative)
    ak , thus the problem now becomes
  • minimize a
  • Subject to
  • ?f(x)T d ? a
  • ?gj(x)T d ? a for each j ? I
  • -1 ? di ? 1 where I 1, .., n
  • This Linear Programming problem now has n1
    variables (n elements of vector d plus scalar a)

Next step Constrained Line Search
  • The idea behind feasible direction methods is to
    take steps through the feasible region of the
  • xk1 xk ak dk
  • where dk is a direction vector and ak is a
    nonnegative scalar.
  • Given that we have dk , next we need to know how
    far to move along dk .
  • The scalar ak is chosen to minimize the objective
    function with the restriction that the point
    xk1 and the line segment joining xk and xk1 be
  • IMPORTANT Note that while moving along dk, we
    may encounter constraints that were inactive, but
    can now become active.
  • Thus, we do need to do a constrained line search
    to find the maximum au .
  • Approach in textbook
  • Determine maximum step size based on bounds of
  • If all constraints are feasible at the variable
    bounds, take this maximum step size as the step
  • Otherwise, search along dk until you find a
    constraint that cause infeasibility first.

Major Shortcomings
  • Two major shortcomings of feasible direction
    methods that require modification of the methods
    in most cases
  • 1) For general problems, there may not exist any
    feasible direction. (example??)
  • In such cases, either
  • relax definition of feasibility or allow
    points to deviate, or
  • introduce concept of moving along curves
    rather than straight lines.
  • 2) Feasible direction methods can be subject to
    jamming a.k.a. zigzagging, that is, it does not
    converge to a constrained local minimum.
  • In Zoutendijk's method, this can be caused
    because the method for finding a feasible
    direction changes if another constraint becomes

Gradient Projection Methods
Basic Problem Formulation
  • Gradient projection started from nonlinear
    optimization problem with linear constraints
  • Min f(x)
  • S.t.
  • arTx ? br
  • asTx bs

Gradient Projection Methods
  • Gradient projection method is motivated by the
    ordinary methods of steepest descent for
    unconstrained problems.
  • Fundamental Concept The negative gradient of
    the objective function is projected on the
    working surface (subset of active constraints) in
    order to define the direction of movement.
  • Major task is to calculate projection matrix (P)
    and subsequent feasible direction vector d.

Feasible Direction Vector and Projection Matrix
Nonlinear Constraints
For the general case of min (x) s.t. h(x)
0 g(x) ? 0 the basic idea is that at a
feasible point xk one determines the active
constraints and projects the negative gradient of
onto the subspace tangent to the surface
determined by these constraints. This vector (if
nonzero) determines the direction for the next
step. However, this vector is in general not a
feasible direction since the working surface may
be curved. Therefore, it may not be possible to
move along this projected negative gradient to
obtain the next point.
Overcoming Curvature Difficulties
  • What is typically done to overcome the problem of
    curvature and loss of feasibility is to search
    along a curve along the constraint surface, the
    direction of the search being defined by the
    projected negative gradient.
  • A new point is found as follows
  • First, a move is made along the projected
    gradient to a point y.
  • Then a move is made in the direction
    perpendicular to the tangent plane at the
    original point to a nearby feasible point on the
    working set.
  • Once this point is found, the value of the
    objective function is determined.
  • This is repeated with various y's until a
    feasible point is found that satisfies the
    descent criteria for improvement relative to the
    original point.

Difficulties and Complexities
  • The movement away from the feasible region and
    then coming back introduces difficulties that
    require series of interpolations for and
    nonlinear equation solutions for their
    resolution, because
  • 1) You first have to get back in the feasible
    region, and
  • 2) next, you have to find a point on the active
    set of constraints.
  • Thus, a satisfactory gradient projection method
    is quite complex.
  • Computation of the nonlinear projection matrix is
    also more time consuming than for linear

Nevertheless, gradient projection method has been
successfully implemented and found to be
effective (your book says otherwise). But, all
the extra features needed to maintain feasibility
require skill.
(Generalized) Reduced Gradient Method
Reduced Gradient Method
  • Reduced gradient method is closely related to
    simplex LP method because variables are split
    into basic and non-basic groups.
  • From a theoretical viewpoint, the method behaves
    very much like the gradient projection method.
  • Like gradient projection method, it can be
    regarded as a steepest descent method applied on
    the surface defined by the active constraints.
  • Reduced gradient method seems to be better than
    gradient projection methods.

Dependent and Independent Variables
Consider min (x) s.t. Ax b, x ?
0 Partition variables into two groups x (y, z)
where y has dimension m and z has dimension n-m.
This partition is formed such that all varaibles
in y are strictly positive. Now, the original
problem can be expressed as min (y,
z) s.t. By Cz b, y 0, z 0 (with, of
course, A B, C) Key notion is that if z is
specified (independent variables), than y (the
dependent variables) can be uniquely solved.
NOTE y and z are dependent. Because of this
dependency, if we move z along the line z a Dz,
then y will have to move along a corresponding
line y aDy.
Dependent variables y are also referred to as
basic variables Independent variables z are also
referred to as non-basic variables
The Reduced Gradient
Generalized Reduced Gradient
  • The generalized reduced gradient solves nonlinear
    programming problems in the standard form
  • minimize (x)
  • subject to h(x) 0, a ? x ? b
  • where h(x) is of dimension m.

GRG algorithm works similar as with linear
constraints. However, is also plagued with
similar problems as gradient projection methods
regarding maintaining feasibility. Well known
implementation GRG2 software
Movement Basics
  • Basic idea in GRG is to search with z variables
    along reduced gradient r for improvement of
    objective function and use y variables to
    maintain feasibility.
  • If some zi is at its bound (see Eq. 5.62), then
    set the search direction for that variable di
    0, dependent on the sign of the reduced gradient
  • Reason you do not want to violate a variables
    bound. Thus, that variable is fixed by not
    allowing it to change (di 0 means it has now no
    effect on f(x)).
  • Search xk1 xk a d with d dy, dz (column
  • If constraints are linear, then new point is
    (automatically) feasible.
  • See derivation on page 177 constraints and
    objective function are combined in reduced
  • If constraint(s) are non-linear, you have to
    adjust y with some Dy to get back to feasibility.
  • Different techniques exist, but basically it is
    equivalent to an unconstrained optimization
    problem that minimizes constraint violation.

Changing Basis
  • Picking a basis is sometimes poorly discussed in
  • Some literally only say pick a set z and y
  • Your textbook provides a method based on Gaussian
    elimination on pages 180-181 that is done every
    iteration of the method in your textbook.
  • Other (recent) implementations favor changing
    basis only when a basic variable reaches zero (or
    equivalently, its upper or lower bound) since
    this saves recomputation of B-1.
  • Thus, if a dependent variable (y) becomes zero,
    then this zero-valued dependent (basic) variable
    is declared independent and one of the strictly
    positive independent variables is made dependent.
  • Either way, this is analogous to an LP pivot

Reduced Gradient Algorithm (one implementation)
  • Pick a set of independent (z) and dependent (y)
  • Let ?z -ri if ri lt 0 or ri gt 0. Otherwise let
    ?z 0
  • If ?z 0, then stop because the current point is
    the solution.
  • Otherwise, find Dy B-1CDz
  • Find a1, a2, a3 achieving, respectively
  • max a1 y a1 Dy 0
  • max a2 z a2 Dz 0
  • min (x a3 Dx ) 0 a3 a1, 0 a3 a2
  • Let x x a3 Dx
  • If a3 lt a1, return to (1). Otherwise, declare
    the vanishing variable in the dependent set (y)
    independent and declare the strictly positive
    variable in the independent set (z) dependent
    (pivot operation). Update B and C.

Note that your book has a slightly different
implementation on page 181!
  • GRG method can be quite complex.
  • Also note that inequality constraints have to be
    converted to equalities first through slack and
    surplus variables.
  • GRG2 is a very well known implementation.
Write a Comment
User Comments (0)