Nonlinear Programming - PowerPoint PPT Presentation

1 / 120
About This Presentation
Title:

Nonlinear Programming

Description:

Tier I: Mathematical Methods of Optimization Section 3: Nonlinear Programming Introduction to Nonlinear Programming We already talked about a basic aspect of ... – PowerPoint PPT presentation

Number of Views:1370
Avg rating:3.0/5.0
Slides: 121
Provided by: polymtlCa
Category:

less

Transcript and Presenter's Notes

Title: Nonlinear Programming


1
(No Transcript)
2
Tier I Mathematical Methods of Optimization
  • Section 3
  • Nonlinear Programming

3
Introduction to Nonlinear Programming
  • We already talked about a basic aspect of
    nonlinear programming (NLP) in the Introduction
    Chapter when we considered unconstrained
    optimization.

4
Introduction to Nonlinear Programming
  • We optimized one-variable nonlinear functions
    using the 1st and 2nd derivatives.
  • We will use the same concept here extended to
    functions with more than one variable.

5
Multivariable Unconstrained Optimization
  • For functions with one variable, we use the 1st
    and 2nd derivatives.
  • For functions with multiple variables, we use
    identical information that is the gradient and
    the Hessian.
  • The gradient is the first derivative with respect
    to all variables whereas the Hessian is the
    equivalent of the second derivative

6
The Gradient
  • Review of the gradient (?)
  • For a function f , of variables x1, x2, , xn

Example
7
The Hessian
  • The Hessian (?2) of f(x1, x2, , xn) is

8
Hessian Example
  • Example (from previously)

9
Unconstrained Optimization
  • The optimization procedure for multivariable
    functions is
  • Solve for the gradient of the function equal to
    zero to obtain candidate points.
  • Obtain the Hessian of the function and evaluate
    it at each of the candidate points
  • If the result is positive definite (defined
    later) then the point is a local minimum.
  • If the result is negative definite (defined
    later) then the point is a local maximum.

10
Positive/Negative Definite
  • A matrix is positive definite if all of the
    eigenvalues of the matrix are positive (gt 0)
  • A matrix is negative definite if all of the
    eigenvalues of the matrix are negative (lt 0)

11
Positive/Negative Semi-definite
  • A matrix is positive semi-definite if all of
    the eigenvalues are non-negative ( 0)
  • A matrix is negative semi-definite if all of
    the eigenvalues are non-positive ( 0)

12
Example Matrix
  • Given the matrix A

The eigenvalues of A are
This matrix is negative definite
13
Unconstrained NLP Example
  • Consider the problem
  • Minimize f(x1, x2, x3) (x1)2 x1(1 x2)
    (x2)2 x2x3 (x3)2 x3
  • First, we find the gradient with respect to xi

14
Unconstrained NLP Example
  • Next, we set the gradient equal to zero

So, we have a system of 3 equations and 3
unknowns. When we solve, we get
15
Unconstrained NLP Example
  • So we have only one candidate point to check.
  • Find the Hessian

16
Unconstrained NLP Example
  • The eigenvalues of this matrix are

All of the eigenvalues are gt 0, so the Hessian is
positive definite.
So, the point is a minimum
17
Unconstrained NLP Example
  • Unlike in Linear Programming, unless we know the
    shape of the function being minimized or can
    determine whether it is convex, we cannot tell
    whether this point is the global minimum or if
    there are function values smaller than it.

18
Method of Solution
  • In the previous example, when we set the gradient
    equal to zero, we had a system of 3 linear
    equations 3 unknowns.
  • For other problems, these equations could be
    nonlinear.
  • Thus, the problem can become trying to solve a
    system of nonlinear equations, which can be very
    difficult.

19
Method of Solution
  • To avoid this difficulty, NLP problems are
    usually solved numerically.
  • We will now look at examples of numerical methods
    used to find the optimum point for
    single-variable NLP problems. These and other
    methods may be found in any numerical methods
    reference.

20
Newtons Method
  • When solving the equation f ?(x) 0 to find a
    minimum or maximum, one can use the iteration
    step

where k is the current iteration. Iteration is
continued until xk1 xk lt e where e is some
specified tolerance.
21
Newtons Method Diagram
Tangent of f? (x) at xk
x
x
xk
xk1
f? (x)
  • Newtons Method approximates f? (x) as a straight
    line at xk and obtains a new point (xk1), which
    is used to approximate the function at the next
    iteration. This is carried on until the new point
    is sufficiently close to x.

22
Newtons Method Comments
  • One must ensure that f (xk1) lt f (xk) for
    finding a minimum and f (xk1) gt f (xk) for
    finding a maximum.
  • Disadvantages
  • Both the first and second derivatives must be
    calculated
  • The initial guess is very important if it is
    not close enough to the solution, the method may
    not converge

23
Regula-Falsi Method
  • This method requires two points, xa xb that
    bracket the solution to the equation f ?(x) 0.

where xc will be between xa xb. The next
interval will be xc and either xa or xb,
whichever has the sign opposite of xc.
24
Regula-Falsi Diagram
f? (x)
xc
xa
x
x
xb
  • The Regula-Falsi method approximates the function
    f? (x) as a straight line and interpolates to
    find the root.

25
Regula-Falsi Comments
  • This method requires initial knowledge of two
    points bounding the solution
  • However, it does not require the calculation of
    the second derivative
  • The Regula-Falsi Method requires slightly more
    iterations to converge than the Newtons Method

26
Multivariable Optimization
  • Now we will consider unconstrained multivariable
    optimization
  • Nearly all multivariable optimization methods do
    the following
  • Choose a search direction dk
  • Minimize along that direction to find a new point

where k is the current iteration number and ak
is a positive scalar called the step size.
27
The Step Size
  • The step size, ak, is calculated in the following
    way
  • We want to minimize the function f(xk1) f(xk
    akdk) where the only variable is ak because xk
    dk are known.
  • We set and solve for ak using a
    single-variable solution method such as the ones
    shown previously.

28
Steepest Descent Method
  • This method is very simple it uses the gradient
    (for maximization) or the negative gradient (for
    minimization) as the search direction

for
So,
29
Steepest Descent Method
  • Because the gradient is the rate of change of the
    function at that point, using the gradient (or
    negative gradient) as the search direction helps
    reduce the number of iterations needed

x2
f(x) 5
-?f(xk)
f(x) 20
?f(xk)
xk
f(x) 25
x1
30
Steepest Descent Method Steps
  • So the steps of the Steepest Descent Method are
  • Choose an initial point x0
  • Calculate the gradient ?f(xk) where k is the
    iteration number
  • Calculate the search vector
  • Calculate the next xUse a single-variable
    optimization method to determine ak.

31
Steepest Descent Method Steps
  • To determine convergence, either use some given
    tolerance e1 and evaluatefor convergence
  • Or, use another tolerance e2 and evaluatefor
    convergence

32
Convergence
  • These two criteria can be used for any of the
    multivariable optimization methods discussed here
  • Recall The norm of a vector, x is given by

33
Steepest Descent Example
  • Lets solve the earlier problem with the Steepest
    Descent Method
  • Minimize f(x1, x2, x3) (x1)2 x1(1 x2)
    (x2)2 x2x3 (x3)2 x3
  • Lets pick

34
Steepest Descent Example
Now, we need to determine a0
35
Steepest Descent Example
Now, set equal to zero and solve
36
Steepest Descent Example
  • So,

37
Steepest Descent Example
  • Take the negative gradient to find the next
    search direction

38
Steepest Descent Example
Update the iteration formula
39
Steepest Descent Example
Insert into the original function take the
derivative so that we can find a1
40
Steepest Descent Example
Now we can set the derivative equal to zero and
solve for a1
41
Steepest Descent Example
  • Now, calculate x2

42
Steepest Descent Example
  • So,

43
Steepest Descent Example
  • Find a2

Set the derivative equal to zero and solve
44
Steepest Descent Example
  • Calculate x3

45
Steepest Descent Example
Find the next search direction
46
Steepest Descent Example
  • Find a3

47
Steepest Descent Example
  • So, x4 becomes

48
Steepest Descent Example
The next search direction
49
Steepest Descent Example
  • Find a4

50
Steepest Descent Example
  • Update x5

51
Steepest Descent Example
  • Lets check to see if the convergence criteria is
    satisfied
  • Evaluate ?f(x5)

52
Steepest Descent Example
  • So, ?f(x5) 0.0786, which is very small and
    we can take it to be close enough to zero for our
    example
  • Notice that the answer of

is very close to the value of that we obtained
analytically
53
Quadratic Functions
  • Quadratic functions are important for the next
    method we will look at
  • A quadratic function can be written in the form
    xTQx where x is the vector of variables and Q is
    a matrix of coefficients
  • Example

54
Conjugate Gradient Method
  • The Conjugate Gradient Method has the property
    that if f(x) is quadratic, it will take exactly n
    iterations to converge, where n is the number of
    variables in the x vector
  • Although it works especially well with quadratic
    functions, this method will work with
    non-quadratic functions also

55
Conjugate Gradient Steps
  • Choose a starting point x0 and calculate f(x0).
    Let d0 -?f(x0)
  • Calculate x1 usingFind a0 by performing a
    single-variable optimization on f(x0 a0d0) using
    the methods discussed earlier. (See illustration
    after algorithm explanation)

56
Conjugate Gradient Steps
  • Calculate f(x1) and ?f(x1). The new search
    direction is calculated using the equation

This can be generalized for the kth iteration
57
Conjugate Gradient Steps
  • Use either of the two methods discussed earlier
    to determine tolerance

Or,
58
Number of Iterations
  • For quadratic functions, this method will
    converge in n iterations (k n)
  • For non-quadratic functions, after n iterations,
    the algorithm cycles again with dn1 becoming d0.

59
Step Size for Quadratic Functions
  • When optimizing the step size, we can approximate
    the function to be optimized in the following
    manner
  • For a quadratic function, this is not an
    approximation it is exact

60
Step Size for Quadratic Functions
  • We take the derivative of that function with
    respect to a and set it equal to zero

The solution to this equation is
61
Step Size for Quadratic Functions
  • So, for the problem of optimizing a quadratic
    function,
  • is the optimum step size.
  • For a non-quadratic function, this is an
    approximation of the optimum step size.

62
Multivariable Newtons Method
  • We can approximate the gradient of f at a point
    x0 by

We can set the right-hand side equal to zero and
rearrange to give
63
Multivariable Newtons Method
  • We can generalize this equation to give an
    iterative expression for the Newtons Method

where k is the iteration number
64
Newtons Method Steps
  • Choose a starting point, x0
  • Calculate ?f(xk) and ?2f(xk)
  • Calculate the next x using the equation
  • Use either of the convergence criteria discussed
    earlier to determine convergence. If it hasnt
    converged, return to step 2.

65
Comments on Newtons Method
  • We can see that unlike the previous two methods,
    Newtons Method uses both the gradient and the
    Hessian
  • This usually reduces the number of iterations
    needed, but increases the computation needed for
    each iteration
  • So, for very complex functions, a simpler method
    is usually faster

66
Newtons Method Example
  • For an example, we will use the same problem as
    before
  • Minimize f(x1, x2, x3) (x1)2 x1(1 x2)
    (x2)2 x2x3 (x3)2 x3

67
Newtons Method Example
  • The Hessian is

And we will need the inverse of the Hessian
68
Newtons Method Example
  • So, pick

Calculate the gradient for the 1st iteration
69
Newtons Method Example
  • So, the new x is

70
Newtons Method Example
  • Now calculate the new gradient

Since the gradient is zero, the method has
converged
71
Comments on Example
  • Because it uses the 2nd derivative, Newtons
    Method models quadratic functions exactly and can
    find the optimum point in one iteration.
  • If the function had been a higher order, the
    Hessian would not have been constant and it would
    have been much more work to calculate the Hessian
    and take the inverse for each iteration.

72
Constrained Nonlinear Optimization
  • Previously in this chapter, we solved NLP
    problems that only had objective functions, with
    no constraints.
  • Now we will look at methods on how to solve
    problems that include constraints.

73
NLP with Equality Constraints
  • First, we will look at problems that only contain
    equality constraints
  • Minimize f(x) x x1 x2 xn
  • Subject to hi(x) bi i 1, 2, , m

74
Illustration
  • Consider the problem
  • Minimize x1 x2
  • Subject to (x1)2 (x2)2 1 0
  • The feasible region is a circle with a radius of
    one. The possible objective function curves are
    lines with a slope of -1. The minimum will be
    the point where the lowest line still touches the
    circle.

75
Graph of Illustration
Feasible region
The gradient of f points in the direction of
increasing f
f(x) 1
f(x) 0
f(x) -1.414
76
More on the Graph
  • Since the objective function lines are straight
    parallel lines, the gradient of f is a straight
    line pointing toward the direction of increasing
    f, which is to the upper right
  • The gradient of h will be pointing out from the
    circle and so its direction will depend on the
    point at which the gradient is evaluated.

77
Further Details
x2
Tangent Plane
Feasible region
x1
f(x) 1
f(x) 0
f(x) -1.414
78
Conclusions
  • At the optimum point, ?f(x) is perpendicular to
    ?h(x)
  • As we can see at point x1, ?f(x) is not
    perpendicular to ?h(x) and we can move (down) to
    improve the objective function
  • We can say that at a max or min, ?f(x) must be
    perpendicular to ?h(x)
  • Otherwise, we could improve the objective
    function by changing position

79
First Order Necessary Conditions
  • So, in order for a point to be a minimum (or
    maximum), it must satisfy the following equation

This equation means that ?f(x) and ?h(x) must
be in exactly opposite directions at a minimum or
maximum point
80
The Lagrangian Function
  • To help in using this fact, we introduce the
    Lagrangian Function, L(x,l)

Review The notation ?x f(x,y) means the gradient
of f with respect to x. So,
81
First Order Necessary Conditions
  • So, using the new notation to express the First
    Order Necessary Conditions (FONC), if x is a
    minimum (or maximum) then

82
First Order Necessary Conditions
  • Another way to think about it is that the one
    Lagrangian function includes all information
    about our problem
  • So, we can treat the Lagrangian as an
    unconstrained optimization problem with variables
    x1, x2, , xn and l1, l2, , lm.
  • We can solve it by solving the equations

83
Using the FONC
  • Using the FONC for the previous example,

And the first FONC equation is
84
FONC Example
  • This becomes

The feasibility equation is or,
85
FONC Example
  • So, we have three equations and three unknowns.
  • When they are solved simultaneously, we obtain

We can see from the graph that positive x1 x2
corresponds to a maximum while negative x1 x2
corresponds to the minimum.
86
FONC Observations
  • If you go back to the LP Chapter and look at the
    mathematical definition of the KKT conditions,
    you may notice that they look just like our FONC
    that we just used
  • This is because it is the same concept
  • We simply used a slightly different derivation
    this time but obtained the same result

87
Limitations of FONC
  • The FONC do not guarantee that the solution(s)
    will be minimums/maximums.
  • As in the case of unconstrained optimization,
    they only provide us with candidate points that
    need to be verified by the second order
    conditions.
  • Only if the problem is convex do the FONC
    guarantee the solutions will be extreme points.

88
Second Order Necessary Conditions (SONC)
  • For where
  • and for y where

If x is a local minimum, then
89
Second Order Sufficient Conditions (SOSC)
  • y can be thought of as being a tangent plane as
    in the graphical example shown previously
  • Jh is just the gradients of each h(x) equation
    and we saw in the example that the tangent plane
    must be perpendicular to ?h(x) and that is why

90
The y Vector
x3
x2
Tangent Plane(all possible y vectors)
x1
  • The tangent plane is the location of all y
    vectors and intersects with x
  • It must be orthogonal (perpendicular) to ?h(x)

91
Maximization Problems
  • The previous definitions of the SONC SOSC were
    for minimization problems
  • For maximization problems, the sense of the
    inequality sign will be reversed
  • For maximization problems
  • SONC
  • SOSC

92
Necessary Sufficient
  • The necessary conditions are required for a point
    to be an extremum but even if they are satisfied,
    they do not guarantee that the point is an
    extremum.
  • If the sufficient conditions are true, then the
    point is guaranteed to be an extremum. But if
    they are not satisfied, this does not mean that
    the point is not an extremum.

93
Procedure
  • Solve the FONC to obtain candidate points.
  • Test the candidate points with the SONC
  • Eliminate any points that do not satisfy the SONC
  • Test the remaining points with the SOSC
  • The points that satisfy them are min/maxs
  • For the points that do not satisfy, we cannot say
    whether they are extreme points or not

94
Problems with Inequality Constraints
  • We will consider problems such as
  • Minimize f(x)
  • Subject to hi(x) 0 i 1, , m
  • gj(x) 0 j 1, , p

An inequality constraint, gj(x) 0 is called
active at x if gj(x) 0. Let the set I(x)
contain all the indices of the active constraints
at x
for all j in set I(x)
95
Lagrangian for Equality Inequality Constraint
Problems
  • The Lagrangian is written
  • We use ls for the equalities ms for the
    inequalities.

96
FONC for Equality Inequality Constraints
  • For the general Lagrangian, the FONC become

and the complementary slackness condition
97
SONC for Equality Inequality Constraints
  • The SONC (for a minimization problem) are
  • where as before.

This time, J(x) is the matrix of the gradients
of all the equality constraints and only the
inequality constraints that are active at x.
98
SOSC for Equality Inequality Constraints
  • The SOSC for a minimization problem with equality
    inequality constraints are

99
Generalized Lagrangian Example
  • Solve the problem
  • Minimize f(x) (x1 1)2 (x2)2
  • Subject to h(x) (x1)2 (x2)2 x1 x2 0
  • g(x) x1 (x2)2 0
  • The Lagrangian for this problem is

100
Generalized Lagrangian Example
  • The first order necessary conditions

101
Generalized Lagrangian Example
  • Solving the 4 FONC equations, we get 2 solutions

1)
and
2)
102
Generalized Lagrangian Example
  • Now try the SONC at the 1st solution
  • Both h(x) g(x) are active at this point (they
    both equal zero). So, the Jacobian is the
    gradient of both functions evaluated at x(1)

103
Generalized Lagrangian Example
  • The only solution to the equation

is
And the Hessian of the Lagrangian is
104
Generalized Lagrangian Example
  • So, the SONC equation is

This inequality is true, so the SONC is satisfied
for x(1) and it is still a candidate point.
105
Generalized Lagrangian Example
  • The SOSC equation is
  • And we just calculated the left-hand side of the
    equation to be the zero matrix. So, in our case
    for x2

So, the SOSC are not satisfied.
106
Generalized Lagrangian Example
  • For the second solution
  • Again, both h(x) g(x) are active at this point.
    So, the Jacobian is

107
Generalized Lagrangian Example
  • The only solution to the equation

is
And the Hessian of the Lagrangian is
108
Generalized Lagrangian Example
  • So, the SONC equation is

This inequality is true, so the SONC is satisfied
for x(2) and it is still a candidate point
109
Generalized Lagrangian Example
  • The SOSC equation is
  • And we just calculated the left-hand side of the
    equation to be the zero matrix. So, in our case
    for x2

So, the SOSC are not satisfied.
110
Example Conclusions
  • So, we can say that both x(1) x(2) may be local
    minimums, but we cannot be sure because the SOSC
    are not satisfied for either point.

111
Numerical Methods
  • As you can see from this example, the most
    difficult step is to solve a system of nonlinear
    equations to obtain the candidate points.
  • Instead of taking gradients of functions,
    automated NLP solvers use various methods to
    change a general NLP into an easier optimization
    problem.

112
Excel Example
  • Lets solve the previous example with Excel
  • Minimize f(x) (x1 1)2 (x2)2
  • Subject to h(x) (x1)2 (x2)2 x1 x2 0
  • g(x) x1 (x2)2 0

113
Excel Example
  • We enter the objective function and constraint
    equations into the spreadsheet

114
Excel Example
  • Now, open the solver dialog box under the Tools
    menu and specify the objective function value as
    the target cell and choose the Min option. As it
    is written, A3 B3 are the variable cells. And
    the constraints should be added the equality
    constraint and the constraint.

115
Excel Example
  • The solver box should look like the following

116
Excel Example
  • This is a nonlinear model, so unlike the examples
    in the last chapter, we wont choose the Assume
    Linear Model in the options menu
  • Also, x1 x2 are not specified to be positive,
    so we dont check the Assume Non-negative box
  • If desired, the tolerance may be decreased to 0.1

117
Excel Example
  • When we solve the problem, the spreadsheet
    doesnt change because our initial guess of x1
    0 x2 0 is an optimum solution, as we found
    when we solved the problem analytically.

118
Excel Example
  • However, if we choose initial values of both x1
    x2 as -1, we get the following solution

119
Conclusions
  • So, by varying the initial values, we can get
    both of the candidate points we found previously
  • However, the NLP solver tells us that they are
    both local minimum points

120
References
  • Material for this chapter has been taken from
  • Optimization of Chemical Processes 2nd Ed.
    Edgar, Thomas David Himmelblau Leon Lasdon.
Write a Comment
User Comments (0)
About PowerShow.com