Title: Smooth, Unconstrained Nonlinear Optimization
1Smooth, Unconstrained Nonlinear Optimization
- Objective is nonlinear
- 6/16/05
2References
- ADS Handout (optional Conmin handout)
- iSIGHT MDOL Reference Guide
- Optimization in Operations Research. Ronald
Rardin. Prentice Hall. 1998 - Numerical Optimization Techniques for Engineering
Design. Garret Vanderplaats, 1999 - The slides are copied directly from these
references.
3Unconstrained Optimization
- Introduces many concepts that we will build on in
constrained optimization - Search direction calculation
- Amount to move along S
- One form of constrained optimization is solving a
series of unconstrained problems where constraint
violations are penalties added to objective. - Introduce mathematical concepts that allow
somealgorithms to claim a local optimum is a
global optimum
4Topic Plan
- Examples
- Regression
- Sams Club without Half Mile Restriction
- Smooth
- Unimodal
- Continuous first and second derivatives
- One dimensional search
- Derivatives and Conditions for Optimality
- Gradient approximations
- Gradient search
- ADS package
- Lab
5Dell Computer Regression Example
6Sams Club Location
Choose a location for the next Sams Club
department store. Dots on map below show 3
population centers of areas to be served.
Population center 1 has 60,000 persons, center 2
has 20,000 and center 3 has 30,000. Locate store
to maximize business from three populations.
Experience shows that business attracted from any
population follows a gravity pattern
proportional to population and inversely
proportional to 1 square of its distance from
chosen location
7 Sams Club Unconstrained Optimization Model
8Vanderplaats Search Categorization
- First order (ADS) This lecture
- Second order ( Newton ) This lecture
- Zero order ( no gradients ) Hooke Jeeves and
Downhill Simplex Next Lecture
9Smooth Nonlinear Functions
A function f(x) is said to be smooth if it is
continuous and differential at all relevant x.
Otherwise it is non smooth. iSIGHT uses gradient
approximations (does not yet useautomatic
differentiation)
10Are Your Computer Programs Smooth?
- Non smooth programming constructs
- if
- switch
- abs
- max
- min
- floor
- ceiling
- integers
Codes only need to be smooth in area of interest
11Improving Directions
Vector Dx is an improving direction at current
solution x (t) if the objective function value
at x (t) lDx is superior to that of x (t) for
all l gt 0 sufficiently small.
12Determine Gradient of Sams Club Optimization
Model at (2,0)
13Gradient Approximations Using Forward Differences
Forward difference formula
Central difference formula
What should e be?
14Sams Club Finite Differences
15One Dimensional Search
Methods Golden section or Polynomial
16Golden Section Search
17(No Transcript)
18(No Transcript)
19Sams Club Golden Section Example
20Sams Club Golden Section Example
21Sams Club Golden Section Example
22Sams Club Golden Section Example
23Sams Club Golden Section Example
24Sams Club Golden Section Example
25Sample Exercise Applying Golden Section Search
Beginning with interval 0,40, apply golden
section search to the unconstrainednonlinear
program. Continue until length containing optimum
has length lt 10
26Golden Section Solution
27Bracketing and 3 Point Patterns
Need a bracket to get low and high points for
subsequent sectioning.
In 1 dimensional search, a 3 point pattern is a
collection of 3 decision variable values x (lo)
lt x (mid) lt x (hi) with the objective value at x
(mid) superior to that of the other two ( greater
for maximize, lesser for minimize ).
28(No Transcript)
29(No Transcript)
30(No Transcript)
31Sams Club Bracketing Example
32Sams Club
33Exercise
34Exercise Solution
35Quadratic Fit Search
The golden section search is reliable but its
slow and steadynarrowing of optimum can require
considerable computation.
Quadratic or polynomial fit can close in more
rapidly.
36Optimum Computation
37(No Transcript)
38(No Transcript)
39(No Transcript)
40Derivatives, Taylor Series and Conditions for
Local Optima
Unconstrained optimization is possible without
derivatives(e.g. Hooke Jeeves). However, if
derivatives are availablethey can substantially
accelerate the search progress
Improving search paradigm
41Local Information and Neighborhoods
The next move must be chosen using only
experiencewith points already visited plus local
information.
42First Derivatives and Gradients
43Second Derivatives and Hessian Matrices
44Taylor Series Approximations with One Variable
45Taylor Series Approximations with Multiple
Variables
46Approximating Hessian
47Local Optima
48Convex/Concave Functions and Global Optimality
49Sample Exercise
50Sample Exercise Solution
51Sufficient Conditions for Unconstrained Global
Optima
If f(x) is a convex function, every unconstrained
localminimum of f is an unconstrained global
minimum. Iff(x) is concave, every unconstrained
local maximumis an unconstrained global maximum.
Every stationary point of a smooth convex
function isan unconstrained minimum, and every
stationary pointof a smooth concave function is
an unconstrained globalmaximum.
Both convex objective functions in minimize
problemsand concave objective functions in
maximize problemsare unimodal.
52Gradient Search
53Gradient Search Algorithm
54Spring Example
55(No Transcript)
56Steepest Ascent/Descent Property
Uses only neighborhood information. Past
experience isnot used.
Athough gradient search may produce good initial
progress,zigzagging as it approaches a
stationary point makes the method too slow and
unreliable to provide satisfactoryresults in
many unconstrained nonlinear problems.
Can be seen as pursuing move direction suggest by
firstorder Taylor
57Steepest Descent on Spring Example
58Newtons Method
Improves over steepest descent by using second
order Taylor series approximation.
Newton steps Dx, which moves to a stationary
point of the second order Taylor series
approximation to f(x) at current point x (t) are
obtained by solving the linear equation system
Computing both first and second partial
derivatives plus solving a linear system of
equations at each iteration makes Newtons method
computationally expensive as the decision vector
becomes large.
59(No Transcript)
60Quasi-Newton Methods
- Tries to get some of the benefits of Newton
methods atreduced cost. - Conjugate Gradient Methods / Fletcher Reeves
- BFGS
61Conjugate Gradient Method
Uses past information along with neighborhood info
Requires only a simple modification to steepest
descentalgorithm and yet dramatically improves
convergence rate.
62(No Transcript)
63(No Transcript)
64Newton
BFGS
65Conmin
- Developed by Vanderplaats
- His subsequent work is ADS, Dot, VisualDoc
- Generic Conmin solves
- Unconstrained problems using Conjugate Gradients
(Fletcher Reeves) or Steepest Descent. - Constrained problems using Method of Feasible
Directions. - Does not support equality constraints
- Supports linear and non linear constraints
- Supports numerical gradient approximation
- IMHO it is in iSIGHT because Nasa client has
confidence and experience in its use. I prefer
ADS for three reasons - Same author
- 11 more years of development and experience went
into it. - Supports equality constraints and many, many
additional options. (Swiss Army Knife)
66iSIGHT ADS
- iSIGHT Problem Formulation was largelyinfluenced
by ADS - iSIGHT does not allow
- Design variables without constraints
- Unconstrained problems
- iSIGHT treats all constraints as nonlinear.
- Dual sided constraint is translated into two lt
constraints. - Constraint violation of 0.00 allowed Can be
overridden through api call not through GUI. - iSIGHT parameter names and ADS do not necessarily
match as ADS uses x1-xn.
67iSIGHT Formulation
Design Variables, Objectives and Constraints
should be normalized
68Constraints
api_SetDeltaForInEqualityConstraintViolation -
the iSIGHT default is 0.0
69Side Constraints are always given
70iSIGHT Parameter Table from single evaluation
Task Process Status cannont be removed
71ADS Manual
- Strongly encourage that you read it cover to
cover. - It is overwhelming for a novice but you should
have enough background to start to feel
comfortable. - ADS has an excellent output file that user can
easily filter from iSIGHT log. - Page 26, 25 Vanderplaats recommendations on
algorithms. (IMHO commercialization and
training changed his decision over the years) - iSIGHT prevents us from running unconstrained
optimization. We will have set up a Method of
Feasible Directions and enforce Steepest Descent.
(ICNDIR 1, Theta 0)
72ADS Continued
- Great academic teaching tool
- Unconstrained Steepest Descent, Fletcher-Reeves,
BFGS - Constrained Exterior Penalty, MMFD, SLP, SQP,
- One dimensional line searches golden section and
polynomial - We will focus on using ADS for this lab for
steepest descent. We will use it in follow on
lecture for exterior penalty.
73Key Control Parameters
- Finite Difference Step Size
- Termination criterion
- Number of iterations
- Objective convergence (Absolute and Relative)
- Zero gradient.
- What is an iteration?????
- How many evaluations will it take?
74ADS Customization for Steepest Descent
75ADS Basic Parameters
76ADS Advanced Parameters - 1
77ADS Advanced Parameters - 2
78ADS Diagnostics
- IPRINT 3552
- Can use log window to control printout.
- Why did ADS terminate? (Diagnostics page 51-52
points 3,4,5,6) - Need to look at initial gradient values. If
different orders of magnitude then need to
provide scaling.
79Lab
- Objectives
- Isolate ADS output by filtering to log window.
- Understand and Analyze ADS output for
unconstrained optimization. - Customize ADS to meet your needs with Tcl calls
- Run one iteration to analyze gradients.
- Apply ADS to solve two unconstrained problems
Spring and NFL.
80Unconstrained Spring Lab - Use -4,4 as starting
point
- A Spring_Start .desc file has been provided that
implements a Calculation to calculate the
potential energy of the spring. The code has
already been coupled. Your tasks are to - Create an Optimization Plan called
SteepestDescent. Have the plan made up of one
optimization technique, Modified Method of
Feasible Directions, MMFD. - Customize the MMFD to use Method of Feasible
Directions with a Golden Section one dimensional
search by customizing the prolog with calls to
api_SetTechniqueOption - Use the Advance Parameters GUI to use steepest
descent by setting ICNDIR and thetaz. Also set
Print Level to give you all of the details. (See
ADS Manual for proper values) - Bring up a detached log window and set View
Filters to only display all other types. This
will only show the ADS generated messages. - Run the optimization. Upon completion do the
following - Bring up Solution Monitor and open the db file.
Scroll down the file, the column Internal has a
valueof 1 for gradient calculations and 2 for
one dimensional search. The row with a
feasibility of 9 preceding the row with a
internal value of 1 contains the starting point
of the iteration. Plot the start and end points
ofeach iteration on slide 58 to verify that you
can obtain the optimal solution. - Review the detached log file, you can also
extract the starting value of each iteration from
this file.Does it match the values from the
Solution Monitor? - From the ADS messages, why did ADS terminate?
- What was the final value of the objective and
design variables? - Write down the total number of function
evaluations. In a later task, you will compare
the efficiencyof Golden Section to Polynomial
Interpolation. - Fill out the table on the next page of the
gradient values for each iteration.
81Spring Lab Continued
Iteration Gradients of Objective
Function Objective 1 2 3 4 5
- Notice the decay rate of the gradients with each
iteration. Isthis what you expected? - Rerun the same lab but this time use a one
dimensionalsearch of Polynomial Interpolation
(IONED 7). How manyfunction evaluations did it
take? Is Golden Section or Polynomial
Interpolation more efficient? - An important aspect of any lab is to see if the
problem is well scaled. Set up the problemto use
Polynomial Interpolation but only have it run one
iteration. The intent of doing thisis to analyze
the gradients to see if they are of the same
order of magnitude and to conductan experiment
with FDCH to see if the default setting is
appropriate. Run one iterationusing the FDCH
listed on the next page and see what effect they
have. (Note We are lookingfor no change in
first three or four digits to indicate we need
tighter FDCH..
82Spring Lab Continued
FDCH FDCHM Gradients .1 .0001 .01 .0001 .001 .0
001 .0001 .0001
- Did you achieve a global optimum? List your
rationale.
83NFL (see attachment for team names)
- This is a cool lab taken from Practical
Management Science by Winston and Albright. It is
a good example - of a curve fit where we are trying to minimize
the sum of square errors between the actual
scores and the - predicted scores.
- Our prediction formula for making money
isPredicted margin Home Team I rating
Visiting Team j rating Home advantage - The design variables are the team ratings for the
31 NFL teams and the scalar value of Home
advantage. The objective is to minimize the
sumor square errors between the Actual point
spreads and predicted point spreads based on data
inputedfor all of the NFL games in 1998. The
intent is to enter data throughout the season so
that we can make a fortune in the weekly
football pools. - You are to load the description file
NFL_tcl.desc. Create an optimization plan to run
ADS witha steepest descent unconstrained
approach using polynomial search. - Run the optimization and answer the following
questions - If Minnesota plays the Detroit Lions in Detroit
on Thanksgiving then what is the predicted
pointspread? - Why did ADS terminate?
- What is the recommended FDCH?
84Rating9
Rating15