Chapter 11 Dynamic Programming

About This Presentation

Title:

Chapter 11 Dynamic Programming

Description:

By using the minimum technique for selecting the shortest step offered by each ... For the stagecoach problem, this recursive relationship was ... – PowerPoint PPT presentation

Number of Views:155

Avg rating:3.0/5.0

Slides: 83

Provided by: All57

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 11 Dynamic Programming

1
Chapter 11Dynamic Programming

by Dr. Peitsang Wu
Department of Industrial
Engineering and Management
I-Shou University

2
An Example
3
Network Representation
4
A Solution

By using the minimum technique for selecting the
shortest step offered by each successive step, we
will have the possible shortest path A?B ? F ? I
? J, with cost 13.
When replacing A?B ? F with A?D ? F , we get
another path with cost only 11.
One possible approach is to enumerate all the
possible routes, which is 18 routes. This is
so-called exhaust enumeration method.

5
Dynamic Programming

Stage
State
Decision variable
Optimal policy (Optimal solution)

6
Dynamic Programming

Dynamic programming does not exist a standard
mathematical formulation of the dynamic
programming problem. Rather, dynamic programming
is a general type of approach to problem solving,
and the particular equations used must be
developed to fit each situation.

7
Dynamic Programming

Dynamic programming starts with small portion of
the original problem and finds the optimal
solution for this smaller problem. It then
gradually enlarges the problem, finding the
current optimal solution from the preceding one,
until the original problem is solved in its
entirety.

8
Formulation

Let decision variable xn, (n1,2,3,4) be the
immediate destination on stage n. The route
selected is A? x1 ? x2 ? x3 ? x4, where x4 is J.
Let fn(s, xn ) be the total cost of the best
overall policy for the remaining stages, given
that you are in state s, ready to start stage n,
and select xn as the immediate destination.
Given s and n, let xn denotes any value of xn
(not necessary unique) that minimizes fn(s, xn ),
and let f n(s) be the corresponding minimum
value of fn(s, xn ).

9
Formulation

Thus
where
fn(s, xn ) immediate cost (at
stage n)
minimum future
cost (stages
n1 onward)
Cs,xnf n1( xn ), the value of Cs,xn is given
by the preceding tables for by is (the current
state) and j xn (the immediate destination),
here f 5( J ) 0.
Objective is to find f 1(A) and the
corresponding route.

10
Solution

Stage n4

11
Solution

Stage n3

12
Solution

Stage n3

x3
s
13
Solution

Stage n2

14
Solution

Stage n2

x2
s
15
Solution

Stage n1

16
Solution

Stage n1

x1
s
17
Optimal Solution
18
Characteristic of DP Problems

The problem can be divided into stage, with a
policy decision required at each stage.
Each stage has a number of states associated with
the beginning of that stage.
The effect of policy decision at each stage to
transform the current state to a state associated
with the beginning of the next stage (possibly
according to a probability distribution).
The solution procedure is designed to find an
optimal policy for the overall problem,i.e.,a
prescription of the optimal policy decision at
each stage for each of the possible states.

19
Characteristic of DP Problems

Given the current state, an optimal policy for
the remaining stages is independent of the policy
decisions adopted in previous stages.Therefore,
the optimal immediate decision depends on only
the current states and not on how you got there.
This is the principle of optimality for DP.
The solution procedure begins by finding optimal
policy for the last stage.
A recursive relationship that identifies the
optimal policy for stage n, given the optimal
policy for the stage n1, is available.

20
Characteristic of DP Problems

For the stagecoach problem, this recursive
relationship was
Therefore, finding the optimal policy
decision when you start in state s at stage n
requires finding the minimizing value xn.
For this particular problem, the corresponding
minimum cost is achieved by using the value of xn
and following the optimal policy when you start
in state xn at stage n1.

21
Characteristic of DP Problems

The precise form of the recursive relationship
differs somewhat among DP problem. However
notation analogous to that introduced in
preceding section will continue to be used here,
as summarized below.
Nnumber of stages.
nlabel for current stage (n1,2,,N)
sn current state for stage n.
xndecision variable for stage n.
optimal value of xn

22
Characteristic of DP Problems

fn(sn,xn) contribution of stages n, n1,.,N to
objective function if system starts in state sn
at stage n, immediate decision is xn, and optimal
decision are made thereafter.
.
The recursive relationship will always be of the
form or
where fn(sn,xn) would be written in terms of sn,
xn,
, and probably some measure of
the
immediate contribution of xn to the objective
function.

23
Characteristic of DP Problems

It is the inclusion of on the right
hand side, so that is defined in terms
of that makes the expression for
a recurring relationship.
The recursive relationship keeps recurring as we
move backward stage. When the current stage
number n is decreased by 1, the new
function is derived by using the
function that was just derived during the
preceding iteration and then this process keeps
repeating.

24
Characteristic of DP Problems

When we use this recursive relationship, the
solution procedure starts at the end and moves
backward stage by stage-each time finding the
optimal policy for that stage-until it finds the
optimal policy starting at the initial stage.
This optimal policy immediately yields an optimal
solution for the entire problem, namely, for
the initial state s1, then for the resulting
state s2, and so forth to for the resulting
stage sN.

25
Characteristic of DP Problems

A table such as the following would be obtained
for each stage (nN, N-1, .,1).

xn
sn
26
Deterministic DP

Deterministic dynamic programming can be
described diagrammatically as shown in Fig. 11.3.
Thus, at stage n the process will be in some
state sn.

27
Dynamic Programming

Making policy decision x, then moves the process
to some state sn1 at stage n 1.
The contribution thereafter to the objective
function under an optimal policy has been
previously calculated to be
Optimizing with respect to xn then gives
. After
and are found for each possible value
of sn, the solution procedure is ready to move
back one stage.

28
Ex 2 Distributing Medical Teams to Countries
29
Formulation

This problem requires making three interrelated
decisions, namely, how many medical teams to
allocate to each of the three countries.
The decision variables xn (n 1, 2, 3) are the
number of teams to allocate to stage (country) n.
sn number of medical teams still available for
allocation to remaining countries(n, . . . , 3).
To state the overall problem mathematically, let
pi(xi) be the measure of performance from
allocating xi medical teams to country i, as
given in Table 11.1.

30
(No Transcript)
31
Basic Structure
32
Formulation

and xi are nonnegative integers. Using the
notation presented in Sec. 11.2, we see that
fn(sn, xn) is
where the maximum is taken over xn1,,x3 such
that . And the xi are
nonnegative integers. In addition,

33
Formulation

Therefore,
Consequently, the recursive relationship
relating functions for
this problem is
For the last stage (n3)

34
Solution Procedure

Stage n 3

35
Solution Procedure

Stage n 2
Formula
x2 0
x2 1
x2 2
Because the objective is
maximization,
with
.

36
Solution Procedure

Stage n 2

37
Solution Procedure

Stage n 1
Formula
The similar
calculations for x1 2,
3, 4 (try it)
verify that
with
, as shown in
the following
table.

38
Solution Procedure

Stage n 1

39
Solution Procedure

Thus, the optimal solution has , which
makes s2 5 14, so , which
makes s3 4 3, so . Since,
this (1, 3, 1) allocation of medical teams to
the three countries will yield an estimated total
of 170,000 additional person-years of life, which
is at least 5,000 more than for any other
allocation.
These results of the dynamic programming analysis
also are summarized in Fig. 11.6.

40
(No Transcript)
41
Ex3 Wyndor Glass Company Problem
42
Formulation

This problem requires making two interrelated
decisions, namely, the level of activity 1,
denoted by x1, and the level of activity 2,
denoted by x2
let stage n activity n (n 1, 2).Thus, xn is
the decision variable at stage n.
Interpret the right-hand side of these
constraints (4, 12, and 18) as the total
available amount of resources 1, 2, and 3.
State sn amount of respective resources still
available for allocation to remaining activities.

43
Formulation

sn (R1, R2, R3) ,where Ri is the amount of
resource i remaining to be allocated (i1, 2, 3).
Therefore,
s1 ( , , ),
s2 ( , , )
However, when we begin by solving for stage 2, we
do not yet know the value of xi, and so we use
s2 (R1, R2, R3) at that point.

44
Formulation

f2 (R1, R2, R3, x2)
contribution of activity 2 to z if system starts
in state (RI, R2, R3) at stage 2 and decision is
x2
f1 (4, 12, 18, x1)
contribution of activities 1 and 2 to z if
system starts in state (4, 12, 18) at stage 1,
immediate decision is x1, and then optimal
decision is made at stage 2,

45
Formulation

Similarly,for n 1,2

46
Basic structure
47
Solution Procedure

Stage 2 To solve at the last stage (n 2), Eq.
(1) indicates that must be the largest value
of x2 that simultaneously satisfies 2x2 ? R2, 2x2
?R3, and x2?0
n2

48
Solution Procedure

Stage 1 (R1, R2, R3)( , , )
so that

49
Solution Procedure
Achieve their maximum at x1 , it follows
that and that this maximum is .
50
Solution Procedure

n 1
Because leads to, R1 ?
, R2 , R3 ? ( ) , for stage
2, the n 2 table yields ,
is the optimal solution for this
problem.

51
Inventory Problem

A company knows that the demand for its product
during each of the next four months will be as
follows month 1, 1 unit month 2, 3 units month
3, 2 units month 4, 4 units.
During a month in which any units are produced, a
set up cost of 3 is incurred.
In addition, there is a variable cost of 1 for
every unit produced.
At the end of each month, a holding cost of 0.5
per unit on hand is incurred.

52
Inventory Problem

Capacity limitation allow maximum of 5 units to
be produced during each month.
The size of the companys warehouse restricts the
ending inventory for each month to at most 4
units.
Assume that 0 units are on hand at the beginning
of the first month.
The company wants to determine a production
schedule that will meet all demands on time and
will minimize the sum of production and holding
cost during the four months..

53
Definition

Stage time.
State the beginning inventory level.
Decision Variable xt(i) to be a production level
during month t that minimizes the total cost
during months t, t 1, ..., 4 if i units are on
hand at the beginning of month t.
Define ft(i) to be the minimum cost of meeting
demands for months t, t 1, .. . , 4 if i units
are on hand at the beginning of month t.
Define c(x) to be the cost of producing x units
during a period.

54
Solution Procedure

Stage 4 During month 4, the firm will produce
just enough units to ensure that the month 4
demand of 4 units is met. This yields
f4(0) c(4) and x4(0) -
f4(1) c(3) and x4(1) -
f4(2) c(2) and x4(2) -
f4(3) c(1) and x4(3) -
f4(4) c(0) and x4(4) -

55
Solution Procedure

Stage 3 The cost f3(i) is the minimum cost
incurred during months 3 and 4 if the inventory
at the beginning of month 3 is i.
For each possible production level x during month
3, the total cost during months 3 and 4 is
Therefore,
x must be a member of 0, 1, 2, 3, 4, 5, and x
must satisfy

56
Solution Procedure
57
Solution Procedure
58
Solution Procedure
59
Solution Procedure

Stage 2 The cost f2(i) is the minimum cost
incurred during months 2 and 3 if the inventory
at the beginning of month 2 is i.
For each possible production level x during month
2, the total cost during months 2 and 3 is
Therefore,
x must be a member of 0, 1, 2, 3, 4, 5, and x
must satisfy

60
Solution Procedure
61
Solution Procedure
62
Solution Procedure
63
Solution Procedure

Stage 1 The cost f1(i) is the minimum cost
incurred during months 1 and 2 if the inventory
at the beginning of month 1 is i.
For each possible production level x during month
2, the total cost during months 1 and 2 is
Therefore,
x must be a member of 0, 1, 2, 3, 4, 5, and x
must satisfy

64
Solution Procedure
65
Solution Procedure
66
Optimal Schedule

Since our initial inventory is 0 units, the cost
for the four months will be f1(0) .
To attain f1(0), we must produce x1( ) unit
during month 1.
The inventory of month 2 will be -
. Thus we should produce x2( ) units.
At month 3, our inventory will be -
. Hence, during month 3, we need to produce x3(
) units.
At month 4 will begin with -
units on hand. Thus, x4( ) units should be
produced during month 4.

67
Optimal Schedule

In summary, the optimal production schedule
incurs a total cost of and produces unit
during month 1, units during month 2, units
during month 3, and units during month 4.

68
General Resource Allocation Problem

Suppose we have w units of resource, and T
activities to which the resource can be
allocated.
Activity t is implemented at a level xt, then gt
(xt) units of the resource are used by activity
t, and a benefit rt (xt) is obtained.

69
General Resource Allocation Problem

Define ft (d) to be the maximum benefit that can
be obtained from activity t, t1,, T if d units
of the resource can be allocated to activities t,
t1,, T.
where xt must be a nonnegative integer
satisfying gt (xt)?d.

70
Knapsack Problem

Consider the following Knapsack problem
Three different type of item can be used to fill
in 10-lb knapsack.
Want to use dynamic programming to solve it.
Same procedure of resource allocation problem.

71
Solution Procedure

Stage 3
where 5x3 ? d and is a nonnegative integer.
f3(10) and x3(10)
f3(9)f3(8)f3(7)f3(6)f3(5)
with x3(9)x3(8)x3(7)x3(6)x3(5)
f3(4)f3(3)f3(2)f3(1)f3(0)
with x3(4)x3(3)x3(2)x3(1)x3(0)

72
Solution Procedure

Stage 2
where 3x2 ? d and is a nonnegative integer.

73
Solution Procedure
74
Solution Procedure

Stage 2
? f1( ) and x1( )
f2( ) and x2( )
f3( ) and x3( )

75
Alternative Procedure for KP

Suppose g(w) is the maximum benefit that can be
gained for a w-lb knapsack.
Let bj be the benefit earned from a single type j
item, and wj is the weight of a single type j
item.

76
Alternative Procedure for KP

To fill a w-lb knapsack optimally, we must begin
by putting some type of item into the knapsack.
If we put a type j item into a w-lb knapsack, the
best we can do is earn bj(best we can do from
(w-wj)-lb knapsack).
And type j item can be placed into a w-lb
knapsack only if wj?w.
Define x(w) to be any type of item that attains
the maximum benefit and x(w) 0 to mean that no
item can fit into a w-lb knapsack.

77
Alternative Procedure for KP

g(0)g(1)g(2)0, and x(0)x(1)x(2)0.
g(3) and x(3) .

78
Equipment Replacement Problem

An auto repair shop always needs to have an
engine analyzer available.
A new engine analyzer costs 1000.
The cost of maintaining an analyzer during i-th
year operation is m160, m280, m3120.
An analyzer may be kept for 1, 2, or 3 years, and
after that it may be traded in for a new one.
The trade in price (salvage value si), for an
i-year-old analyzer is s1800, s2600,
s3500.
The shop wants to determine optimal policy for
replacement during the next 5 years..

79
Solution Procedure

Define g(t) to be the minimum net cost incurred
from year t until year 5 given that a new machine
has been replaced.
Define x to be the time at which the replacement
occurs.
Define ctx to be the net cost of purchasing a
machine at time t and operating it until time x.
where t1? x ? t3and x ? 5, g(5)0.

80
Solution Procedure

The net costmaintenance costs replacement
costs salvage value.

81
Solution Procedure
82
Optimal Schedule

The machine purchase at time 0 should replace
machine at time 1 or time 3.
If replace at time 1, the new time 1 machine may
be trade in at time 2 or time 4
If the time 1 machine trade in at time2, the new
machine should keep until time 5.
And so on.
The replacement policies are
(1) trade in at time 1, 2, and 5.
(2) trade in at time 1, 4, and 5.
(3) trade in at time 3, 4, and 5.