BAMS 517 Decision Analysis: A Dynamic Programming Perspective - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

BAMS 517 Decision Analysis: A Dynamic Programming Perspective

Description:

Title: BAMS 517 Decision Analysis Author: Eric Cope Last modified by: Sauder Schooll of Business Created Date: 12/20/2004 6:46:51 PM Document presentation format – PowerPoint PPT presentation

Number of Views:235
Avg rating:3.0/5.0
Slides: 44
Provided by: EricC157
Category:

less

Transcript and Presenter's Notes

Title: BAMS 517 Decision Analysis: A Dynamic Programming Perspective


1
BAMS 517Decision Analysis A Dynamic
Programming Perspective
  • Martin L. Puterman
  • UBC Sauder School of Business
  • Winter Term 2011

2
Introduction to Decision Analysis -outline
  • Course info
  • Dynamic decision problem introduction
  • Decision problems and decision trees
  • Single decision problems
  • Multiple decision problems
  • Probability
  • Expected value decision making
  • Value of Perfect Information
  • Value of Imperfect Information
  • Utility and Prospect Theory
  • Finite Horizon Dynamic Programming

3
Some dynamic decision problems
  • Assigning customers to tables in a restaurant
  • Deciding when to release an auction on eBay
  • Choosing the quantity to produce (inventory
    models)
  • Deciding when to start a medical treatment or
    accept an organ transplant
  • Playing Tetris
  • Deciding when to add capacity to a system
  • Advanced patient scheduling
  • Managing a bank of elevators
  • Deciding when to replace a car
  • Managing a portfolio
  • Deciding when to stop a clinical trial
  • Guiding a robot to a target
  • Playing golf

In each case there is a trade-off between
immediate reward and uncertain long term gain
4
Common ingredients of these dynamic decision
problems
  • Problem persists over time.
  • Problem structure remains the same every period.
  • Current decisions impact future system behavior
    probabilistically.
  • Current decision may result in immediate costs or
    rewards.
  • These problems are all examples of Markov
    decision problems or MDPs or stochastic dynamic
    programs
  • They were first formulated in the 1940s for
    problems in reservoir management (Masse) and
    sequential statistical estimation problems (Wald)
  • They were formalized in the 1950s by Bellman and
    Howard.
  • Theory was developed between 1960-1990.
  • Rediscovered in the 1990s by computer scientists
  • Reinforcement learning
  • Approximate dynamic programming

5
Basic Decision Analysis

6
Decision Analysis
  • Goal to understand how to properly structure,
    and then solve, decision problems of nearly any
    type
  • Structuring the decision problem and obtaining
    the inputs is usually the hard part
  • Once the right structure has been found, solving
    for the best course of action is usually
    straightforward
  • We will be guided by mathematical and scientific
    principles
  • These principles will ensure that
  • Our decision-making is rational and logically
    coherent
  • We choose the best course of action based on our
    preferences for certain outcomes and knowledge
    available at the time of the decision
  • We might not always be satisfied with the outcome
    but we will be confident with that the process we
    used was the best available.

7
Decision Analysis
  • Our analysis will tell us what decision ought to
    be taken, as a rational person, and not what
    decision people actually tend to make in the same
    situation
  • Normative (or prescriptive) analysis, rather than
    a descriptive analysis
  • Many studies have shown that people do not always
    act rationally
  • The methods we introduce provides a framework
    that translates your preferences for outcomes and
    your assessments of the likelihood of each
    consequence into a recipe for action
  • Places minimal requirements on your preferences
    and assessments
  • Does not impose someone elses values in place of
    your own
  • We begin by exploring how to assess the
    likelihood of outcomes. We will discuss how to
    determine your preference for outcomes in a few
    classes.

8
Simple decision problems
  • The basic problem is to select an action from a
    finite set without knowing which outcome will
    occur
  • In order to decide on the proper action, we need
    to
  • Quantify the uncertainty of future events
  • Assign probabilities to the events
  • Evaluate and compare the goodness of the
    possible outcomes
  • Assign utilities to the outcomes
  • Once these are in place, we have fully specified
    the decision problem

9
Assessing Probabilities Through Decision Trees

10
The election stock market problem
  • Suppose we are faced with the following
    opportunity on September 8, 2008.
  • You can pay .56 and if Obama wins the election
    you receive 1 and if he loses you receive 0.
  • http//iemweb.biz.uiowa.edu/graphs/graph_Pres08_WT
    A.cfm
  • Decision Invest .56 or do not
  • Uncertain Event Obama wins.
  • Suppose this has probability q
  • The election stock market problem is perhaps the
    simplest decision problem we will study. It
    contains, however, all the basic elements of many
    more complex problems.

11
The election problem on September 8
Payoff (Gain) Obama wins Obama loses
Buy 1 share 1 (.44) 0 (-.56)
Do not 0 0
12
A decision tree for the election problem
0.44
Obama wins
q
Buy 1 share
1-q
Obama loses
-0.56
Do not invest
0
13
Valuing gambles
  • Under certain conditions (to be discussed in
    class 3) it is advantageous to evaluate gambles
    by their mathematical expectation.
  • For the previous problem the expected value of
    the gamble would be
  • .44 q - .56 (1-q) q - .56

14
Solving the election problem - a reduced
problem
We replace the gamble by its expectation latter
we use expected utility of the gamble
( q-.56)
Buy 1 share
Do not invest
0
15
The election problem solution
  • Assume you will choose the decision which
    maximizes your expected payoff.
  • If you invest, your expected payoff is q-.56 if
    you do not your expected payoff is 0. Thus if
    you thought (on September 8) that q, the
    probability Obama wins exceeds .56, you will
    invest, if you dont, you will not invest.
  • You will be indifferent, when q .56.
  • On September 8, the consensus (among investors in
    the Iowa Electronic Stock Market) probability of
    Obama winning was 0.56
  • Why?
  • Thus an Electronic Stock Market provides an
    alternative to polls when predicting outcomes of
    random events and a method for assessing
    probabilities.
  • Current markets
  • Wikipedia Article (Gives comparison of accuracy
    compared to polls)

16
Odds - definitions
  • If p is the probability an event occurs
  • o p/ (1-p)
  • is called the odds of the event occurring
  • Often we consider l ln(o) ln (p/1-p) which is
    called the log-odds or logit of p.
  • Aside this is a key ingredient in a logistic
    regression model
  • ln(p/1-p) ß0 ß1x
  • Thus the odds (on September 8) of Obama winning
    is
  • o .56/.44 1.27 (to one)

17
Odds - Examples
  • On December 30, 2008 The Globe and Mail gave the
    following odds for various teams winning the
    Super Bowl
  • NY Giants 2 to 1
  • Tennessee Titans 4 to 1
  • Arizona Cardinals 40 to 1
  • They have the following meaning in this context
  • If you bet 1 on the Cardinals (on Dec 30) and
    they win the Super Bowl, you get back 41 dollars
    for a net gain of 40 dollars
  • The relation of these odds to probabilities can
    be determined using decision analysis
  • What is the implied odds makers probability q,
    that Arizona wins the Super Bowl?

18
A decision tree for the Super Bowl problem
40
Arizona wins
q
Bet on Arizona
1-q
Arizona loses
-1
Do not bet
0
19
Solving the Super Bowl problem
You would be indifferent between the two
decisions if q 1/41 or 1-q 40/41
41q -1
Bet on Arizona
Do not bet
0
20
Odds and Bookmakers Odds
  • Based on the decision tree and expectations the
    probability of winning is 1/41
  • So using the above definition of odds you would
    find the odds of winning is
  • oW 1/41/ (1 - 1/41) 1/40 (to 1)
  • The odds of losing would be
  • oL 40/41/ (1-40/41) 40 (to 1)
  • Thus quoted odds for sports events are the odds
    of losing.
  • Hence the odds (on December 30) the Giants dont
    win the Super Bowl are 2 to 1 and the odds they
    win the Super Bowl are 1 to 2.
  • The implied probability the Giants win the Super
    Bowl is 1/3.
  • Another interpretation (courtesy Wikipedia)
  • Generally, 'odds' are not quoted to the general
    public in the format (p/1-p) because of the
    natural confusion with the chance of an event
    occurring being expressed fractionally as a
    probability. 
  • Example Suppose that you are told to pick a
    digit from 0 to 9. Then the odds are 9 to 1
    against you choosing a 7. One way to think about
    this interpretation is that there are 10 outcomes
    in 1 you succeed in picking a 7 and in 9 you
    dont succeed.
  • This interpretation doesnt work for one time
    events like the Super Bowl.
  • Ill refer to these as bookmakers odds.

21
Games of Chance and Odds
The payout on a successful bet on a single number
is 35 to 1 plus the amount bet. The true
bookmakers odds are 37 to 1 on an American
roulette wheel (with 0 and 00). (assuming a fair
wheel)
22
A decision tree for a single number bet in
roulette
35
Ball stops on 7
1/38
Bet 1 on 7
37/38
Lose
-1
Do not bet
0
23
Solving the roulette problem
- 0.0526
Bet 1 on 7
Do not
0
24
Bet name Winning spaces Payout Odds against winning Expected value(on a 1 bet)
0 0 35 to 1 37 to 1 -0.053
00 00 35 to 1 37 to 1 -0.053
Straight up Any single number 35 to 1 37 to 1 -0.053
Row 00 0, 00 17 to 1 18 to 1 -0.053
Split any two adjoining numbers vertical or horizontal 17 to 1 18 to 1 -0.053
Trio 0, 1, 2 or 00, 2, 3 11 to 1 11.667 to 1 -0.053
Street any three numbers horizontal (1, 2, 3 or 4, 5, 6 etc.) 11 to 1 11.667 to 1 -0.053
Corner any four adjoining numbers in a block (1, 2, 4, 5 or 17, 18, 20, 21 etc. ) 8 to 1 8.5 to 1 -0.053
Five Number Bet 0, 00, 1, 2, 3 6 to 1 6.6 to 1 -0.079
Six Line any six numbers from two horizontal rows (1, 2, 3, 4, 5, 6 or 28, 29, 30, 31, 32, 33 etc.) 5 to 1 5.33 to 1 -0.053
1st Column 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34 2 to 1 2.167 to 1 -0.053
2nd Column 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, 35 2 to 1 2.167 to 1 -0.053
3rd Column 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36 2 to 1 2.167 to 1 -0.053
1st Dozen 1 through 12 2 to 1 2.167 to 1 -0.053
2nd Dozen 13 through 24 2 to 1 2.167 to 1 -0.053
3rd Dozen 25 through 36 2 to 1 2.167 to 1 -0.053
Odd 1, 3, 5, ..., 35 1 to 1 1.111 to 1 -0.053
Even 2, 4, 6, ..., 36 1 to 1 1.111 to 1 -0.053
Red 1, 3, 5, 7, 9, 12,14, 16, 18, 19, 21, 23,25, 27, 30, 32, 34, 36 1 to 1 1.111 to 1 -0.053
Black 2, 4, 6, 8, 10, 11,13, 15, 17, 20, 22, 24,26, 28, 29, 31, 33, 35 1 to 1 1.111 to 1 -0.053
1 to 18 1, 2, 3, ..., 18 1 to 1 1.111 to 1 -0.053
19 to 36 19, 20, 21, ..., 36 1 to 1 1.111 to 1 -0.053
25
Roulette
  • Based on the previous table every 1 bet in
    roulette has an expected value of negative
    0.526.
  • Thus roulette is an unfavorable game.
  • Note there is research on how to play unfavorable
    games optimally based on dynamic programming.
  • But if you play many times, and the wheel is
    fair, you will lose money.
  • Why do people play?

26
Money Lines or odds sets
  • Another way of expressing odds.
  • Used frequently for hockey and baseball betting.
  • The Globe and Mail (December 31,2008)
  • In an NHL game the favorite Calgary has a line of
    -175 and Edmonton the underdog has a line of
    155.
  • This means that if you want to bet on Calgary,
    you must bet 175 to win 100 and if you want to
    bet on Edmonton you must bet 100 to 155.
  • This implies
  • P(Calgary) 7/11 .636
  • P(Edmonton) 20/51 .392
  • Whats happening? What about ties?
  • Suppose the Calgary probability is correct, (and
    ties are not possible), then the probability of
    Edmonton winning should be 1-.636 .354 and the
    money line on Edmonton should be 636/.354 180!
  • So the House is taking 25 off the payout on a
    winning Edmonton bet.
  • The same argument for Calgary implies ?
  • So again like in roulette the House is taking a
    premium on every bet by reducing the payoff below
    the expected value of the gamble.

27
Assigning probabilities to events
  • The uncertainty of an event will be measured
    according to its probability of occurrence
  • For events that have been repeated several times
    and regularly observed, its easy to assign a
    probability
  • Outcomes of gambling games
  • Tossing coins, rolling dice, spinning roulette
    wheels, etc.
  • Actuarial and statistical events
  • A 30-year-old female driver having an accident in
    the next year
  • The chance of rain tomorrow, given todays
    weather conditions
  • The number of cars driving over the Lions Gate
    bridge tomorrow between 8 and 9 AM
  • The number of admits to the emergency room at VGH
    on January 7, 2009

28
Assigning probabilities to events
  • However, not all events occur with statistical
    regularity
  • General Motors will be bankrupt by July 1, 2010
  • A democrat will win the 2012 US presidential
    election
  • The uncertainty of an event often derives from a
    lack of precise knowledge
  • How many jellybeans are in a jar?
  • Was W.L. MacKenzie King Prime Minister of Canada
    in 1936?
  • Or there is not much data available
  • Will a new medical treatment be effective in a
    specific patient?
  • Since these events cannot be repeated in any
    meaningful way, how can we assign a probability
    to their occurrence?
  • We can rely on election stock markets or odds if
    they are available.
  • What if theyre not?

29
Assigning probabilities to events
  • It is important to recognize that two different
    people in the same situation might assign two
    different probabilities to the same event
  • A probability assignment reflects your personal
    assessment of the likelihood of an event the
    uncertainty being measured is your uncertainty
  • Different people may have different knowledge
    about the event in question
  • Even people with the same knowledge could still
    differ in their opinion of the likelihood of an
    event
  • Someone could coherently assign a probability of
    ¼ to a coin coming up heads, if he/she had reason
    to believe the coin is not fair
  • They are often called subjective probabilities.
  • The assessment of subjective probabilities is a
    key topic in research on decision analysis (and
    forecasting)

30
Assigning probabilities to events
  • Example Suppose we wished to assign a
    probability to the event A thumb tack lands with
    its point up
  • How we could we find this probability?
  • We could guess.
  • We can gauge our belief of the likelihood of an
    event by comparing it to a set of standard
    statistical probabilities through a reference
    lottery.
  • We can compare the following two gambles to
    assess this probability
  • Choice A Toss the thumbtack. If it lands point
    up, you win 1 otherwise you receive 0
  • Choice B Spin the spinner, If it ends on blue,
    you win 1 otherwise you receive 0
  • We can adjust the portion of the spinner that is
    blue until we are indifferent between the two
    choices.
  • This Probability spinner provides a way of
    varying the blue portion systematically.

31
The implied decision tree
1

Thumbtack land up
Choice A
Thumbtack lands on tip down
0
1
Spinner blue
Choice B
Spinner red
0
32
Implications of using reference lottery
  • If the spinner is set so that the probability of
    blue is .5 and you prefer A to B, then you
    believe the probability thumbtack up is greater
    than .5
  • If the spinner is set so that the probability of
    blue is .9 and you prefer B to A, then you
    believe the probability thumbtack up is less
    than .9
  • Repeating this can give a plausible range for the
    probability of thumbtack up .
  • This is hard to do!
  • There is a big literature on biases of such
    assignments.
  • Alternatively we could construct a distribution
    of plausible values for this probability and the
    likelihood of each of these values instead of
    assigning one number.
  • Or we could input our assessment into the
    decision problem and do sensitivity analysis.

33
Another option acquire information
  • Suppose you were faced with Choice A only? What
    would this gamble be worth?
  • One approach provide a prior distribution on the
    probability of the event p.
  • Example Uniformly distributed on 0,1.
  • Base the decision on the mean, median or mode of
    this distribution.
  • Toss the thumbtack once, and use Bayes theorem
    to update this probability.

34
Assigning probabilities to events
  • Let E be an event, and let H represent the
    knowledge and background information used to make
    a probability judgment. We denote the assigned
    probability as P(E H)
  • The probability of event E given information H
  • We do not consider probabilities as separate from
    the information used to assess them
  • This reflects the fact that we consider all
    probabilities to be based on the judgment of an
    individual and the individuals knowledge at the
    time of the assessment.
  • Even though we consider probabilities to based on
    an individuals judgment, they cannot be
    arbitrarily assigned
  • Certain rules must be obeyed for the assignments
    to be coherent
  • Using the method outlined above to assign
    probabilities avoids incoherent assignments

35
Axioms of Probability
  • The probability assignments P(E H) must obey
    the following basic axioms
  • 0 P(E H) 1
  • (Addition law) Suppose that E1 and E2 are two
    events that could not both occur together (they
    are mutually exclusive). Then
  • P(E1 or E2 H) P(E1 H) P(E2 H)
  • If E1 and E2 are mutually exclusive and
    collectively exhaustive, then
  • P(E1 or E2 H) P(E1 H) P(E2 H) 1
  • (Multiplication law) For any two events E1 and
    E2,
  • P(E1 and E2 H) P(E1 H)P(E2 E1 and H)
  • If E1 and E2 are independent (i.e., P(E2 E1
    and H) P(E2 H)), then
  • P(E1 and E2 H) P(E1 H) P(E2 H)
  • These rules can be used to compute probability
    assignments for complex events based on those for
    simpler events

36
The law of total probabilities
  • This law can be derived from the axioms and the
    definition of conditional probability. It says
    that for any two events A and E,
  • P(A H) P(A and E H) P(A and Ec H)
  • P(A E and H)P(E H) P(A Ec and H)
    P(Ec H)
  • This law is useful because it allows one to
    divide a complex event into subparts for which it
    may be easier to assess probabilities.
  • Also it generalizes to more than just
    conditioning on E and Ec.
  • We can replace it by any set (or continuum) of
    events that partitions the sample space.
  • It is used widely in probability theory to
    compute complex probabilities and is fundamental
    for evaluating Markov chains

37
Bayes rule
  • This is a very important rule that we will use
    extensively.
  • It is a way to systematically include information
    in assessing probabilities
  • To simplify notation lets drop the conditioning
    on H and assume that it is understood that
    probabilities are conditional on history.
  • Bayes Rule can be written as
  • It is derived using the definition of conditional
    probability and the law of total probabilities
  • It generalizes to any set of events that
    partitions the sample space.

38
Updating probability assessments
  • Suppose that you cant see inside a jellybean jar
    containing only red and white beans, but I tell
    you that either 25 of the beans are red or 25
    are black. You think these possibilities are
    equally likely.
  • Suppose I pick a bean at random. What is the
    probability it is red?
  • Now you draw 5 jellybeans from the jar with
    replacement, and find that 4 of them are red.
    How should you revise your belief in the
    probability that 25 of the beans are red, in
    light of this information?
  • Let A be the event 25 of beans in the jar are
    red and let E be the event of drawing 5 beans
    and obtaining 4 red and 1 black
  • We want to find P(A E) and we know
  • P(A) P(Ac) 0.5
  • P(E A) .75 (.25)4 0.00293
  • P(E Ac) .25 (.75)4 0.0791

39
Updating probability assessments
  • Using Bayes rule, we now compute
  • P(A E) .00293(0.5) / .00293(0.5)
    .0791(0.5) .0357
  • Thus, you should now believe that there is about
    a 3.5 chance that 25 of the jellybeans are red.
  • You also think there is about a 96.5 chance that
    25 of the beans are black
  • Obviously, we have received strong evidence
    regarding the contents of the jar, since our
    beliefs have gone from complete uncertainty (50)
    to high probability (96.5)
  • Lets look at some of the terms involved in
    Bayes rule
  • The expression P(A E) is known as the
    posterior probability of A, i.e., the assessed
    probability of A after we learn that E has
    occurred
  • P(A) is known as the prior probability of A,
    i.e., the assessment of the probability of A
    before the new information was received
  • P(E A) is known is the likelihood of E
    occurring given that A is true

40
Probabilities for single events
  • To follow on the previous example, lets now ask
    what we think the probabilities are now for the
    next bean drawn from the jar to be red
  • We assign a probability of .965 to there being
    75 red beans in the jar, and a chance of .035 to
    there being 25 red beans in the jar
  • Let A 75 of the beans in the jar are red
  • Let Ac 25 of the beans in the jar are red
  • Let B The next bean drawn from the jar is red
  • We use the law of total probability to compute
    P(B)
  • P(B) P(B A) P(A) P(B Ac) P(Ac)
  • .75(.965) .25(.035) .733
  • P(B) is called a marginal probability
  • What was this probability before sampling?

41
The Monty Hall problem
  • Monty Hall was the host of the once-popular game
    show Lets Make a Deal
  • In the show, contestants were shown three doors,
    behind each of which was a prize. The contestant
    chose a door and received the prize behind that
    door
  • This setup was behind one of the most notorious
    problems in probability
  • Suppose you are the contestant, and Monty tells
    you that there is a car behind one of the doors,
    and a goat behind each of the other doors. (Of
    course, Monty knows where the car is)
  • Suppose you choose door 1

42
The Monty Hall problem
  • Before revealing whats behind door 1, Monty
    says Now Im going to reveal to you one of the
    other doors you didnt choose and opens door 3
    to show that there is a goat behind the door.
  • Monty now says Before I open door 1, Im going
    to allow you to change your choice. Would you
    rather that I open door 2 instead, or do you
    want to stick with your original choice of door
    1?
  • What do you do?

43
Summary
  • Sequential Decision Problems
  • Decision Trees
  • Probability Assessment, Odds and Gambling
  • Probability updating
  • Monty Hall and Hatton Realty for next time.
Write a Comment
User Comments (0)
About PowerShow.com