Title: Making Simple Decisions
1Making Simple Decisions
Some material borrowed from Jean-Claude Latombe
and Daphne Koller by way of Marie desJadines,
2Topics
- Decision making under uncertainty
- Utility theory and rationality
- Expected utility
- Utility functions
- Multiattribute utility functions
- Preference structures
- Decision networks
- Value of information
3Uncertain Outcomes of Actions
- Some actions may have uncertain outcomes
- Action spend 10 to buy a lottery which pays
1000 to the winner - Outcome win, not-win
- Each outcome is associated with some merit
(utility) - Win gain 990
- Not-win lose 10
- There is a probability distribution associated
with the outcomes of this action (0.0001,
0.9999). - Should I take this action?
4Expected Utility
- Random variable X with n values x1,,xn and
distribution (p1,,pn) - X is the outcome of performing action A (i.e.,
the state reached after A is taken) - Function U of X
- U is a mapping from states to numerical utilities
(values) - The expected utility of performing action A is
EUA Si1,,n p(xiA)U(xi) - Expected utility of lottery 0.000199 0
0.999910 9.9811
Utility of each outcome
Probability of each outcome
5One State/One Action Example
U(S0A1) 100 x 0.2 50 x 0.7 70 x 0.1
20 35 7 62
6One State/Two Actions Example
- U1(S0A1) 62
- U2(S0A2) 74
- U(S0) maxU1(S0A1),U2(S0A2)
- 74
80
7Introducing Action Costs
- U1(S0A1) 62 5 57
- U2(S0A2) 74 25 49
- U(S0) maxU1(S0A1),U2(S0A2)
- 57
-5
-25
80
8MEU Principle
- Decision theory A rational agent should choose
the action that maximizes the agents expected
utility - Maximizing expected utility (MEU) is a normative
criterion for rational choices of actions - Must have complete model of
- Actions
- States
- Utilities
- Even if you have a complete model, will be
computationally intractable
9Comparing outcomes
- Which is better A Being rich and sunbathing
where its warm B Being rich and sunbathing
where its cool C Being poor and sunbathing
where its warm D Being poor and sunbathing
where its cool - Multiattribute utility theory
- A clearly dominates B A ?gt B. A gt C. C gt D. A
gt D. What about B vs. C? - Simplest case Additive value function (just add
the individual attribute utilities) - Others use weighted utility, based on the
relative importance of these attributes - Learning the combined utility function (similar
to joint prob. table)
10Multiattribute Utility Theory
- A given state may have multiple utilities
- ...because of multiple evaluation criteria
- ...because of multiple agents (interested
parties) with different utility functions
11Decision networks
- Extend Bayesian nets to handle actions and
utilities - a.k.a. influence diagrams
- Make use of Bayesian net inference
- Useful application Value of Information
12RN example
13Decision network representation
- Chance nodes random variables, as in Bayesian
nets - Decision nodes actions that decision maker can
take - Utility/value nodes the utility of the outcome
state.
14Evaluating decision networks
- Set the evidence variables for the current state.
- For each possible value of the decision node
(assume just one) - Set the decision node to that value.
- Calculate the posterior probabilities for the
parent nodes of the utility node, using BN
inference. - Calculate the resulting utility for the action.
- Return the action with the highest utility.
15Exercise Umbrella network
take/dont take
P(rain) 0.4
Umbrella
Weather
Lug umbrella
Forecast
Happiness
P(lugtake) 1.0 P(lugtake)1.0
f w p(fw) sunny rain
0.3 rainy rain 0.7 sunny no rain
0.8 rainy no rain 0.2
U(lug, rain) -25 U(lug, rain) 0 U(lug,
rain) -100 U(lug, rain) 100
EU(take) U(lug, rain)P(lug)p(rain)
U(lug, rain)P(lug)p(rain) -250.4
0P(rain) -250.4 -10
EU(take) U(lug, rain)P(lug)p(rain)
U(lug, rain)P(lug)p(rain) -1000.4
1000.6 20
16Umbrella network
Decision may be helped with forecast (additional
information)
take/dont take
P(rain) 0.4
D(FSunny) Take D(FRainy) Not_Take
Umbrella
Weather
Lug umbrella
Forecast
P(lugtake) 1.0 P(lugtake)1.0
Happiness
f w p(fw) sunny rain
0.3 rainy rain 0.7 sunny no rain
0.8 rainy no rain 0.2
U(lug, rain) -25 U(lug, rain) 0 U(lug,
rain) -100 U(lug, rain) 100
17Value of Perfect Information (VPI)
- How much is it worth to observe (with certainty)
a random variable X? - Suppose the agents current knowledge is E. The
value of the current best action ? is EU(a E)
maxA ?i U(Resulti(A)) p(Resulti(A) E, Do(A)) - The value of the new best action after observing
the value of X is EU(a E,X) maxA ?i
U(Resulti(A)) p(Resulti(A) E, X, Do(A)) - But we dont know the value of X yet, so we have
to sum over its possible values - The value of perfect information for X is
therefore VPI(X) ( ?k p(xk E) EU(axk
xk, E)) EU (a E)
Expected utility of the best action if we dont
know X (i.e., currently)
Expected utility of the best action given that
value of X
Probability of each value of X
18Umbrella network
Decision may be helped with forecast (additional
information)
take/dont take
P(rain) 0.4
D(FSunny) Take D(FRainy) Not_Take
Umbrella
Weather
Lug umbrella
Forecast
P(lugtake) 1.0 P(lugtake)1.0
Happiness
f w p(fw) sunny rain
0.3 rainy rain 0.7 sunny no rain
0.8 rainy no rain 0.2
U(lug, rain) -25 U(lug, rain) 0 U(lug,
rain) -100 U(lug, rain) 100
19Exercise Umbrella network
p(rainsunny) 0.12 5/3 0.2 p(rainsunny)
0.485/3 0.8 Similarly, we have p(rainrainy)
0.12 2.5 0.7 p(rainrainy) 0.282.5 0.3
p(WF) ap(FW)P(W) p(sunnyrain)p(rain)
0.30.4 0.12 P(sunnyrain)p(rain) 0.80.6
0.48 a 1/(0.120.48) 5/3
EU(takefrainy)) -25P(rainrainy)
0P(rainrainy) -250.7 -17.5 EU(takefra
iny) -1000.7 1000.3 -40 a2 take
EU(takefsunny)) -25P(rainsunny)
0P(rainsunny) -250.2 -5 EU(takefsunny
) -1000.2 1000.8 60 a1 take
P(rain) 0.4
f w p(fw) sunny rain
0.3 rainy rain 0.7 sunny no rain
0.8 rainy no rain 0.2
VPI(F) 60P(fsunny) 17.5p(frainy) 20
600.6 17.50.4 20 9