Title: Random Variables
1Random Variables Expectation
2Random Variable
- A random variable (r.v.) is a well defined rule
for assigning a numerical value to all possible
outcomes of an experiment. - example
- experiment taking a course
- outcomes grades A, B, C, D, F
- sample space S discrete finite
- random variable Y 4 if grade is A
- Y 3 if grade is B
- Y 2 if grade is C
- Y 1 if grade is D
- Y 0 if grade is F
3Experiment throw 2 diceWhat are the possible
outcomes?
- 1,1 2,1 3,1 4,1 5,1 6,1
- 1,2 2,2 3,2 4,2 5,2 6,2
- 1,3 2,3 3,3 4,3 5,3 6,3
- 1,4 2,4 3,4 4,4 5,4 6,4
- 1,5 2,5 3,5 4,5 5,5 6,5
- 1,6 2,6 3,6 4,6 5,6 6,6
4Define the random variable X to be the sum of the
dots on the 2 dice.
5For which outcomes does X 9
- 1,1 2,1 3,1 4,1 5,1 6,1
- 1,2 2,2 3,2 4,2 5,2 6,2
- 1,3 2,3 3,3 4,3 5,3 6,3
- 1,4 2,4 3,4 4,4 5,4 6,4
- 1,5 2,5 3,5 4,5 5,5 6,5
- 1,6 2,6 3,6 4,6 5,6 6,6
6For which outcomes does X 9
- 1,1 2,1 3,1 4,1 5,1 6,1
- 1,2 2,2 3,2 4,2 5,2 6,2
- 1,3 2,3 3,3 4,3 5,3 6,3
- 1,4 2,4 3,4 4,4 5,4 6,4
- 1,5 2,5 3,5 4,5 5,5 6,5
- 1,6 2,6 3,6 4,6 5,6 6,6
7What is Pr(X9)?
- 1,1 2,1 3,1 4,1 5,1 6,1
- 1,2 2,2 3,2 4,2 5,2 6,2
- 1,3 2,3 3,3 4,3 5,3 6,3
- 1,4 2,4 3,4 4,4 5,4 6,4
- 1,5 2,5 3,5 4,5 5,5 6,5
- 1,6 2,6 3,6 4,6 5,6 6,6
Since there are 36 equally likely outcomes, each
has a probability of 1/36. So since there are 4
outcomes that yield X9, Pr(X9) 4/36 1/9
8Lets calculate the probabilities of all the
possible values x of the random variable X
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
9Lets calculate the probabilities of the possible
values x of the random variable X
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
10Lets calculate the probabilities of the possible
values x of the random variable X
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
11Lets calculate the probabilities of the possible
values x of the random variable X
- x Pr(Xx)
- 2 1/36
- 3 2/36
- 4 3/36
-
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
12Lets calculate the probabilities of the possible
values x of the random variable X
- x Pr(Xx)
- 2 1/36
- 3 2/36
- 4 3/36
- 5 4/36
-
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
13Lets calculate the probabilities of the possible
values x of the random variable X
- x Pr(Xx)
- 2 1/36
- 3 2/36
- 4 3/36
- 5 4/36
- 6 5/36
-
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
14Lets calculate the probabilities of the possible
values x of the random variable X
- x Pr(Xx)
- 2 1/36
- 3 2/36
- 4 3/36
- 5 4/36
- 6 5/36
- 7 6/36
-
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
15Lets calculate the probabilities of the possible
values x of the random variable X
- x Pr(Xx)
- 2 1/36
- 3 2/36
- 4 3/36
- 5 4/36
- 6 5/36
- 7 6/36
- 8 5/36
-
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
16Lets calculate the probabilities of the possible
values x of the random variable X
- x Pr(Xx)
- 2 1/36
- 3 2/36
- 4 3/36
- 5 4/36
- 6 5/36
- 7 6/36
- 8 5/36
- 9 4/36
-
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
17Lets calculate the probabilities of the possible
values x of the random variable X
- x Pr(Xx)
- 2 1/36
- 3 2/36
- 4 3/36
- 5 4/36
- 6 5/36
- 7 6/36
- 8 5/36
- 9 4/36
- 10 3/36
-
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
18Lets calculate the probabilities of the possible
values x of the random variable X
- x Pr(Xx)
- 2 1/36
- 3 2/36
- 4 3/36
- 5 4/36
- 6 5/36
- 7 6/36
- 8 5/36
- 9 4/36
- 10 3/36
- 11 2/36
-
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
19Lets calculate the probabilities of the possible
values x of the random variable X
- x Pr(Xx)
- 2 1/36
- 3 2/36
- 4 3/36
- 5 4/36
- 6 5/36
- 7 6/36
- 8 5/36
- 9 4/36
- 10 3/36
- 11 2/36
- 12 1/36
1,1 2,1 3,1 4,1 5,1 6,1 1,2 2,2 3,2 4,2 5,2 6,2 1,
3 2,3 3,3 4,3 5,3 6,3 1,4 2,4 3,4 4,4 5,4 6,4 1,5
2,5 3,5 4,5 5,5 6,5 1,6 2,6 3,6 4,6 5,6 6,6
20Lets graph the probability distribution of X.
- x Pr(Xx)
- 2 1/36
- 3 2/36
- 4 3/36
- 5 4/36
- 6 5/36
- 7 6/36
- 8 5/36
- 9 4/36
- 10 3/36
- 11 2/36
- 12 1/36
21Pr(Xx) f(x) p(x)as described in this table
or graph is called the probability distribution
or probability mass function (p.m.f.)
- x Pr(Xx)
- 2 1/36
- 3 2/36
- 4 3/36
- 5 4/36
- 6 5/36
- 7 6/36
- 8 5/36
- 9 4/36
- 10 3/36
- 11 2/36
- 12 1/36
22Properties of Probability Distributions
- 0 Pr(Xx) 1 for all x
-
23Cumulative Mass Function
24Cumulative Mass Function (2 dice problem)
1 30/36 24/36 18/36 12/36 6/36
F(x)
- x Pr(Xx) Pr(Xx)
- 2 1/36 1/36
- 3 2/36 3/36
- 4 3/36 6/36
- 5 4/36 10/36
- 6 5/36 15/36
- 7 6/36 21/36
- 8 5/36 26/36
- 9 4/36 30/36
- 10 3/36 33/36
- 11 2/36 35/36
- 12 1/36 1
0 1 2 3 4 5 6 7 8 9 10 11 12
13 x
25Expectation, Expected Value, or Mean of a Random
Variable
26Notice the similarity of the definitions of the
mean of a random variable the mean of a
frequency distribution for a population
Recall that probability p(x) is the relative
frequency f/N with which something occurs over
the long run. So these definitions are saying
the same thing.
27Example Suppose that a stock broker wants to
estimate the price of a certain stock one year
from now. If the probability mass function of
the price in a year is as given, determine the
expected price.
- x price in one year p(x)
- 94 0.25
- 98 0.25
- 102 0.25
- 106 0.25
28Example Suppose that a stock broker wants to
estimate the price of a certain stock one year
from now. If the probability mass function of
the price in a year is as given, determine the
expected price.
- x price in one year p(x)
- 94 0.25
- 98 0.25
- 102 0.25
- 106 0.25
- 1.00
29Example Suppose that a stock broker wants to
estimate the price of a certain stock one year
from now. If the probability mass function of
the price in a year is as given, determine the
expected price.
- x price in one year p(x) xp(x)
- 94 0.25 23.5
- 98 0.25 24.5
- 102 0.25 25.5
- 106 0.25 26.5
- 1.00
30Example Suppose that a stock broker wants to
estimate the price of a certain stock one year
from now. If the probability mass function of
the price in a year is as given, determine the
expected price.
- x price in one year p(x) xp(x)
- 94 0.25 23.5
- 98 0.25 24.5
- 102 0.25 25.5
- 106 0.25 26.5
- 1.00 100.0
Notice that you do NOT divide by the number of
observations when youre done adding. Also, the
probabilities do not have to be equal they just
have to add up to one.
31Theorem Suppose that g(X) is a function of a
random variable X, the probability mass
function of X is px(x). Then the expected value
of g(X) is
32Example Suppose Y X2 the distribution of X
is as given below. Determine the mean of g(X) by
using1. the definition of expected value, 2.
the previous theorem.
- x p(x)
- -2 0.1
- -1 0.2
- 1 0.3
- 2 0.4
33Example Suppose Y X2 the distribution of X
is as given below. Determine the mean of g(X) by
using1. the definition of expected value, 2.
the previous theorem.
- x p(x) y p(y)
- -2 0.1
- -1 0.2
- 1 0.3
- 2 0.4
34Example Suppose Y X2 the distribution of X
is as given below. Determine the mean of g(X) by
using1. the definition of expected value, 2.
the previous theorem.
- x p(x) y p(y)
- -2 0.1 1 0.5
- -1 0.2
- 1 0.3
- 2 0.4
35Example Suppose Y X2 the distribution of X
is as given below. Determine the mean of g(X) by
using1. the definition of expected value, 2.
the previous theorem.
- x p(x) y p(y)
- -2 0.1 1 0.5
- -1 0.2 4 0.5
- 1 0.3
- 2 0.4
36Example Suppose Y X2 the distribution of X
is as given below. Determine the mean of g(X) by
using1. the definition of expected value, 2.
the previous theorem.
- x p(x) y p(y)
yp(y) - -2 0.1 1 0.5 0.5
- -1 0.2 4 0.5 2.0
- 1 0.3
- 2 0.4
37Example Suppose Y X2 the distribution of X
is as given below. Determine the mean of g(X) by
using1. the definition of expected value, 2.
the previous theorem.
- x p(x) y p(y)
yp(y) - -2 0.1 1 0.5 0.5
- -1 0.2 4 0.5 2.0
- 1 0.3 E(Y) 2.5
- 2 0.4
38Example Suppose Y X2 the distribution of X
is as given below. Determine the mean of g(X) by
using1. the definition of expected value, 2.
the previous theorem.
- x p(x) y
- -2 0.1 4
- -1 0.2 1
- 1 0.3 1
- 2 0.4 4
-
39Example Suppose Y X2 the distribution of X
is as given below. Determine the mean of g(X) by
using1. the definition of expected value, 2.
the previous theorem.
- x p(x) y ypx(x)
- -2 0.1 4 0.4
- -1 0.2 1 0.2
- 1 0.3 1 0.3
- 2 0.4 4 1.6
-
40Example Suppose Y X2 the distribution of X
is as given below. Determine the mean of g(X) by
using1. the definition of expected value, 2.
the previous theorem.
- x p(x) y ypx(x)
- -2 0.1 4 0.4
- -1 0.2 1 0.2
- 1 0.3 1 0.3
- 2 0.4 4 1.6
- E(Y) 2.5
41DefinitionVariance of a random variable X
42TheoremThe variance of X can also be calculated
as follows
43Standard Deviation of a random variable X
44Example Suppose sales at a donut shop are
distributed as below. Calculate (a) the mean
number of donuts sold, (b) the variance (using
both the definition of the variance the
theorem), (c) the standard deviation.
x p(x)
1 0.08
2 0.27
4 0.10
6 0.33
12 0.22
45First, the mean.
x p(x) xp(x)
1 0.08 0.08
2 0.27 0.54
4 0.10 0.40
6 0.33 1.98
12 0.22 2.64
46First, the mean.
x p(x) xp(x)
1 0.08 0.08
2 0.27 0.54
4 0.10 0.40
6 0.33 1.98
12 0.22 2.64
m5.64
47Next, the variance using the definition
x p(x) xp(x) x-m
1 0.08 0.08 -4.64
2 0.27 0.54 -3.64
4 0.10 0.40 -1.64
6 0.33 1.98 0.36
12 0.22 2.64 6.36
m5.64
48Next, the variance using the definition
x p(x) xp(x) x-m (x-m)2
1 0.08 0.08 -4.64 21.53
2 0.27 0.54 -3.64 13.25
4 0.10 0.40 -1.64 2.69
6 0.33 1.98 0.36 0.13
12 0.22 2.64 6.36 40.45
m5.64
49Next, the variance using the definition
x p(x) xp(x) x-m (x-m)2 (x-m)2p(x)
1 0.08 0.08 -4.64 21.53 1.72
2 0.27 0.54 -3.64 13.25 3.58
4 0.10 0.40 -1.64 2.69 0.27
6 0.33 1.98 0.36 0.13 0.04
12 0.22 2.64 6.36 40.45 8.90
m5.64
50Next, the variance using the definition
x p(x) xp(x) x-m (x-m)2 (x-m)2p(x)
1 0.08 0.08 -4.64 21.53 1.72
2 0.27 0.54 -3.64 13.25 3.58
4 0.10 0.40 -1.64 2.69 0.27
6 0.33 1.98 0.36 0.13 0.04
12 0.22 2.64 6.36 40.45 8.90
m5.64 s2 14.51
51Now, the variance using the theoremV(X)
E(X2)-E(X)2.
x p(x) xp(x) x-m (x-m)2 (x-m)2p(x) x2
1 0.08 0.08 -4.64 21.53 1.72 1
2 0.27 0.54 -3.64 13.25 3.58 4
4 0.10 0.40 -1.64 2.69 0.27 16
6 0.33 1.98 0.36 0.13 0.04 36
12 0.22 2.64 6.36 40.45 8.90 144
m5.64 s2 14.51
52Now, the variance using the theoremV(X)
E(X2)-E(X)2.
x p(x) xp(x) x-m (x-m)2 (x-m)2p(x) x2 x2p(x)
1 0.08 0.08 -4.64 21.53 1.72 1 0.08
2 0.27 0.54 -3.64 13.25 3.58 4 1.08
4 0.10 0.40 -1.64 2.69 0.27 16 1.60
6 0.33 1.98 0.36 0.13 0.04 36 11.88
12 0.22 2.64 6.36 40.45 8.90 144 31.68
m5.64 s2 14.51
53Now, the variance using the theoremV(X)
E(X2)-E(X)2.
x p(x) xp(x) x-m (x-m)2 (x-m)2p(x) x2 x2p(x)
1 0.08 0.08 -4.64 21.53 1.72 1 0.08
2 0.27 0.54 -3.64 13.25 3.58 4 1.08
4 0.10 0.40 -1.64 2.69 0.27 16 1.60
6 0.33 1.98 0.36 0.13 0.04 36 11.88
12 0.22 2.64 6.36 40.45 8.90 144 31.68
m5.64 s2 14.51 E(X2)46.32
54Now, the variance using the theoremV(X)
E(X2)-E(X)2.
x p(x) xp(x) x-m (x-m)2 (x-m)2p(x) x2 x2p(x)
1 0.08 0.08 -4.64 21.53 1.72 1 0.08
2 0.27 0.54 -3.64 13.25 3.58 4 1.08
4 0.10 0.40 -1.64 2.69 0.27 16 1.60
6 0.33 1.98 0.36 0.13 0.04 36 11.88
12 0.22 2.64 6.36 40.45 8.90 144 31.68
m5.64 s2 14.51 E(X2)46.32
s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51
55And lastly, the standard deviation,by taking the
square root of the variance.
x p(x) xp(x) x-m (x-m)2 (x-m)2p(x) x2 x2p(x)
1 0.08 0.08 -4.64 21.53 1.72 1 0.08
2 0.27 0.54 -3.64 13.25 3.58 4 1.08
4 0.10 0.40 -1.64 2.69 0.27 16 1.60
6 0.33 1.98 0.36 0.13 0.04 36 11.88
12 0.22 2.64 6.36 40.45 8.90 144 31.68
m5.64 s2 14.51 E(X2)46.32
s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s 3.81 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s 3.81 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s 3.81 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s 3.81 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s 3.81 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s 3.81 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s 3.81 s2 V(X) E(X2) E(X)2 46.32 (5.64)2 14.51 s 3.81
56Important Theorem
- If X has mean m and variance s2, then (X-m)/s
has mean 0 and variance 1.
57Example (G-m)/s
- Suppose your course grades have a mean of 2.7 and
a standard deviation of 1.2. - Suppose you took your grades, subtracted 2.7 from
each one, then divided those results by 1.2. - The new set of numbers would have a mean of 0 and
a standard deviation of 1.
58Expectation RulesLet k, a, b be constants.
- E(k) k The mean of a
constant is the constant. - 2. V(k) 0 The variance of a constant is zero.
- E(a bX) a b E(X)
- V(a bX) b2 V(X)
59Example If X has a mean of 3 and a variance of
2/3, what are the mean and variance of Y52X ?
- First find the mean E(Y) E(52X).
- E(a bX) a b E(X).
- Let a5 b2. Then just plug into the formula.
So, - E(Y) E(52X) 5 2 E(X) 5 2(3) 11.
- Next find the variance V(Y) V(52X).
- V(a bX) b2 V(X).
- Again let a5 and b2 and just plug into the
formula. - V(Y) V(52X) 22 V(X) 4 V(X) 4(2/3) 8/3.
- Notice that the constant term shifts the mean but
has no effect on the spread of the distribution.
60Joint Probability Distribution for 2 Discrete
Random Variables X Y
61Properties of Joint Probability Distributions
62Example Consider the following joint
distribution of the number of jobs the number
of promotions of college graduates in their 1st 5
years out of college.
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y)
1 2 3 4
1 0.10 0.15 0.12 0.06
2 0.05 0.07 0.10 0.05
3 0.04 0.02 0.14 0.10
Number of jobs (x)
63For example, the probability of 3 jobs 2
promotions is 0.02.
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y)
1 2 3 4
1 0.10 0.15 0.12 0.06
2 0.05 0.07 0.10 0.05
3 0.04 0.02 0.14 0.10
Number of jobs (x)
64We can determine the marginal distribution of
the 2 random variables X Y just as we did
before for 2 events.Just add across the row or
down the column.
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y)
1 2 3 4
1 0.10 0.15 0.12 0.06
2 0.05 0.07 0.10 0.05
3 0.04 0.02 0.14 0.10
Number of jobs (x)
65For the probability of 1 job
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05
3 0.04 0.02 0.14 0.10
Number of jobs (x)
66Similarly for the probabilities of 2 or 3 jobs
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
Number of jobs (x)
67For the probability of 1 promotion
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19
Number of jobs (x)
68and for the probabilities of 2, 3, or 4
promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21
Number of jobs (x)
69Notice again, that you must get at total one when
you total the marginal probabilities for x and
for y.
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
Number of jobs (x)
70Conditional Probabilities for Random
VariablesExample
- The probability that X is 2 given that Y is 3
- pXY(23) Pr(X2Y3)
- Pr(X2 Y3)/Pr(Y3).
- The probability that Y is 2 given that X is 3
- pYX(23) Pr(Y2X3)
- Pr(Y2 X3)/Pr(X3).
71Lets do the calculations using our previous
example.
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
pXY(23) Pr(X2Y3) Pr(X2
Y3)/Pr(Y3) 0.10/0.36 0.278. pYX(23)
Pr(Y2X3) Pr(Y2 X3)/Pr(X3) 0.02/0.30
0.067.
Number of jobs (x)
72Cumulative Joint Mass Function for 2 Discrete
Random Variables X Y
73Job/Promotion Example Find probability that a
person had 2 or fewer jobs 3 or fewer promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
F(2,3)
Number of jobs (x)
74Job/Promotion Example Find probability that a
person had 2 or fewer jobs 3 or fewer promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
F(2,3) f(1,1)
Number of jobs (x)
75Job/Promotion Example Find probability that a
person had 2 or fewer jobs 3 or fewer promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
F(2,3) f(1,1) f(1,2)
Number of jobs (x)
76Job/Promotion Example Find probability that a
person had 2 or fewer jobs 3 or fewer promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
F(2,3) f(1,1) f(1,2) f(1,3)
Number of jobs (x)
77Job/Promotion Example Find probability that a
person had 2 or fewer jobs 3 or fewer promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
F(2,3) f(1,1) f(1,2) f(1,3)
Number of jobs (x)
78Job/Promotion Example Find probability that a
person had 2 or fewer jobs 3 or fewer promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
F(2,3) f(1,1) f(1,2) f(1,3) f(2,1)
Number of jobs (x)
79Job/Promotion Example Find probability that a
person had 2 or fewer jobs 3 or fewer promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
F(2,3) f(1,1) f(1,2) f(1,3) f(2,1)
f(2,2)
Number of jobs (x)
80Job/Promotion Example Find probability that a
person had 2 or fewer jobs 3 or fewer promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
F(2,3) f(1,1) f(1,2) f(1,3) f(2,1)
f(2,2) f(2,3)
Number of jobs (x)
81Job/Promotion Example Find probability that a
person had 2 or fewer jobs 3 or fewer promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
F(2,3) f(1,1) f(1,2) f(1,3) f(2,1)
f(2,2) f(2,3)
Number of jobs (x)
82Job/Promotion Example Find probability that a
person had 2 or fewer jobs 3 or fewer promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
F(2,3) f(1,1) f(1,2) f(1,3) f(2,1)
f(2,2) f(2,3) 0.10 0.15 0.12 0.05
0.07 0.10
Number of jobs (x)
83Job/Promotion Example Find probability that a
person had 2 or fewer jobs 3 or fewer promotions
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
F(2,3) f(1,1) f(1,2) f(1,3) f(2,1)
f(2,2) f(2,3) 0.10 0.15 0.12 0.05
0.07 0.10 0.59
Number of jobs (x)
84Independence
- Recall that 2 events A B were independent if
Pr(AnB)Pr(A) Pr(B) - Similarly 2 random variables are independent if
p(x,y) pX(x) pY(y) for all values of x y
85In our previous example, are the number of jobs
number of promotions independent?
Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) Number of Promotions (y) pX(x) marginal prob. of x
1 2 3 4 pX(x) marginal prob. of x
1 0.10 0.15 0.12 0.06 0.43
2 0.05 0.07 0.10 0.05 0.27
3 0.04 0.02 0.14 0.10 0.30
pY(y) marginal prob. of y pY(y) marginal prob. of y 0.19 0.24 0.36 0.21 1.00
We must have p(x,y) pX(x) pY(y) for all
values of x y. To start, does p(1,1) equal
pX(1) pY(1) ? p(1,1) 0.10 pX(1) pY(1) 0.43
0.19 0.0817
? 0.10 So X Y are not independent. If that case
had been equal, we wouldnt be done yet. Wed
have to verify that equality held for all the
cells.
Number of jobs (x)
86Theorem mean of a function of 2 random
variables X Y
87Suppose that based on the joint distribution of
the length X width Y of lumber sold by a
lumberyard, we would like to determine the mean
length, mean width, mean area of the lumber.
- So we want to calculate
- E(X),
- E(Y), and
- E(XY).
88Given the joint distribution below, calculate
E(X), E(Y), E(XY).
Y Y Y
2 4 6
X 4 0.05 0.05 0.10
X 8 0.10 0.50 0.20
89First, determine the marginal distributions.
Y Y Y
2 4 6
X 4 0.05 0.05 0.10
X 8 0.10 0.50 0.20
90The marginal distribution of X ...
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
91The marginal distribution of Y ...
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30
92Check that the marginal distribution
probabilities sum to 1.
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
93Next we calculate the mean length mean width.
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
94For E(X), remember we need to multiply the
values by their probabilities and add up.
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
x p(x) xp(x)
95We get the values of X and their probabilities
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
x p(x) xp(x)
4 0.20
8 0.80
96multiply
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
x p(x) xp(x)
4 0.20 0.80
8 0.80 6.40
97and add up.
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
x p(x) xp(x)
4 0.20 0.80
8 0.80 6.40
7.20
98We now have our E(X).
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
x p(x) xp(x)
4 0.20 0.80
8 0.80 6.40
E(X) 7.20 E(X) 7.20
99For E(Y), we do the same thing.
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
y p(y) yp(y)
100Get the values of Y and their probabilities
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
y p(y) yp(y)
2 0.15
4 0.55
6 0.30
101multiply
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
y p(y) yp(y)
2 0.15 0.30
4 0.55 2.20
6 0.30 1.80
102and add up.
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
y p(y) yp(y)
2 0.15 0.30
4 0.55 2.20
6 0.30 1.80
4.30
103Theres our E(Y).
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
y p(y) yp(y)
2 0.15 0.30
4 0.55 2.20
6 0.30 1.80
E(Y) 4.30 E(Y) 4.30
104To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
For the mean area, E(XY), the theorem translates
to
105To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
To keep track of the xy terms, we are going to
put them in our table.
106To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
107To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
108To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
109To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
110To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 (32) 0.20 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
111To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 (32) 0.20 (48) 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
112To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 (32) 0.20 (48) 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
Next, we need to multiple the xy terms by the
corresponding probabilities,
113To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 (32) 0.20 (48) 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
and then add it all up.
114To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 (32) 0.20 (48) 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
So we have 0.05 (8) ...
115To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 (32) 0.20 (48) 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
So we have 0.05 (8) 0.05 (16) ...
116To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 (32) 0.20 (48) 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
So we have 0.05 (8) 0.05 (16) 0.10 (24) ...
117To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 (32) 0.20 (48) 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
So we have 0.05 (8) 0.05 (16) 0.10 (24)
0.10 (16) ...
118To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 (32) 0.20 (48) 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
So we have 0.05 (8) 0.05 (16) 0.10 (24)
0.10 (16) 0.50 (32) ...
119To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 (32) 0.20 (48) 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
So we have 0.05 (8) 0.05 (16) 0.10 (24)
0.10 (16) 0.50 (32) 0.20 (48) ...
120To calculate the mean area E(XY), we use the
theorem
Y Y Y pX(x)
2 4 6 pX(x)
X 4 0.05 (8) 0.05 (16) 0.10 (24) 0.20
X 8 0.10 (16) 0.50 (32) 0.20 (48) 0.80
pY(y) pY(y) 0.15 0.55 0.30 1.00
So we have 0.05 (8) 0.05 (16) 0.10 (24)
0.10 (16) 0.50 (32) 0.20 (48) 30.8 for
the mean area.
121You might wonder if we could get E(XY) by just
multiplying E(X) by E(Y).
- The answer is generally not.
- In our example, we had E(X) 7.2, E(Y) 4.3,
E(XY) 30.8 - E(X) E(Y) 30.96, not 30.80.
- Close in this case, but not the same.
122If X and Y are independent, then it is true that
E(XY) E(X) E(Y).
- It may also hold occasionally in other cases.
- But generally, it doesnt work.
123Definition Covariance of X Y
What does this mean?
124Suppose that two variables tend to move in the
same direction, like study time and grades. Next,
when x is large, so that it is larger than its
mean, then x-mX gt 0. When x is large, y tends to
be large as well, so that y-mY gt 0
also. Remember, that the p(x,y) values are
probabilities and therefore must be positive. So
those terms in the formula would look like
These products are positive.
125Similarly, since x and y tend to be small
together,we have x-mX lt 0 with y-mYlt0
too. Those terms would look like
- -
These products are positive too. So were adding
up a lot of positive numbers. What all that means
is that when 2 variables tend to move in the same
direction, the covariance will positive.
126When 2 variables tend to move in opposite
directions,
- their covariance C(X,Y) lt 0,
- perhaps like party time and grades.
127If variables dont tend to move either in the
same or opposite directions,
- their covariance C(X,Y) 0.
- This case includes independent variables.
128It is usually easier to calculate covariances
using this theorem.
- Theorem C(X,Y) E(XY) E(X) E(Y)
129Returning to the lumber example
- Remember we had E(X) 7.2, E(Y) 4.3, E(XY)
30.8 - Then the covariance would be
- C(X,Y) E(XY) E(X) E(Y)
- (30.8) (7.2)(4.3)
- - 0.16
130Difficulty
- The value of the covariance changes when you
change units. - That is, you get different answers if you use
feet, inches, or meters. - So its difficult to tell if a particular answer
means a strong relationship or not. - Fortunately, we have a solution to this problem
131Correlation Coefficient
- The correlation coefficient is similar to the
covariance, but it doesnt vary with the units
used.
132Correlation Coefficient
The correlation coefficient is denoted by the
Greek letter rho, r. Its computed by dividing
the covariance of X Y by the standard
deviations of X of Y.
133The correlation coefficient is always between -1
and 1.
-1 r 1.
134Correlation Coefficient
-1 r 1
So, if your correlation coefficient r is close to
1, you have a strong positive relationship. If it
is close to -1, you have a strong negative
relationship. If it is close to zero, there is no
strong linear relationship at all.
135Back to the lumber example again
- We had C(X,Y) -0.16.
- We need the standard deviations of X and Y, which
we have not calculated yet.
136This is what we had for X so far.
x p(x) xp(x)
4 0.20 0.80
8 0.80 6.40
E(X) 7.20
137Recall we said previously that we can calculate
V(X)as V(X) E(X2) E(X)2.
x p(x) xp(x)
4 0.20 0.80
8 0.80 6.40
E(X) 7.20
We have E(X) but we need E(X2). The theorem
Eg(X) Sg(x)p(x) gives us E(X2) Sx2p(x)
138E(X2) Sx2p(x)
x p(x) xp(x) x2 x2p(x)
4 0.20 0.80 16
8 0.80 6.40 64
E(X) 7.20
139E(X2) Sx2p(x)
x p(x) xp(x) x2 x2p(x)
4 0.20 0.80 16 3.2
8 0.80 6.40 64 51.2
E(X) 7.20
140E(X2) Sx2p(x)
x p(x) xp(x) x2 x2p(x)
4 0.20 0.80 16 3.2
8 0.80 6.40 64 51.2
E(X) 7.20 E(X2) 54.4
141Now we need to subtract to get V(X).
x p(x) xp(x) x2 x2p(x)
4 0.20 0.80 16 3.2
8 0.80 6.40 64 51.2
E(X) 7.20 E(X2) 54.4
V(X) E(X2) E(X)2 V(X) E(X2) E(X)2 V(X) E(X2) E(X)2 V(X) E(X2) E(X)2 V(X) E(X2) E(X)2
142x p(x) xp(x) x2 x2p(x)
4 0.20 0.80 16 3.2
8 0.80 6.40 64 51.2
E(X) 7.20 E(X2) 54.4
V(X) E(X2) E(X)2 54.4 (7.2)2 V(X) E(X2) E(X)2 54.4 (7.2)2 V(X) E(X2) E(X)2 54.4 (7.2)2 V(X) E(X2) E(X)2 54.4 (7.2)2 V(X) E(X2) E(X)2 54.4 (7.2)2
143x p(x) xp(x) x2 x2p(x)
4 0.20 0.80 16 3.2
8 0.80 6.40 64 51.2
E(X) 7.20 E(X2) 54.4
V(X) E(X2) E(X)2 54.4 (7.2)2 2.56 V(X) E(X2) E(X)2 54.4 (7.2)2 2.56 V(X) E(X2) E(X)2 54.4 (7.2)2 2.56 V(X) E(X2) E(X)2 54.4 (7.2)2 2.56 V(X) E(X2) E(X)2 54.4 (7.2)2 2.56
144Take the square root to get the standard
deviation sX
x p(x) xp(x) x2 x2p(x)
4 0.20 0.80 16 3.2
8 0.80 6.40 64 51.2
E(X) 7.20 E(X2) 54.4
V(X) E(X2) E(X)2 54.4 (7.2)2 2.56 sX 1.60 V(X) E(X2) E(X)2 54.4 (7.2)2 2.56 sX 1.60 V(X) E(X2) E(X)2 54.4 (7.2)2 2.56 sX 1.60 V(X) E(X2) E(X)2 54.4 (7.2)2 2.56 sX 1.60 V(X) E(X2) E(X)2 54.4 (7.2)2 2.56 sX 1.60
145We do the same thing with Y.
y p(y) yp(y)
2 0.15 0.30
4 0.55 2.20
6 0.30 1.80
E(Y) 4.30
146Get y2
y p(y) yp(y) y2 y2p(y)
2 0.15 0.30 4
4 0.55 2.20 16
6 0.30 1.80 36
E(Y) 4.30
147Multiply by p(y).
y p(y) yp(y) y2 y2p(y)
2 0.15 0.30 4 0.60
4 0.55 2.20 16 8.80
6 0.30 1.80 36 10.80
E(Y) 4.30
148Add to get E(Y2).
y p(y) yp(y) y2 y2p(y)
2 0.15 0.30 4 0.60
4 0.55 2.20 16 8.80
6 0.30 1.80 36 10.80
E(Y) 4.30 E(Y2) 20.20
149Subtract to get V(Y).
y p(y) yp(y) y2 y2p(y)
2 0.15 0.30 4 0.60
4 0.55 2.20 16 8.80
6 0.30 1.80 36 10.80
E(Y) 4.30 E(Y2) 20.20
V(Y) E(Y2) E(Y)2 20.20 (4.3)2 1.71 V(Y) E(Y2) E(Y)2 20.20 (4.3)2 1.71 V(Y) E(Y2) E(Y)2 20.20 (4.3)2 1.71 V(Y) E(Y2) E(Y)2 20.20 (4.3)2 1.71 V(Y) E(Y2) E(Y)2 20.20 (4.3)2 1.71
150Take the square root to get the standard
deviation sY
y p(y) yp(y) y2 y2p(y)
2 0.15 0.30 4 0.60
4 0.55 2.20 16 8.80
6 0.30 1.80 36 10.80
E(Y) 4.30 E(Y2) 20.20
V(Y) E(Y2) E(Y)2 20.20 (4.3)2 1.71 sY 1.31 V(Y) E(Y2) E(Y)2 20.20 (4.3)2 1.71 sY 1.31 V(Y) E(Y2) E(Y)2 20.20 (4.3)2 1.71 sY 1.31 V(Y) E(Y2) E(Y)2 20.20 (4.3)2 1.71 sY 1.31 V(Y) E(Y2) E(Y)2 20.20 (4.3)2 1.71 sY 1.31
151Now we have everything we need to compute the
correlation coefficient for the lumber problem.
This number is much closer to 0 than it is to
-1. So the negative relation between the length
width of the lumber is very weak.
152Theorem
- E(aX bY) aE(X) bE(Y)
- V(aX bY) a2V(X) b2V(Y) 2abC(X,Y)
153Example The mean variance of X are 1 5
respectively. The mean variance of Y are 2 6
respectively. The covariance of X Y is 7.
Determine the mean variance of 4X 3Y.
- Recall E(aX bY) aE(X) bE(Y)
- V(aX bY) a2V(X) b2V(Y)
2abC(X,Y) - To solve this problem what should a b be?
- a is 4 b is 3.
- E(aX bY) aE(X) bE(Y) 4 (1) 3(2)
- 4
6 10 - V(aX bY) a2V(X) b2V(Y) 2abC(X,Y)
- 42V(X) 32V(Y)
2(4)(3)C(X,Y) - 16(5) 9(6) 24(7)
- 80 54 168
- 302
154Consider the following joint distribution of X
Y.
y y
2 4
x 1 0.20 0.25
x 3 0.15 0.20
x 5 0.15 0.05
- Determine the following
- The mean variance of X
- The mean variance of Y
- The covariance correlation coefficient of X Y
- The mean variance of XY
155First, determine the marginal distribution of X
y y pX(x)
2 4 pX(x)
x 1 0.20 0.25 0.45
x 3 0.15 0.20 0.35
x 5 0.15 0.05 0.20
156and the marginal distribution of Y.
y y pX(x)
2 4 pX(x)
x 1 0.20 0.25 0.45
x 3 0.15 0.20 0.35
x 5 0.15 0.05 0.20
pY(y) pY(y) 0.50 0.50
157Verify that they sum to 1.
y y pX(x)
2 4 pX(x)
x 1 0.20 0.25 0.45
x 3 0.15 0.20 0.35
x 5 0.15 0.05 0.20
pY(y) pY(y) 0.50 0.50 1
158Set up table to compute the mean variance of X.
y y pX(x)
2 4 pX(x)
x 1 0.20 0.25 0.45
x 3 0.15 0.20 0.35
x 5 0.15 0.05 0.20
pY(y) pY(y) 0.50 0.50 1
x p(x) xp(x) x2p(x)
159Fill in the values of X and their probabilities.
x p(x) xp(x) x2p(x)
1 0.45
3 0.35
5 0.20
y y pX(x)
2 4 pX(x)
x 1 0.20 0.25 0.45
x 3 0.15 0.20 0.35
x 5 0.15 0.05 0.20
pY(y) pY(y) 0.50 0.50 1
160Multiply x by p(x).
x p(x) xp(x) x2p(x)
1 0.45 0.45
3 0.35 1.05
5 0.20 1.00
161Add to get the mean of X.
x p(x) xp(x) x2p(x)
1 0.45 0.45
3 0.35 1.05
5 0.20 1.00
E(X) 2.50
162To calculate the variance, first compute E(X2)
S x2p(x).
x p(x) xp(x) x2p(x)
1 0.45 0.45 0.45
3 0.35 1.05 3.15
5 0.20 1.00 5.00
E(X) 2.50
163To calculate the variance, first compute E(X2)
S x2p(x).
x p(x) xp(x) x2p(x)
1 0.45 0.45 0.45
3 0.35 1.05 3.15
5 0.20 1.00 5.00
E(X) 2.50 E(X2)8.60
164Calculate the variance as V(X) E(X2) E(X)2.
x p(x) xp(x) x2p(x)
1 0.45 0.45 0.45
3 0.35 1.05 3.15
5 0.20 1.00 5.00
E(X) 2.50 E(X2)8.60
V(X) E(X2) E(X)2 8.6 (2.5)2 2.35 V(X) E(X2) E(X)2 8.6 (2.5)2 2.35 V(X) E(X2) E(X)2 8.6 (2.5)2 2.35 V(X) E(X2) E(X)2 8.6 (2.5)2 2.35
165Set up table to compute the mean variance of Y.
y y pX(x)
2 4 pX(x)
x 1 0.20 0.25 0.45
x 3 0.15 0.20 0.35
x 5 0.15 0.05 0.20
pY(y) pY(y) 0.50 0.50 1
y p(y) yp(y) y2p(y)
166Fill in the values of Y and their probabilities.
y y pX(x)
2 4 pX(x)
x 1 0.20 0.25 0.45
x 3 0.15 0.20 0.35
x 5 0.15 0.05 0.20
pY(y) pY(y) 0.50 0.50 1
y p(y) yp(y) y2p(y)
2 0.5
4 0.5
167Multiply y by p(y)
y p(y) yp(y) y2p(y)
2 0.5 1
4 0.5 2
168and add to get E(Y).
y p(y) yp(y) y2p(y)
2 0.5 1
4 0.5 2
E(Y) 3
169To calculate the variance, first compute E(Y2)
S y2p(y).
y p(y) yp(y) y2p(y)
2 0.5 1 2
4 0.5 2 8
E(Y) 3
170To calculate the variance, first compute E(Y2)
S y2p(y).
y p(y) yp(y) y2p(y)
2 0.5 1 2
4 0.5 2 8
E(Y) 3 E(Y2) 10
171Calculate the variance as V(Y) E(Y2) E(Y)2.
y p(y) yp(y) y2p(y)
2 0.5 1 2
4 0.5 2 8
E(Y) 3 E(Y2) 10
V(Y) E(Y2) E(Y)2 10 (3)2 1 V(Y) E(Y2) E(Y)2 10 (3)2 1 V(Y) E(Y2) E(Y)2 10 (3)2 1 V(Y) E(Y2) E(Y)2 10 (3)2 1
172To determine the C(X,Y) E(XY) - E(X) E(Y), we
need
173As before, well put the xy values in the table
next to the probability values
y y pX(x)
2 4 pX(x)
x 1 0.20 (2) 0.25 (4) 0.45
x 3 0.15 (6) 0.20 (12) 0.35
x 5 0.15 (10) 0.05 (20) 0.20
pY(y) pY(y) 0.50 0.50 1.00
174Then we multiply and add.
y y pX(x)
2 4 pX(x)
x 1 0.20 (2) 0.25 (4) 0.45
x 3 0.15 (6) 0.20 (12) 0.35
x 5 0.15 (10) 0.05 (20) 0.20
pY(y) pY(y) 0.50 0.50 1.00
E(XY) (0.20)(2) (0.25)(4) (0.15)(6)
(0.20)(12) (0.15)(10) (0.05)(20)
0.40 1.00 0.90 2.40
1.50 1.00 7.20
175C(X,Y) E(XY) E(X) E(Y)
- Since E(XY) 7.2, E(X) 2.5, E(Y) 3.0,
- C(X,Y) 7.2 (2.5)(3)
- 7.2 7.5
- -0.3
176Next, the correlation coefficient.
Since C(X,Y) -0.3, V(X)2.35, V(Y) 1,
177The next part of the problem asked for E(XY)
- We know that E(X) 2.5 and E(Y) 3.0.
- E(aXbY) a E(X) b E(Y)
- What should a b be?
- 1 1
- So E(XY) 1 E(X) 1E(Y)
- E(X) E(Y)
- 2.5 3.0
- 5.5
178Lastly V(XY)
- We know V(X) 2.35, V(Y) 1, C(X,Y) -0.3.
- V(aXbY) a2 V(X) b2 V(Y) 2ab C(X,Y)
- What are a b ?
- 1 1
- V(aXbY) a2 V(X) b2 V(Y) 2ab C(X,Y)
- 12 V(X) 12 V(Y)
2(1)(1)C(X,Y) - V(X) V(Y) 2C(X,Y)
- 2.35 1 2 (-0.3)
- 2.75
179Specific Discrete Distributions
- Uniform
- Binomial
- Hypergeometric
- Multinomial
- Poisson
180Uniform Distribution
- The uniform distribution assigns all the possible
values equal probabilities. - example a fair die has possible values
1, 2, 3, 4, 5, and 6 each with
probability 1/6.
181Graph of Uniform DistributionExample Fair Die
182Binomial Distribution
- Example What is the probability of getting 3
heads on 5 tosses of an unfair (lopsided) coin
whose probability on any toss of getting a head
is 1/3.
183What is the probability of getting specifically
HTHHT ?
- (1/3) (2/3) (1/3) (1/3) (2/3)
- (1/3)3 (2/3)2
- What is the probability of any other specific
outcome with 3 heads on 5 tosses? - The same.
- So we just have to figure out how many different
ways you can get 3 heads on 5 tosses, and
multiply that by the probability of each
individual outcome. - That will give us the probability of getting 3
heads on 5 tosses.
184How many ways can you get 3 heads on 5 tosses?
- Its the number of combinations of 5 objects
taken 3 at a time.
185So the probability of getting 3 heads on 5 tosses
is
186In general, the probability of getting x
successes on n trials in which the probability of
success on any given trial is p is
This is the binomial distribution.
187Notes
- 0! 1
- Each trial that can result in either success or
failure is called a Bernoulli trial.
188Example If the probability that any person
passes this course is 0.95, what is the
probability that in a a class of 30 people,
exactly 28 people pass?
189Lets go back to the example in which we flipped
a coin 5 times the probability of heads on each
toss was 1/3.
- For 3 heads, the probability was 0.1646.
- Using the binomial formula, we can determine the
probabilities of the other possibilities.
x p(x) 0 0.1317 1 0.3292 2 0.3292 3 0.1646 4
0.0412 5 0.0041 1
190If we graph this distribution, it looks like
x p(x) 0 0.1317 1 0.3292 2 0.3292 3 0.1646 4
0.0412 5 0.0041 1
Notice that there is a bump on the left and a
tail on the right. Such a distribution is said to
be skewed to the right. The skew is where the
tail is.
191Binomial Distribution
- The binomial distribution graph we just did was
for p 1/3 and the skew was to the right. - A binomial distribution with p lt ½ will always
have a skew to the right. - What do you think the distribution will look like
if p gt ½ ? - It will be skewed to the left. (The tail will be
on the left the bump will be on the right.)
192Binomial Distribution
- What do you think the distribution will look like
if p ½ ? - It will be symmetric. The left and right sides
will be mirror images of each other. - If the number of trials n (tosses in our example)
is large, the graph will be roughly symmetric
even if p ? ½ . - How large does n have to be for the graph to be
roughly symmetric? That depends on how far p is
from ½. - There are two sets of rules that are sometimes
used to determine if the graph is roughly
symmetric. - One rule requires that np 5 and n(1-p) 5.
- The other rule requires that np(1-p) 3.
- These rules are not exactly equivalent, but they
both work reasonably well.
193Mean Variance of the Binomial Distribution
- Mean m np
- Variance s2 np(1-p)
194Example What are the mean, variance, standard
deviation for our binomial distribution example
in which n5 p1/3?
- Mean m np (5)(1/3) 5/3
- Variance s2 np(1-p) (5)(1/3)(2/3) 10/9
195Using Excel to calculate Binomial Probabilities
- On an Excel spreadsheet, you can get the binomial
distribution as follows - click insert, and then click function
- select statistical as the category of function,
scroll down to the binomdist function, and click
on it - fill in the information in the dialog box .
196Suppose that you wanted to calculate a messy
binomial, such as the probability of between 60
and 70 successes inclusive, on 100 trials with
success probability on each trial of 0.64.
- This would be a lot of work with just a
calculator. You would have to calculate 11
separate binomial probabilities (the
probabilities for 60, 61, 62, 70) and then add
them up. - Its much easier with Excel.
197Remember you want the probability of between 60
and 70 successes inclusive, on 100 trials with
success probability on each trial of 0.64.
- You can calculate the (cumulative) probability of
70 or fewer successes. - Then calculate the cumulative probability of 59
or fewer successes. - Then take the difference.
198To get the probability of 70 or fewer successes,
specify the following
- of successes 70
- of trials 100
- prob.of success on any trial 0.64
- cumulative True (because you want 70 or fewer,
not just 70)
199To get the probability of 59 or fewer successes,
specify the following
- of successes 59
- of trials 100
- prob.of success on any trial 0.64
- cumulative True
200Then just subtract the two cumulative function
values you calculated.
- If you do this, you get
- 0.91368 0.17394 0.7397
201We can also study binomial problems using
proportions.
- For example, we might want to know the
probability of getting 60 heads on 5 tosses of a
coin with probability of heads on each toss of
1/3. (This is the same as getting 3 heads.) - In general, if X is the number of successes on n
trials, the proportion of successes is X/n. - We can easily determine the mean variance of
this binomial proportion variable X/n. - If p again is the probability of success on any
given trial, - E(X/n) p
- V(X/n) p(1-p)/n
202When can we use the binomial distribution?
- We have exactly two possibilities on each trial
(success or failure, heads or tails, male or
female, yes or no, etc.) - The probability of success is the same on each
trial. - The trials are independent. (What happens on one
trial has no effect on what happens on the next
trial.)
203Sampling with without Replacement
- Suppose we have a bowl with 6 red and 4 green
marbles. We select 3 marbles at random without
replacement. We want to know the probability of
selecting exactly 2 red marbles. - Whats the probability of getting a red marble on
the 1st draw? - 6/10
- Whats the probability of getting a red marble on
the 2nd draw? - It depends on what we got on the first draw.
- If we got a red one, then the probability is 5/9.
- If we got a green one, then the probability is
6/9. - Since the probability varies from trial to trial,
we can not use the binomial distribution. - We will discuss very shortly what we use instead.
204What if we selected the marbleswith replacement?
- Then the probability of a red marble would be the
same on each draw, regardless of what you pulled
out previously. - Then we could use the binomial distribution.
205Suppose we instead of having 6 red marbles and 4
green marbles, we had 6000 red ones and 4000
green ones.
- The probability of red on the 1st draw would be
6,000/10,000 0.6 . - If we got red on the 1st draw, the probability of
red on the 2nd draw would be 5999/9999 0.59996 - If we got green on the 1st draw, the probability
of red on the 2nd would be 6000/9999 0.60006 - These three numbers are very close.
- So you could use the binomial distribution to get
a very good approximation of the probability.
206So if we have two options on each trial, when we
can use the binomial distribution?
- If we sample with replacement, or
- We sample without replacement, but the sample is
small relative to the population. - A rule that is often used is that the sample
is less than 5 of the population (n lt 0.05 N).
207If our sample is more than 5 of our population,
then we will use the hypergeometric
distribution.
208Lets return to our marble problem.
- Suppose we have a bowl with 6 red and 4 green
marbles. We select 3 marbles at random without
replacement. We want to know the probability of
selecting exactly 2 red marbles. - Remember that the number of ways of selecting x
objects from n is . - So there are ways of selecting 2 red
marbles from 6. - There are ways of selecting 1 green
marble from 4. - There are ways of selecting 3 marbles
from 10.
209So the probability of getting exactly 2 red
marbles on 3 draws will be
210and our probability is
211The hypergeometric distribution can also be used
if you have more than 2 categories.
- If you had 3 categories, for example, you would