Title: Chapter 2 in Undergraduate Econometrics
1Chapter 2 Probability
Random Variable (r.v.) is a variable whose value
is unknown until it is observed. The value of a
random variable results from an experiment.
Experiments can be either controlled (laboratory)
or uncontrolled (observational). Most economic
variables are random and are the result of
uncontrolled experiments.
2Random Variables
- A discrete random variable can take on only a
finite number of values such as - The number of visits to a doctors office
- Number of children in a household
- Flip of a coin
- Dummy (binary) variable D0 if male, D1 if
female - A continuous random variable can take any real
value (not just whole numbers) in an interval on
the real number line such as - Gross Domestic Product next year
- Price of a share in Microsoft
- Interest rate on a 30 year mortgage
3Probability Distributions of Random Variables
- All random variables have probability
distributions that describe the values the random
variable can take on and the associated
probabilities of these values. - Knowing the probability distribution of random
variable gives us some indication of the value
the r.v. may take on.
4Probability Distribution for Discrete Random
Variable
- Expressed as a table, graph or function
- 1. Suppose X of tails when a coin is
flipped twice. X can take on the values 0, 1 or
2. Let f(x) be the associated probabilities - Table Graph
- X f(x)
- 0 0.25
- 1 0.50
- 2 0.25
Probability is represented as height on this bar
graph
5- 2. Suppose X is a binary variable that can take
on two values 0 or 1. Furthermore, assume
P(X1) p and P(X0) (1-p) - Function
- P(Xx) f(x) px(1-p)1-x for X 0, 1
- Table
- X f(x)
- 0 (1-p)
- 1 p
Suppose p 0.10 Then X takes on 0 with
probability 0.90 and X takes on 1 with
probability 0.10
6Facts about discrete probability distribution
functions
- Each probability P(Xx) f(x) must lie between 0
and 1 0 ? f(x) ? 1 - 2. The sum of the probabilities must be 1. If X
can take on n different values then - f(x1) f(x2). . .f(xn) 1
7Probability Distribution (Density)for Continuous
Random Variables
- Expressed as a function or graph.
- Continuous r.v.s can take on an infinite number
of values in a given interval - A table isnt appropriate to express pdf
- EX f(x) 2x for 0 ? x ? 1
- 0 otherwise
8- Because a continuous random variable has an
uncountably infinite number of values, - the probability of one occurring is zero.
- P(X a) 0
- Instead, we ask What is the probability that X
is between a and b? - Pa lt X lt b ?
- In an experiment, the probability Pa lt X lt b
is the proportion of the time, in many
experiments, that X will fall between a and b.
9- Probability is represented as area under the
function. - Total area must
- be 1.0
- Area of triangle
- is 1.0
- Probability that x lies between 0 and 1/2
- P 0 ? X ? 1/2 0.25
- Area of any triangle is ½BaseHeight
10- Uniform Random Variable u is distributed
uniformly between a and b - p.d.f. is a line between a and b of height
1/(b-a) - f(u) 1/(b a) if a ? u ? b
- 0 otherwise
- EX Spin a dial on a clock
- a 0 and b 12
- Find the probability that
- u lies between 1 and 2
11In calculus, the integral of a function defines
the area under it
For continuous random variables it is the area
under f(x), and not f(x) itself, which defines
the probability of an event. We will NOT be
integrating functions when necessary we use
tables and/or computers to calculate the
necessary probability (integral).
12Rules of Summation
Rule 3 ???axi a ??xi
13 Rules of Summation (continued)
n
1
n
i 1
From Rule 6, we can prove (in class) that
14 Rules of Summation (continued)
The order of summation does not matter
15The Mean of a Random Variable
The mean of a random variable is its
mathematical expectation, or expected value. For
a discrete random variable, this is
E(X) ?xif(xi) x1f(x1) x2f(x2)
. . . xnf(xn) where n measures the number of
values X can take on
It is a probability-weighted average of the
possible values the random variable X can take
on. This is a sum for discrete r.v.s and an
integral for continuous r.v.s
16- E(X) tells us the long-run average value for X.
It is not the value one would expect X to take
on. - If you were to randomly draw values of X from its
pdf an infinite number of times and average these
values, you would get E(X) - E(X) ? this greek letter mu is not used in
your text but is commonly used to denote the mean
of X.
17Example Roll a fair die
Interpretation In a large number of rolls of a
fair die, one-sixth of the values will be 1s,
one-sixth of the values will be 2s. etc., and
the average of these values will be 3.5.
18Mathematical Expectation
- Think of E(.) as an operator that requires you to
weight by probabilities any expression inside the
parentheses, and then sum - E(g(x)) ?g(xi)f(xi)
- g(x1)f(x1) g(x2 ) f(x2) . . .
g(xn ) f(xn)
19Rules of Mathematical Expectation
- E(c) c where c is a constant
- E(cX) cE(X) where c is a constant and X is a
random variable - E(a cX) a cE(X) where a and c are constants
and X is a random variable.
20Variance of a Random Variable
- Like the mean, the variance of a r.v. is an
expected value, but it is the expected value of
the squared deviations from the mean - Let g(x) (x E(x))2
- Variance ?2 Var(x) E(x E(x))2
- ?g(xi)f(xi)
- ?(xi E(xi))2f(xi)
- It measures the amount of dispersion in the
possible values for X.
21About Variance
- Unit of measurement is X units squared
- When we create a new random variable as a linear
transformation of X - y a cx
- We know that E(y) a cE(x)
- But Var(y) c2Var(x)
- (proof in class) This property tells us that the
amount of variation in y is determined by the
amount of variation in X and the constant c. The
additive constant a in no way alters the amount
of variation in the values on x.
22About Variance (cont)
- E(x E(x))2 Ex2 2E(x)x E(x)2
- E(x2) 2E(x)E(x) E(x)2
- E(x2) 2E(x)2 E(x)2
- E(x2) E(x)2
- Run the E(.) operator thru, pulling out constants
and stopping on random variables. Remember that
E(x) is itself a constant, so - E(E(x)) E(x)
23Standard Deviation
- Because variance is in squared units of the r.v.,
we can take the square root of the variance to
obtain the standard deviation. - ? ?2 ? Var(x)
- Be sure to take the square root after you square
and sum the deviations from the mean.
24Joint Probability
- An experiment can randomly determine the outcome
of more than one variable. - When there are 2 random variables of interest, we
study the joint probability density function - When there are more than 2 random variables of
interest, we study the multivariate probability
density function.
25For a discrete joint pdf, probability is
expressed in a matrix
Let X return on stocks, Y return on bonds
X X X X X X f(y)
Y -10 0 10 20
Y 6 0 0 0.10 0.10
Y 8 0 0.10 0.30 0.20
Y 10 0.10 0.10 0 0
f(x)
P(Xx,Yy) f(x,y)
e.g. P(X10,Y8) 0.30
26About Joint P.d.Fs
- Marginal Probability Distribution what is the
probability distribution for X regardless of what
values Y takes on? - f(x) ?yf(x,y)
- what is the probability distribution for Y
regardless of what values X takes on? - f(y) ?xf(x,y)
27- Conditional Probability Distribution
- What is the probability distribution for X given
that Y takes on a particular value? - f(xy) f(x,y)/f(y)
- What is the probability distribution for Y given
that X takes on a particular value? - f(yy\x) f(x,y)/f(x)
28- Covariance A measure that summarizes the joint
probability distribution between two random
variables. - cov(x,y) E(x E(x))(y-E(y))
- ?x ?y (xi E(x))(yi E(y))f(x,y)
- Ex
-
29About Covariance It measures the joint
association between 2 random variables. Try
asking When X is large, is Y more or less
likely to also be large? If the answer is that Y
is likely to be large when X is large, then we
say X and Y have a positive relationship.
Cov(x,y) gt 0 If the answer is that Y is likely to
be small when X is large, then we say that X and
Y have a negative relationship. Cov(x,y) lt
0. cov(x,y) E(x E(x))(y E(y)) Exy
E(x)y xE(y) E(x)E(y) E(xy) E(x)E(y)
E(x)E(y) E(x)E(y) E(xy) E(x)E(y) ?
useful!!
30- Correlation
- Covariance has awkward units of measurement.
- Correlation removes all units of measurement by
- dividing covariance by the product of the
standard - deviations
- ?xy Cov(x,y)/(?x?y)
- and 1 ? ?xy ? 1
- Ex
31What does correlation look like??
?0
?.7
?.3
?.9
32Statistical Independence
- Two random variables are statistically
independent if knowing the value that one will
take on does not reveal anything about what value
the other may take on - f(xy) f(x) or f(yx) f(y)
- This implies that f(x,y) f(x)f(y) if X and Y
are independent. - If 2 r.v.s are independent, then their
covariance will necessarily be equal to 0.
33Functions of more than one Random Variable
- Suppose that X and Y are two random variables. If
we sum them together we create a new random
variable that has the following mean and
variance - Z aX bY ?
- E(Z) E(aX bY) aE(x) bE(y)
- Var(Z) Var(aX bY)
- a2Var(X) b2Var(Y) 2abCov(X,Y)
- If X and Y are independent ?
- Var(Z) Var(aX bY)
- a2Var(X) b2Var(Y) see page 31
34Normal Probability Distribution
- Many random variables tend to have a normal
distribution (a well known bell shape) - Theoretically, xN(ß,?2) where E(x) ß and
Var(x) ?2 - The probability density function is
x
a
b
?
35Normal Distribution (cont)
- A family of distributions, each with its own mean
and variance. The mean anchors the
distributions center and the variance captures
the spread of the bell-shaped curve - To find area under the curve would require
integrating the p.d.f too complicated. Computer
generated table gives all the probabilities we
need for a normal r.v. that has mean 0 and
variance of 1
To use the table (pg. 389), we need to take a
normal random variable xN(?,?2) and transform it
by subtracting the mean and dividing by the
standard deviation. This is a linear
transformation of X that creates a new random
variable that has mean 0 and variance of 1. Z
(x - ?)/ ? where z N(0,1)
36Statistical inference drawing conclusions about
a population based on a sample