Title: The Vision Thing Power Thirteen
1The Vision ThingPower Thirteen
- Bivariate Normal Distribution
2Outline
- Circles around the origin
- Circles translated from the origin
- Horizontal ellipses around the (translated)
origin - Vertical ellipses around the (translated) origin
- Sloping ellipses
3y
x
mx 0, sx2 1 my 0, sy2 1 rx, y 0
4y
b
x
a
mx a, sx2 1 my b, sy2 1 rx, y 0
5y
x
mx 0, sx2 gt sy2 my 0 rx, y 0
6y
x
mx 0, sx2 lt sy2 my 0 rx, y 0
7y
b
x
a
mx a, sx2 gt sy2 my b rx, y gt 0
8y
b
x
a
mx a, sx2 gt sy2 my b rx, y lt 0
9Why? The Bivariate Normal Density and Circles
- f(x, y) 1/2psxsyexp(-1/2(1-r2)
((x-mx)/sx2 -2r ((x-mx)/sx ((y-my)/sy
((y-my)/sy2 - If means are zero and the variances are one and
no correlation, then - f(x, y) 1/2pexp(-1/2 )(x2 y2), where
f(x,y) constant, k, for an isodensity - ln2pk (-1/2)(x2 y2), and (x2 y2)
-2ln2pkr2
10Ellipses
- If sx2 gt sy2, f(x,y) 1/2psxsyexp(-1/2)
((x-mx)/sx2 ((y-my)/sy2, and x (x-mx)
etc. - f(x,y) 1/2psxsyexp(-1/2) (x/sx2
y/sy2) , where f(x,y) constant, k, and
lnk 2psxsy (-1/2) (x/sx2 y/sy2
)and x2/c2 y2/d2 1 is an ellipse
11Correlation and Rotation of the Axes
y
Y
X
x
mx 0, sx2 lt sy2 my 0 rx, y lt 0
12Bivariate Normal marginal conditional
- If x and y are independent, then f(x,y) f(x)
f(y), i.e. the product of the marginal
distributions, f(x) and f(y) - The conditional density function, the density of
y conditional on x, f(y/x) is the joint density
function divided by the marginal density function
of x f(y/x) f(x, y)/f(x)
13Conditional Distribution
- f(y/x) 1/sy exp-1/2(1-r2)sy2
y-my-r(x-mx)(sy/sx) - the mean of the conditional distribution is
my r(x - mx) )(sy/sx), i.e this is the
expected value of y for a given value of x, xx - E(y/xx) my r(x - mx) )(sy/sx)
- The variance of the conditional distribution is
VAR(y/xx) sx2(1-r)2
14y
Regression line intercept my -
rmx(sy/sx) slope r(sy/sx)
my
mx
x
mx a, sx2 gt sy2 my b rx, y gt 0
15Bivariate Regression Another Perspective
- Regression line is the E(y/x) line if y and x are
bivariate normal - intercept my - r mx (sx/sy)
- slope r (sx/sy)
16Example Lab Six
17Example Lab Six
18Correlation Matrix
- GE INDEX GE 1.000000 0.636290 INDEX
0.636290 1.000000 -
19Bivariate Regression Another Perspective
- Regression line is the E(y/x) line if y and x are
bivariate normal - intercept my - r mx (sx/sy)
- slope r (sx/sy)
- my 0.022218
- r 0.63629
- mx 0.014361
- (sx/sy) (0.02543/0.043669)
- intercept 0.0064
- slope 1.094
20(No Transcript)
21Vs. 0.0064 Vs. 1.094
22Bivariate Normal Distribution and the Linear
probability Model
23education
Non-Players
Mean Educ Non-Players
Players
Mean educ. Players
income
mean income players
Mean income non
mx a, sx2 gt sy2 my b rx, y gt 0
24education
Non-Players
Mean Educ Non-Players
Players
Mean educ. Players
income
mean income players
Mean income Non-Players
mx a, sx2 gt sy2 my b rx, y gt 0
25education
Non-Players
Mean Educ Non-Players
Discriminating line
Players
Mean educ. Players
income
mean income players
Mean income Non-Players
mx a, sx2 gt sy2 my b rx, y gt 0
26Discriminant Function, Linear Probability
Function, and Decision Theory, Lab 6
- Expected Costs of Misclassification
- E(C) C(P/N)P(P/N)P(N)C(N/P)P(N/P)P(P)
- Assume C(P/N) C(N/P)
- Relative Frequencies P(N)23/1001/4,
P(P)77/1003/4 - Equalize two costs of misclassification by
setting fitted value of P(P/N), i.e.Bern to 3/4 - E(C) C(P/N)(3/4)(1/4)C(N/P)(1/4)(3/4)
27education
Non-Players
Mean Educ Non-Players
Discriminating line
Players
Mean educ. players
income
mean income players
Mean income Non-Players
mx a, sx2 gt sy2 my b rx, y gt 0
Note P(P/N) is area of the non-players
distribution below (southwest) of the line
28Set Bern 3/4 1.39 -0.0216education -
0.0105income, solve for education as it depends
on income and plot
297 non-players misclassified, as well as 14players
misclassified
30(No Transcript)
31Decision Theory
- Moving the discriminant line, I.e. changing the
cutoff value from 0.75 to 0.5, changes the
numbers of those misclassified, favoring one
population at the expense of another - you need an implicit or explicit notion of the
costs of misclassification, such as C(P/N) and
C(N/P) to make the necessary judgement of where
to draw the line