Title: '''Multivariate Data Analysis
1...Multivariate Data Analysis
- Sociology 315, Winter 2002
- Week 5 Feb. 4-8
2Multiple Regression Residualization
Ry.12 multiple correlation tells us how well
the 3 dimensional plane is fitting the
scatterplot R2 - proportion of variance in y,
explained by all the indep. variables
r2 proportion of variance in y explained by
x (correlation-squared)
3Residualization
X1
Y
X2
1. Remove the linear effects of X2 from Y. y
a byx2 y a byx e e y y Eyx2
4Y
Y
X1
X2
X1
X2
5Y
r res, x2 0
X1
There should be no correlation between the
residuals (unexpl. variance in Y) and X2.
62. Remove the linear effects of X2 from
X1. x1 a bx2 x1 a bx2 e e x x
Ex1x2
7Y
Y
X1
X2
X1
X2
8There should be no correlation between the
residuals (unexpl. variance) and X2.
Y
X1
X2
r res, x2 0
9r1y.2 N ? XY (? X) (? Y) ? N ? X2 (?
X)2N ? Y2 (? Y) 2 or r1y.2 r 1y -
r 12 r y2 ?(1- r212)(1 r2y2) Values range
from -1 to 1.
10(No Transcript)
11(No Transcript)
12r1y.2 N ? XY (? X) (? Y) ? N ? X2 (?
X)2N ? Y2 (? Y) 2 5 (1.22) - (0.01)
(-0.01) ? 5 (195.91) - (0.01)25 (5.16) -
(-0.01)2 6.10 ? 25, 272.39
-0.04
13Y
r 1y
r y2
X2
X1
r 12
14Denominator
Y
Numerator
X2
X1
The partial correlation squared (r21y.2) is
the proportion of variance in Y that is shared
with X1 controlling for X2.
15Y
ry12
ry22
X2
X1
ry12 ry22 total proportion of variance in Y
explained by both indep. vars. R2y.12 ry12
ry22 if r122 0 (indep. vars have a
correlation of 0)
16Y
X2
X1
r 12 gt0
If r122 gt 0, then R2y.12 ry12 ry22 wont
work, because the shaded area would be counted
twice.
17Semi-Partial Correlation r y (1 . 2)
X1 age
Y attitude
X2 educ.
r y (1 . 2) the correlation of age and
attitude, controlling for the relationship
between education and age
18A semi-partial correlation is used to obtain the
proportion of variance in the dep. var. (Y),
that the first indep. variable (X1) accounts for
over and above the variance accounted for by
the second indep. variable (X2).
r2y (1 . 2) R2 y . 12 - r2 y2
r y (1 . 2) r 1y - (r y2) (r 12) (1 -
r212)
19Y
4
1
3
2
X2
X1
20Partial ry1 . 2 1 1 4
Semi-Partial ry (1 . 2) 1
1 2 3 4
What is the squared correlation between Y and
X1, controlling for X2? What proportion of
variance in Y is shared with X1, controlling
for X2?
Y keeps its total variation (as indicated by 1
2 3 4). The semi-partial correlation
will be a smaller number than the partial
correlation.
21Understanding the Partial Correlation
Coefficient 1. A partial correlation
coefficient is the correlation between two or
more variables after removing from each of them
the linear effects of one or more control
variables. 2. A partial correlation coefficient
is the correlation between two variables that is
homogeneous on the control variable(s). 3. A
partial correlation coefficient is the
correlation between two variables that have
each been residualized on one or more control
variables.
22Standardized Partial Regression Coefficient The
slope you would get if you converted Y, X1 and X2
into z-scores Zy by1 . 2 (z1) by2. 1
(z2) Why is there no y-intercept? Because
when means are standardized they are 0
23by1 . 2 by2 . 1 for bs compare across
groups (for the same indep. var.) by1 . 2
by1 . 2 S1 Sy by1 . 2
by1 . 2 Sy S1 by2 . 1 by2 . 1
S2 or by2 . 1 ry2 - ry1 r21 Sy
1 r221
24Standard error of estimate Se Sy ?(1- R2 y.
12 ) The smaller the standard error of estimate,
the larger the R2