Title: Correlation Calculations for Multiple Regression
1Correlation Calculations for Multiple Regression
2After calculating the regression parameters (b
values), we can also calculate the correlation
coefficient.
To get the correlation coefficient (r), we first
need to calculate r2.
Actually, in the multiple-regression case, we
only get the absolute value of the correlation
coefficient (r) the "direction" of the
correlation is determined by the sign of each b
value.
3Before trying multiple regression, let's look
again at the case with one independent variable.
These data points come from test case 4 of lab 3.
4We already know how to calculate the regression
parameters.
200.0
b0 -0.351493739 b1 0.094962426
150.0
100.0
50.0
0.0
0
500
1000
1500
2000
5If we evaluate the regression line equation at
each x value, we get the predicted y values.
ypred 0.094962426 x 0.351493739
6To determine the correlation, we also need to
calculate the mean Y value (yavg).
yavg 60.3 (Mean of original y values)
7Next, we need to sum the squares of two
differences (y yavg) and (ypred yavg).
y yavg
ypred yavg
8Once we have the two sums, we can calculate the
correlation coefficient.
Just in case you are curious, the statisticians
label the sum-square values like this
Sum of squares error (unexplained)
Total sum of squares (variability)
Sum of squares predicted (explained)
9Here is an example using the data points from
test case 4 of lab 3.
10To extend the correlation calculation to handle
multiple independent variables, the only change
is in calculating the predicted y values.
One independent variable("linear regression")
One or more independent variables("multiple
regression")
Obviously, both are forms of "linear" regression,
despite the names.
11Please complete the following assignment before
the start of the next class.
- Read textbook pages 133-202.
- Complete (at least most of) lab 4
- Due Tuesday evening when we return
- Enjoy your break!