Title: Multiple regression
1Multiple regression
2Overview
- Simple linear regression
- SPSS output
- Linearity assumption
- Multiple regression
- in action 7 steps
- checking assumptions (and repairing)
- Presenting multiple regression in a paper
3Simple linear regression
- Class attendance and language learning
- Bob 10 classes 100 words
- Carol 15 classes 150 words
- Dave 12 classes 120 words
- Ann 17 classes 170 words
Heres some data. We expect that the more classes
someone attends, the more words they learn.
4The straight line is the model for the data. The
definition of the line (y mx c) summarises
the data.
5SPSS output for simple regression (1/3)
Model Summaryb Model R R Square Adjusted R
Square Std. Error of the Estimate 1 .792a
.627 .502 25.73131 a. Predictors
(Constant), classes b. Dependent Variable
vocabulary
6SPSS output for simple regression (2/3)
7SPSS output for simple regression (3/3)
Coefficientsa Model Unstandardized
Coefficients Standrdzd Coefficients
t Sig. B Std. Error
Beta 1 (Constant) -19.178 64.837
-.296 .787 classes
10.685 4.762 .792
2.244 .111 a. Dependent Variable
vocabulary
8Linearity assumption
- Always check that the relationship between each
predictor variable and the outcome is linear
9Multiple regression
- More than one predictor
- e.g. predict vocabulary from
- classes homework L1vocabulary
10Multiple regression in action
- Bivariate correlations scatterplots check for
outliers - Analyse / Regression
- Overall fit (R2) and its significance (F)
- Coefficients for each predictor (ms)
- Regression equation
- Check mulitcollinearity (Tolerance)
- Check residuals are normally distributed
11Bivariate outlier
12Multivariate outlier
- Test
- Mahalanobis distance
- (In SPSS, click Save button in Regression
dialog) - to test sig., treat as a chi-square value
- with df number of predictors
13Multicollinearity
- Tolerance should not be too close to zero
- T 1 R2
- where R2 is for prediction of this predictor by
the others - If it fails, you need to reduce the number of
predictors (you dont need the extra ones anyway)
14Failed normality assumption
- If residuals do not (roughly) follow a normal
distribution - it is often because one or more predictors is
not normally distributed - ? May be able to transform predictor
15Categorical predictor
- Typically predictors are continuous variables
- Categorical predictors
- e.g. Sex (male, female)
- can do code as 0, 1
- Compare simple regression with t-test
- (vocabulary constant Sex)
16Presenting multiple regression
- Table is a good idea
- Include correlations (bivariate)
- R2 adjusted
- Report F (df, df), and its p, for the overall
model - Report N
- Coefficient, t, and p (sig.) for each predictor
- Mention that assumptions of linearity, normality,
and absence of multicollinearity were checked,
and satisfied
17Further reading
- Tabachnik Fidell (2001, 2007) Using
Multivariate Statistics. Ch5 Multiple regression