Title: PSY294 Statistics for Psychology
1PSY294 Statistics for Psychology
To think about
What causes crime? Suppose that an investigation
is to be carried out into the possible causes of
crime. The response variable (Y) is the crime
rate (number of offences per 1000 population) in
each authority area (county or group of
counties). List the variables which might be
measured in a survey, to explain why crime rates
vary.
2PSY294 Statistics for Psychology
Possible answers
- Population size/breakdown
- Urban/rural
- M/F ratio
- Youth population
- Per capita income
- Racial mix
- Unemployment rate
- Percentage single families
- Truancy rates
- Drug use
- Educational achievement
- (e.g. University participation)
- Number of police
- Unreported crime
- Number of pubs
- Number of churches
- Alcohol consumption
- Car ownership
- Weather
- Successful local football team
3PSY294 Statistics for Psychology
Property of correlation coefficient
proportion of the total variability in the
Y-variable which can be explained by its
dependence on the predictor variable X
(or R-SQ)
e.g. In week 2 (IQ vs GP-average)
73.2 of the variability in GP-average can be
explained by its dependence on IQ
4PSY294 Statistics for Psychology
Multiple linear regression
Single response variable, more than one predictor
variable
Best fitting relationship
to which we add a random scatter e
5PSY294 Statistics for Psychology
Multiple linear regression
- Main problems
- Calculations become very difficult to do by
hand - Scatterplots do not tell the whole story
- Solutions
- Use a statistical computer package
- Analyse the residuals carefully
6PSY294 Statistics for Psychology
Multiple linear regression
MTB gt regress c1 2 c2 c3 SUBCgt residuals
c4 SUBCgt fits c5 SUBCgt predict 110 8.
Regression Analysis GP-ave versus IQ, study
The regression equation is GP-ave - 5.25
0.0494 IQ 0.118 study Predictor Coef SE Coef
T P Constant -5.249 1.166 -4.50
0.001 IQ 0.04940 0.01047 4.72
0.001 Study 0.11800 0.02803 4.21 0.002 S
0.265390 R-Sq 91.0 R-Sq(adj) 89.0
Test of significance of IQ
Scatter
Test of significance of Study
Multiple correlation
7PSY294 Statistics for Psychology
Multiple linear regression
Analysis of Variance Source DF
SS MS F P Regression
2 6.3886 3.1943 45.35 0.000 Residual Error
9 0.6339 0.0704 Total 11
7.0225 Source DF Seq SS IQ
1 5.1406 study 1 1.2480
8PSY294 Statistics for Psychology
Multiple linear regression
Unusual Observations Obs IQ GP-ave Fit
SE Fit Residual St Resid 8 130
2.0000 2.5883 0.0875 -0.5883 -2.35R R
denotes an observation with a large standardized
residual. Predicted Values for New
Observations New Obs Fit SE Fit
95 CI 95 PI 1 1.1284
0.1529 (0.7825, 1.4743) (0.4355, 1.8212)
9PSY294 Statistics for Psychology
Multiple linear regression
GP-ave - 5.25 0.0494 IQ 0.118 study
c.f. last week GP-ave - 7.01 0.0741 IQ
Effect on prediction e.g. student number 12 has
IQ138, study-hrs18. If we use IQ only, we
predict GP-average3.216 If we use IQstudy hrs,
we predict 3.691
first is out by 0.38, second by 0.09
10PSY294 Statistics for Psychology
Multiple linear regression
- Note other changes
- Standard deviation of scatter (s) falls from
0.4338 to 0.2654 (less scatter) - R-SQ increases from 73.2 to 91 (more
variability is explained) - Both p-values are small
- (Both X variables make important contributions)
11PSY294 Statistics for Psychology
Multiple partial correlation
Calculate pairwise correlations (using
Minitab) Corr (Y, ) 0.856 R-SQ
73.2 Corr (Y, ) 0.829 R-SQ
68.7 Also Corr ( , ) 0.560.
12PSY294 Statistics for Psychology
Multiple partial correlation
can be calculated directly using the formula
in the notes
c.f. Minitab R-SQ 91
13PSY294 Statistics for Psychology
Multiple partial correlation
Partial correlation is the contribution of a
variable after fitting another. For example, the
correlation of Y with after fitting
i.e. explains 66.6 of the remaining
variation