Title: Assumptions Underlying Parametric Statistical Techniques
1Chapter 13
- Assumptions Underlying Parametric Statistical
Techniques
2Parametric Statistics
- We have been studying parametric statistics.
- They include estimations of mu and sigma,
correlation, t tests and F tests.
3Five Assumptions
To validly use parametric statistics, we make
- two research assumptions
- two assumptions about the type of the
distributions in the samples, - and one assumption about the kind of numbering
system that we are using.
4Research Assumptions
- Subjects have to be randomly selected from the
population. - Experimental error is randomly distributed across
samples in the design. - (We will not discuss these any further).
5Distribution Assumptions
- The distribution of sample means fit a normal
curve. - Homogeneity of variance (using FMAX).
6Assumptions about Numbering Schemes
- The measures we take are on an interval scale.
- (Other numbering scales, such as ordinal and
nominal, do not allow the estimation of
population parameters such as mu and sigma and
the tests used to analyze such data are therefore
call nonparametric).
7Violating the Assumptions
If any of these assumptions are violated, we
cannot use parametric statistics. We must use
less-powerful, non-parametric statistics.
8Sample Means Form a Normal Curve
9Sample Means
- An assumption we need to make is that the
distribution of sample means is normally
distributed. - This is not as extreme an assumption as it might
seem. - We will follow the example in the book to
demonstrate (only smaller).
10An Artificial Population
Subject Score
- Seven subjects.
- Each subject has a different score.
- We sample five subjects.
A 1 B 2 C 3 D 4 E 5 F 6 G 7
11The Distribution is Rectangular
FREQUENCY
3 2 1 0
? ? ? ? ? ? ?
1 2 3 4 5 6 7
SCORE
12All Possible Samples
Sample Scores Mean
Sample Scores Mean
ABCDE 12345 3.0 ABCDF 12346 3.2 ABCDG 12347
3.4 ABCEF 12356 3.4 ABCEG 12357 3.6 ABCFG
12367 3.8 ABDEF 12456 3.6 ABDEG 12457
3.8 ABDFG 12467 4.0 ABEFG 12567 4.2
ACDEF 13456 3.8 ACDEG 12457 4.0 ACDFG 13467
4.2 ACEFG 13567 4.4 ADEFG 14567 4.6 BCDEF
23456 4.0 BCDEG 23457 4.2 BCDFG 23467
4.4 BCEFG 23567 4.6 BDEFG 24567 4.8 CDEFG
34567 5.0
13Sample Distribution
14Normal Curve for Sample Means Conclusion
- Even if we have a small population (7),
- with a rectangular distribution,
- and a small sample size (5),
- which yields a small number of possible samples
(21), - the sample means tend to fall in an
(approximately) normal distribution. - This assumption that the distribution of sample
means will basically fit a normal curve is seldom
violated. - This assumption is robust.
15But it can happen -Violating the Normal Curve
Assumption
Distributions of sample means can vary from
normal in several ways.
- Normal curves
- are symmetric
- are bell-shaped
- have a single peak
- Non-normal curves
- have skew
- have kurtosis- platykutic or leptokurtic
- are polymodal
16Symmetry
F r e q u e n c y
score
17Skewed
NORMAL
18Bell-shaped
F r e q u e n c y
1 SD is 34 2 SD is 48 etc.
score
19Kurtosis
NORMAL
20One mode
F r e q u e n c y
score
There is only one mode and it equals the
median and the mean.
21Polymodality
NORMAL
22Violation of normally distributed sample means
- If the distribution of sample means is
- skewed,
- or has kurtosis,
- or more than one mode,
- then we cannot use parametric statistics.
- BUT THIS IS RARE.
23Homogeneity of Variance and FMAX
24For F Ratios and t Tests
- We assume that the distribution of scores around
each sample mean is similar. - The distributions within each group all estimate
the same thing, that is, sigma2. - The mean squares within each group should be the
approximately the same in each group, differing
only because of random sampling fluctuation. - For F ratios and t tests, this is called
homogeneity of variance.
25For Correlation
- For correlation, the scores must vary roughly the
same amount around the entire length of the
regression line. - This is called homoscedasticity.
26Homoscedasticity
27Non-Homoscedasticity
28Homogeneity of Variance
- In mathematical terms, homogeneity of variance
means that the mean squares for each group are
the same.
We use the FMAX test to check if the group with
the smallest mean squares is too different from
the group with the largest mean squares.
29FMAX
- If FMAX is significant, then the Mean Squares
deviate from each other too much. - The assumption of homogeneity of variance is
violated. - We cannot use parametric statistics!
30Why???
- Because all parametric statistical procedures
rely on our ability to estimate sigma2 with MSW. - If the estimates of MSW among the grous differ
among groups so that Fmax is significant, the
odds are someone (most likely the senior
experimenter) messed up and created a measure
with too small a range of scores.
31When that happens all the scores pile up at one
end of the scale.
- When everyone scores at the top or bottom a
scale, individual differences and measurement
problems seem to disappear. - We call this a ceiling effect (if the scores are
all at the top of the scale) and a floor effect
if the scores are all at the bottom
32- Because ID and MP in one or more groups have been
pushed up against the top or bottom of the scale
there is practically no within group variation. - So, while adding df, the group contributes little
or nothing to sum of squares within group (SSW). - So, when you include one or more groups with
practically no variation within group in your
totals sums of squares and mean square, you wind
up with an underestimate of sigma2. - This makes it possible to get significant results
not because you have pushed the means apart with
an IV, but because MSW is an underestimat
33- This makes it possible to get significant results
not because you have pushed the means apart with
an IV, but because MSW is an underestimate of
sigma2 and therefore the denominator of the F or
t test will be too small. - So you can get significant results more often
than you should when the null is true.
34Uncrowded vs crowded groups How crowded do you
feel?
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8
1 2 3 5 5 6 6 4
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8
9 8 9 9 9 9 8 9
35FMAX
- In FMAX, the MAX part refers to the largest
ratio that can be obtained by comparing the
estimated variances from 2 experimental groups.
The significance of FMAX is checked in an FMAX
table.
36K number of variances
2 3 4 5 6
7 8 9 10 4 23.2 37
49 59 69 79 89 97
106 5 14.9 22 28 33 38
42 46 50 54 6 11.1 15.5
19.1 22 25 27 30 32
34 7 8.89 12.1 14.5 16.5 18.4 20
22 23 24 8 7.50 9.9
11.7 13.2 14.5 15.8 16.9 17.9 18.9 9
6.54 8.5 9.9 11.1 12.1 13.1
13.9 14.7 15.3 10 5.85 7.4 8.6
9.6 10.4 11.1 11.8 12.4 12.9 12
4.91 6.1 6.9 7.6 8.2 8.7
9.1 9.5 9.9 15 4.07 4.9 5.5
6.0 6.4 6.7 7.1 7.3
7.5 20 3.32 3.8 4.3 4.6 4.9
5.1 5.3 5.5 5.6 30 2.63
3.0 3.3 3.4 3.6 3.7 3.8
3.9 4.0 60 1.96 2.2 2.3 2.4
2.4 2.5 2.5 2.6 2.6
dfFMAX
alpha .01.
37The critical values.
k number of variances
2 3 4 5 6
7 8 9 10 4 23.2 37
49 59 69 79 89 97
106 5 14.9 22 28 33 38
42 46 50 54 6 11.1 15.5
19.1 22 25 27 30 32
34 7 8.89 12.1 14.5 16.5 18.4 20
22 23 24 8 7.50 9.9
11.7 13.2 14.5 15.8 16.9 17.9 18.9 9
6.54 8.5 9.9 11.1 12.1 13.1
13.9 14.7 15.3 10 5.85 7.4 8.6
9.6 10.4 11.1 11.8 12.4 12.9 12
4.91 6.1 6.9 7.6 8.2 8.7
9.1 9.5 9.9 15 4.07 4.9 5.5
6.0 6.4 6.7 7.1 7.3
7.5 20 3.32 3.8 4.3 4.6 4.9
5.1 5.3 5.5 5.6 30 2.63
3.0 3.3 3.4 3.6 3.7 3.8
3.9 4.0 60 1.96 2.2 2.3 2.4
2.4 2.5 2.5 2.6 2.6
dfFMAX
38Book Example
39k number of variances
2 3 4 5 6
7 8 9 10 4 23.2 37
49 59 69 79 89 97
106 5 14.9 22 28 33 38
42 46 50 54 6 11.1 15.5
19.1 22 25 27 30 32
34 7 8.89 12.1 14.5 16.5 18.4 20
22 23 24 8 7.50 9.9
11.7 13.2 14.5 15.8 16.9 17.9 18.9 9
6.54 8.5 9.9 11.1 12.1 13.1
13.9 14.7 15.3 10 5.85 7.4 8.6
9.6 10.4 11.1 11.8 12.4 12.9 12
4.91 6.1 6.9 7.6 8.2 8.7
9.1 9.5 9.9 15 4.07 4.9 5.5
6.0 6.4 6.7 7.1 7.3
7.5 20 3.32 3.8 4.3 4.6 4.9
5.1 5.3 5.5 5.6 30 2.63
3.0 3.3 3.4 3.6 3.7 3.8
3.9 4.0 60 1.96 2.2 2.3 2.4
2.4 2.5 2.5 2.6 2.6
dfFMAX
FMAX 16.33 gt 8.89 FMAX exceeds the critical
value. We cannot use parametric statistics.
40Examples
Number Subjects Critical value Design of Means
in larger NG of FMAX 2X4 8
21 5.3 2X2
? 16
? 3X3 ?
11 ? 2X3
? 9
?
4
9
6
41K number of variances
2 3 4 5 6
7 8 9 10 4 23.2 37
49 59 69 79 89 97
106 5 14.9 22 28 33 38
42 46 50 54 6 11.1 15.5
19.1 22 25 27 30 32
34 7 8.89 12.1 14.5 16.5 18.4 20
22 23 24 8 7.50 9.9
11.7 13.2 14.5 15.8 16.9 17.9 18.9 9
6.54 8.5 9.9 11.1 12.1 13.1
13.9 14.7 15.3 10 5.85 7.4 8.6
9.6 10.4 11.1 11.8 12.4 12.9 12
4.91 6.1 6.9 7.6 8.2 8.7
9.1 9.5 9.9 15 4.07 4.9 5.5
6.0 6.4 6.7 7.1 7.3
7.5 20 3.32 3.8 4.3 4.6 4.9
5.1 5.3 5.5 5.6 30 2.63
3.0 3.3 3.4 3.6 3.7 3.8
3.9 4.0 60 1.96 2.2 2.3 2.4
2.4 2.5 2.5 2.6 2.6
dfFMAX
42 Number Subjects Critical value Design of Means
in larger NG of FMAX 2X4 8
21 5.3 2X2
4 16
5.5 3X3 9
11 ? 2X3
6 9
?
43K number of variances
2 3 4 5 6
7 8 9 10 4 23.2 37
49 59 69 79 89 97
106 5 14.9 22 28 33 38
42 46 50 54 6 11.1 15.5
19.1 22 25 27 30 32
34 7 8.89 12.1 14.5 16.5 18.4 20
22 23 24 8 7.50 9.9
11.7 13.2 14.5 15.8 16.9 17.9 18.9 9
6.54 8.5 9.9 11.1 12.1 13.1
13.9 14.7 15.3 10 5.85 7.4 8.6
9.6 10.4 11.1 11.8 12.4 12.9 12
4.91 6.1 6.9 7.6 8.2 8.7
9.1 9.5 9.9 15 4.07 4.9 5.5
6.0 6.4 6.7 7.1 7.3
7.5 20 3.32 3.8 4.3 4.6 4.9
5.1 5.3 5.5 5.6 30 2.63
3.0 3.3 3.4 3.6 3.7 3.8
3.9 4.0 60 1.96 2.2 2.3 2.4
2.4 2.5 2.5 2.6 2.6
dfFMAX
44 Number Subjects Critical value Design of Means
in larger NG of FMAX 2X4 8
21 5.3 2X2
4 16
5.5 3X3 9
11 12.4 2X3
6 9
?
45K number of variances
2 3 4 5 6
7 8 9 10 4 23.2 37
49 59 69 79 89 97
106 5 14.9 22 28 33 38
42 46 50 54 6 11.1 15.5
19.1 22 25 27 30 32
34 7 8.89 12.1 14.5 16.5 18.4 20
22 23 24 8 7.50 9.9
11.7 13.2 14.5 15.8 16.9 17.9 18.9 9
6.54 8.5 9.9 11.1 12.1 13.1
13.9 14.7 15.3 10 5.85 7.4 8.6
9.6 10.4 11.1 11.8 12.4 12.9 12
4.91 6.1 6.9 7.6 8.2 8.7
9.1 9.5 9.9 15 4.07 4.9 5.5
6.0 6.4 6.7 7.1 7.3
7.5 20 3.32 3.8 4.3 4.6 4.9
5.1 5.3 5.5 5.6 30 2.63
3.0 3.3 3.4 3.6 3.7 3.8
3.9 4.0 60 1.96 2.2 2.3 2.4
2.4 2.5 2.5 2.6 2.6
dfFMAX
46 Number Subjects Critical value Design of Means
in larger NG of FMAX 2X4 8
21 5.3 2X2
4 16
5.5 3X3 9
11 9.5 2X3
6 9
14.5
47Example other way
Number of Means 8 ? ? ?
MSG max 18.2 26.3 34.2 18.0
MSG min 1.1 2.0 4.6 0.5
FMAX 16.5 ? ? ?
Subjects in larger NG 10 12 21 7
dfFMAX 9 ? ? ?
p?.01 .01 ? ? ?
Design 2X4 2X3 2X2 3X3
11
6
13.2
20
4
7.4
6
9
36.0
48FMAX(6,11) 13.2
k number of variances
2 3 4 5 6
7 8 9 10 4 23.2 37
49 59 69 79 89 97
106 5 14.9 22 28 33 38
42 46 50 54 6 11.1 15.5
19.1 22 25 27 30 32
34 7 8.89 12.1 14.5 16.5 18.4 20
22 23 24 8 7.50 9.9
11.7 13.2 14.5 15.8 16.9 17.9 18.9 9
6.54 8.5 9.9 11.1 12.1 13.1
13.9 14.7 15.3 10 5.85 7.4 8.6
9.6 10.4 11.1 11.8 12.4 12.9 12
4.91 6.1 6.9 7.6 8.2 8.7
9.1 9.5 9.9 15 4.07 4.9 5.5
6.0 6.4 6.7 7.1 7.3
7.5 20 3.32 3.8 4.3 4.6 4.9
5.1 5.3 5.5 5.6 30 2.63
3.0 3.3 3.4 3.6 3.7 3.8
3.9 4.0 60 1.96 2.2 2.3 2.4
2.4 2.5 2.5 2.6 2.6
dfFMAX
p?.01
49Number of Means 8 ? ? ?
MSG max 18.2 26.3 34.2 18.0
MSG min 1.1 2.0 4.6 0.5
FMAX 16.5 13.2 7.4 36.0
Subjects in larger NG 10 12 21 7
dfFMAX 9 11 20 6
p?.01 .01 .01 ? ?
Design 2X4 2X3 2X2 3X3
6
4
9
50FMAX(4,20) 7.4
k number of variances
2 3 4 5 6
7 8 9 10 4 23.2 37
49 59 69 79 89 97
106 5 14.9 22 28 33 38
42 46 50 54 6 11.1 15.5
19.1 22 25 27 30 32
34 7 8.89 12.1 14.5 16.5 18.4 20
22 23 24 8 7.50 9.9
11.7 13.2 14.5 15.8 16.9 17.9 18.9 9
6.54 8.5 9.9 11.1 12.1 13.1
13.9 14.7 15.3 10 5.85 7.4 8.6
9.6 10.4 11.1 11.8 12.4 12.9 12
4.91 6.1 6.9 7.6 8.2 8.7
9.1 9.5 9.9 15 4.07 4.9 5.5
6.0 6.4 6.7 7.1 7.3
7.5 20 3.32 3.8 4.3 4.6 4.9
5.1 5.3 5.5 5.6 30 2.63
3.0 3.3 3.4 3.6 3.7 3.8
3.9 4.0 60 1.96 2.2 2.3 2.4
2.4 2.5 2.5 2.6 2.6
dfFMAX
p?.01
51Number of Means 8 ? ? ?
MSG max 18.2 26.3 34.2 18.0
MSG min 1.1 2.0 4.6 0.5
FMAX 16.5 13.2 7.4 36.0
Subjects in larger NG 10 12 21 7
dfFMAX 9 11 20 6
p?.01 .01 .01 .01 ?
Design 2X4 2X3 2X2 3X3
6
4
9
52FMAX(9,6) 36.0
K number of variances
2 3 4 5 6
7 8 9 10 4 23.2 37
49 59 69 79 89 97
106 5 14.9 22 28 33 38
42 46 50 54 6 11.1 15.5
19.1 22 25 27 30 32
34 7 8.89 12.1 14.5 16.5 18.4 20
22 23 24 8 7.50 9.9
11.7 13.2 14.5 15.8 16.9 17.9 18.9 9
6.54 8.5 9.9 11.1 12.1 13.1
13.9 14.7 15.3 10 5.85 7.4 8.6
9.6 10.4 11.1 11.8 12.4 12.9 12
4.91 6.1 6.9 7.6 8.2 8.7
9.1 9.5 9.9 15 4.07 4.9 5.5
6.0 6.4 6.7 7.1 7.3
7.5 20 3.32 3.8 4.3 4.6 4.9
5.1 5.3 5.5 5.6 30 2.63
3.0 3.3 3.4 3.6 3.7 3.8
3.9 4.0 60 1.96 2.2 2.3 2.4
2.4 2.5 2.5 2.6 2.6
dfFMAX
p?.01
53Answers to examples
Number of Means 8 ? ? ?
MSG max 18.2 26.3 34.2 18.0
MSG min 1.1 2.0 4.6 0.5
FMAX 16.5 13.2 7.4 36.0
Subjects in larger NG 10 12 21 7
dfFMAX 9 11 20 6
p?.01 .01 .01 .01 .01
Design 2X4 2X3 2X2 3X3
6
4
9
You cannot use the F test for any of
these experiments!
54Homogeneity of Variance Conclusions
If FMAX is significant, then the assumption of
homogeneity of variance has been violated. If
the assumption of homogeneity of variance is
violated, then we cannot estimate sigma2 and
therefore can not compute the t or F test or the
Pearsons correlation coefficient (r).
55Interval Scales
56Assumption
- Our last assumption that we must meet to use
parametric statistics is that the measures in our
experiment use an interval scale. - An interval scale is a set of numbers whose
differences are equal at all points along the
scale.
57Examples of Interval Scales
- Integers - 1,2,3,4,
- Real numbers - 1.0, 1.1, 1.2, 1.3,
- Time - 1 minute, 2 minutes, 3 minutes,
- Distance - 1 foot, 2 feet, 3 feet, 4 feet,
58Examples of Non-Interval Scales
- Ordinal - ranks, such as first, second, third
high medium low etc. - The difference in time between first and second
can be very different from the time between
second and third. - The median is the best measure of central
tendency for ordinal data.
59Examples of Non-Interval Scales
- Nominal - categories, such as, male, female
pass, fail. - There is not even an order for nominal data.
- Categories should be mutually exclusive and
exhaustive. - The best measure of central tendency is the mode.
60Comparing Scales
- Interval scales have more information than
ordinal scales, which in turn have more
information than nominal scales.
- The more information that is available, the more
sensitive that a given statistical test can be.
61Example - test grades
Interval Scale SCORES 98 84 77 76 75 62 61 60
Ordinal Scale RANKS 1 2 3 4 5 6 7 8
Nominal Scale Pass/Fail PPPPPFFF
62Book Example - test grades
Interval Scale SCORES 98 84 77 76 75 62 61 60
Ordinal Scale RANKS 1 2 3 4 5 6 7 8
Nominal Scale Pass/Fail PPPPPFFF
Ordinal scales show the relative order of
individual measures. However, there is no
information about how far apart individuals are.
63Book Example - test grades
Interval Scale SCORES 98 84 77 76 75 62 61 60
Ordinal Scale RANKS 1 2 3 4 5 6 7 8
Nominal Scale Pass/Fail PPPPPFFF
Categories are mutually exclusive you either
pass or fail. Categories are exhaustive you
can only pass or fail.
64Interval Scale Conclusion
- Parametric tests can only be performed on
interval data. - Non-parametric tests must be used on ordinal and
nominal data. - Researchers prefer parametric tests because more
information is available, which makes it easier
to find - Significant differences between experimental
group means or - Significant correlations between two variables.
- If any assumptions are violated, it is common
practice to convert from the interval scale to
another scale. Then you can use the weaker,
non-parametric statistics. - There are non-parametric statistics that
correspond to all of the parametric statistics
that we have studied.
65Summary - Assumptions
To use parametric statistics, it must be true
that
- Subjects are randomly selected from the
population. - Experimental error is randomly distributed across
samples in the design. - The distribution of sample means fit a normal
curve. - There is homogeneity of variance demonstrated by
using FMAX. - The measures we take are on an interval scale.