Title: Variability
1Variability
- Taking into account the spread of a distribution
of scores
2Concepts
- Variability
- See Fig 4.11
- Range
- Range URL X(max) LRL X(min)
- Consider 4, 8, 7, 9, 6, 11
- Range 11.5 3.5 8
- How much information does the range provide about
the variability of scores? - Consider 4, 5, 5, 5, 5, 11
- How is the range different for the above?
- Interquartile range
3Interquartile Range (IQR)
- Range for the middle 50 of the observations (Q2)
- Chop off top 25 of observations (Q3)
- Chop off bottom 25 of observations(Q1)
- See Fig 4.3
4Procedure for determining IQR
- 1. Order observations from least to most
- 2. Find the positions in the distribution that
divide observations into quarters (number of
observations 1)/4 - 3. Now we can remove the top and bottom quarters.
- 4. Count from the bottom up for first quartile
(Q1) - 5. Count from the top down for third quartile
(Q3) - 6. IQR Q3 Q1
5(No Transcript)
6(No Transcript)
7(No Transcript)
8(No Transcript)
9Advantage of IRQ over range
- One more thing statisticians more often use
semi-interquartile range (SIQR IRQ/2) - Conveys the same information the variability of
the scores in the middle of the distribution. - Not sensitive to extreme values
10Variance
- If we want to describe how observations deviate
from the mean, why not find the average of the
deviations? - Deviation X (X-bar)
11(No Transcript)
12(No Transcript)
13(No Transcript)
14(No Transcript)
15 note n should be N in the formula below
16(No Transcript)
17note n should be N in the formula below and s
should be small sigma (population parameter)
18note n should be N in the formula below and s
should be small sigma (population parameter)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22note s should be small sigma (population
parameter)
23note n should be N in the formula below and s
should be small sigma (population parameter)
24(No Transcript)
25Standard Deviation
26(No Transcript)
27(No Transcript)
28note n should be N in the formula below and s
should be small sigma (population parameter)
29note n should be N in the formula below and s
should be small sigma (population parameter)
30Standard Deviation and Variance for Samples
- A sample statistic is said to be biased if, on
the average, it consistently overestimates or
underestimates the corresponding population
parameter - Generally, the sample statistic underestimates
the variability of the population parameter - See Fig 4.6
31Computing unbiased statistics
- Instead of n, use degrees of freedom (df) or
(n-1) to calculate the sample variance and sample
statistic - What would that look like?
- The notation for sample variance is s, not sigma.
32Factors that affect variability
- Extreme scores
- The range is most affected by extreme scores
- The SIQR is least affected by this
- Sample size
- The range is directly related to sample size
- Stability under sampling
- The range will change unpredictably
- Open-ended distributions
- Cannot compute range or standard deviation
- Only available measure of variability is SIQR.
33The role of variability in inferential statistics
- The question is whether the sample data reflects
patterns that exist in the population, or are the
sample data simply showing random fluctuations
that occur by chance. - The average of the statistic across samples
should be the same as the statistic of the
population. - See Table 4.1