Title: Summary descriptive statistics: means and standard deviations:
1Summary descriptive statistics means and
standard deviations Measures of central
tendency ("averages") Measures of dispersion
(spread of scores)
21. The Mode The most frequent score in a set
of scores. 6, 11, 22, 22, 96, 98. Mode 22
3Advantages of the mode(i) Simple to calculate,
easy to understand. (ii) The only average which
can be used with nominal data. Disadvantages of
the mode(i) May be unrepresentative and hence
misleading. e.g. 3, 4, 4, 5, 6, 7, 8, 8, 96,
96, 96. Mode is 96 - but most of the scores are
low numbers. (ii) May be more than one mode in a
set of scores. e.g. 3, 3, 3, 4, 4, 4, 6, 6, 6
has three modes!
42. The Median When scores are arranged in order
of size, the median is either (a) the middle
score (if there is an odd number of scores) 4, 5
,6 ,7, 8, 8, 96. Median 7. or (b) the
average of the middle two scores (if there is an
even number of scores). 4, 5, 6, 7, 8, 8, 96, 96.
Median (78)/2 7.5.
5 Advantages of the median (i) Resistant to the
distorting effects of extreme high or low scores.
Disadvantages of the median (i) Ignores
scores' numerical values, which is wasteful if
data are interval or ratio. (ii) More susceptible
to sampling fluctuations than the mean. (iii)
Less mathematically useful than the mean.
6 Â
3. The Mean Add all the scores together and
divide by the total number of scores. e.g.
(34456) / 5 22 / 5 4.4
7Advantages of the mean (i) Uses information from
every single score. (ii) Resistant to sampling
fluctuation - i.e., varies the least from sample
to sample. (Important since we normally want to
extrapolate from samples to populations). Disadvan
tages of the mean (i) Susceptible to distortion
from extreme scores. e.g. 4, 5, 5, 6 mean 5.
4, 5, 5, 106 mean 30. (ii) Can only be used
with interval or ratio data, not with ordinal or
nominal data.
81. The Range The difference between the highest
and lowest scores. (i.e. range highest -
lowest). Advantages Quick and easy to
calculate, easy to understand. Disadvantages Und
uly influenced by extreme scores. 3, 4, 4, 5,
100. Range (100-3) 97. 3, 4, 4, 5, 5. Range
(5-3) 2. Conveys no information about the
spread of scores between the highest and lowest
scores. e.g. 2, 2, 2, 2, 2, 20 and 2, 20, 20, 20,
20, 20 have exactly the same range (18) but very
different distributions.
92. The Standard Deviation (S.D.) The "average
difference of scores from the mean". The bigger
the s.d., the more scores differ from the mean
and between themselves, and the less satisfactory
the mean is as a summary of the
data. Advantages Like the mean, uses
information from every score. Disadvantages Not
intuitively easy to understand! Can only be used
with interval or ratio data.
10How to calculate the standard deviation For the
set of scores 5, 6, 7, 9, 11
(a) Work out the mean
38 / 5 7.6
11(
-
X
X
Ã¥
s
(b) Subtract the mean from each score 5 - 7.6
-2.6 6 - 7.6 -1.6 7 - 7.6 -0.6 9 - 7.6
1.4 11- 7.6 3.4
12(
)
2
-
X
X
Ã¥
s
n
(c) Square the differences just obtained -2.6 2
6.76 -1.6 2 2.56 -0.6 2 0.36 1.4 2
1.96 3.4 2 11.56
13(
)
2
-
X
X
Ã¥
s
n
(d) Add up the squared differences 6.76 2.56
0.36 1.96 11.56 23.20
14)
(
2
-
X
X
Ã¥
s
n
(e) Divide this by the total number of scores,
to get the variance 23.20 / 5 4.64
15)
(
2
-
X
X
Ã¥
s
n
(f) Standard deviation is the square root of the
variance (we do this to get back to the original
units) ?4.64 2.15. 2.15 is our sample
standard deviation.
16Complications in using the mean and s.d. We
usually obtain the mean and s.d. from a sample -
very rarely from the parent population. Sometimes
we are content to describe our sample per se, but
usually we want to extrapolate to the population
from our sample. A sample mean is a good
estimate of the population mean. A sample s.d.
tends to underestimate the population s.d. Hence,
when using the sample s.d. as a description of
the sample, divide by n. When using the sample
s.d. as an estimate of the population s.d.,
divide by n-1 (to make the s.d. larger than it
would otherwise have been).
17population s.d. if you measure every member of
the population (?n on calculators)
sample s.d. as an estimate of the population s.d.
(?n-1 on calculators)
sample s.d. as description of a sample (?n
("sigma n") on calculators)
population mean (mu)
sample mean as an estimate of population s.d.
sample mean as description of a sample
18The Standard Error of the Mean This is the
standard deviation of a set of sample
means. Shows how much variation there is within
a set of sample means, and hence how likely our
particular sample mean is to be in error, as an
estimate of the true population mean.
means of different samples actual population mean
19Formula for the standard error
We normally estimate this from our obtained data
20So - find the standard deviation then divide
this by the square root of the number of
scores. If the S.E. is small, our obtained sample
mean is more likely to be similar to the true
population mean than if the S.E. is
large. Increasing n reduces the size of the
S.E..A sample mean based on 100 scores is
probably closer to the population mean than a
sample mean based on 10 scores.