Title: Learning Objectives
1Learning Objectives
- Calculate the mean, median, and mode for small
sets of data and also frequency distribution - Understand the advantage and disadvantage of each
type of average - Understand the need for measure of spread and to
be able to use the cumulative frequency ogive to
find the interquartile range - Calculate the standard deviation for small sets
of data and for frequency distribution
2Summary Measures
- Average measures location or central
tendency, tells at what general level the data
are. - Range measures scatter or dispersion,
indicates how widely spread the data are. - The symmetry of the data, measures the "shape" of
the data, tells us how equally the data are
dispersed.
3Measures of Location - The Arithmetic Mean
- The arithmetic mean
- For example, the mean of 3, 4, 5, 5, 5, 6, 7, 8,
11 is 54/9 6 - Or the mean
- X f fx
- 0 364 0
- 1 362 362
- 2 226 452
- 3 44 132
- 4 4 16
- Total 1000 962
- .962
4The Arithmetic Mean
- The arithmetic mean for grouped data
- where
- fi frequency of ith class interval
- xi mid-point of ith class interval
- j number of
class intervals -
5Suppose that a survey of the prices of 60items
sold in a shop gives the results below(five
class intervals for prices)
6(No Transcript)
7Measures of Location - Median
- This is the middle value of a set of numbers
- Median for ungrouped data the middle item or the
arithmetic mean of the middle two values. - E.g. the median of 3, 4, 5, 5, 5, 6, 7, 8, 11
is 5, - whereas the median of 3, 4, 5, 5, 5, 6, 7, 8,
11, 12 is (56)/2 5.5
8(No Transcript)
9Measures of Location - Mode
- The mode is the most frequently occurring value
of the variable - The mode of ungrouped data
- 5, 9, 7, 14, 8, 7, 3 is 7
- With a frequency distribution, the mode is the
value with the greatest frequency
10(No Transcript)
11The mode of a grouped frequency distribution
30 35 40 45 50 55
60
Modal value
12The mode of a grouped frequency distribution
Modal value
40 45 50 55 60 65 70 75 80
13scored by each of twenty participants in a
driving competition
14by a sample of 20 views in a 19 part serial
15from work in a 1-year period for a sample of 20
employees
16Answers for calculations
- Mean is the first-choice as long as the data are
- symmetrical
- Median should be considered when there are large
outliers - Mode is good measure when the data have two or
more clusters
17(No Transcript)
18Measures of Dispersion (or Scatter)
- Range largest reading - smallest reading
- E.g., 1, 3, 4, 5, 5, 5, 6, 7, 8, 13,
Range 13 - 1 12 - Interquartile Range Range of middle 50 of
readings - E.g., 1, 3, 4, 5, 5, 5, 6, 7, 8, 13
- remove
- Interquartile Range 7 - 4 3
1975
Q1 the lower quartile 25
50
Q3 the upper quartile 25
IQR (Q3-Q1)
25
Semi-IQR IQR/2
Q1
Q2
Q3
20Measures of Dispersion (or Scatter)
- Mean absolute deviation (MAD) - the average
distance of the readings from their arithmetic
mean - where is the
arithmetic mean - modulus
- E.g., the MAD of 3, 4, 5, 5, 5, 6, 7, 8, 11,
where the mean 6 and n 9. x - -3, -2,
-1, -1, -1, 0, 1, 2, 5. Thus, - MAD 3 2 1 1 1 0 1 2 5 / 9
1.8
21Measures of Dispersion (or Scatter)
MAD 71.2/60 1.19
22Measures of Dispersion (or Scatter)
- Population variance
- Sample variance
S2 792 / (7-1) 132
23Measures of Dispersion (or Scatter)
- Standard deviation
- sample standard deviation
- population standard deviation
- Grouped data
24(No Transcript)
25Summary
- Measures of location the arithmetic mean is
chosen unless there is good reason to do
otherwise - Measures of dispersion
- range is easily understood, distorted by outliers
- imterqurtile range is easily understood but not
well known - MAD is sensible but unfamiliar and difficult to
handle mathematically - variance is mathematically tractable but with
wrong units, difficult to understand - standard deviation is mathematically tractable
and well-known
26(No Transcript)