Measures of Central Tendency - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Measures of Central Tendency

Description:

Title: PowerPoint Presentation Last modified by: intel Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show (4:3) Other titles – PowerPoint PPT presentation

Number of Views:354
Avg rating:3.0/5.0
Slides: 50
Provided by: pbwo527
Category:

less

Transcript and Presenter's Notes

Title: Measures of Central Tendency


1
Measures of Central Tendency
  • By Rahul Jain

2
The Motivation
  • Measure of central tendency are used to describe
    the typical member of a population.
  • Depending on the type of data, typical could have
    a variety of best meanings.
  • We will discuss four of these possible choices.

3
4 Measures of Central Tendency
  • Mean the arithmetic average. This is used for
    continuous data.
  • Median a value that splits the data into two
    halves, that is, one half of the data is smaller
    than that number, the other half larger. May be
    used for continuous or ordinal data.
  • Mode this is the category that has the most
    data. As the description implies it is used for
    categorical data.
  • Midrange not used as often as the other three,
    it is found by taking the average of the lowest
    and highest number in the data set. Also
    primarily used for continuous data.

4
Measures of Central Tendency
  • The central tendency is measured by averages.
    These describe the point about which the various
    observed values cluster.
  • In mathematics, an average, or central tendency
    of a data set refers to a measure of the "middle"
    or "expected" value of the data set.

5
Mean
  • To find the mean, add all of the values, then
    divide by the number of values.
  • The lower case, Greek letter mu is used for
    population mean.
  • An x with a bar over it, read x-bar, is used
    for sample mean.

6
Mean Example
7
Arithmetic Mean of Group Data
  • if are the mid-values
    and
  • are the corresponding
    frequencies, where the subscript k stands for
    the number of classes, then the mean is

8
Exercise-1 Find the Arithmetic Mean
Class Frequency (f) x fx
20-29 3 24.5 73.5
30-39 5 34.5 172.5
40-49 20 44.5 890
50-59 10 54.5 545
60-69 5 64.5 322.5
Sum N43 2003.5
9
Median
  • The median is a number chosen so that half of the
    values in the data set are smaller than that
    number, and the other half are larger.
  • To find the median
  • List the numbers in ascending order
  • If there is a number in the middle (odd number of
    values) that is the median
  • If there is not a middle number (even number of
    values) take the two in the middle, their average
    is the median

10
Median Example
11
Median
  • The implication of this definition is that a
    median is the middle value of the observations
    such that the number of observations above it is
    equal to the number of observations below it.

If n is Even
If n is odd
12
Median of Group Data
  • L0 Lower class boundary of the median
  • class
  • h Width of the median class
  • f0 Frequency of the median class
  • F Cumulative frequency of the pre-
  • median class

13
Steps to find Median of group data
  1. Compute the less than type cumulative
    frequencies.
  2. Determine N/2 , one-half of the total number of
    cases.
  3. Locate the median class for which the cumulative
    frequency is more than N/2 .
  4. Determine the lower limit of the median class.
    This is L0.
  5. Sum the frequencies of all classes prior to the
    median class. This is F.
  6. Determine the frequency of the median class. This
    is f0.
  7. Determine the class width of the median class.
    This is h.

14
Example-Find Median
Age in years Number of births Cumulative number of births
14.5-19.5 677 677
19.5-24.5 1908 2585
24.5-29.5 1737 4332
29.5-34.5 1040 5362
34.5-39.5 294 5656
39.5-44.5 91 5747
44.5-49.5 16 5763
All ages 5763 -
15
Mode
  • The mode is simply the category or value which
    occurs the most in a data set.
  • If a category has radically more than the others,
    it is a mode.
  • Generally speaking we do not consider more than
    two modes in a data set.
  • No clear guideline exists for deciding how many
    more entries a category must have than the others
    to constitute a mode.

16
Obvious Example
  • There is obviously more yellow than red or blue.
  • Yellow is the mode.
  • The mode is the class, not the frequency.

17
Bimodal
18
No Mode
  • Category Frequency
  • 1 51
  • 2 51
  • 3 66
  • 4 62
  • 5 65
  • 6 57
  • 7 47
  • 8 43
  • 64
  • Although the third category is the largest, it is
    not sufficiently different to be called the mode.

19
Example-2 Find Mean, Median and Mode of Ungroup
Data
The weekly pocket money for 9 first year pupils
was found to be 3 , 12 , 4 , 6 , 1 , 4 , 2 , 5
, 8
Mean 5
Median 4
Mode 4
20
Mode of Group Data
  • L1 Lower boundary of modal class
  • ?1 difference of frequency between
  • modal class and class before it
  • ?2 difference of frequency between
  • modal class and class after
  • H class interval

21
Steps of Finding Mode
  • Find the modal class which has highest frequency
  • L0 Lower class boundary of modal class
  • h Interval of modal class
  • ?1 difference of frequency of modal
  • class and class before modal class
  • ?2 difference of frequency of modal class and
  • class after modal class

22
Example -4 Find Mode
Slope Angle () Midpoint (x) Frequency (f) Midpoint x frequency (fx)
0-4 2 6 12
5-9 7 12 84
10-14 12 7 84
15-19 17 5 85
20-24 22 0 0
Total Total n 30 ?(fx) 265
23
Midrange
  • The midrange is the average of the lowest and
    highest value in the data set.
  • This measure is not often used since it is based
    strictly on the two extreme values in the data.

24
Midrange Example
25
Measures of Variation
Same mean, but y varies more than x.
26
Three Measures of Variation
  • While there are other measures, we will look at
    only three
  • Variance
  • Standard deviation
  • Coefficient of variation
  • Population mean and sample mean use an identical
    formula for calculation.
  • There is a minor difference in the formulas for
    variation.

27
Population Variance
  • The population variance, s2, is found using
    either of the formulas to the right.
  • The differences are squared to prevent the sum
    from being zero for all cases.
  • N is the size of the population, µ is the
    population mean.
  • Note that variance is always positive if x can
    take on more than one value.

28
Population Standard Deviation
  • The standard deviation can be thought of as the
    average amount we could expect the xs in the
    population to differ from the mean value of the
    population.
  • To get the standard deviation, simply take the
    square root of the variance.

29
Sample Variance
  • The sample variance, s2, is found using either of
    the formulas to the right.
  • The differences are squared to prevent the sum
    from being zero for all cases.
  • The sample size is n, x-bar is the sample mean.
  • Note that n-1 is used rather than n. This
    adjustment prevents bias in the estimate.

30
Sample Standard Deviation
  • Just like the standard deviation of a population,
    to find the standard deviation of a sample, take
    the square root of the sample variance.

31
Coefficient of Variation
  • The measures discussed so far are primarily
    useful when comparing members from the same
    population, or comparing similar populations.
  • When looking at two or more dissimilar
    populations, it doesnt make any more sense to
    compare standard deviations than it does to
    compare means.

32
Coefficient of Variation Cont.
  • Example 1 Weight loss programs A and B.
  • Two different programs with the same goal and
    target population.
  • While program B averages more weight loss, it
    also has less consistent results.

A B
Mean (weight loss per month) 20 25
Standard deviation 15 30
33
Coefficient of Variation Cont.
  • Example 2 Weight loss program A and tax refund
    B.
  • Two different programs with different goals and
    different target populations.
  • We know that average weight loss and average tax
    refund are not comparable. Are the standard
    deviations comparable?

A B
Mean 20 650
Standard deviation 15 30
34
Coefficient of Variation Cont.
  • In the last example we can see an argument that
    standard deviation does not give the complete
    picture.
  • The coefficient of variation addresses this issue
    by establishing a ratio of the standard deviation
    to the mean. This ratio is expressed as a
    percentage.

35
Coefficient of Variation Cont.
  • Looking at the two examples. We see that in both
    cases the standard deviation for B is twice that
    of A.
  • In the first example we have almost twice the
    relative variation in B.
  • In the second example, we have a little over 16
    times as much variation in A.

A B
CV Example 1 75 120
CV Example 2 75 4.6
36
Measures of Position
The dot on the left is at about -1, the dot on
the right is at approximately 0.8. But where are
they relative to the rest of the values in this
distribution.
37
Quartiles, Percentiles and Other Fractiles
  • We will only consider the quartile, but the same
    concept is often extended to percentages or other
    fractions.
  • The median is a good starting point for finding
    the quartiles.
  • Recall that to find the median, we wanted to
    locate a point so that half of the data was
    smaller, and the other half larger than that
    point.

38
Quartile
  • For quartiles, we want to divide our data into 4
    equal pieces.

Suppose we had the following data set (already in
order) 2 3 7 8 8 8 9 13 17 20 21 21
Choosing the numbers 7.5, 8.5, and 18.5 as
markers would Divide the data into 4 groups, each
with three elements. These numbers would be the
three quartiles for this data set.
39
Quartiles Continued
  • Conceptually, this is easy, simply find the
    median, then treat the left hand side as if it
    were a data set, and find its median then do the
    same to the right hand side.
  • This is not always simple. Consider the following
    data set.
  • 3 3 3 3 3 5 6 8 8 8 8 8 9
  • The first difficulty is that the data set does
    not divide nicely.
  • Using the rules for finding a median, we would
    get quartiles of 3, 6 and 8.
  • The second difficulty is how many of the 3s are
    in the first quartile, and how many in the second?

40
Quartiles Continued
  • For this course, lets pretend that this is not
    an issue.
  • I will give you the quartiles.
  • I will not ask how many are in a quartile.

41
Interquartile Range
  • One method for identifying these outliers,
    involves the use of quartiles.
  • The interquartile range (IQR) is Q3 Q1.
  • All numbers less than Q1 1.5(IQR) are probably
    too small.
  • All numbers greater than Q3 1.5(IQR) are
    probably too large.

42
Measures of Variation Variance Standard
Deviationfor GROUPED DATA
  • The grouped variance is
  • The grouped standard deviation is

43
Example 3-24 (p130) Miles Run per Week
  • Find the variance and the standard deviation for
    the frequency distribution below. The data
    represents the number of miles that 20 runners
    ran during one week.

Class f Xm fXm f(Xm X)
5.5 10.5 10.5 15.5 15.5 20.5 20.5 25.5 25.5 30.5 30.5 35.5 35.5 40.5 1 2 3 5 4 3 2 20
8 13 18 23 28 33 38
18 8 213 26 318 54 523 115 428
108 333 99 238 76 SfXm 486
1(8-24.3)2 265.69 2(13-24.3)2
255.38 3(18-24.3)2 119.07 5(23-24.3)2
8.45 4(28-24.3)2 54.76 3(33-24.3)2
227.07 2(38-24.3)2 375.38 S f(Xm X) 1305.80
44
Mean Deviation
  • The mean deviation is an average of absolute
    deviations of individual observations from the
    central value of a series. Average deviation
    about mean
  • k Number of classes
  • xi Mid point of the i-th class
  • fi frequency of the i-th class

45
Coefficient of Mean Deviation
  • The third relative measure is the coefficient of
    mean deviation. As the mean deviation can be
    computed from mean, median, mode, or from any
    arbitrary value, a general formula for computing
    coefficient of mean deviation may be put as
    follows

46
Coefficient of Range
  • The coefficient of range is a relative measure
    corresponding to range and is obtained by the
    following formula
  • where, L and S are respectively the largest
    and the smallest observations in the data set.

47
Coefficient of Quartile Deviation
  • The coefficient of quartile deviation is computed
    from the first and the third quartiles using the
    following formula

48
Assignment-1
  • Find the following measurement of dispersion from
    the data set given in the next page
  • Range, Percentile range, Quartile Range
  • Quartile deviation, Mean deviation, Standard
    deviation
  • Coefficient of variation, Coefficient of mean
    deviation, Coefficient of range, Coefficient of
    quartile deviation

49
Data for Assignment-1
Marks No. of students Cumulative frequencies
40-50 6 6
50-60 11 17
60-70 19 36
70-80 17 53
80-90 13 66
90-100 4 70
Total 70
Write a Comment
User Comments (0)
About PowerShow.com