Title: Basic Statistics
1Basic Statistics
Measures of Central Tendency
2Characteristics of Distributions
- Location or Center
- Can be indexed by using a measure of central
tendency - Variability or Spread
- Can be indexed by using a measure of variability
3Consider the following distribution of scores
How do the red and blue distributions differ?
How do the red and green distributions differ?
4Consider the following distributions
How do the green and blue distributions differ?
5Consider the following two distributions
How do the green and red distributions differ?
6Characteristics of Distributions
- Location or Central Tendency
- Variability
- Symmetry
- Kurtosis
7Measures of Central Tendency
Summarizing Data
The Mean The Median The Mode
Give you one score or measure that represents, or
is typical of, an entire group of scores
8Most scores tend to center toward
a point in the distribution.
frequency
score
Central Tendency
9Measures of Central Tendency
Are statistics that describe typical, average, or
representative scores.
The most common measures of central tendency
(mean,median, and mode) are quite different in
conception and calculation. These three
statistics reflect different notions of the
center of a distribution.
10The Mode
The score that occurs most frequently
In case of ungrouped frequency distribution
11Unimodal Distribution -One Mode-
Bimodal Distribution Two Modes-
12Mode and Measurement Scales
Can you find a mode for each data?
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
1 2 1 3 3 2 3 3 3 1 2 1 2 3 3 2 1 2 3 2
1 2 3 4 4 3 4 3 2 4 4 2 1 2 4 4 3 2 3 4
3
4
112
56
68 56 39 56 44 56 45 56 75 81 67 59
112 132 112 113 112 150 125 114
Nationality 1American 2Asian 3Mexican
Football Poll 1first 2second 3third 4fourth
IQ score
Weight
13The Mode
- It is not affected by extremely large or small
values and is therefore a valuable measure of
central tendency when such values occur. - It can be found for ratio-level, interval-level,
ordinal-level and nominal-level data
14The Median
The Median is the 50th percentile of a
distribution - The point where half of
the observations fall below and half of the
observations fall above In any distribution there
will always be an equal number of cases above and
below the Median.
Oh my !! Where is the median?
Location
15For an odd number of untied scores (11, 13,
18, 19, 20)
11 12 13 14 15 16 17 18 19
20
The Median is the middle score when scores are
arranged in rank order
Median Location (N1)/2 3rd
Median Score 18
16For an even number of untied scores (11, 15,
19, 20)
11 12 13 14 15 16 17 18 19
20
The Median is halfway between the two central
values when scores are arranged in rank order
Median Location (N1)/2 2.5th Score
Median (1519)/2 17
17- The Median of group of scores is that point on
the number line such that sum of the distances of
all scores to that point is smaller than the sum
of the distances to any other point. - There is a unique median for each data set.
- It is not affected by extremely large or small
values and is therefore a valuable measure of
central tendency when such values occur.
18The Median
- Can be computed for
- Ordinal-level data
- Interval-level data
- Ratio-level data
19Median and Levels of Measurement
1 2 1 3 3 2 3 3 3 1 2 1 2 3 3 2 1 2
3 2
1 2 3 4 4 3 4 3 2 4 4 2 1 2 4 4 3 2 3 4
112 132 112 113 112 150 125 114
68 56 39 56 44 56 45 56 75 81 67 59
No
Yes
Yes
Yes
Nationality
Football Poll
IQ score
Weight
Can you find a median for each type of data?
20The Mean
21The Population Mean
- For ungrouped data, the population mean is
the sum of all the population values divided by
the total number of population values. To
compute the population mean, use the following
formula.
Sigma
Individual value
Population mean
Population size
22The Sample Mean
- For ungrouped data, the sample mean is the sum
of all the sample values divided by the number of
sample values. To compute the sample mean, use
the following formula.
Sigma (Summation)
Sample Mean
Individual value
Sample size
23Characteristics of The Mean
Center of Gravity of a Distribution
24Center of Gravity of a Distribution
1
2
3
4
5
6
7
8
Mean
25How much error do you expect for each
case?
Deviation Scores
-6
25
27
-4
31
31
-2
0
29
33
2
The Mean
6
35
37
4
Data set
26On average, I feel fine
Its too hot!
Its too cold!
27The Mean of group of scores is the point on the
number line such that sum of the squared
differences between the scores and the mean is
smaller than the sum of the squared difference to
any other point. If you summed the differences
without squaring them, the result would be zero.
28- Mean and Measurement Scales
- Every set of interval-level and ratio-level data
has a mean. - Can you find the Mean for the following data
sets?
Nominal data
Ordinal data
Interval data
Ratio data
1 2 3
1 2 3
1 2 3
1 2 3
2
2
YES
NO
YES
NO
Nationality 1American 2Asian 3Mexican
IQ Test
Football Poll 1first 2second 3third
Weight
29- All the values are included in computing the mean.
30- A set of data has a unique mean and the mean is
affected by unusually large or small data values
outliers.
3
5
7
9
1
1
9
3
5
6
5
4
5
5.5
5
The Mean
31- Every set of interval-level and ratio-level data
has a mean. - All the values are included in computing the
mean. - A set of data has a unique mean.
- The mean is affected by unusually large or small
data values. - The arithmetic mean is the only measure of
central tendency where the sum of the deviations
of each value from the mean is zero.
32The Relationships between Measures of Central
Tendency and Shape of a Distribution
33Normal Distribution
Symmetric
Unimodal
Mean Median Mode
34Positively Skewed Distribution
Mode
Median
Mean
Mode lt Median lt Mean
The median falls closer to the mean than to the
mode
35Negatively Skewed Distribution
Mode
Median
Mean
Mode gt Median gt Mean
The median falls closer to the mean than to the
mode
36Bimodal Distribution
Mode1
Mode2
Mean Median
Mode1 lt Mean Median lt Mode2
37SUMMARY
There are three common measures of central
tendency. The mean is the most widely used and
the most precise for inferential purposes and is
the foundation for statistical concepts that will
be introduced in subsequent class. The mean is
the ratio of the sum of the observations to the
number of observations. The value of the men is
influenced by the value of every score in a
distribution. Consequently, in skewed
distributions it is drawn toward the elongated
tail more than is the median or mode. The median
is the 50th percentile of a distribution. It is
the point in a distribution from which the sum of
the absolute differences of all scores are at a
minimum. In perfectly symmetrical distributions
the median and mean have same value. When the
mean and median differ greatly, the median is
usually the most meaningful measure of central
tendency for descriptive purposes. The mode,
unlike the mean and median, has descriptive
meaning even with nominal scales of measurement.
The mode is the most frequently occurring
observation. When the median or mean is
applicable, the mode is the least useful measure
of central tendency. In symmetrical unimodal
distribution the mode, median, and mean have the
same value.