Title: PSY 360
1PSY 360
- Describing Middle
- Describing Spread
2Description With Statistics
- Aspects or characteristics of data that we can
describe are - Middle
- Spread
- Skewness
- Kurtosis
- Statistics that measure/describe middle are mean,
median, mode - Statistics that measure/describe spread are
range, variance, standard deviation, midrange
3Description With Statistics
- Middle central tendency, location, center
- Measures of middle are mean, median, mode
(keywords) - Spread variability, dispersion
- Measures of spread are range, variance, standard
deviation, midrange (keywords) - Skewness departure from symmetry
- Positive skewness tail (extreme scores) in
positive direction - Negative skewness tail (extreme scores) in
negative direction - Kurtosis peakedness relative to normal curve
4(No Transcript)
5Skewness
Positive Skewness
6Skewness
Positive Skewness
Negative Skewness
7Kurtosis
8Description With Statistics
S
- Another name for middle is central tendency,
location, or center - Mean describes/measures middle
- Another name for spread is variability or
dispersion - Variance describes/measures spread
- Why is mean not a correct alternative name for
middle? Because mean is a statistic and the name
mean is a reserved keyword - Why is variance not a correct alternative name
for spread? Because variance is a statistic and
the name variance is a reserved keyword
9Describing the Middle of Data
- Another name for middle is
- _________.
- Middle is the aspect of data
- we want to describe.
- We describe/measure the middle of data in a
sample with the statistics - Mean.
- Median.
- Mode.
- We describe/measure the middle of data in a
population with the parameter ? (mu) we
usually dont know ?, so we estimate it with X.
10Sample Mean
- The sample mean is the sum of the scores divided
by the number of scores, and is symbolized by
X-bar, X ?X - N
- For example 4, 1, 7, N3, ?X12 and X ?X/N
12/3 4 - Characteristics
- X is the balance point
- ?(X-X)0
- X Minimizes ?(X-X)2 (Least Squares criterion)
- X is pulled in the direction of extreme scores
11Sample Mean
- What is the mean for the following data 4, 1, 7,
6 - N4
- ?X18
- X ?X/N 18/4 4.5
12Sample Median
- The median is the middle of the ordered
scores, and is symbolized as X50. - Median position (as distinct from the median
itself) is (N1)/2 and is used to find the
median. - Find the median of these scores 4, 1, 7
- N3.
- Median position is (31)/2 4/2 2.
- Place the scores in order 1, 4, 7.
- X50 is the score in position/rank 2.
- So X50 4.
13Sample Median
- Another example 4, 1, 7, 6
- N4.
- Median position is (N1)/2 (41)/2 5/2 2.5.
- Place the scores in order 1, 4, 6, 7.
- X50 is the score in position/rank 2.5.
- So X50 (46)/2 10/2 5.
- Characteristics
- Depends on only one or two middle values.
- For quantitative data when distribution is
skewed. - Minimizes ?X-X50.
14Sample Mode
- The mode is the most frequent score.
- Examples
- 1 1 4 7, the mode is 1.
- 1 1 4 7 7, there are two modes, 1 and 7.
- 1 4 7, there is no mode.
- Characteristics
- Has problems more than one, or none maybe not
in the middle little info regarding the data. - Best for qualitative data, e.g. gender.
- If it exists, it is always one of the scores.
- Is rarely used.
15Describing the Spread of Data
- Another name for spread is _________.
- Spread is the aspect of data we want to
describe. - Any statistic that describes/measures spread
should have these characteristics it should - Equal zero when the spread is zero.
- Increase as spread increases.
- Measure just spread, not middle.
16Describing the Spread of Data
- We describe/measure the spread of data in a
sample with the statistics - Range high score-low score.
- Midrange, MR.
- Sample variance, s².
- Sample standard deviation, s.
- Unbiased variance estimate, s².
- Standard deviation, s.
- We describe/measure the spread of data in a
population with the parameter ? (sigma) or ?²
we usually dont know ? or ?², so we estimate
them with one of the statistics.
Spread
17Range
- Formula is high score low score.
- Example 4 1 5 3 3 6 1 2 6 4 5 3 4 1, N 14
- Arrange data in order 1 1 1 2 3 3 3 4 4 4 5 5 6
6 - Range high score low score 6 1 5
range
18Midrange (MR)
midrange
- Formula is MRUH-LH.
- UHupper hinge
- LHlower hinge
- Hinges cut off 25 of the data in each tail
- Hinge position is (median position1)/2.
- median position is the whole number part of the
median position (remember, median pos.(N1)/2) - Use hinge position to count in from the tails to
find the hinges.
19Midrange (MR)
- Example 4 1 5 3 3 6 1 2 6 4 5 3 4 1, N14
- Arrange data in order 1 1 1 2 3 3 3 4 4 4 5 5 6
6 - Compute median position (N1)/2(141)/215/27
.5 - Compute hinge position
- (median position1)/2(71)/28/24
- Count in to the 4th score from each tail to find
UH and LH - UH5 and LH2
- MRUH-LH5-23
midrange
20Sample Variance, s²
S
- Definitional formula s² ?(X-X)²
- N
-
21Sample Variance, s²
S
- Definitional formula s² ?(X-X)²
- N
- the average squared deviation from X.
22Sample Variance, s²
S
- Definitional formula s² ?(X-X)²
- N
- the average squared deviation from X.
- Example 1 2 3
- N3, X ?X/N6/32
- ?(X-X)² (1-2)²(2-2)²(3-2)²-1202121012
- s²2/N2/3.6667
23Sample Variance, s²
S
- Definitional formula s² ?(X-X)²
- N
- the average squared deviation from X.
- Example 1 2 3
- N3, X ?X/N6/32
- ?(X-X)² (1-2)²(2-2)²(3-2)²1012
- s²2/3.6667
- Computational formula s² N?X²-(?X)²
- N2
- ?X² 1²2²3²14914, ?X6, N3
- s²3(14)-(6)²/3²42-36/96/92/3.6667
- s² is in squared units of measure.
24Sample Standard Deviation, s
- Formula s ?s²
- Example 1 2 3
- N3, X ?X/N6/32
- ?(X-X)² (1-2)²(2-2)²(3-2)²1012
- s²2/3.6667
- s ?.6667.8165
- s is in original units of measure.
S
25Unbiased Variance Estimate, s²
- Definitional formula s² ?(X-X)²
- (N-1)
- Example 1 2 3
- N3, X ?X/N6/32
- ?(X-X)² (1-2)²(2-2)²(3-2)²1012
- s²2/21.0
- Computational formula
- s² N?X²-(?X)²
- N(N-1)
- ?X² 1²2²3²14914, ?X6, N3
- s²3(14)-(6)²/3(2)42-36/66/61.0
- s² is in squared units of measure
S
26Standard Deviation, s
S
- Formula s ?s²
- Example 1 2 3
- N3, X ?X/N6/32
- ?(X-X)² (1-2)²(2-2)²(3-2)²1012
- s²1.0
- s ?11.0
- s is in original units of measure.
- s is the typical distance of scores from the mean.
27Why do we care about measures of middle and
variability?
- Once weve collected data, the first step is
usually to organize the information using simple
descriptive statistics (e.g., measures of middle
and variability) - Measures of middle are AVERAGES. Mean, median,
and mode are different ways of finding the one
value that best represents all of your data - Measures of variability tell us how scores DIFFER
FROM ONE ANOTHER.
28What do those formulas mean?
- Computing the mean X ?X
-
N - List the entire set of values in one or more
columns. These are all the Xs. - Compute the sum or total of the values.
- Divide the total or sum by the number of values.
- Computing the median (N1)/2 is the Median
Position - List the values in order (from lowest to
highest). - Find the middle-most score (i.e., the score in
the median position). Thats the median. - Computing the mode
- List the entire set of values, but list each only
once. - Tally the number of times that each value occurs.
- The value that occurs most often is the mode.
29What do those formulas mean?
- Computing the range Range Highest score
Lowest score - Find the highest and lowest scores.
- Subtract the lowest score from the highest.
- Computing the Midrange MR UH-LH
- Find the hinge position (median position1)/2.
Use this to count in from the tails to find the
hinges. - Subtract the Lower Hinge from the Upper Hinge.
- Computing the Sample Variance (s2) ?(X-X)²
N - Compute the mean for the group.
- Subtract the mean from each score.
- Square each of these difference scores. (This
gets rid of negative numbers). - Sum all of the squared deviations around the
mean. - Divide the sum by N. (This gives you the AVERAGE
SQUARED DEVIATION AROUND THE MEAN)
30Why do we have two formulas for variance and
standard deviation?
- Remember that our statistics are ESTIMATES of the
parameters in the population. - When we use N as the denominator (as in s2
s), then we produce a biased estimate (it is too
small). - Since we are trying to be good scientists, we
will be conservative and use the unbiased
estimates of the variance and standard deviation
(s2 s). - Why did I confuse you with these different
formulas? We will address the idea of bias
later in the semester and this is a good
introductionplus, some of the instructors for
your later statistics courses may expect you to
have a solid understanding of biased and unbiased
estimates of variability.
31PRACTICE
- Data 1, 3, 5, 2, 13, 11, 1, 4
- Compute the mean
- X SX/N 40/8 5
- Compute the median
- Place scores in order 1, 1, 2, 3, 4, 5, 11, 13
- Median position (N1)/2 (81)/2 9/2 4.5
- Count in from the end 4.5 places
- X50 (34)/2 7/2 3.5
- Compute the mode
- Tally the scores
- Mode 1
32PRACTICE
- Data 1, 3, 5, 2, 13, 11, 1, 4
- Compute the Range
- Range High score Low score 13-1 12
- Compute the Midrange
- Place scores in order 1, 1, 2, 3, 4, 5, 11, 13
- MR UH - LH
- Hinge position (median position1)/2
(41)/2 2.5 - Count in from each end 2.5 places
- UH (511)/2 8
- LH (12)/2 1.5
- MR UH LH 8 1.5 6.5
33PRACTICE
- Data 1, 3, 5, 2, 13, 11, 1, 4
- Compute s2
- s² ?(X-X)²
N - (1-5)2(3-5)2(5-5)2(2-5)2(13-5)2(11-5)2(1-5)
2(4-5)2
8 - (-4)2(-2)2(0)2(-3)2(8)2(6)2(-4)2(-1)2
8 - 164096436161 146
18.25 8 8 - Compute s
- s ?s²
- s ? 18.25 4.27
34PRACTICE
- Data 1, 3, 5, 2, 13, 11, 1, 4
- Compute s2
- s² ?(X-X)²
N-1 - (1-5)2(3-5)2(5-5)2(2-5)2(13-5)2(11-5)2(1-5)
2(4-5)2
8-1 - (-4)2(-2)2(0)2(-3)2(8)2(6)2(-4)2(-1)2
8-1 - 164096436161 146
20.86 8-1
7 - Compute s
- s ?s²
- s ? 20.86 4.57
35Homework for Next Time
- 8.8, 6.4, 8.4, 7.8, 8.8, 8.9, 7.8, 7.7, 4.5, 3.7,
9.1, 8.3, 7.5, 3.2, 8.9, 4.4, 6.8 - Use SPSS to find N, Minimum score, Maximum score,
Mean, Standard Deviation - Which standard deviation does SPSS compute s or
s?