Title: Stat 2411 Statistical Methods
1Stat 2411 Statistical Methods
- Chapter 3
- Measures of Location
23.1 Populations and Samples
- Population All conceivably possible or
hypothetically possible observation - Sample The particular observations actually taken
3Population
- Example Temperatures of patients with
meningitis. - There are unlimited or infinite potential
observations
100.2
101.5
100.3
Population of potential measurements
4Sample
Sample n10 104.0 100.2 100.8 108.0 104.8 102.4 1
04.2 103.8 101.6 101.4
Notation value
53.2 The mean
Center of gravity
6Summation notation
7Population Descriptions
- The Population mean is the average of all values
in the population of potential values. -
- Population mean
- Population descriptions are denoted by Greek
letters like -
- Meningitis example
- average of all potential measurement of
temperature of all meningitis cases.
8Parameter and Statistic
- Population descriptions parameters
- Sample descriptions statistics
- Sample statistics are usually used to estimate
the corresponding population parameters.
93.3 Weighted mean
- Weight X
- Homework 20 90
- Exam 1 8 82
- Exam 2 11 87
- Exam 3 13 85
- Exam 4 13 92
- Final 35 83
10Geometric Mean
- (problem 3.15)
- Sometimes data are analyzed in the log scale
(for reasons discussed later). -
- Geometric mean back-transformed mean of logs
- x y
log10x
10y
11Geometric mean
- Example x 1 10 100
- y 0 1 2
Algebraically equivalent formula
12Harmonic Mean
- Back-Transformed mean of 1/x
X Y
1 1
10 0.1 100 0.01
Example x time Y rate
Current 1 mph
15 miles
3 mph upstream
5 mph downstream
Harmonic mean
30miles/5 hours up 3 hours down
133.4 The Median
- The median M is the midpoint of a data set. When
observations are ordered from smallest to
largest, M is in the middle, with half the
observations smaller, half larger
3 5 7 9 38
3 5 7 9
14Means vs. Medians
- The two values can behave VERY differently,
because the Median (M) is resistant to the
magnitude of possible outliers, but the Mean (
) is not, so it can be drawn toward them.
15Mode
-
- The value that occurs most frequently
- Mode108
9 10 11 12 13
06 02688888 222244666 02448 04
16Fractiles
- Quartiles divide data into 4 parts.
- Deciles divide data into 10 parts.
- Percentiles divide data into a hundred parts
Among the many fractiles, quartiles are used
very often in describing data. Quartiles are
the values at which 25 (Q1), 50 (Q2Median) and
75 (Q3) of the observations fall at or below
them, and can be used to describe the internal
variability.
17Defining the Quartiles
To calculate the quartiles 1. Arrange the
observations in increasing order and locate the
median M in the ordered list of observations. 2.
The first quartile Q1 is the median of the
observations whose position in the ordered list
is to the left of the location of the overall
median. 3. the third quartile Q3 is the median
of the observations whose position in the ordered
list is to the right of the location of the
overall median.
18Calculating (Identifying) the Quartiles
26 systolic blood pressure 90 96 100 102 106 108
108 108 108 108 112 112 112 112 114 114 116 116
116 120 122 124 124 128 130 134 Q1108
Q3120
19The Box Plot Graphing the Five-Number Summary
(Min, Q1, Median, Q3, Max)
Maximum (Largest Observation)
Q3 (75th percentile)
Values of the Variable
Median M (50th percentile)
Q1 (25th percentile)
Minimum (Smallest Observation)
- Box plots can show very large datasets
highlight skewness
- Because they show less detail than histograms or
stemplots, they are best used for side-by-side
comparison of more than 1 dataset.
20- Read section 3.8 on summation notation.