Title: Week 8
1Statistical Inference
Week 8
To guess is cheap, To guess wrongly is
expensive. An old Chinese proverb
2OUTLINE
1. Statistical Inference 2. Point and Interval
Estimation 3. Confidence Intervals RATIONALE 4.
Assumptions and Conditions 5. A Six-Step Model 6.
Estimation of a population mean (? known) 7.
Choosing the Sample Size
3To recap Central Limit Theorem
When n is large
68 probability that our will be in this region
95 probability that our will be in this region
4STANDARD ERROR OF THE SAMPLE MEAN
- The standard error of the sample mean is computed
by -
-
- is the symbol for the standard error of
the - sample mean.
- s is the standard deviation of the population.
- n is the size of the sample.
Shows how closely to population mean the sample
mean tends to be. It is a TYPICAL deviation of a
sample mean relative to the POPULATION mean.
5EXERCISE 1
61. Statistical Inference
A population is a complete set of observations,
patients, entities, or measurements about which
we would like to draw conclusions.
- Statistical inference is a formal process that
uses information from a sample to draw
conclusions about a population. - It also provides a statement of how much
confidence can be placed in the conclusion.
7Statistical Inference
- Estimation
- Estimating unknown value of a population
parameter. - Hypothesis testing
- Making decisions about the value of a parameter
by testing a pre-conceived hypothesis - - Both types of of inference are based on the
sampling distribution of statistics. - - Both report probabilities that state what
would happen if we used the inference method many
times.
8Statistical Inference
- When do we use estimation?
- When we want to estimate the unknown population
parameters and we do not have any previous
knowledge about the population.
92. POINT AND INTERVAL ESTIMATION
- Point Estimate
- is a single number (our best guess), calculated
from available sample data, that is used to
estimate the value of an unknown population
parameter. - Interval Estimate (Confidence Interval)
- is an interval that provides an upper and lower
bound for a specific unknown population
parameter. - -gt arguably the most useful type of inference.
103. Confidence Intervals RATIONALE
11Following are the data obtained from a study
involving 4 female rest home patients
Platelet counts (in
000, per mm3) 125 170
144 101 The sample mean is 135, and appears to
be a reasonable estimate of the population
mean. Can we conclude that the POPULATION mean
is 135? Why? How reliable is this estimate? A
second sample would definitely not give the same
mean again. A point estimate like this is of
little value unless we indicate its variability.
12Let us suppose that we know (unrealistically)
that the standard deviation, ? of the population
is 50. We know (CLT!) that in repeated sampling
the sample mean follows the normal distribution
centred at the unknown population mean. The
standard deviation of that distribution is
50/2 25
13- We say that we are 99.7 confident that the
unknown population mean for all SA elderly women
lies between 60 and 210. - There are only two possibilities
The CI contains the true ?
The interval of numbers between the values
? 75 is called a 99.7 confidence interval for ? .
14Confidence Intervals
- Confidence Interval
- is an interval computed from a sample that is
accompanied by a specific level of confidence
(probability) of being correct (that is, of
encompassing the true value of the parameter ) - The generic formula
Estimate - Margin of error lt Parameter lt Estimate
Margin of error
Margin of error shows how accurate we believe
our guess (estimate), is based on the variability
of the estimate
155. A Six-Step Confidence Interval Model
Describe the population parameter of concern
STEP 1
STEP 2
Check the assumptions conditions
State the level of confidence
STEP 3
STEP 4
Find the point estimate
Obtain the critical values and calculate CI
STEP 5
Interpret the results
STEP 6
166. Confidence Intervals for a Population Mean ?
s is known
Estimate - Margin of error ltParameter lt Estimate
Margin of error OR Estimate ? Margin of error
17Confidence Intervals for a Population Mean ?
- The two confidence intervals that are used
extensively are the 95 and the 99. - A 95confidence interval means that about 95 of
the similarly constructed intervals will contain
the parameter being estimated.
184. Assumptions Conditions
(i) Observations are independent (this is
always the case when the sample is random) (ii)
The population has a normal distribution, OR the
sample size is large enough (say n gt 30) (iii)
Standard deviation of population, s, is known
19Confidence Intervals Example 1
Example Average potency(strength) of
tablets. Suppose that we want to determine the
average strength of certain tablet
product. Assume that the standard deviation of
the population is known and is equal 0.5 mg. A
random sample of twenty five tablets was taken
and the results are shown in Table 1.
Obtain the 95 confidence interval on the mean
tablet strength.
20Confidence Intervals Example 1
Table 1 Results of 25 Single-Tablet Assays
21Confidence Intervals Example 1
Step 1 Describe the population parameter of
concern ?
Step 2 Check the assumptions conditions
a) Observations are independent since the random
sample was taken b) Assume that the population is
normal! c) ? is known
Step 3 State the level of confidence
95
22Step 5 Obtain the critical values and calculate
the lower and upper confidence limits.
Margin of error
Confidence level Probability
z 90
0.05 1.645 95
0.025 1.96
99 0.005
2.578 z critical value
23Standard Error
C 95 z1.96 C 99 z2.58
For C 95
What is the margin of error? 0.196
24Step 6 Interpret the results. We are 95
confident that the mean tablet potency is
somewhere in the obtained interval.
25GRAPHICAL DESCRIPTION OF C.I. (simulation)
Hence we have calculated an interval with a
method that is guaranteed to catch the true mean
95 of the time in the long run.
26How CI Behave?
- HOW CONFIDENCE INTERVALS BEHAVE?
- What can the user control?
- - confidence level
- - sample size
- What are the desirable properties of the
C.I.? - High Confidence (this would imply that the
method would - give the correct answers almost always)
- Small margin of error (this would mean that his
C.I . has - pinned down the parameter quite precisely)
277. CHOOSING THE SAMPLE SIZE
OBJECTIVE You want to have both a high
confidence level and a small margin of error How
can this be done? By taking a large enough
sample. So if m is your specified margin of
error and z is the critical value for your
desired confidence level the formula for the
sample size is
In our example 2 What is the sample size
required to decrease the margin of error to 0.1?
n ( (z s )/m )2 (1.96 0.5)/ 0.1)2 96
28Cautions When Computing Confidence Interval
- The data must be a SRS from the population.
- If the design used is more complex than the SRS
the formula used might not be correct. - If data is haphazardly collected then there is
no way you can obtain valid inference. - Since the sample mean, is largely affected by
outliers, - outliers can have a large effect on the
Confidence interval as well. - When the sample size is small and the population
is roughly non-normal the true confidence level
might be different from the value C used in
computing the interval.