Title: CONFIDENCE INTERVALS
1CONFIDENCE INTERVALS
2Outline
- Return / Review Exams
- Confidence Intervals when ? is known
- Margin of Error
- Break
- Confidence Intervals when ? is unknown
- In class project
3A flighty example
- Say a biologist wants to determine the home range
of a particular species of bird. - From all the population of this species, the
biologist collects a sample of 100 birds and
attaches a radio transmitter to the leg of each
bird. - After monitoring the position of the birds for 1
month, the biologist finds that the mean home
range of the sample is 23.1 km. - Is this the exact value of the population?
NO!
4Point Estimation
- The question we can ask is, Well how close is
23.1 to the population mean? - This value (23.1) is called the point estimate,
it is the value we will use to estimate the
population value - This is the Best Guess of the value of the
population parameter, given your sample
information - The way we will answer this question is to
generate a range of values for which we can be
reasonably confident that the population value
falls - This process is called interval estimation
- The resulting range of values is called a
confidence interval.
5Point Estimate Example
- The signing bonus for 30 new players in the NFL
are used to estimate the mean bonus for all new
players. The sample mean is 130,000 with a
standard deviation of 25,500. What is the point
estimate of the mean signing bonus for all new
NFL players? - Answer the sample mean is 130,000 so this is
your point estimate of the population mean, ?
6Returning to our flighty example
- The biologist has a point estimate of 23.1 km,
and can guess that the true population is
probably between 20 and 26. - While the biologist may be confident that the
true range is between 20 and 26, she may be even
more confident that with a range of 15-30. - Thus, the wider the interval estimation, the
greater the confidence that we have that the
interval contains the population mean.
7Confidence Intervals
- Now, it is possible and preferable to be more
quantitative about how we go about determining
the interval (rather than just guessing). - This is what well do for the rest of class
- So, a confidence interval provides a range of
numbers along with the percentage confidence that
the parameter lies within - A 95 confidence interval means that 95 of
similarly constructed intervals will contain the
population parameter - Note also that although there are many different
CIs that we could construct, in practice the 90,
95, and 99 CI are used most often.
8Computing Confidence Intervals Sampling Error
- Last class, we saw that whenever we use a sample
to estimate a population characteristics, we are
going to have some amount of sampling error. - Sometimes, we will overestimate the true value
- Sometimes, we will underestimate the true value
Sample 2
Sample 3
Sample 4
Sample 1
True Value
9Standard Error
- But, we also know (from the central limit
theorem) that if our sample size is sufficiently
large (n gt 30), the sampling distributions of the
sample means will be approximately normal. - Thus, 95 of the sample means will fall between
2 standard deviations from the mean of the
population - The standard deviation of the sample means
Standard Error, or the degree to which particular
means of samples are typically in error as
estimates of the mean of the population
10What does this mean?
- When we collect a sample of data, we can be
reasonably certain that the true population value
falls within 2 standard deviations (plus or
minus) of our sample mean! - Bottom line If you have?X and add and subtract
about 2 standard deviations from it, this is 95
confidence interval
11Elements of a Confidence Interval
Sample statistic (point estimate)
Confidence interval
Confidence limit (lower)
Confidence limit (upper)
12Calculating confidence intervals
- Two methods
- When the standard deviation of the population is
known - When the standard deviation of the population is
unknown - When will we already know the standard deviation
of the population? - Well, most of the time this value is unknown
- However, sometimes a previous study (or group of
studies) has established the standard deviation
for a population. - What else do we need?
- Set level of confidence
13Level of Confidence and Alpha level
- What determines the width of our confidence
interval? - We do! Before we start calculating the CI, we
must determine our level of confidence - This is another way of saying How comfortable am
I with being wrong - Alpha (?) indicates the probability that our
confidence interval does not include the
population parameter - The level of confidence we set determines our
alpha - Level of Confidence 1 - ?
- Thus, if we set a confidence of 95, our alpha
will be (1-95) or 5 - So, if we have a 95 CI, this means that we will
have a 5 chance that our CI will not include the
true value.
14Confidence Interval Procedure when ? is known.
15Dividing ? in half
- Note that when we set alpha, we need to consider
the fact that sometimes our point estimate will
underestimate the population value and sometimes
will overestimate the true value. - So this means that for a 95 CI, we want 2.5
chance of making an overestimation and 2.5
chance of making an underestimation.
16Common Levels of Confidence
- Commonly used confidence levels are 90, 95, and
99
Confidence Coefficient,
Normalz value,
Confidence Level
1 - ?
Z ?/2
1.28 1.645 1.96 2.33 2.58 3.08 3.27
.80 .90 .95 .98 .99 .998 .999
80 90 95 98 99 99.8 99.9
17Confidence Depends on Interval (z)
?X? ? Z??x
?X
?
?1.65??x ?2.575??x
?-2.575??x ?-1.65??x
?1.96??x
?-1.96??x
90 CI
95 CI
99 CI
18Guidelines for using this procedure
- For small sample sizes, (n lt 15) z curve can only
be used with the variable under consideration is
normally distributed - For moderate sample sizes, (15 lt n lt 30) z curve
can be used unless the data contains extreme
values or the sample is not normally distributed - For large sample sizes (n gt 30), the z curve can
be used no matter what the distribution of the
variable under consideration.
19Factors Affecting Interval Width
Intervals Extend from?X - Z??X to?X Z??X
- 1. Data Dispersion
- Measured by ?
- 2. Sample Size
- ??X ? / ?n
- 3. Level of Confidence (1 - ?)
- Affects Z
20Lets return to our flighty example
- Our biologist found a point estimate of 23.1 km.
- Say the biologist knows that the standard
deviation of the population of birds is 4.7km - To find the 95CI,
- 23.1 1.96 (4.7/?100)
- 23.1 .9212
- 95 CI 22.18 to 24.02
21More examples on board
22Does interval estimation work?
- The best way to test whether this works is to run
a simulation. - http//www.ruf.rice.edu/7Elane/stat_sim/conf_inte
rval/index.html
23Margin of Error
This part of equation is the margin of error or
E
24Margin of Error
- Sometimes, we will have in mind a specific margin
of error before we start our study. - For example, some political pollsters say I want
to determine the job approval for candidate A
with a margin of error of some value - When we have a predetermined margin of error, we
can determine the sample size needed to get the
estimate of the population value, within that
margin
25Margin of Error Equation
- E (Z?/2 x ?) / ?n
- Through some algebra, we get
- n (Z?/2 x ?)/n2
- Lets do an example
26Margin of Error Example
27Class project