Title: Confidence Intervals with Means
1Confidence Intervals with Means
2What is the purpose of a confidence interval?
- To estimate an unknown population parameter
3Formula
Standard deviation of statistic
Critical value
statistic
Margin of error
4- In a randomized comparative experiment on the
effects of calcium on blood pressure, researchers
divided 54 healthy, white males at random into
two groups, taking calcium or placebo. The paper
reports a mean seated systolic blood pressure of
114.9 with standard deviation of 9.3 for the
placebo group. Assume systolic blood pressure is
normally distributed. - Can you find a z-interval for this problem? Why
or why not?
5Students t- distribution
- Continuous distribution
- Unimodal, symmetrical, bell-shaped density curve
- Above the horizontal axis
- Area under the curve equals 1
- Based on degrees of freedom
- df n - 1
6Formula
Standard deviation of statistic
Standard error when you substitute s for s.
Critical value
statistic
Margin of error
7How to find t
Can also use invT on the calculator! Need upper
t value with 5 is above so 95 is
below invT(p,df)
- Use Table B for t distributions
- Look up confidence level at bottom df on the
sides - df n 1
- Find these t
- 90 confidence when n 5
- 95 confidence when n 15
t 2.132
t 2.145
8Steps for doing a confidence interval
- Assumptions
- Calculate the interval
- Write a statement about the interval in the
context of the problem.
9Statement (memorize!!)
- We are ________ confident that the true mean
context is between ______ and ______. -
10Assumptions for t-inference
- Have an SRS from population (or randomly assigned
treatments) - s unknown
- Normal (or approx. normal) distribution
- Given
- Large sample size
- Check graph of data
Use only one of these methods to check normality
11- Ex. 1) Find a 95 confidence interval for the
true mean systolic blood pressure of the placebo
group.
- Assumptions
- Have randomly assigned males to treatment
- Systolic blood pressure is normally distributed
(given). - s is unknown
- We are 95 confident that the true mean systolic
blood pressure is between 111.22 and 118.58.
12Find a sample size
- If a certain margin of error is wanted, then to
find the sample size necessary for that margin of
error use
Always round up to the nearest person!
13Ex 4) The heights of PWSH male students is
normally distributed with s 2.5 inches. How
large a sample is necessary to be accurate within
.75 inches with a 95 confidence interval?
n 43
14Hypothesis Tests One Sample Means
15How can I tell if they really are underweight?
A government agency has received numerous
complaints that a particular restaurant has been
selling underweight hamburgers. The restaurant
advertises that its patties are a quarter
pound (4 ounces).
A hypothesis test will allow me to decide if the
claim is true or not!
16Steps for doing a hypothesis test
Since the p-value lt (gt) a, I reject (fail to
reject) the H0. There is (is not) sufficient
evidence to suggest that Ha (in context).
- Assumptions
- Write hypotheses define parameter
- Calculate the test statistic p-value
- Write a statement in the context of the problem.
H0 m 12 vs Ha m (lt, gt, or ?) 12
17Formulas
m
t
18Calculating p-values
- For z-test statistic
- Use normalcdf(lb,rb)
- using standard normal curve
- For t-test statistic
- Use tcdf(lb, rb, df)
19Draw shade a curve calculate the p-value
- 1) right-tail test t 1.6 n 20
- 2) two-tail test t 2.3 n 25
P-value .0630
P-value (.0152)2 .0304
20Example 1 Bottles of a popular cola are supposed
to contain 300 mL of cola. There is some
variation from bottle to bottle. An inspector,
who suspects that the bottler is under-filling,
measures the contents of six randomly selected
bottles. Is there sufficient evidence that the
bottler is under-filling the bottles?
Use a .1 299.4 297.7 298.9 300.2 297
301
21SRS?
Normal? How do you know?
- Since the boxplot is approximately symmetrical
with no outliers, the sampling distribution is
approximately normally distributed
Do you know s?
What are your hypothesis statements? Is there a
key word?
H0 m 300 where m is the true mean amount Ha m
lt 300 of cola in bottles
p-value .0880
a .1
Plug values into formula.
Compare your p-value to a make decision
Since p-value lt a, I reject the null hypothesis.
Write conclusion in context in terms of Ha.
There is sufficient evidence to suggest that the
true mean cola in the bottles is less than 300 mL.
22Matched Pairs Test
- A special type of
- t-inference
23Matched Pairs two forms
- Pair individuals by certain characteristics
- Randomly select treatment for individual A
- Individual B is assigned to other treatment
- Assignment of B is dependent on assignment of A
- Individual persons or items receive both
treatments - Order of treatments are randomly assigned before
after measurements are taken - The two measures are dependent on the individual
24Is this an example of matched pairs?
- 1)A college wants to see if theres a difference
in time it took last years class to find a
job after graduation and the time it took the
class from five years ago to find work after
graduation. Researchers take a random sample
from both classes and measure the number of days
between graduation and first day of employment
No, there is no pairing of individuals, you have
two independent samples
25Is this an example of matched pairs?
- 2) In a taste test, a researcher asks people in a
random sample to taste a certain brand of spring
water and rate it. Another random sample of
people is asked to taste a different brand
of water and rate it. The researcher wants to
compare these samples
No, there is no pairing of individuals, you have
two independent samples If you would have the
same people taste both brands in random order,
then it would be an example of matched pairs.
26Is this an example of matched pairs?
- 3) A pharmaceutical company wants to test its new
weight-loss drug. Before giving the drug to a
random sample, company researchers take a weight
measurement on each person. After a month
of using the drug, each persons weight is
measured again.
Yes, you have two measurements that are dependent
on each individual.
27A whale-watching company noticed that many
customers wanted to know whether it was better to
book an excursion in the morning or the
afternoon. To test this question, the company
collected the following data on 15 randomly
selected days over the past month. (Note
days were not consecutive.)
You may subtract either way just be careful
when writing Ha
Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Morning 8 9 7 9 10 13 10 8 2 5 7 7 6 8 7
After-noon 8 10 9 8 9 11 8 10 4 7 8 9 6 6 9
Since you have two values for each day, they are
dependent on the day making this data matched
pairs
First, you must find the differences for each day.
28Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Morning 8 9 7 9 10 13 10 8 2 5 7 7 6 8 7
After-noon 8 10 9 8 9 11 8 10 4 7 8 9 6 6 9
Differences 0 -1 -2 1 1 2 2 -2 -2 -2 -1 -2 0 2 -2
I subtracted Morning afternoon You could
subtract the other way!
- Assumptions
- Have an SRS of days for whale-watching
- s unknown
- Since the normal probability plot is
approximately linear, the distribution of
difference is approximately normal.
You need to state assumptions using the
differences!
Notice the granularity in this plot, it is still
displays a nice linear relationship!
29Differences 0 -1 -2 1 1 2 2 -2 -2 -2 -1 -2 0 2 -2
Is there sufficient evidence that more whales are
sighted in the afternoon?
Be careful writing your Ha! Think about how you
subtracted M-A If afternoon is more should the
differences be or -? Dont look at numbers!!!!
If you subtract afternoon morning then Ha mDgt0
H0 mD 0 Ha mD lt 0 Where mD is the true mean
difference in whale sightings from morning minus
afternoon
Notice we used mD for differences it equals 0
since the null should be that there is NO
difference.
30Differences 0 -1 -2 1 1 2 2 -2 -2 -2 -1 -2 0 2 -2
finishing the hypothesis test Since p-value
gt a, I fail to reject H0. There is insufficient
evidence to suggest that more whales are sighted
in the afternoon than in the morning.
In your calculator, perform a t-test using the
differences (L3)
Notice that if you subtracted A-M, then your test
statistic t .945, but p-value would be the
same
How could I increase the power of this test?