Title: APSTAT UNIT 4A INFERENCE PART 1
1APSTAT UNIT 4AINFERENCE PART 1
2APSTAT Chapter 18Sampling Distributionsand
Sample Means
3Lets Just DO IT!!!!
Proportion of correct answers on last AP Stat Exam
Regular Old Distribution
.55-.59 .60-.64 .65-.69 .70-.74 .75-.79 .80-.84 .8
5-.89
4Lets Just DO IT!!!!
Proportion of correct answers on last AP Stat Exam
- Now, Everyone take a 5-person random sample
- Do randint(1,13,5) to choose your subjects
- Add their scores and divide by 5 to get x-bar
(sample mean) - Now we will do a distribution of our sample
means a SAMPLING DISTRIBUTION!!!!!
5Lets Just DO IT!!!!
Proportion of correct answers on last AP Stat Exam
Class Sample Means
Sampling Distribution
.55-.59 .60-.64 .65-.69 .70-.74 .75-.79 .80-.84 .8
5-.89
6We Just DID IT!!!!
- Give me at least 2 things that are different
between the regular distribution and the sampling
distributions -
-
-
-
-
-
7Bias
- Unbiased Statistic
- Mean of Sampling distribution should equal True
population mean. - How did ours look earlier? The true mean of the
population was about 72.3.
8Sample Proportions
- Mean of Sample Proportion
- In last section
- Proportion is just outcome divided by n, so.
9Standard Deviation of Sample Proportion
- In last section
- Proportion is just outcome divided by n, so throw
down a little Algebra.
10Now try itSample Proportions
- Find mean and standard deviation for
- 60 samples of 10 coin flips, p.5
- 60 samples of 50 coin flips, p.5
- What does this say about variability in regards
to sample size?
112 Rules of Thumb - Assumptions/Conditions
- Population size large enough
- Population should be at least 10 times the sample
size - 10 Condition
- Normalness
- n should be large enough to produce an
approximately normal sampling distribution. - np gt 10 AND n(1-p) gt10
12Try them out
- A San Jose firm decide to sample 25 residents to
determine if they oppose off-shore oil drilling.
They predict that P(oppose) 0.4 - Large enough population?
- Normalness?
13Example.
- If the true percentage of students who pass the
APStat exam is .64, what is the probability that
a random sample of 100 students will have at
least 70 students pass?
14?.64, n100, 70 or more
- Check Conditions - Briefly Explain
- 10
- np and n(1-p) gt 10
- Draw Picture- (Find SD too)
- Find P-Value
- Conclusion
15Same Problem Data
- Within what range would we expect to find 95 of
sample proportions of size 100.
16Sample Means
- Take a whole bunch of samples and find the means
- Why sample means?
- Remember our sample of class scores?
- Less variable
- More normal
17Mean and Standard Deviation of X-BAR
- If we take all the possible samples from a
population, the mean of the sampling distribution
will equal the population mean (if the population
mean was accurate in the first place, but more on
that later) -
18Mean and Standard Deviation of X-BAR
- Standard Deviation of a sampling distribution is
-
19Lets try it!
- If adult males have height N(68,2) what would be
the mean and standard deviation for the
distribution if - n10
- n40
- What happened to the Standard deviation when n
was quadrupled? - What would happen to the standard deviation if n
was multiplied by 9?
20CLT The Central Limit Theorem
- If the population we are sampling from is already
normal with N(?,?), the sampling distribution
will be normal as well with mean ? and standard
deviation - But what if the population we are sampling from
is not normal?
21Age of Pennies
- Riebhoff has 50 pennies, he took the current year
and subtracted it from the date on the penny to
obtain the following data
22Penny Ages
23Sample Size n1
24Sample Size n4(everyone do 3 SRS)
25Sample Size n8(everyone do 3 SRS)
26What happened?
- The distribution got normaler as the sample
size increased. Cool? - Central Limit Theorem says that even if a
distribution is not normal, the distribution of
the sampling distribution will approach normalcy
when n is large. - Allows us to use z-scores and such, even when the
larger population is not normally distributed.
27Assumptions/Conditions
- Random Sample - Always describe
- Independence - Describe
- 10 Condition
28Try it
- If the APSTAT EXAM 2005 had a mean score of 3.2
with a standard deviation of 1.2 - Old Skool - Find the probability that a single
student would have a score of 4 or higher? - New Skool find the probability that an SRS of
20 students would have a score of 4 or higher?
29Old Skool - Find the probability that a single
student would have a score of 4 or higher?
30New Skool find the probability that an SRS of
20 students would have a score of 4 or higher?
- Check Conditions - Briefly Explain
- 10
- Independence and Random Sample
- Draw Picture- (Find SD too)
- Find P-Value
- Conclusion
31Standard Error
- Sometimes we do not have the population standard
deviation. If we have to estimate it, we call it
Standard Error and roll an SE.
32APSTAT Chapter 19Confidence Intervals for
Proportions
33Confidence Interval for a Proportion (aka
One-Proportion Z-Interval)
- At Woodside High, 80 students are surveyed and
32 of them had tried marijuana. - How confident am I that the true proportion of WH
students that have tried marijuana is at or near
32? - CONFIDENCE INTERVAL!!!
34The Dealio
- If I do know the population mean
- If I sample, I know the sample mean might be
quite different than the population mean - BUTThat difference is predictable.
- For instance, if N(0.70,0.1) and n4
- Sample Mean 0.70, Sample SD0.1/sqrt4.05
- We expect 95 of samples (Empirical Rule) to fall
between 2 SD of the mean - Therefore 95 of samples will fall between 0.6
and 0.8.
35Confidence Intervals
- Work in reverse
- (From Woodside High Example) I sampled 80 and got
sample p .32 - I want to know the true population proportion.
- The true population proportion will lie within
2SD of the Sample Proportion in 95/100 samples of
this size. - Lets Do It!!!!
36Do It!
- List what you know
- p-hat.32, n80
- Conditions/Assumptions
- 10 for Independence
- Woodside HS has over 800 students
- np and nq gt 10 to use Normal Model
- Both .32 x 80 and .68 x 80 gt 10
- Find Standard Error
- SE(p-hat)
37Do It!
- Draw the Picture
- Conclusion
- We are 95 confident that the TRUE mean
proportion of ____________ falls between ____ and
____
38Do It!
- We can also write confidence intervals in the
form - (estimate) (margin of error)
Standard error
39What Does 95 Confidence Mean?
- If we did a whole bunch of confidence intervals
at this sample size, we would expect 95 out of
100 intervals to contain the true mean. - Picture of this
TRUE POPULATION PROPORTION
40AHOY!
- We do not always want 95 confidence.
- Example, if a part on an airplanes landing gear
needs to be a certain size to work, wouldnt you
want a little more confidence in the sample being
within certain parameters? - Common Intervals are 90, 95 and 99
- Denote as C.90, C.95, or C.99
41But 90 and 99 arent Empirically Cool
- We need this z-score! Its critical!
- So critical, it is called the critical value and
denoted as z
42Mas z
- Now check t distribution critical values chart
(back of book or formula sheet) - Look at bottom. It gives you C and right above
it is.. - Yeah!
43Try it!
- A poll asked who would you vote for if an
election were held today between Sen. Barack
Obama and Sen. John McCain. 115 of the 250
respondents chose Sen. McCain. Construct and
interpret a 90 confidence interval for the
proportion of voters choosing McCain.
44Try It!
- Conditions
- Mean, SE, z
- Calculate CI
- Conclusion
45Last thing
- Finding sample size needed for a CI with a given
level of confidence and a given margin of error - NBC News is doing a poll on who will be the next
Governor of California. The want a 3 margin of
error at a 95 confidence interval. What sample
size should they use?
46Sample size needed
Margin of Error
47Sample size needed
Why 0.5? Gives us largest n value. Safety First!
OOPS! YOU SHOULD ALWAYS ROUND UP TO STAY WITHIN
CONFIDENCE INTERVAL! SHOULD BE 1068.
48APSTAT Chapter 20One ProportionHypothesis
Tests
49Significance Tests
- Example. AP Stat Exam 2005
- National Proportion Who Passed .58
- Priory Students n 32, p-hat.78
- Two Possibilities
- Higher WPS proportion just happened by chance
(natural variation of a sample) - The likelihood of 78 of 32 students passing is
so remote we must conclude that Priory Students
are likely better at APStat than national
average.
50Hypothesis Testing
- Reflect our two possibilities from above
- NOTHING IS STRANGE (difference could have been by
natural variation of sample) - SOMETHIN IS GOIN ON (difference is so
improbable we must assume there is a difference) - Here is how we write them
- H0 Null Hypothesis (Nothing Strange)
- Ha Alternative Hypothesis (Somethin is goin on)
51In our WPS SAT Example
- In practice, we describe the hypotheses in both
symbols and words - H0 p .58, Priory students perform at the same
level as the National Average - Ha p gt .58, Priory students perform better than
the National Average - We will perform test(s) that give evidence
against the H0 (kinda like a trial)
52What to do with the Hypothesis
- After we conduct a test we will have evidence
based on our understanding of probability and
sample variation. With this info we can - Reject H0 in favor of Ha
- if there is SIGNIFICANT evidence that the result
did not likely happen by chance variation. - Fail to Reject H0
- if there is not enough evidence to reject it.
The variation could likely have happened by chance
53Be Carefull
- Notice we NEVER, NEVER, NEVER
- Accept either Hypothesis
- Say one or the other is true or false
- We only have evidence, we could still be wrong.
- BUT.the stronger the evidence the more confident
we can be!
54Where do we get evidence?
- One way, P-value from a z-score. What is the
probability that this event happened given the
population mean, standard deviation and in our
trial? - Our old friend, the z-score
We are using a sample here, so we throw in our
sample standard deviation.
55Lets Do It! WPS SAT Example
- Step 1 Define Parameter
- p the true passing proportion of WPS APstat
test-takers - Step 2 Hypotheses
- H0 p .58, Priory students perform at the same
level as the national proportion - Ha pgt .78, Priory students perform better than
the national proportion
56WPS SAT Example Continued
- Step 3 Assumptions
- SRS
- No, but we will assume WPS Students are a
representative sample of the population of all AP
Stat test-takers. - Independence
- Priory sample of 32 is less than 10 of
population of AP Stat test-takers - ????????????????
- .58(32) and .42(32) both gt 10
57WPS SAT Example Continued
- Step 4 Name Test and DO IT
- One Sample Z-Test for a Proportion
58WPS SAT Example Continued
- Step 5 P-value and sketch of normal curve
- P(zgt 2.31) .01053
- Step 6 Interpret P-value and Conclusion
- A P-Value of .01053 indicates that there is about
a 1 in 100 chance that a result this distant from
the p happened merely by chance. Therefore,
reject H0 in favor of Ha. It is very likely that
WPS students performed far better on average than
the National Average on the 2005 APStat exam
59PHAT-PI (MUCH LOVE TO AL YOUNG)
- P - Parameters (What are we studying)
- H - Hypothesis (In words and symbols)
- A - Assumptions (depends on type of test)
- T - Test (Name it. Do it.)
- P - P-Value (Calculate it-Draw it)
- I - Interpret (Reject/Fail to Reject, Why, ATQ)
60Alternative Hypotheses
- Can be
- Greater Than (Ha ------gtblah)
- Less Than (Ha --------ltblah)
- Not (Ha --------?blah)
- Double your one-sided P-value
?0
61On TI-83
- Still have to do all of Phat Pi, but helps with
calculations. - StatgtTestgt1-PropZTest
- p0 - Population proportion
- x - successes in sample n(p-hat)
- n - sample size
- Do it for AP Stat Example
62Defective Products
- A company claims that just 3 of its products are
defective. A simple random sample of 400 of
their products yielded 14 defective items. Do
these sample data suggest that the companys
claim is too low?
63PHAT-PI
64PHAT-PI
65How Much Evidence?
- GTang (and many texts) give a rule of thumb of
5. If there is a 5 probability or less that
the outcome would happen by chance, you can throw
down the enough evidence to reject H0 - If it is 1 or less, you can throw down the very
strong evidence against H0. Reject H0 in favor
66Significance level
- Sometimes a problem will specify a certain amount
of evidence that is needed. - ? Significance Level
- Usually ? 0.05 or 0.01
- Basically, your P-value must be below that level
to reject the null hypothesis. - Example your p-value is .03 and ? 0.05
- Be careful with one and two-sided alternatives
and significance levels - Your p-value doubles in a 2-sided.
67APSTAT Chapter 21More Stuff About Hypothesis
Tests
68Great Chapter
- Make sure you read it
- Important concepts
- What a Null Hypothesis is and isnt
- What P-Value Means
- Significance (?) Level
- Critical Value - One v. Two sided
- Confidence Intervals and Tests of Significance -
Relationship Between
69Great ChapterBut.
- Goes further than you need in explaining
- Types of Error
- Type I
- Type II
- Power
70Errors - Can We Make Mistakes?
- Sure, Rejecting a Good Shipment
- For Example, I need batteries that work 99 of
the time. My significance test of a sample from
a battery shipment tells me to reject the
shipment, but it is actually ok. - We Can also Fail to Reject a Bad Statement
- If I had accepted a shipment that was actually
bad because my sample proportion ended up close
to the mean I was looking for. - Which of these is worse in real life?
71Errors
- Type I Reject H0 when it is actually true
- Usually not so bad
- Rejecting a good shipment
- Probability is equal to ?
- Type II Failing to Reject H0 when it is
actually false - Usually bad
- Accepting a bad shipment
- Probability (?) is a bear to calculate
- Check book to see how! Ooooo, fun!
- Be happy you will NEVER be asked to do it
72Errors - 2
- Decrease both Type I and II errors by
- Increasing n
- Decrease Type II Errors by
- Increasing ?
- You end up rejecting more/failing to reject less
- Causes an increase in Type I errors
73POWER
- Basically, how sure we are that we will not get a
Type II error - Power 1 P(Type II)
- OR Power 1 - P(?)
- Never will you be asked to compute (unless the
probability of a type II error is given) - Increase Power by
- Increasing n (Sample size)
- Increase ? (say from .01 to .05)
74Power and Error Wrap
- What you have to know
- Explain Power, Type I, and Type II errors in
context of the problem. - Calculate P(Type I error) given ?
- How to Decrease
- Type I Error
- Type II Error
- How to increase Power
75APSTAT Chapter 22Two Proportion Hypothesis
Tests
76Lets Hop Right In
- A recent report found that men wash their hands
75 of the time after using the restroom and
women 85 of the time. If SRSs of 1200 men and
1100 women were surveyed, can we statistically
say there is a significant difference between
hand washing habits of men and women?
77Handwashing
- Parameter (group 1female, 2male)
- ?1-p2 Difference between female and male hand
washing proportions - Hypotheses
- H0 p1-p20 No difference in hand washing
- Ha p1-p2?0 Is a
78Handwashing
- Assumptions
- SRSs Yep
- Independent samples Safe to Assume
- n1p1gt5 and n1(1-p1)gt5
- n2p2gt5 and n2(1-p2)gt5 Yep
- Population 10X Sample Yep
79Handwashing
- Test Two Sample Proportion Z-Test
POOLED
Pool if variances are equal (since our null
theorizes that the populations and thus the
variances - are equal)
80Handwashing
- P-Value
- 2P(Zgt5.965)Really Really Really Small
- Interpretation
- P Value is so small, there is VERY significant
evidence against the assumption that males and
females wash hands at the same proportion.
Reject Null Hypothesis in favor of the
Alternative. Males and females almost assuredly
have different hand washing proportions.
81Pooled vs. Non-Pooled
- Use Pooled when you hypothesize populations have
the same variance (in proportions, the same p
same variance) - Use Non-pooled when populations are likely to
have separate variances. (If your null shows a
non-zero difference)
82Confidence Interval
Use Non-Pooled because there is no null to test
for.
83So, To Review
- PhatPi is on, but with these changes
- P Parameter of interest is now the difference
between ___ and ___ - H H0 p1p2 (or p1-p20)
- Ha p1gtp2 (or p1-p2gt0)
- or Ha p1ltp2 (or p1-p2lt0)
- or Ha p1?p2 (or p1-p2 ? 0)
- Plus, you have to choose Pooled v. Non-Pooled
(Pooled if Null is p1p2)
84Using TI 83
- StatgtTestgt2-PropZTest
- Can Also Do Interval
- StatgtTestgt2-PropZInt
- Put in C-Level (usually .9, .95, or .99)
85Lets do one!
- Some scientist suggest that sickle-cell traits
protect against malaria. A study in Africa
tested 543 for sickle-cell trait and also for
malaria. In all, 136 of the children had
sickle-cell trait and 36 of these had malaria.
The other 407 children lacked the sickle-cell
trait and 157 of them had malaria. Is there
evidence that malaria infection is lower among
children with the sickle-cell trait.
86Malaria v. Sickle Cell
87Malaria v. Sickle Cell
88Do Using a 95 C-Interval
- Assumptions
- Interval Calculation
- Interpretation
89That is it!
- Just one section left to go!