Title: Stat 155, Section 2, Last Time
1Stat 155, Section 2, Last Time
- Binomial Distribution
- Normal Approximation
- Continuity Correction
- Proportions (different scale from counts)
- Distribution of Sample Means
- Law of Averages, Part 1
- Normal Data ? Normal Mean
- Law of Averages, Part 2
- Everything (averaged) ? Normal
2Reading In Textbook
- Approximate Reading for Todays Material
- Pages 382-396, 400-416
- Approximate Reading for Next Class
- Pages 425-428, 431-439
3Chapter 6 Statistical Inference
- Main Idea
- Form conclusions by
- quantifying uncertainty
- (will study several approaches,
- first is)
4Section 6.1 Confidence Intervals
- Background
- The sample mean, , is an estimate of
the population mean, - How accurate?
- (there is variability, how much?)
5Confidence Intervals
- Recall the Sampling Distribution
- (maybe an approximation)
6Confidence Intervals
- Thus understand error as
- How to explain to untrained consumers?
- (who dont know randomness,
- distributions, normal curves)
7Confidence Intervals
- Approach present an interval
- With endpoints
- Estimate - margin of error
- I.e.
- reflecting variability
- How to choose ?
8Confidence Intervals
- Choice of Confidence Interval radius,
- i.e. margin of error,
- Notes
- No Absolute Range (i.e. including everything)
is available - From infinite tail of normal distn
- So need to specify desired accuracy
9Confidence Intervals
10Confidence Intervals
- Choice of margin of error,
- Approach
- Choose a Confidence Level
- Often 0.95
- (e.g. FDA likes this number for
- approving new drugs, and it
- is a common standard for
- publication in many fields)
- And take margin of error to include that part of
sampling distribution
11Confidence Intervals
- E.g. For confidence level 0.95, want
-
distribution - 0.95 Area
- margin of
error
12Confidence Intervals
- Computation Recall NORMINV takes areas
(probs), and returns cutoffs - Issue NORMINV works with lower areas
- Note lower tail
- included
13Confidence Intervals
- So adapt needed probs to lower areas.
- When inner area 0.95,
- Right tail 0.025
- Shaded Area 0.975
- So need to compute
14Confidence Intervals
- Need to compute
- Major problem is unknown
- But should answer depend on ?
- Accuracy is only about spread
- Not centerpoint
- Need another view of the problem
15Confidence Intervals
- Approach to unknown
- Recenter, i.e. look at distn
- Key concept
- Centered at 0
- Now can calculate as
16Confidence Intervals
- Computation of
- Smaller Problem Dont know
- Approach 1 Estimate with
- Leads to complications
- Will study later
- Approach 2 Sometimes know
17Confidence Intervals
138
139.1
113
132.5
140.7
109.7
118.9
134.8
109.6
127.3
115.6
130.4
130.2
111.7
105.5
- E.g. Crop researchers plant 15 plots with a new
variety of corn. The yields, in bushels per acre
are - Assume that 10 bushels / acre
18Confidence Intervals
- E.g. Find
- The 90 Confidence Interval for the mean value
, for this type of corn. - The 95 Confidence Interval.
- The 99 Confidence Interval.
- How do the CIs change as the confidence level
increases? - Solution, part 1 of
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg22.xls
19Confidence Intervals
- An EXCEL shortcut
- CONFIDENCE
- Careful parameter is
- 2 tailed outer area
- So for level 0.90, 0.10
20Confidence Intervals
- HW 6.5, 6.9, 6.13, 6.15, 6.19
21Choice of Sample Size
- Additional use of margin of error idea
- Background distributions
- Small n
Large n
22Choice of Sample Size
- Could choose n to make desired value
- But S. D. is not very interpretable, so make
margin of error, m desired value - Then get is within m units of ,
- 95 of the time
23Choice of Sample Size
- Given m, how do we find n?
- Solve for n (the equation)
24Choice of Sample Size
- Graphically, find m so that
- Area 0.95 Area
0.975
25Choice of Sample Size
26Choice of Sample Size
- Numerical fine points
- Change this for coverage prob. ? 0.95
- Round decimals upwards,
- To be sure of desired coverage
27Choice of Sample Size
- EXCEL Implementation
- Class Example 22, Part 2
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg22.xls - HW 6.22 (1945), 6.23
28Interpretation of Conf. Intervals
- 2 Equivalent Views
- Distribution
Distribution - 95
- pic 1
pic 2
29Interpretation of Conf. Intervals
- Mathematically
- pic 1
pic 2 - no pic
30Interpretation of Conf. Intervals
- Frequentist View If repeat the experiment
many times, - About 95 of the time, CI will contain
- (and 5 of the time it wont)
31Confidence Intervals
- Nice Illustration
- Publishers Website
- Statistical Applets
- Confidence Intervals
- Shows proper interpretation
- If repeat drawing the sample
- Interval will cover truth 95 of time
32Interpretation of Conf. Intervals
- Revisit Class Example 17
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg17.xls - Recall Class HW
- Estimate of Male Students at UNC
- C.I. View Class Example 23
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg23.xls - Illustrates idea
- CI should cover 95 of time
33Interpretation of Conf. Intervals
- Class Example 23
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg23.xls - Q1 SD too small ? Too many cover
- Q2 SD too big ? Too few cover
- Q3 Big Bias ? Too few cover
- Q4 Good sampling ? About right
- Q5 Simulated Bi ? Shows natural varn
34Interpretation of Conf. Intervals
35And now for somethingcompletely different.
- A fun dance video
- http//ebaumsworld.com/2006/07/robotdance.html
- Suggested by David Moltz
36Sec. 6.2 Tests of Significance
- Hypothesis Tests
- Big Picture View
- Another way of handling random error
- I.e. a different view point
- Idea Answer yes or no questions, under
uncertainty - (e.g. from sampling or measurement error)
37Hypothesis Tests
- Some Examples
- Will Candidate A win the election?
- Does smoking cause cancer?
- Is Brand X better than Brand Y?
- Is a drug effective?
- Is a proposed new business strategy effective?
- (marketing research focuses on this)
38Hypothesis Tests
- E.g. A fast food chain currently brings in
profits of 20,000 per store, per day. A new
menu is proposed. Would it be more profitable? - Test Have 10 stores (randomly selected!) try
the new menu, let average of their daily
profits.
39Fast Food Business Example
- Simplest View for
- new menu looks better.
- Otherwise looks worse.
- Problem New menu might be no better (or even
worse), but could have - by bad luck of sampling
- (only sample of size 10)
40Fast Food Business Example
- Problem How to handle quantify gray area in
these decisions. - Note Can never make a definite conclusion e.g.
as in Mathematics, - Statistics is more about real life
- (E.g. even if or
, that might be bad luck of sampling, although
very unlikely)
41Hypothesis Testing
- Note Can never make a definite conclusion,
- Instead measure strength of evidence.
- Approach I (note different from text)
- Choose among 3 Hypotheses
- H Strong evidence new menu is better
- H0 Evidence is inconclusive
- H- Strong evidence new menu is worse
42Caution!!!
- Not following text right now
- This part of course can be slippery
- I am breaking this down to basics
- Easier to understand
- (If you pay careful attention)
- Will tie things together later
- And return to textbook approach later
43Hypothesis Testing
- Terminology
- H0 is called null hypothesis
- Setup H, H0, H- are in terms of parameters,
i.e. population quantities - (recall population vs. sample)
44Fast Food Business Example
- E.g. Let true (over all stores) daily
profit from new menu. - H (new is
better) - H0 (about the
same) - H- (new is worse)
45Fast Food Business Example
- Base decision on best guess
- Will quantify strength of the evidence using
probability distribution of - E.g. ? Choose H
- ? Choose
H0 - ? Choose
H-
46Fast Food Business Example
- How to draw line?
- (There are many ways,
- here is traditional approach)
- Insist that H (or H-) show strong evidence
- I.e. They get burden of proof
- (Note one way of solving
- gray area problem)
47Fast Food Business Example
- Assess strength of evidence by asking
- How strange is observed value ,
- assuming H0 is true?
- In particular, use tails of H0 distribution as
measure of strength of evidence
48Fast Food Business Example
- Use tails of H0 distribution as measure of
strength of evidence -
distribution -
under H0 - observed
value of - Use this probability to measure
- strength of evidence
49Hypothesis Testing
- Define the p-value, for either H or H-, as
- Pwhat was seen, or more conclusive H0
- Note 1 small p-value ? strong evidence
against H0, i.e. for H (or H-) - Note 2 p-value is also called observed
significance level.
50Fast Food Business Example
- Suppose observe ,
- based on
- Note , but is this
conclusive? - or could this be due to natural sampling
variation? - (i.e. do we risk losing money from new menu?)
51Fast Food Business Example
- Assess evidence for H by
- H p-value Area
52Fast Food Business Example
- Computation in EXCEL
- Class Example 22, Part 1
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg24.xls - P-value 0.094.
- 1 in 10, could be random variation,
- not very strong evidence