Title: 45-733: lecture 10 (chapter 8)
145-733 lecture 10 (chapter 8)
2CI for difference in means, N()
- Suppose we have two populations we are interested
in - Sample from population X X1, X2,, Xn
- X has mean
- X has variance
- Sample from population Y Y1, Y2,, Ym
- Y has mean
- Y has variance
3CI for difference in means, N()
- Suppose we have two populations we are interested
in - We are interested in a CI for
- E(X)-E(Y)
- ?X- ?Y
- Example How much lower are defect rates at the
Cleveland plan compared to the Pittsburgh plant?
4CI for difference in means, N()
- Case 1 matched pairs
- When we do our sampling, we sample an equal
number from populations X and Y - When we do our sampling, we carefully match our
choices from each population to try to make sure
all characteristics other than the one we are
interested in are the same - Example We select 100 items from Clv and Pgh
plants, matching by type of product, day of week,
hour of day, week of year, etc
5CI for difference in means, N()
- Case 1 matched pairs
- We are interested in E(X)-E(Y)
- Notice, that we can essentially sample from X-Y
by looking at dixi-yi - E(D)E(X)-E(Y) by rules of expectation
- V(D)V(X)V(Y) by rules of variance
- So, we can just use everything we know about
making CI for means!
6CI for difference in means, N()
- Case 1 matched pairs
- Notice, if the sample size is large, we do not
need to assume normal distributions - If the sample size is large, then the CLT will
assure
7CI for difference in means, N()
- Case 1 matched pairs
- Example problem pg 311, number 34
8CI for difference in means, N()
- Case 2 independent samples
- In this case, we just literally have two totally
independent samples. - Number of obs may be different
- Not carefully matched
- Example What is the difference between our
companys mean compensation of salespeople and
other companies in our industry? - Not likely to be carefully matched because too
expensive - Much effort to find employees of other companies
with the same education, experience, past
performance, etc as ours
9CI for difference in means, N()
- Case 2 independent samples
- We are interested in E(X)-E(Y)
- But the populations may also differ in variance,
so that we must also be concerned with V(X) and
V(Y) - Well, we can still calculate
10CI for difference in means, N()
- Case 2 independent samples
- We are interested in ?X- ?Y
- It would be natural, therefore, to look at
- Well,
11CI for difference in means, N()
- Case 2 independent samples
- If X and Y are from a normally distributed
population, then X-bar and Y-bar are normally
distributed - Since the sum or difference of normals is normal
12CI for difference in means, N()
- Case 2 independent samples
- We can easily calculate the variance
- So,
13CI for difference in means, N()
- Case 2 independent samples
- Normalizing
14CI for difference in means, N()
- Case 2 independent samples
- Of course, we dont know the variance
- If the sample sizes are large, the CLT comes to
the rescue
15CI for difference in means, N()
- Case 2 independent samples
- Of course, we dont know the variance
- If the sample sizes are large, the CLT comes to
the rescue - Notice, if we are using the CLT, we do NOT need
to assume that X and Y are normal in this case!
16CI for difference in means, N()
- Case 2 independent samples
- Example pg 312, number 36
17CI for difference in means, N()
- Case 2 independent samples
- Of course, we dont know the variance
- If we know that the population variances are the
same
18CI for difference in means, N()
- Case 2 independent samples
- If we know that the population variances are the
same
19CI for difference in means, N()
- Case 2 independent samples
- Population variances the same
- Then, as with our previous arguments
20CI for difference in proportions
21CI for difference in proportions
- Suppose we have two populations we are interested
in - Sample from population X X1, X2,, Xn
- X is Bernoulli
- X has parameter px
- Sample from population Y Y1, Y2,, Ym
- Y is Bernoulli
- Y has parameter pY
22CI for difference in proportions
- Suppose we have two populations we are interested
in - We are interested in a CI for
- E(X)-E(Y)
- pX- pY
- Example How much less likely are women than men
to vote for George W Bush?
23CI for difference in proportions
- Since we are interested in
- We will want to examine
- This has mean and variance
24CI for difference in proportions
- Obviously, the two sample proportions are sample
means - So, for large samples, we can apply a CLT
25CI for difference in proportions
- Example pg 313, number 44
26CI for difference in variance
- Suppose we have two populations we are interested
in - Sample from population X X1, X2,, Xn
- X has mean
- X has variance
- Sample from population Y Y1, Y2,, Ym
- Y has mean
- Y has variance
27CI for difference in variance
- Suppose we have two populations we are interested
in - We are interested in a CI for
- V(X)/V(Y)
- ?X/ ? Y
- Example How much lower is the variance in
income in Sweden compared to the US?
28CI for difference in variance
- Suppose we have two populations we are interested
in - We are interested in a CI for
- V(X)/V(Y)
- ?X/ ? Y
- Example How much lower is the variance in
income in Sweden compared to the US?
29CI for difference in variance
- Since we are interested in
- We will examine
- We know that
30CI for difference in variance
- The F-distribution
- The ratio of two independent chi-squared random
variables (each divided by their respective
degrees of freedom) is distributed with the
F-distribution
31CI for difference in variance
32CI for difference in variance
- The F-distribution
- ?1 is called numerator degrees of freedom
- ?2 is called denominator degrees of freedom
- The F-distribution is compiled in an F-table in
our book
33CI for difference in variance
- The F-distribution
- We know that
- So that means
34CI for difference in means, N()
- The F-distribution
- Then that means
35CI for difference in variance
36CI for difference in variance
- Making the CI
- Our table is limited
- Our table has tabulations for PFgta?
- For ?0.05
- For ?0.01
- And for various values of numerator and
denominator degrees of freedom - (Picture)
37CI for difference in variance
38CI for difference in variance
- Making a 90 CI (picture)
- This can be had from the table
39CI for difference in variance
- Making a 90 CI (picture)
- This can be had from the table
40CI for difference in variance
- Example
- Two classes
- Class 1 has 30 students
- Class 2 has 22 students
- Variance
- Class 1 std dev on a test is 10
- Class 2 std dev on a test is 15
- Question assuming normality, is there a
difference in the variance of the test scores?