Inference about the difference of statistical analysis - PowerPoint PPT Presentation

1 / 35

About This Presentation

Title:

Inference about the difference of statistical analysis

Description:

A sample of 5 students were selected to take an SAT preparatory course. ... Assumption: Independent samples, n1 30 and n2 30. ... Large-Sample Hypothesis Test of ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 36

Provided by: JJ16

Category:

more less

Transcript and Presenter's Notes

Title: Inference about the difference of statistical analysis

1
Chapter 9

Inference about the difference of statistical
analysis

2
Sec. 9.1 Introduction

Experiment design The procedure used to choose
and assign subjects to the two groups.
Two types of design for comparative experiment
The sample assigned to group 1 is selected
independently of the sample assigned to group 2.
matched design

3
1b. All observations in this group are
independent of each other. This is ok.
Samples from population 1.
Samples from population 2
We also need the measurements in each group to
come from a normal distn. This is probably ok.
1a. All observations in this group are
independent of each other. This is ok.
2. An observation in one group must be
independent of an observation in the opposing
group.
4
Example 9.1

A survey of 436 workers showed that 192 of them
said that it was seriously unethical to monitor
employee e-mail. When 121 senior-level bosses
were surveyed, 40 said that it was seriously
unethical to monitor employee e-mail.
Let ?w and ?B be the population proportion of
workers and bosses that feel its unethical to
monitor e-mail respectively.

5
Example 9.2

A sample of 5 students were selected to take an
SAT preparatory course. They took the SAT exam
before they took the course and then they took it
again after the course.
Student A B C D E
SAT Before700 840 830 860 840
SAT After 720 840 820 900 870
Let ?B denote the mean TSE score before the
course,
?A the mean TSE score after the course.

6
Sec. 9.2 Inference about difference between two
population proportions
7
Example 9.1 (continued)Find 80 CI for ?w- ?B

First find an point estimate of this difference.

2.The standard error of ?w- ?B is estimated by
Pw 192/436 0.4403 and
pB 40/1210.3305 respectively. This gives a
standard error of ?w- ?B

3. 80CI for ?w- ?B is
Where z
Hence the desired CI is

10
Confidence Interval for ?1 -?2

Application 2 Bernoulli populations
Assumption Independent samples, n1gt30 and n2gt30.
A (1-?)100 confidence interval for ?1 -?2 is
given by

11
Exercise 9.1

A study suggests that nicotine-laced gum helps
smokers to stop smoking. The study shows that 29
of 106 smokers who chewed nicotine-laced gum
remained smoke free for 1 year and 16 out of 100
smokers who chewed regular gum remained smoke
free for 1 year. Use this information to find a
98 confidence interval for the difference
between the proportions of smokers who
successfully use nicotine-laced gum and those who
successfully use regular gum.

12
Large-Sample Hypothesis Test of ?1-?2
13
Example 9.1 (continued)

Given those data, Is the evidence sufficient to
suggest that the larger percentage of workers
feel that its unethical to monitor email.
Solution 1.That is to test

Vs.
14

2.Under H0 the standardized test statistic is
where p(19240)/(43640)0.4165. as an estimate
for ? .
Plugging in Pw 192/436 0.4403 and pB
40/1210.3305 yields the observed value of the
test statistic zobs 2.1656.

3.Similar to the one sample tests, we can make a
decision by comparing the p-value to a.

Since p-value P(Z gt 2.1656) 0.015lt0.05. Based
on the data, we reject H0.i.e. there is
significant evidence that the larger percentage
of workers feel that its unethical to monitor
email.
16
Large-Sample Hypothesis Test of ?1-?2

Assumption the two samples are independent of
each other
Observe p1 p2
construct hypotheses
test statistic , and sample distribution under
null hypothesis
p1-p2 N(0, )
z
p-value of zobs (use z-table)
make decision

17
Exercise 9.2

The campaign manager for a presidential candidate
wishes to test the claim that the proportion of
Ohio voters who favor the candidate is at least
as large as the proportion of California voters
who favor the candidate. Given these data, test
the manager's claim at a 5 level of significance.

18
Sec. 9.3 Inferences about difference between
two population means
19
Example 9.3

1.)What is the 90 confidence interval of the
difference between the mean salary of of
statisticians in New York and those in
Massachusetts?
2.)Test if the mean salary of statisticians in
New York significantly different from those in
Massachusetts?
(a0.05)

1.Point estimate of ?N- ?M is
2. The standard error is
When the population variances are known
When the population variances are unknown

These are sample sizes for NY and Mass.
21
Recall

Think about the one sample case first.
When we test something about a single
mean, there were 2 cases to consider
s known which means we use the standard normal
(Z) to make inferences
s unknown which means we use the t distribution
to make inferences

22
A little more complicated
Use the standard normal to obtain p-values and
confidence intervals.
sN and sM are both known
sN ? sM. This is a 2 sample t-test. We use a
t-distn but the df has to be approximated.
sN and sM are both unknown
23
Two sample t-test

3. So if sN and sM are both unknown,the
standardized test statistic
has t distribution with degree of freedom

24
Note

For a conservative approach to the two-sample
t-procedures, the degrees of
freedom are given by
Dfmin(nN-1, nM-1).

For the example concerning New York and Mass.
salaries, The degrees of freedom to use is
min(45-1, 37-1)36.
25

4.For 1) The 90 confidence interval of
is of form
Where t is the upper critical value of t(36)
with confidence level .9
t1.684

RemarkI used 40 degrees of freedom since 36 is
not in the book
26

A 90 confidence interval for is
(-1690, 5090).

5. For 2) testing
Under H0,
the standardized test statistic
Conservatively

Given data
tobs

P-value2P(tgttobs)2P(t(36)gt0.8473) Since
0.8473lt.851 P(tgt0.8473)gtP(tgt.851).2 Then
P-valuegt.4 gt0.05 Based on the data, not
reject H0,i.e. there is inefficient evidence to
reject the null hypothesis and the difference
between the mean salary of statisticians in two
cities are not significant
29
Remark

Actually the df of t-statistic in this example is
The test might be proceeded by using t(75),but
test result is the same

30
Inferences about difference between two
population means

Assumption the two samples are independent of
each other
the estimator
t
CI for based on
Estimator t(standard error)
where t is based on confidence level (1-?) and

degree of freedom (df)
(round it down to the nearest integer)
A conservative approach to
dfmin(n1-1, n2-1)

32
Exercise 9.3

Wind speed data were gathered during January and
July at the site proposed for a wind generator
will be different in the two months. From the
summary data, construct a 99 confidence interval
for the difference between the mean wind speeds
in January and July.

(3.934,10.466)
By conservative approach,
7.2(2.75)(1.228),i.e.(3.823,10.577)

34
Exercise 9.4

Plastic grocery hags have almost replaced the
standard brown paper bags at the supermarket. One
particular company was trying to increase the
tensile strength of the bags. These summary data
are from two independent random samples and give
the tensile strengths of plastic bags from two
different production run
Sample 1
Sample 2
Determine whether there is a significant
difference between the mean tensile strengths
from the two production runs.

Df64
P-value2P(tgt3.45) 0.001
Reject H0,this small p-value indicates that the
difference between the mean tensile strength s of
the plastic bags from the two different
production runs is highly significant.
By conservative approach
df31, P-value2P(tgt3.45)lt2P(tgt3.385)0.002lt0.05
Reject H0

Write a Comment

User Comments (0)