CSC 323 Quarter: Spring

About This Presentation

Title:

CSC 323 Quarter: Spring

Description:

CSC 323 Quarter: Spring 02/03 Daniela Stan Raicu School of CTI, DePaul University –

Number of Views:82

Avg rating:3.0/5.0

Slides: 26

Provided by: Ioa53

Learn more at: http://facweb.cs.depaul.edu

Category:

more less

Transcript and Presenter's Notes

Title: CSC 323 Quarter: Spring

1
CSC 323 Quarter Spring 02/03

Daniela Stan Raicu
School of CTI, DePaul University

2
Outline

Confidence intervals when the population
distribution is unknown
Confidence intervals when the sample size is
small
Tests of significance

3
Is x normal distributed?
Is the population normal?
Yes
No
Is ?
Is ?
may or may not be considered normal
has t-student distribution
is considered to be normal
(We need more info)
4
Assumptions when applying z-statistic

1. The population has a normal distribution with
mean µ and standard deviation ?.
2. The standard deviation ? is known
3. The size n of the simple random sample
(SRS) is large

5
Assumptions when applying z-statistic

Is z-statistic appropriate to use when
The sample size is small?
2. The population does not have a normal
distribution?
3. The population has a normal distribution
but the standard deviation ? is unknown?

What is the distribution of
It is not normal!
6
Inference on averages for small samples (cont.)
If data arise from a population with normal
distribution and n is small (nlt30), we can use a
different curve, called t- distribution or
Students curve.
The t-distribution was discovered by W. S. Gosset
(born on 13 June 1876 in Canterbury, England),
the chief statistician of the Guinness brewery in
Dublin, Ireland. He discovered the
t-distribution in order to deal with small
samples arising in statistical quality control.
The brewery had a policy against employees
publishing under their own names, thus he
published his results about the t-distribution
under the pen name "Student", and that name has
become attached to the distribution.
7
The t-student distributions
Suppose that an SRS of size n is drawn from an
N(µ, ?). Then the one-sample t statistic
has the t-distribution with n-1 degrees of
freedom.
- The degrees of freedom come from the standard
deviation s in the denominator of t.
- There are many students curves! There is one
students curve for each number of degrees of
freedom for tests on averages Degrees of
freedom number of observations 1
8
Comparing the students curve and the standard
normal curve
d.f.5
d.f.15
t
t
Students curve Standard Normal curve Students
curve has fatter tails. For d.f. around 30, the
students curve is very similar to the standard
normal curve.
d.f.30
t
9
When to use the t-test

When should we use it? Each of the following
conditions should hold
For computing a statistical test on averages.
The sample is a simple random sample.
The number of observations is small, the sample
size n is less than 30.
The distribution of the population is
bell-shaped, it is not too different from the
normal distribution. (Not easy to check,
typically true for measurements!)

10
Tests on averages z-test or t-test?
If the amount of current data is
large
Small (n lt30)
Use the z-test the normal curve
The distribution of the population is
Unknown but quite different from the normal curve
Unknown but not different from the normal curve
Use the t-test the students curve
Do not use the t-test!
11
Confidence intervals for proportions

Assignment
1. Draw the flowchart for estimating the
population
proportions
2. Calculate the confidence interval for the
population proportion
for different situations from the flowchart
3. Calculate the confidence intervals for
different confidence
levels such as C.96, .98, etc.
4. Give examples where the way we calculated the
confidence
Intervals does not work.

12
Tests of Significance
Example 1 In the courtroom, juries must make a
decision about the guilt or innocence of a
defendant. Suppose you are on the jury in a
murder trial. It is obviously a mistake if the
jury claims the suspect is guilty when in fact he
or she is innocent.
What is the other type of mistake the jury could
make?
Which is more serious?
13
Tests of Significance
Example 2 Suppose exactly half, or 0.50, of a
certain population would answer yes when asked if
they support the death penalty. A random sample
of 400 people results in 220, or 0.55, who answer
yes. The Rule for Sample Proportions tells us
that the potential sample proportions in this
situation are approximately bell-shaped, with
standard deviation of 0.025. Find the
standardized score for the observed value of
0.55. Then determine how often you would
expect to see a standardized score at least that
large or larger.
14
Tests of Significance
Example 2 (cont.)
n 400 mean 0.50 STD0.025
2.27
15
The Five Steps of Hypothesis Testing

1. Determining the Two Hypotheses H0, Ha
2. Computing the Sampling Distribution
3. Collecting and Summarizing the
Data(calculating the observed test statistic)
4. Determining How Unlikely the Test Statistic is
if the Null Hypothesis is True (calculating the
P-value)
5. Making a Decision/Conclusion(based on the
P-value, is the result statistically significant?)

16
1.A. The Null Hypothesis H0

population parameter equals some value
no relationship
no change
no difference in two groups, etc.
When performing a hypothesis test, we assume that
the null hypothesis is true until we have
sufficient evidence against it.

1. B. The Alternative Hypothesis Ha

population parameter differs from some value
relationship exists
a change occurred
two groups are different, etc.

17
The Hypotheses for Proportions

Null H0 pp0
One sided alternatives
Ha pgtp0
Ha pltp0
Two sided alternative
Ha p p0

18
The Hypotheses for Proportions

Null H0 pp0
One sided alternatives
Ha pgtp0
Ha pltp0
Two sided alternative
Ha p p0

19
Example Parental Discipline

Nationwide random telephone survey of 1,250
adults.
474 respondents had children under 18 living at
home
results on behavior based on the smaller sample
reported margin of error
3 for the full sample
5 for the smaller sample

Results of the study
The 1994 survey marks the first time a majority
of parents reported not having physically
disciplined their children in the previous year.
Figures over the past six years show a steady
decline in physical punishment, from a peak of 64
percent in 1988 The 1994 sample proportion who
did not spank or hit was 51 ! Is this evidence
that a majority of the population did not spank
or hit?
20
Case Study The Hypotheses

Null The proportion of parents who physically
disciplined their children in the previous year
is the same as the proportion p of parents who
did not physically discipline their children.
H0 p.5
Alt A majority of parents did not physically
discipline their children in the previous year.
Ha pgt.5

2. Sampling Distributions of p
If numerous samples or repetitions of size n are
taken, the sampling distribution of the sample
proportions from various samples will be
approximately normal with mean equal to p (the
population proportion) and standard deviation
equal to
Since we assume the null hypothesis is true, we
replace p with p0 to complete the test.
21
3. Test Statistic for Proportions
To determine if the observed proportion is
unlikely to have occurred under the assumption
that H0 is true, we must first convert the
observed value to a standardized score
Case study Based on the sample n474 (large, so
proportions follow normal distribution) no
physical discipline (.50 is p0 from the null
hypothesis) standardized score (test statistic) z
(0.51 - 0.50) / 0.023 0.43
22
4. P-value

The P-value is the probability of observing data
this extreme or more so in a sample of this size,
assuming that the null hypothesis is true.
A small P-value indicates that the observed data
(or relationship) is unlikely to have occurred if
the null hypothesis were actually true.
The P-value tends to be small when there is
evidence in the data against the null hypothesis.

23
Case Study P-value
P-value 0.3446
From the normal distribution table (Table B),
z0.4 is the 65.54th percentile.
24
5. Decision

If we think the P-value is too low to believe the
observed test statistic is obtained by chance
only, then we would reject chance (reject the
null hypothesis) and conclude that a
statistically significant relationship exists
(accept the alternative hypothesis).
Otherwise, we fail to reject chance anddo not
reject the null hypothesis of no relationship
(result not statistically significant).

Typical Cut-off for the P-value

Commonly, P-values less than 0.05 are considered
to be small enough to reject chance.
Some researchers use 0.10 or 0.01 as the cut-off
instead of 0.05.
This cut-off value is typically referred to as
the significance level ? of the test

25
P-value for Testing Proportions

Ha pgtp0
P-value is the probability of getting a value as
large or larger than the observed test statistic
(z) value.
Ha pltp0
P-value is the probability of getting a value as
small or smaller than the observed test statistic
(z) value.
Ha pp0
P-value is two times the probability of getting a
value as large or larger than the absolute value
of the observed test statistic (z) value.