Biostatistics and Computer Applications - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Biostatistics and Computer Applications

Description:

Sugar packing machine: Calculate the 95% confidence interval for. ... Example One Sample Chi-Square Test. Packing machine: 02=2.Sample, n=10, s2=2.5. ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 44
Provided by: dafen
Category:

less

Transcript and Presenter's Notes

Title: Biostatistics and Computer Applications


1
Biostatistics and Computer Applications
  • Parameter Estimation
  • Confidence interval
  • Sample size estimation
  • Inference for variance
  • SAS Programming
  • 1/2/2003

2
Recap (Hypothesis test)
  • Steps in hypothesis test
  • Regions of acceptance and rejection
  • One tailed and two-tailed test
  • Type I error and Type II error
  • One sample hypothesis test
  • Two sample independent test
  • Two paired sample test

3
Recap (Region of acceptance and rejection)
Accept H0
Reject H0
Reject H0
X-bar
-1.96 0 1.96
z
4
Recap (Hypothesis test)
  • One sample
  • Two samples (independent)
  • Paired t-test

5
Interval estimation
  • According to the sampling distribution, set an
    interval for parameter (
    of a population that the
    probability is within the interval is
    , i.e.
  • L1 and L2 are parameter s
    confidence limits (CL) L1, L2 is confidence
    interval (CI) is the confidence
    coefficient.

6
Confidence Interval
  • A Confidence Interval is a range (or an interval)
    of values that is likely to contain the true
    value of the population parameter (e.g., mean,
    standard deviation).
  • Influenced by degree of confidence (1- ?)
  • Balance between precision (as reflected in the
    width of the CI) and reliability (as expressed by
    the degree of confidence). Common choices are
    95 and 99.

7
Calculating the Confidence Interval (CI) of a
Mean
  • Z distribution
  • Considering sample mean
  • In general,

8
Interpreting the Confidence Interval
  • We have and do not know , but We have
    confidence to say that interval
    include .

9
Example
  • Sugar packing machine Calculate the 95
    confidence interval for .

As this CI does not include 100, we reject
at 0.05 level.
10
What the Confidence Interval Does Not Mean!
  • We cannot state that there is a 95 chance that
    the true population mean is contained within any
    particular observed confidence interval because
    because the population mean is a parameter, or a
    fixed value, and therefore is either inside or
    outside of the estimated interval. It cannot be
    inside an interval 95 of the time.
  • There is no uncertainty about the sample
    statistics (mean, SD, etc). We are 100 sure that
    we calculated them correctly.

Interpretation We do not know the population
mean, but we can be sure that on average 95 out
of 100 CIs similarly obtained would include the
population mean. If we repeat this procedure 100
times, the interval constructed in this manner
will include the true mean (?) 95 times.
11
Relation between confidence intervals and
hypothesis test (significance test)
  • If the 95 confidence interval does not contain
    ?0 , then the null hypothesis would be rejected
    at the 0.05 level
  • Conversely, if the 95 confidence interval does
    contain ?0 , then the null hypothesis is accepted
    at the 0.05 level

12
Another example
  • Light level at floor of a tree canopy. The 95
    confidence of population mean .

As this CI include 3.0, we accept
at ? 0.05 level.
13
Confidence Interval (CI) for
For z distribution, the 1- ? confidence interval
for (?1 - ?2) is For t distribution, the (1- ?)
confidence interval for (?1 - ?2) is For paired
samples, the (1- ?) confidence interval for ?d
is
14
Confidence Interval (CI) for
Calculation of same
as in the hypothesis test. If the interval
L1,L2 include 0, we accept
at ? level otherwise, we reject
H0.
15
Example of Confidence Interval (CI) for
Example 1 n1n2200 Example 2 Two Virus,
tobacco leaves, dead pots.
16
One-sided Confidence Interval
  • Only one confidence limit is calculated.

  • the confidence interval is
  • the
    confidence interval is
  • Use different critical z or t- value for 1-?
    rather than 1- ?/2. Thus, for the z-distribution,
    we use 1.645 instead of 1.96.
  • The relationship between confidence interval and
    hypothesis test is the same. If ?0 is included in
    the confidence interval for one mean or 0 is
    included in the confidence interval for two
    means, accept H0 otherwise, reject H0.

17
Sample size determination
  • As standard error decreases, the confidence
    interval becomes smaller, we have an precisely
    estimate.
  • We can decrease the standard error by increasing
    the sample size n.
  • But increasing n will cost increase other
    expenses.
  • So we need a sample size n to guarantee the
    precision of the parameter estimation from sample
    with certain confidence.

18
Sample size determination
  • One sample mean
  • replace t with t ?(df)

We use z ? as t ? to calculate n, if nlt30, we
recalculate n again use t ?(df).
19
Sample size determination
  • As variance increases, sample size increases
  • As significance level decreases, sample size
    increases
  • As difference between means increase, sample size
    decreases

20
Examples
  • We measured a certain variable with s5.5 unit.
    In order to get a sample mean not different from
    population mean more than 1 unit with 99
    confidence, how many individuals do we need to
    sample?
  • s5.5, alpha0.01, z0.012.58

21
Sample Size Estimation for Comparison of Two Means
  • Two independent samples
  • Paired samples

22
Example of Sample Size Estimation
  • Two treatments, A 24, 20, 29, 25 kg, B 18, 24,
    15, 19 kg. Any different between two treatments?
    If we want to find a difference between the
    difference of sample means and population means
    less than 4 kg with 95 confidence, how many
    individuals do we need to measure?

23
Statistical inference for variance
24
One Sample Chi-Square Test
  • We have a sample n, s2 and a known population
    ?02C.
  • H0 ?2 ?02 vs HA ?2 ¹ ?02
  • Test statistic is
  • ?2 (n-1) s2/?2 ?2(n-1)
  • (if ngt30, )
  • Reject if ?2gt?2?/2 (n-1) or ?2 lt?21-?/2( n-1),
  • (1- ?) confidence interval for ?2
  • L1 (n-1) s2/ ?2?/2, L2 (n-1) s2/ ? 21-?/2

?2?/2 (n-1)
?21-?/2( n-1),
25
Example One Sample Chi-Square Test
  • Packing machine ?022.Sample, n10, s22.5. If
    the sample variance significantly different from
    2? Whats the 95 confidence interval for ?2?
  • H0 ?2 ?02 vs HA ?2 ¹ ?02
  • ?20.025 (9)2.70, ?2.0.975(9) 19.02
  • ?2 (n-1) s2/?02 11.25
  • Accept H0, as 2.7lt?2lt11.25,
  • ?2 95 confidence interval (Not symmetric)
  • L1 (n-1) s2/ ?2?/21.18
  • L2 (n-1) s2/ 21-?/28.33

26
Two Samples Variance (F Test)
  • We have two samples n1, s12 n2, s22.
  • H0 ?12lt ?22 vs HA ?12gt?22
  • Test statistic is (always put larger s2 as s12)
  • Fs12/ s22
  • Reject if
  • F gtF?(n1-1,n2-1)
  • If n1 and n2gt100, then use z test

27
Example Two Samples Variance
  • Test if the variance of boys height is larger
    than girls (data not real)?
  • n110, s1222.15 n28,s224.11
  • H0 ?12lt ?22 vs HA ?12gt?22
  • F 0.05(9,7) 3.68
  • Fs12/ s2222.15/4.115.39
  • Reject H0 as F gtF0.05

28
Multiple variances (Bartlett test)
  • Draw k independent sample from a normal
    distributed population with n, si2.
  • H0 all ?2 are same vs HA at least one of ?2
    different from others
  • Test statistic is
  • Reject H0 if
  • ?2 gt?2 ? (k-1)

29
Example Multiple variances (Bartlett test)
K5, n20
30
Statistical Inference - Proportions
  • One sample
  • Hypothesis test
  • Confidence interval
  • Two Sample
  • Hypothesis test
  • Confidence interval

31
One sample Tests Binomial Proportion

Test if sample proportion estimated p is
different from a prescribed value p0 and estimate
the confidence interval for p. H0 p p0 vs
HA p ¹ p0 Test statistic
Confidence interval L1 -z??p , L2
z??p If H0 rejected, L1 -z?sp , L2
z?sp
32
Example

Suppose that there is an equal chance that a
child is male or female. We find in a sample of
114 workers at a pesticide plant (with only one
child) that 66 of the children are female. Is
this evidence that the working condition changing
the proportion of male and female (p00.5)? Data
n 114, p66/1140.5789 H0 pp0 vs HA p
¹ p0 The critical value for a one-sided a
0.05 test is 1.96. Since the test statistic, z
1.69, smaller than the critical value, we accept
H0. 95 Confidence interval L10.5789-1.960.046
0.48 L20.57891.960.046 0.67 includes 0.50
33
Hypothesis Testing for 2 Sample Proportions
The hypothesis that the two populations are the
same is addressed by the hypotheses H0
p1 p2 HA p1 ¹ p2 Test statistic (same rules
as for one sample)
34
Hypothesis Testing for 2 Sample Proportions
1- ? confidence interval for (p1-p2) If H0
is rejected
35
Example for 2 Sample Proportions
Seat belt safety study We can test H0 p1
p2 but we first need a common estimate (under the
null)
36
Example for 2 Sample Proportions
Since z lt 1.96 we fail to reject H0 and
conclude that the observed difference is not
statistically significant at the 0.05 level. 95
confidence interval for (p1-p2) This interval
includes 0, confirms that H0 should be accepted.
37
Sample Size
1-sample Proportion 2-sample Proportion
38
Example for sample Size
Example We know the probability of purple flower
plant in F2 generation is p0.75. We want to a
sample p between 0.740.76 with 95 confidence,
how many plants do we need to sample?
39
SAS Programming
  • PROC TTEST
  • TTEST performs t tests for one sample, two
    samples, and paired observations. The one-sample
    t test compares the mean of the sample to a given
    number. The two-sample t test compares the mean
    of the first sample minus the mean of the second
    sample to a given number. The paired observations
    t test compares the mean of the differences in
    the observations to a given number.

40
PROC TTEST
  • PROC TTEST options
  • CLASS variable
  • VAR variables
  • PAIRED x1x2

41
PROC TTEST
  • PROC TTEST options
  • Options
  • ALPHAp, set p value for CI
  • CIEQUAL, for variance CI
  • COCHRAN
  • DATASAS-data-set , data set name
  • H0m, set H0mium instead of m0.

42
PROC TTEST
  • CLASS variable Specify variable to separate
    whole data set into to parts, the variable can
    only have two levels for two independent samples
    No class statement is needed for one sample and
    paired t-test.

43
PROC TTEST
  • PAIRED x1x2 Test dx1-x2, if miu_d0 for paired
    t-test.
  • VAR variable Specify variables to be analyzed.
    These variables should not be in the PAIRED
    statement.
Write a Comment
User Comments (0)
About PowerShow.com