Introduction to Statistical Inference - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Statistical Inference

Description:

Conclude that the 'status quo' is not true if observed data are highly unlikely ... If interval contains m0, P-value a (don't conclude m ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 34
Provided by: larryw4
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Statistical Inference


1
Chapter 6
  • Introduction to Statistical Inference

2
Introduction
  • Goal Make statements regarding a population (or
    state of nature) based on a sample of
    measurements
  • Probability statements used to substantiate
    claims
  • Example Clinical Trial for Pravachol (5-year
    follow-up)
  • Of 3302 subjects receiving Pravachol, 174 had
    heart incidences
  • Of 3293 subjects receiving placebo, 248 had heart
    incidences

3
Estimating with Confidence
  • Goal Estimate a population mean (proportion)
    based on sample mean (proportion)
  • Unknown Parameter (m, p)
  • Known Approximate Sampling Distribution of
    Statistic
  • Recall For a random variable that is normally
    distributed, the probability that it will fall
    within 2 standard deviations of mean is
    approximately 0.95

4
Estimating with Confidence
  • Although the parameter is unknown, its highly
    likely that our sample mean or proportion
    (estimate) will lie within 2 standard deviations
    (aka standard errors) of the population mean or
    proportion (parameter)
  • Margin of Error Measure of the upper bound in
    sampling error with a fixed level (we will use
    95) of confidence. That will correspond to 2
    standard errors

5
Confidence Interval for a Mean m
  • Confidence Coefficient (C) Probability (based on
    repeated samples and construction of intervals)
    that a confidence interval will contain the true
    mean m
  • Common choices of C and resulting intervals

6
C
m
7
C
0
8
Philadelphia Monthly Rainfall (1825-1869)
9
4 Random Samples of Size n20, 95 CIs
10
Factors Effecting Confidence Interval Width
  • Goal Have precise (narrow) confidence intervals
  • Confidence Level (C) Increasing C implies
    increasing probability an interval contains
    parameter implies a wider confidence interval.
    Reducing C will shorten the interval (at a cost
    in confidence)
  • Sample size (n) Increasing n decreases standard
    error of estimate, margin of error, and width of
    interval (Quadrupling n cuts width in half)
  • Standard Deviation (s) More variable the
    individual measurements, the wider the interval.
    Potential ways to reduce s are to focus on more
    precise target population or use more precise
    measuring instrument. Often nothing can be done
    as nature determines s

11
Selecting the Sample Size
  • Before collecting sample data, usually have a
    goal for how large the margin of error should be
    to have useful estimate of unknown parameter
    (particularly when comparing two populations)
  • Let m be the desired level of the margin of error
    and s be the standard deviation of the population
    of measurements (typically will be unknown and
    must be estimated based on previous research or
    pilot study
  • The sample size giving this margin of error is

12
Precautions
  • Data should be simple random sample from
    population (or at least can be treated as
    independent observations)
  • More complex sampling designs have adjustments
    made to formulas (see Texts such as Elementary
    Survey Sampling by Scheaffer, Mendenhall, Ott)
  • Biased sampling designs give meaningless results
  • Small sample sizes from nonnormal distributions
    will have coverage probabilities (C) typically
    below the nominal level
  • Typically s is unknown. Replacing it with sample
    standard deviation s works as a good
    approximation in large samples

13
Significance Tests
  • Method of using sample (observed) data to
    challenge a hypothesis regarding a state of
    nature (represented as particular parameter
    value(s))
  • Begin by stating a research hypothesis that
    challenges a statement of status quo (or
    equality of 2 populations)
  • State the current state or status quo as a
    statement regarding population parameter(s)
  • Obtain sample data and see to what extent it
    agrees/disagrees with the status quo
  • Conclude that the status quo is not true if
    observed data are highly unlikely (low
    probability) if it were true

14
Pravachol and Olestra
  • Pravachol vs Placebo wrt heart disease/death
  • Pravachol 5.27 of 3302 patients suffer MI or
    death to CHD
  • Placebo 7.53 of 3293 patients suffer MI or
    death to CHD
  • Probability of difference this large for
    Pravachol if no more effective than placebo is
    .000088 (will learn formula later)
  • Olestra vs Triglyceride Chips wrt GI Symptoms
  • Olestra 15.81 of 563 subjects report GI
    symptoms
  • Triglyceride 17.58 of 529 subjects report GI
    symptoms
  • Probability of difference this large in either
    direction (olestra better or worse) is .4354
  • Strong evidence of Pravachol effect vs placebo
  • Weak to no evidence of Olestra effect vs
    Triglyceride

15
Elements of a Significance Test
  • Null hypothesis (H0) Statement or theory being
    tested. Will be stated in terms of parameters and
    contain an equality. Test is set up under the
    assumption of its truth.
  • Alternative Hypothesis (Ha) Statement
    contradicting H0. Will be stated in terms of
    parameters and contain an inequality. Will only
    be accepted if strong evidence refutes H0 based
    on sample data. May be 1-sided or 2-sided,
    depending on theory being tested.
  • Test Statistic (TS) Quantity measuring
    discrepancy between sample statistic (estimate)
    and parameter value under H0
  • P-value Probability (assuming H0 true) that we
    would observe sample data (test statistic) this
    extreme or more extreme in favor of the
    alternative hypothesis (Ha)

16
Example Interference Effect
  • Does the way items are presented effect task
    time?
  • Subjects shown list of color names in 2 colors
    different/black
  • Xi is the difference in times to read lists for
    subject i diff-blk
  • H0 No interference effect mean difference is 0
    (m 0)
  • Ha Interference effect exists mean difference gt
    0 (m gt 0)
  • Assume standard deviation in differences is s
    8 (unrealistic)
  • Experiment to be based on n70 subjects

How likely to observe sample mean difference ?
2.39 if m 0?
17
P-value
0
2.39
18
Computing the P-Value
  • 2-sided Tests How likely is it to observe a
    sample mean as far of farther from the value of
    the parameter under the null hypothesis? (H0
    m m0 Ha m ? m0)

After obtaining the sample data, compute the mean
and convert it to a z-score (zobs) and find the
area above zobs and below -zobs from the
standard normal (z) table
  • 1-sided Tests Obtain the area above zobs for
    upper tail tests (Ham gt m0) or below zobs for
    lower tail tests (Ham lt m0)

19
Interference Effect (1-sided Test)
  • Testing whether population mean time to read list
    of colors is higher when color is written in
    different color
  • Data Xi difference score for subject i
    (Different-Black)
  • Null hypothesis (H0) No interference effect (m
    0)
  • Alternative hypothesis (Ha) Interference effect
    (m gt 0)
  • Known n70, s 8 (This wont be known in
    practice but can be replaced by sample s.d. for
    large samples)

20
Interference Effect (2-sided Test)
  • Testing whether population mean time to read list
    of colors is effected (higher or lower) when
    color is written in different color
  • Data Xi difference score for subject i
    (Different-Black)
  • Null hypothesis (H0) No interference effect (m
    0)
  • Alternative hypothesis (Ha) Interference effect
    ( or -) (m ? 0)
  • Known n70, s 8 (This wont be known in
    practice but can be replaced by sample s.d. for
    large samples)

21
Equivalence of 2-sided Tests and CIs
  • For a 1-C, a 2-sided test conducted at a
    significance level will give equivalent results
    to a C-level confidence interval
  • If entire interval gt m0, P-value lt a , zobs gt 0
    (conclude m gt m0)
  • If entire interval lt m0, P-value lt a , zobs lt 0
    (conclude m lt m0)
  • If interval contains m0, P-value gt a (dont
    conclude m ?m0)
  • Confidence interval is the set of parameter
    values that we would fail to reject the null
    hypothesis for (based on a 2-sided test)

22
Decision Rules and Critical Values
  • Once a significance (a) level has been chosen a
    decision rule can be stated, based on a critical
    value
  • 2-sided tests H0 m m0 Ha m ? m0
  • If test statistic (zobs) gt za/2 Reject Ho and
    conclude m gt m0
  • If test statistic (zobs) lt -za/2 Reject Ho and
    conclude m lt m0
  • If -za/2 lt zobs lt za/2 Do not reject H0 m m0
  • 1-sided tests (Upper Tail) H0 m m0 Ha m gt m0
  • If test statistic (zobs) gt za Reject Ho and
    conclude m gt m0
  • If zobs lt za Do not reject H0 m m0
  • 1-sided tests (Lower Tail) H0 m m0 Ha m lt
    m0
  • If test statistic (zobs) lt -za Reject Ho and
    conclude m lt m0
  • If zobs gt -za Do not reject H0 m m0

23
Potential for Abuse of Tests
  • Should choose a significance (a) level in advance
    and report test conclusion (significant/nonsignifi
    cant) as well as the P-value. Significance level
    of 0.05 is widely used in the academic literature
  • Very large sample sizes can detect very small
    differences for a parameter value. A clinically
    meaningful effect should be determined, and
    confidence interval reported when possible
  • A nonsignificant test result does not imply no
    effect (that H0 is true).
  • Many studies test many variables simultaneously.
    This can increase overall type I error rates

24
Large-Sample Test H0m1-m20 vs H0m1-m2gt0
  • H0 m1-m2 0 (No difference in population means
  • HA m1-m2 gt 0 (Population Mean 1 gt Pop Mean 2)
  • Conclusion - Reject H0 if test statistic falls
    in rejection region, or equivalently the P-value
    is ? a

25
Example - Botox for Cervical Dystonia
  • Patients - Individuals suffering from cervical
    dystonia
  • Response - Tsui score of severity of cervical
    dystonia (higher scores are more severe) at week
    8 of Tx
  • Research (alternative) hypothesis - Botox A
    decreases mean Tsui score more than placebo
  • Groups - Placebo (Group 1) and Botox A (Group 2)
  • Experimental (Sample) Results

Source Wissel, et al (2001)
26
Example - Botox for Cervical Dystonia
Test whether Botox A produces lower mean Tsui
scores than placebo (a 0.05)
Conclusion Botox A produces lower mean Tsui
scores than placebo (since 2.82 gt 1.645 and
P-value lt 0.05)
27
2-Sided Tests
  • Many studies dont assume a direction wrt the
    difference m1-m2
  • H0 m1-m2 0 HA m1-m2 ? 0
  • Test statistic is the same as before
  • Decision Rule
  • Conclude m1-m2 gt 0 if zobs ? za/2 (a0.05 ?
    za/21.96)
  • Conclude m1-m2 lt 0 if zobs ? -za/2 (a0.05 ?
    -za/2 -1.96)
  • Do not reject m1-m2 0 if -za/2 ? zobs ? za/2
  • P-value 2P(Z? zobs)

28
Power of a Test
  • Power - Probability a test rejects H0 (depends on
    m1- m2)
  • H0 True Power P(Type I error) a
  • H0 False Power 1-P(Type II error) 1-b
  • Example
  • H0 m1- m2 0 HA m1- m2 gt 0
  • s12 s22 25 n1 n2 25
  • Decision Rule Reject H0 (at a0.05 significance
    level) if

29
Power of a Test
  • Now suppose in reality that m1-m2 3.0 (HA is
    true)
  • Power now refers to the probability we
    (correctly) reject the null hypothesis. Note that
    the sampling distribution of the difference in
    sample means is approximately normal, with mean
    3.0 and standard deviation (standard error)
    1.414.
  • Decision Rule (from last slide) Conclude
    population means differ if the sample mean for
    group 1 is at least 2.326 higher than the sample
    mean for group 2
  • Power for this case can be computed as

30
Power of a Test
  • All else being equal
  • As sample sizes increase, power increases
  • As population variances decrease, power
    increases
  • As the true mean difference increases, power
    increases

31
Power of a Test
Distribution (H0)
Distribution (HA)
32
Power of a Test
  • Power Curves for group sample sizes of
    25,50,75,100 and varying true values m1-m2 with
    s1s25.
  • For given m1-m2 , power increases with sample
    size
  • For given sample size, power increases with
    m1-m2

33
Sample Size Calculations for Fixed Power
  • Goal - Choose sample sizes to have a favorable
    chance of detecting a clinically meaning
    difference
  • Step 1 - Define an important difference in means
  • Case 1 s approximated from prior experience or
    pilot study - dfference can be stated in units of
    the data
  • Case 2 s unknown - difference must be stated in
    units of standard deviations of the data
  • Step 2 - Choose the desired power to detect the
    the clinically meaningful difference (1-b,
    typically at least .80). For 2-sided test
Write a Comment
User Comments (0)
About PowerShow.com