Introduction to choosing the correct statistical test

About This Presentation
Title:

Introduction to choosing the correct statistical test

Description:

Introduction to choosing the correct statistical test + Tests for Continuous Outcomes I Questions to ask yourself: What is the outcome (dependent) variable? –

Number of Views:1480
Avg rating:3.0/5.0
Slides: 89
Provided by: stanfordE4
Learn more at: https://web.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: Introduction to choosing the correct statistical test


1
Introduction to choosing the correct statistical
test
  • Tests for Continuous Outcomes I

2
Questions to ask yourself
  • What is the outcome (dependent) variable?
  • Is the outcome variable continuous,
    binary/categorical, or time-to-event?
  • What is the unit of observation?
  • person (most common)
  • lesion
  • half a face
  • physician
  • clinical center
  • Are the observations independent or correlated?
  • Independent observations are unrelated (usually
    different, unrelated people)
  • Correlated some observations are related to one
    another, for example the same person over time
    (repeated measures), lesions within a person,
    half a face, hands within a person, controls who
    have each been selected to a particular case,
    sibling pairs, husband-wife pairs, mother-infant
    pairs

3
Correlated data example
  • Split-face trial
  • Researchers assigned 56 subjects to apply SPF 85
    sunscreen to one side of their faces and SPF 50
    to the other prior to engaging in 5 hours of
    outdoor sports during mid-day.
  • Sides of the face were randomly assigned
    subjects were blinded to SPF strength.
  • Outcome sunburn


Russak JE et al. JAAD 2010 62 348-349.
4
Results
Table I   --  Dermatologist grading of sunburn
after an average of 5 hours of skiing/snowboarding
(P .03 Fishers exact test)
Sun protection factor Sunburned Not sunburned
85 1 55
50 8 48

Fishers exact test compares the following
proportions 1/56 versus 8/56. Note that
individuals are being counted twice!
5
Correct analysis of data
Table 1. Correct presentation of the data from
Russak JE et al. JAAD 2010 62 348-349. (P
.016 McNemars test).
SPF-50 side SPF-50 side
SPF-85 side Sunburned Not sunburned
Sunburned 1 0
Not sunburned 7 48
McNemars test evaluates the probability of the
following In all 7 out of 7 cases where the
sides of the face were discordant (i.e., one side
burnt and the other side did not), the SPF 50
side sustained the burn.
6
Overview of common statistical tests
Outcome Variable Are the observations correlated? Are the observations correlated? Assumptions
Outcome Variable independent correlated Assumptions
Continuous (e.g. blood pressure, age, pain score) Ttest ANOVA Linear correlation Linear regression Paired ttest Repeated-measures ANOVA Mixed models/GEE modeling Outcome is normally distributed (important for small samples). Outcome and predictor have a linear relationship.
Binary or categorical (e.g. breast cancer yes/no) Chi-square test Relative risks Logistic regression McNemars test Conditional logistic regression GEE modeling Chi-square test assumes sufficient numbers in each cell (gt5)
Time-to-event (e.g. time-to-death, time-to-fracture) Kaplan-Meier statistics Cox regression n/a Cox regression assumes proportional hazards between groups
7
Overview of common statistical tests
Outcome Variable Are the observations correlated? Are the observations correlated? Assumptions
Outcome Variable independent correlated Assumptions
Continuous (e.g. blood pressure, age, pain score) Ttest ANOVA Linear correlation Linear regression Paired ttest Repeated-measures ANOVA Mixed models/GEE modeling Outcome is normally distributed (important for small samples). Outcome and predictor have a linear relationship.
Binary or categorical (e.g. breast cancer yes/no) Chi-square test Relative risks Logistic regression McNemars test Conditional logistic regression GEE modeling Sufficient numbers in each cell (gt5)
Time-to-event (e.g. time-to-death, time-to-fracture) Kaplan-Meier statistics Cox regression n/a Cox regression assumes proportional hazards between groups
8
Continuous outcome (means)
Outcome Variable Are the observations correlated? Are the observations correlated? Alternatives if the normality assumption is violated (and small n)
Outcome Variable independent correlated Alternatives if the normality assumption is violated (and small n)
Continuous (e.g. blood pressure, age, pain score) Ttest compares means between two independent groups ANOVA compares means between more than two independent groups Pearsons correlation coefficient (linear correlation) shows linear correlation between two continuous variables Linear regression multivariate regression technique when the outcome is continuous gives slopes or adjusted means Paired ttest compares means between two related groups (e.g., the same subjects before and after) Repeated-measures ANOVA compares changes over time in the means of two or more groups (repeated measurements) Mixed models/GEE modeling multivariate regression techniques to compare changes over time between two or more groups Non-parametric statistics Wilcoxon sign-rank test non-parametric alternative to paired ttest Wilcoxon sum-rank test (Mann-Whitney U test) non-parametric alternative to the ttest Kruskal-Wallis test non-parametric alternative to ANOVA Spearman rank correlation coefficient non-parametric alternative to Pearsons correlation coefficient
9
Continuous outcome (means)
Outcome Variable Are the observations correlated? Are the observations correlated? Alternatives if the normality assumption is violated (and small n)
Outcome Variable independent correlated Alternatives if the normality assumption is violated (and small n)
Continuous (e.g. blood pressure, age, pain score) Ttest compares means between two independent groups ANOVA compares means between more than two independent groups Pearsons correlation coefficient (linear correlation) shows linear correlation between two continuous variables Linear regression multivariate regression technique when the outcome is continuous gives slopes or adjusted means Paired ttest compares means between two related groups (e.g., the same subjects before and after) Repeated-measures ANOVA compares changes over time in the means of two or more groups (repeated measurements) Mixed models/GEE modeling multivariate regression techniques to compare changes over time between two or more groups Non-parametric statistics Wilcoxon sign-rank test non-parametric alternative to paired ttest Wilcoxon sum-rank test (Mann-Whitney U test) non-parametric alternative to the ttest Kruskal-Wallis test non-parametric alternative to ANOVA Spearman rank correlation coefficient non-parametric alternative to Pearsons correlation coefficient
10
Example two-sample t-test
  • In 1980, some researchers reported that men have
    more mathematical ability than women as
    evidenced by the 1979 SATs, where a sample of 30
    random male adolescents had a mean score 1
    standard deviation of 43677 and 30 random female
    adolescents scored lower 41681 (genders were
    similar in educational backgrounds,
    socio-economic status, and age). Do you agree
    with the authors conclusions?

11
Two sample ttest
  • Statistical question Is there a difference in
    SAT math scores between men and women?
  • What is the outcome variable? Math SAT scores
  • What type of variable is it? Continuous
  • Is it normally distributed? Yes
  • Are the observations correlated? No
  • Are groups being compared, and if so, how many?
    Yes, two
  • ? two-sample ttest

12
Two-sample ttest mechanics
13
Data Summary
n Sample Mean Sample Standard Deviation
Group 1 women 30 416 81
Group 2 men 30 436 77
14
Two-sample t-test
  • 1. Define your hypotheses (null, alternative)
  • H0 ?-? math SAT 0
  • Ha ?-? math SAT ? 0 two-sided

15
Two-sample t-test
  • 2. Specify your null distribution
  • F and M have approximately equal standard
    deviations/variances, so make a pooled estimate
    of standard deviation/variance

The standard error of a difference of two means
is
Differences in means follow a T-distribution
16
T distribution
  • A t-distribution is like a Z distribution, except
    has slightly fatter tails to reflect the
    uncertainty added by estimating the standard
    deviation.
  • The bigger the sample size (i.e., the bigger the
    sample size used to estimate ?), then the closer
    t becomes to Z.
  • If ngt100, t approaches Z.

17
Students t Distribution
Note t Z as n increases
Standard Normal (t with df ?)
t (df 13)
t-distributions are bell-shaped and symmetric,
but have fatter tails than the normal
t (df 5)
t
0
from Statistics for Managers Using Microsoft
Excel 4th Edition, Prentice-Hall 2004
18
Students t Table
Upper Tail Area
Let n 3 df n - 1 2 ? .10
?/2 .05
df
.25
.10
.05
1
1.000
3.078
6.314
2
0.817
1.886
2.920
?/2 .05
3
0.765
1.638
2.353
The body of the table contains t values, not
probabilities
0
t
2.920
from Statistics for Managers Using Microsoft
Excel 4th Edition, Prentice-Hall 2004
19
t distribution values
With comparison to the Z value
Confidence t t
t Z Level (10 d.f.)
(20 d.f.) (30 d.f.) ____ .80
1.372 1.325 1.310 1.28
.90 1.812 1.725
1.697 1.64 .95 2.228
2.086 2.042 1.96 .99
3.169 2.845 2.750 2.58
Note t Z as n increases
from Statistics for Managers Using Microsoft
Excel 4th Edition, Prentice-Hall 2004
20
Two-sample t-test
  • 2. Specify your null distribution
  • F and M have approximately equal standard
    deviations/variances, so make a pooled estimate
    of standard deviation/variance

The standard error of a difference of two means
is
Differences in means follow a T-distribution
here we have a T-distribution with 58 degrees of
freedom (60 observations 2 means)
21
Two-sample t-test
  • 3. Observed difference in our experiment 20
    points

22
Two-sample t-test
  • 4. Calculate the p-value of what you observed

Critical value for two-tailed p-value of .05 for
T582.000 0.98lt2.000, so pgt.05
5. Do not reject null! No evidence that men
are better in math )
23
Corresponding confidence interval
Note that the 95 confidence interval crosses 0
(the null value).
24
Review Question 1
  • A t-distribution
  • Is approximately a normal distribution if ngt100.
  • Can be used interchangeably with a normal
    distribution as long as the sample size is large
    enough.
  • Reflects the uncertainty introduced when using
    the sample, rather than population, standard
    deviation.
  • All of the above.

25
Review Question 1
  • A t-distribution
  • Is approximately a normal distribution if ngt100.
  • Can be used interchangeably with a normal
    distribution as long as the sample size is large
    enough.
  • Reflects the uncertainty introduced when using
    the sample, rather than population, standard
    deviation.
  • All of the above.

26
Review Question 2
  • In a medical student class, the 6 people born on
    odd days had heights of 64.6?4 inches the 10
    people born on even days had heights of 71.1?5
    inches. Height is roughly normally distributed.
    Which of the following best represents the
    correct statistical test for these data?
  • a.
  • b.
  • c.
  • d.

27
Review Question 2
  • In a medical student class, the 6 people born on
    odd days had heights of 64.6?4 inches the 10
    people born on even days had heights of 71.1?5
    inches. Height is roughly normally distributed.
    Which of the following best represents the
    correct statistical test for these data?
  • a.
  • b.
  • c.
  • d.

28
Continuous outcome (means)
Outcome Variable Are the observations correlated? Are the observations correlated? Alternatives if the normality assumption is violated (and small n)
Outcome Variable independent correlated Alternatives if the normality assumption is violated (and small n)
Continuous (e.g. blood pressure, age, pain score) Ttest compares means between two independent groups ANOVA compares means between more than two independent groups Pearsons correlation coefficient (linear correlation) shows linear correlation between two continuous variables Linear regression multivariate regression technique when the outcome is continuous gives slopes or adjusted means Paired ttest compares means between two related groups (e.g., the same subjects before and after) Repeated-measures ANOVA compares changes over time in the means of two or more groups (repeated measurements) Mixed models/GEE modeling multivariate regression techniques to compare changes over time between two or more groups Non-parametric statistics Wilcoxon sign-rank test non-parametric alternative to paired ttest Wilcoxon sum-rank test (Mann-Whitney U test) non-parametric alternative to the ttest Kruskal-Wallis test non-parametric alternative to ANOVA Spearman rank correlation coefficient non-parametric alternative to Pearsons correlation coefficient
29
Example paired ttest
Difference Significance
Difference Significance
Before BTxnA After BTxnA Difference Significance

Social skills 5.90 5.84 NS .293
Academic performance 5.86 5.78 .08 .068
Date success 5.17 5.30 .13 .014
Occupational success 6.08 5.97 .11 .013
Attractiveness 4.94 5.07 .13 .030
Financial success 5.67 5.61 NS .230
Relationship success 5.68 5.68 NS .967
Athletic success 5.15 5.38 .23 .000

Significant at 5 level. Significant at 1 level. Significant at 5 level. Significant at 1 level. Significant at 5 level. Significant at 1 level. Significant at 5 level. Significant at 1 level. Significant at 5 level. Significant at 1 level.
TABLE 1.   Difference between Means of "Before"
and "After" Botulinum Toxin A Treatment



30
Paired ttest
  • Statistical question Is there a difference in
    date success after BoTox?
  • What is the outcome variable? Date success
  • What type of variable is it? Continuous
  • Is it normally distributed? Yes
  • Are the observations correlated? Yes, its the
    same patients before and after
  • How many time points are being compared? Two
  • ? paired ttest

31
Paired ttest mechanics
  1. Calculate the change in date success score for
    each person.
  2. Calculate the average change in date success for
    the sample. (.13)
  3. Calculate the standard error of the change in
    date success. (.05)
  4. Calculate a T-statistic by dividing the mean
    change by the standard error (T.13/.052.6).
  5. Look up the corresponding p-values. (T2.6
    corresponds to p.014).
  6. Significant p-values indicate that the average
    change is significantly different than 0.

32
Paired ttest example 2
33
Example problem paired ttest
Null Hypothesis Average Change 0
34
Example problem paired ttest
With 5 df, Tgt2.571 corresponds to plt.05
(two-sided test)
35
Example problem paired ttest
Note does not include 0.
36
Continuous outcome (means)
Outcome Variable Are the observations correlated? Are the observations correlated? Alternatives if the normality assumption is violated (and small n)
Outcome Variable independent correlated Alternatives if the normality assumption is violated (and small n)
Continuous (e.g. blood pressure, age, pain score) Ttest compares means between two independent groups ANOVA compares means between more than two independent groups Pearsons correlation coefficient (linear correlation) shows linear correlation between two continuous variables Linear regression multivariate regression technique when the outcome is continuous gives slopes or adjusted means Paired ttest compares means between two related groups (e.g., the same subjects before and after) Repeated-measures ANOVA compares changes over time in the means of two or more groups (repeated measurements) Mixed models/GEE modeling multivariate regression techniques to compare changes over time between two or more groups Non-parametric statistics Wilcoxon sign-rank test non-parametric alternative to paired ttest Wilcoxon sum-rank test (Mann-Whitney U test) non-parametric alternative to the ttest Kruskal-Wallis test non-parametric alternative to ANOVA Spearman rank correlation coefficient non-parametric alternative to Pearsons correlation coefficient
37
Using our class data
  • Hypothesis Students who consider themselves
    street smart drink more alcohol than students who
    consider themselves book smart.
  • Null hypothesis no difference in alcohol
    drinking between street smart and book smart
    students.

38
Non-normal class dataalcohol

39
Wilcoxon sum-rank test
  • Statistical question Is there a difference in
    alcohol drinking between street smart and book
    smart students?
  • What is the outcome variable? Weekly alcohol
    intake (drinks/week)
  • What type of variable is it? Continuous
  • Is it normally distributed? No (and small n)
  • Are the observations correlated? No
  • Are groups being compared, and if so, how many?
    two
  • ? Wilcoxon sum-rank test

40
Results


Book smart
Street smart
Mean1.6 drinks/week median 1.5
Mean2.7 drinks/week median 3.0
41
Wilcoxon rank-sum test mechanics
  • Book smart values (n13) 0 0 0 0 1 1 2 2 2 3 3
    4 5
  • Street Smart values (n7) 0 0 2 3 3 5 6
  • Combined groups (n20) 0 0 0 0 0 0 1 1 2 2 2 2 3
    3 3 3 4 5 5 6
  • Corresponding ranks 3.5 3.5 3.5 3.5 3.5 3.5 7.5
    7.5 10.5 10.5 10.5 10.5 14.5 14.5 14.5 14.5 17
    18.5 18.5 20
  • ties are assigned average ranks e.g., there are
    6 zeros, so zeros get the average of the ranks
    1 through 6.

42
Wilcoxon rank-sum test
  • Ranks, book smart 3.5 3.5 3.5 3.5 7.5 7.5 10.5
    10.5 10.5 14.5 14.5 17 18.5
  • Ranks, street smart 3.5 3.5 10.5 14.5 14.5 18.5
    20
  • Sum of ranks book smart 3.53.53.53.57.57.51
    0.510.510.5 14.514.51718.5 125
  • Sum of ranks street smart 3.53.510.514.5
    14.518.520 85
  • Wilcoxon sum-rank test compares these numbers
    accounting for the differences in sample size in
    the two groups.
  • Resulting p-value (from computer) 0.24
  • Not significantly different!

43
Example 2, Wilcoxon sum-rank test
10 dieters following Atkins diet vs. 10 dieters
following Jenny Craig Hypothetical
RESULTS Atkins group loses an average of 34.5
lbs. J. Craig group loses an average of 18.5
lbs. Conclusion Atkins is better?
44
Example non-parametric tests
BUT, take a closer look at the individual
data Atkins, change in weight (lbs) 4, 3,
0, -3, -4, -5, -11, -14, -15, -300 J. Craig,
change in weight (lbs) -8, -10, -12, -16, -18,
-20, -21, -24, -26, -30
45
Jenny Craig
30
25
20
P
e
r
c
15
e
n
t
10
5
0
-30
-25
-20
-15
-10
-5
0
5
10
15
20
Weight Change
46
Atkins
30
25
20
P
e
r
c
15
e
n
t
10
5
0
-300
-280
-260
-240
-220
-200
-180
-160
-140
-120
-100
-80
-60
-40
-20
0
20
Weight Change
47
Wilcoxon Rank-Sum test
  • RANK the values, 1 being the least weight loss
    and 20 being the most weight loss.
  • Atkins
  • 4, 3, 0, -3, -4, -5, -11, -14, -15, -300
  •  1, 2, 3, 4, 5, 6, 9, 11, 12, 20
  • J. Craig
  • -8, -10, -12, -16, -18, -20, -21, -24, -26, -30
  • 7, 8, 10, 13, 14, 15, 16, 17, 18,
    19

48
Wilcoxon Rank-Sum test
  • Sum of Atkins ranks
  •  1 2 3 4 5 6 9 11 12 2073
  • Sum of Jenny Craigs ranks
  • 7 8 10 13 14 1516 17 1819137
  • Jenny Craig clearly ranked higher!
  • P-value (from computer) .018

49
Review Question 3
  • When you want to compare mean blood pressure
    between two groups, you should
  • Use a ttest
  • Use a nonparametric test
  • Use a ttest if blood pressure is normally
    distributed.
  • Use a two-sample proportions test.
  • Use a two-sample proportions test only if blood
    pressure is normally distributed.

50
Review Question 3
  • When you want to compare mean blood pressure
    between two groups, you should
  • Use a ttest
  • Use a nonparametric test
  • Use a ttest if blood pressure is normally
    distributed.
  • Use a two-sample proportions test.
  • Use a two-sample proportions test only if blood
    pressure is normally distributed.

51
Continuous outcome (means)
Outcome Variable Are the observations correlated? Are the observations correlated? Alternatives if the normality assumption is violated (and small n)
Outcome Variable independent correlated Alternatives if the normality assumption is violated (and small n)
Continuous (e.g. blood pressure, age, pain score) Ttest compares means between two independent groups ANOVA compares means between more than two independent groups Pearsons correlation coefficient (linear correlation) shows linear correlation between two continuous variables Linear regression multivariate regression technique when the outcome is continuous gives slopes or adjusted means Paired ttest compares means between two related groups (e.g., the same subjects before and after) Repeated-measures ANOVA compares changes over time in the means of two or more groups (repeated measurements) Mixed models/GEE modeling multivariate regression techniques to compare changes over time between two or more groups Non-parametric statistics Wilcoxon sign-rank test non-parametric alternative to paired ttest Wilcoxon sum-rank test (Mann-Whitney U test) non-parametric alternative to the ttest Kruskal-Wallis test non-parametric alternative to ANOVA Spearman rank correlation coefficient non-parametric alternative to Pearsons correlation coefficient
52
DHA and eczema
Figure 3 from Koch C, Dölle S, Metzger M, Rasche
C, Jungclas H, Rühl R, Renz H, Worm M.
Docosahexaenoic acid (DHA) supplementation in
atopic eczema a randomized, double-blind,
controlled trial. Br J Dermatol. 2008
Apr158(4)786-92. Epub 2008 Jan 30.
53
Wilcoxon sign-rank test
  • Statistical question Did patients improve in
    SCORAD score from baseline to 8 weeks?
  • What is the outcome variable? SCORAD
  • What type of variable is it? Continuous
  • Is it normally distributed? No (and small
    numbers)
  • Are the observations correlated? Yes, its the
    same people before and after
  • How many time points are being compared? two
  • ? Wilcoxon sign-rank test

54
Wilcoxon sign-rank test mechanics
  • 1. Calculate the change in SCORAD score for each
    participant.
  • 2. Rank the absolute values of the changes in
    SCORAD score from smallest to largest.
  • 3. Add up the ranks from the people who improved
    and, separately, the ranks from the people who
    got worse.
  • 4. The Wilcoxon sign-rank compares these values
    to determine whether improvements significantly
    exceed declines (or vice versa).

55
Continuous outcome (means)
Outcome Variable Are the observations correlated? Are the observations correlated? Alternatives if the normality assumption is violated (and small n)
Outcome Variable independent correlated Alternatives if the normality assumption is violated (and small n)
Continuous (e.g. blood pressure, age, pain score) Ttest compares means between two independent groups ANOVA compares means between more than two independent groups Pearsons correlation coefficient (linear correlation) shows linear correlation between two continuous variables Linear regression multivariate regression technique when the outcome is continuous gives slopes or adjusted means Paired ttest compares means between two related groups (e.g., the same subjects before and after) Repeated-measures ANOVA compares changes over time in the means of two or more groups (repeated measurements) Mixed models/GEE modeling multivariate regression techniques to compare changes over time between two or more groups Non-parametric statistics Wilcoxon sign-rank test non-parametric alternative to paired ttest Wilcoxon sum-rank test (Mann-Whitney U test) non-parametric alternative to the ttest Kruskal-Wallis test non-parametric alternative to ANOVA Spearman rank correlation coefficient non-parametric alternative to Pearsons correlation coefficient
56
ANOVA example
Mean micronutrient intake from the school lunch
by school
a School 1 (most deprived 40 subsidized
lunches).b School 2 (medium deprived lt10
subsidized).c School 3 (least deprived no
subsidization, private school).d ANOVA
significant differences are highlighted in bold
(Plt0.05).
FROM Gould R, Russell J, Barker ME. School lunch
menus and 11 to 12 year old children's food
choice in three secondary schools in England-are
the nutritional standards being met? Appetite.
2006 Jan46(1)86-92.
57
ANOVA
  • Statistical question Does calcium content of
    school lunches differ by school type (privileged,
    average, deprived)
  • What is the outcome variable? Calcium
  • What type of variable is it? Continuous
  • Is it normally distributed? Yes
  • Are the observations correlated? No
  • Are groups being compared and, if so, how many?
    Yes, three
  • ? ANOVA

58
ANOVA (ANalysis Of VAriance)
  • Idea For two or more groups, test difference
    between means, for normally distributed
    variables.
  • Just an extension of the t-test (an ANOVA with
    only two groups is mathematically equivalent to a
    t-test).

59
One-Way Analysis of Variance
  • Assumptions, same as ttest
  • Normally distributed outcome
  • Equal variances between the groups
  • Groups are independent

60
Hypotheses of One-Way ANOVA
61
ANOVA
  • Its like this If I have three groups to
    compare
  • I could do three pair-wise ttests, but this would
    increase my type I error
  • So, instead I want to look at the pairwise
    differences all at once.
  • To do this, I can recognize that variance is a
    statistic that lets me look at more than one
    difference at a time

62
The F-test
Is the difference in the means of the groups more
than background noise (variability within
groups)?
63
The F-distribution
  • A ratio of variances follows an F-distribution
  • The F-test tests the hypothesis that two
    variances are equal.
  • F will be close to 1 if sample variances are
    equal.

64
ANOVA example 2
  • Randomize 33 subjects to three groups 800 mg
    calcium supplement vs. 1500 mg calcium supplement
    vs. placebo.
  • Compare the spine bone density of all 3 groups
    after 1 year.

65
Spine bone density vs. treatment
1.2
1.1
1.0
S
P
I
N
E
0.9
0.8
0.7
PLACEBO
800mg CALCIUM
1500 mg CALCIUM
66
Group means and standard deviations
  • Placebo group (n11)
  • Mean spine BMD .92 g/cm2
  • standard deviation .10 g/cm2
  • 800 mg calcium supplement group (n11)
  • Mean spine BMD .94 g/cm2
  • standard deviation .08 g/cm2
  • 1500 mg calcium supplement group (n11)
  • Mean spine BMD 1.06 g/cm2
  • standard deviation .11 g/cm2

67
The F-Test
68
Review Question 4
  • Which of the following is an assumption of ANOVA?
  • The outcome variable is normally distributed.
  • The variance of the outcome variable is the same
    in all groups.
  • The groups are independent.
  • All of the above.
  • None of the above.

69
Review Question 4
  • Which of the following is an assumption of ANOVA?
  • The outcome variable is normally distributed.
  • The variance of the outcome variable is the same
    in all groups.
  • The groups are independent.
  • All of the above.
  • None of the above.

70
ANOVA summary
  • A statistically significant ANOVA (F-test) only
    tells you that at least two of the groups differ,
    but not which ones differ.
  • Determining which groups differ (when its
    unclear) requires more sophisticated analyses to
    correct for the problem of multiple comparisons

71
Question Why not just do 3 pairwise ttests?
  • Answer because, at an error rate of 5 each
    test, this means you have an overall chance of up
    to 1-(.95)3 14 of making a type-I error (if all
    3 comparisons were independent)
  •  If you wanted to compare 6 groups, youd have to
    do 15 pairwise ttests which would give you a
    high chance of finding something significant just
    by chance.

72
Multiple comparisons
73
Correction for multiple comparisons
  • How to correct for multiple comparisons post-hoc
  • Bonferroni correction (adjusts p by most
    conservative amount assuming all tests
    independent, divide p by the number of tests)
  • Tukey (adjusts p)
  • Scheffe (adjusts p)

74
1. Bonferroni
For example, to make a Bonferroni correction,
divide your desired alpha cut-off level (usually
.05) by the number of comparisons you are making.
Assumes complete independence between
comparisons, which is way too conservative.
75
2/3. Tukey and Sheffé
  • Both methods increase your p-values to account
    for the fact that youve done multiple
    comparisons, but are less conservative than
    Bonferroni (let computer calculate for you!).

76
Review Question 5
  • I am doing an RCT of 4 treatment regimens for
    blood pressure. At the end of the day, I compare
    blood pressures in the 4 groups using ANOVA. My
    p-value is .03. I conclude
  • All of the treatment regimens differ.
  • I need to use a Bonferroni correction.
  • One treatment is better than all the rest.
  • At least one treatment is different from the
    others.
  • In pairwise comparisons, no treatment will be
    different.

77
Review Question 5
  • I am doing an RCT of 4 treatment regimens for
    blood pressure. At the end of the day, I compare
    blood pressures in the 4 groups using ANOVA. My
    p-value is .03. I conclude
  • All of the treatment regimens differ.
  • I need to use a Bonferroni correction.
  • One treatment is better than all the rest.
  • At least one treatment is different from the
    others.
  • In pairwise comparisons, no treatment will be
    different.

78
Continuous outcome (means)
Outcome Variable Are the observations correlated? Are the observations correlated? Alternatives if the normality assumption is violated (and small n)
Outcome Variable independent correlated Alternatives if the normality assumption is violated (and small n)
Continuous (e.g. blood pressure, age, pain score) Ttest compares means between two independent groups ANOVA compares means between more than two independent groups Pearsons correlation coefficient (linear correlation) shows linear correlation between two continuous variables Linear regression multivariate regression technique when the outcome is continuous gives slopes or adjusted means Paired ttest compares means between two related groups (e.g., the same subjects before and after) Repeated-measures ANOVA compares changes over time in the means of two or more groups (repeated measurements) Mixed models/GEE modeling multivariate regression techniques to compare changes over time between two or more groups Non-parametric statistics Wilcoxon sign-rank test non-parametric alternative to paired ttest Wilcoxon sum-rank test (Mann-Whitney U test) non-parametric alternative to the ttest Kruskal-Wallis test non-parametric alternative to ANOVA Spearman rank correlation coefficient non-parametric alternative to Pearsons correlation coefficient
79
Non-parametric ANOVA (Kruskal-Wallis test)
  • Statistical question Do nevi counts differ by
    training velocity (slow, medium, fast) group in
    marathon runners?
  • What is the outcome variable? Nevi count
  • What type of variable is it? Continuous
  • Is it normally distributed? No (and small sample
    size)
  • Are the observations correlated? No
  • Are groups being compared and, if so, how many?
    Yes, three
  • ? non-parametric ANOVA

80
Example Nevi counts and marathon runners
Richtig et al. Melanoma Markers in Marathon
Runners Increase with Sun Exposure and Physical
Strain. Dermatology 200821738-44.
81
Non-parametric ANOVA
  • Kruskal-Wallis one-way ANOVA
  • (just an extension of the Wilcoxon Sum-Rank test
    for 2 groups based on ranks)

82
Example Nevi counts and marathon runners
By non-parametric ANOVA, the groups differ
significantly in nevi count (plt.05) overall. By
Wilcoxon sum-rank test (adjusted for multiple
comparisons), the lowest velocity group differs
significantly from the highest velocity group
(plt.05)
Richtig et al. Melanoma Markers in Marathon
Runners Increase with Sun Exposure and Physical
Strain. Dermatology 200821738-44.
83
Review Question 6
  • I want to compare depression scores between
    three groups, but Im not sure if depression is
    normally distributed. What should I do?
  • Dont worry about itrun an ANOVA anyway.
  • Test depression for normality.
  • Use a Kruskal-Wallis (non-parametric) ANOVA.
  • Nothing, I cant do anything with these data.
  • Run 3 nonparametric ttests.

84
Review Question 6
  • I want to compare depression scores between
    three groups, but Im not sure if depression is
    normally distributed. What should I do?
  • Dont worry about itrun an ANOVA anyway.
  • Test depression for normality.
  • Use a Kruskal-Wallis (non-parametric) ANOVA.
  • Nothing, I cant do anything with these data.
  • Run 3 nonparametric ttests.

85
Review Question 7
  • If depression score turns out to be very
    non-normal, then what should I do?
  • Dont worry about itrun an ANOVA anyway.
  • Test depression for normality.
  • Use a Kruskal-Wallis (non-parametric) ANOVA.
  • Nothing, I cant do anything with these data.
  • Run 3 nonparametric ttests.

86
Review Question 7
  • If depression score turns out to be very
    non-normal, then what should I do?
  • Dont worry about itrun an ANOVA anyway.
  • Test depression for normality.
  • Use a Kruskal-Wallis (non-parametric) ANOVA.
  • Nothing, I cant do anything with these data.
  • Run 3 nonparametric ttests.

87
Review Question 8
  • I measure blood pressure in a cohort of elderly
    men yearly for 3 years. To test whether or not
    their blood pressure changed over time, I compare
    the mean blood pressures in each time period
    using a one-way ANOVA. This strategy is
  • Correct. I have three means, so I have to use
    ANOVA.
  • Wrong. Blood pressure is unlikely to be normally
    distributed.
  • Wrong. The variance in BP is likely to greatly
    differ at the three time points.
  • Correct. It would also be OK to use three ttests.
  • Wrong. The samples are not independent.

88
Review Question 8
  • I measure blood pressure in a cohort of elderly
    men yearly for 3 years. To test whether or not
    their blood pressure changed over time, I compare
    the mean blood pressures in each time period
    using a one-way ANOVA. This strategy is
  • Correct. I have three means, so I have to use
    ANOVA.
  • Wrong. Blood pressure is unlikely to be normally
    distributed.
  • Wrong. The variance in BP is likely to greatly
    differ at the three time points.
  • Correct. It would also be OK to use three ttests.
  • Wrong. The samples are not independent.
Write a Comment
User Comments (0)
About PowerShow.com