Topics in Biostatistics Part 2 - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Topics in Biostatistics Part 2

Description:

Center for Clinical Epidemiology and Biostatistics. University of Penn ... Journal of Virology, 29(2): 669-676. Kruskal-Wallis test. Non-normal. Independent 2 ... – PowerPoint PPT presentation

Number of Views:257
Avg rating:3.0/5.0
Slides: 40
Provided by: sarahra
Category:

less

Transcript and Presenter's Notes

Title: Topics in Biostatistics Part 2


1
Topics in BiostatisticsPart 2
  • Sarah J. Ratcliffe, Ph.D.
  • Center for Clinical Epidemiology and
    Biostatistics
  • University of Penn School of Medicine

2
Outline
  • Hypothesis testing
  • Examples
  • Interpreting results
  • Resources

3
Hypothesis testing
  • Steps
  • Select a one-sided or two-sided test.
  • Establish the level of significance (e.g., ?
    .05).
  • Select an appropriate test statistic.
  • Compute test statistic with actual data.
  • Calculate degrees of freedom (df) for the test
    statistic.

4
Hypothesis testing
  • Steps contd
  • Obtain a tabled value for the statistical test.
  • Compare the test statistic to the tabled value.
  • Calculate a p-value.
  • Make decision to accept or reject null hypothesis.

5
Hypothesis testing
  • Steps
  • Select a one-sided or two-sided test.
  • Establish the level of significance (e.g., ?
    .05).
  • Select an appropriate test statistic.
  • Compute test statistic with actual data.
  • Calculate degrees of freedom (df) for the test
    statistic.

6
Hypothesis testing One-sided versus Two-sided
  • Determined by the alternative hypothesis.
  • Unidirectional one-sided
  • Example
  • Infected macaques given vaccine or placebo.
    Higher
  • viral-replication in vaccine group has no benefit
    of
  • interest.
  • H0 vaccine has no beneficial effect on
    viral-replication levels at 6 weeks after
    infection.
  • Ha vaccine lowers viral-replication levels by 6
    weeks after infection.

7
Hypothesis testing One-sided versus Two-sided
  • Bi-directional two-sided
  • Example
  • Infected macaques given vaccine or placebo.
  • Interested in whether vaccine has any effect on
    viral-
  • replication levels, regardless of direction of
    effect.
  • H0 vaccine has no beneficial effect on
    viral-replication levels at 6 weeks after
    infection.
  • Ha vaccine effects the viral-replication levels.

8
Hypothesis testing
  • Steps
  • Select a one-sided or two-sided test.
  • Establish the level of significance (e.g., ?
    .05).
  • Select an appropriate test statistic.
  • Compute test statistic with actual data.
  • Calculate degrees of freedom (df) for the test
    statistic.

9
Hypothesis testing Level of Significance
  • How many different hypotheses are being
    examining?
  • How many comparisons are needed to answer this
    hypothesis?
  • Are any interim analyses planned?
  • e.g. test data, depending on results collect more
    data and re-test.
  • gt How many tests will be ran in total?

10
Hypothesis testing Level of Significance
  • ?total desired total Type-I error (false
    positives) for all comparisons.
  • One test
  • ?1 ?total
  • Multiple tests / comparisons
  • If ?i ?total, then ??i gt ?total
  • Need to use a smaller ? for each test.

11
Hypothesis testing Level of Significance
  • Conservative approach
  • ?i ?total / number comparisons
  • Can give different ?s to each comparison.
  • Formal methods include Bonferroni, Tukey-Cramer,
    Scheffes method, Duncan-Walker.
  • OBrien-Fleming boundary or a Lan and Demets
    analog can be used to determine ?i for interim
    analyses.
  • Benjamini Y, and Hochberg Y (1995) Controlling
    the false discovery rate a practical and
    powerful approach to multiple testing. JRSSB,
    57125-133.

12
Hypothesis testing
  • Steps
  • Select a one-tailed or two-tailed test.
  • Establish the level of significance (e.g., ?
    .05).
  • Select an appropriate test statistic.
  • Compute test statistic with actual data.
  • Calculate degrees of freedom (df) for the test
    statistic.

13
Hypothesis testing Selecting an Appropriate test
  • How many samples are being compared?
  • One sample
  • Two samples
  • Multi-samples
  • Are these samples independent?
  • Unrelated subjects in each sample.
  • Subjects in each sample related / same.

14
Hypothesis testing Selecting an Appropriate test
  • Are your variables continuous or categorical?
  • If continuous, is the data normally distributed?
  • Normality can be determined using a P-P (or Q-Q)
    plot.
  • Plot should be approximately a straight line for
    normality.
  • If not normal, can it be transformed to
    normality?
  • Blindly assuming normality can lead to wrong
    conclusions!!!

15
Hypothesis testing Selecting an Appropriate test
Approximately a straight line normal assumption
okay
16
Hypothesis testing Selecting an Appropriate test
Not a straight line NOT normal Can it be
transformed to normality?
17
Hypothesis testing Selecting an Appropriate test
The natural log transform of the data is
approximately a straight line normal assumption
okay Analyze the transformed data NOT the
original data.
18
Hypothesis testing Geometric versus Arithmetic
mean
  • Geometric mean of n positive numerical values is
    the nth root of the product of the n values.
  • Geometric will always be less than arithmetic.
  • Geometric better when some values are very large
    in magnitude and others are small.
  • If geometric is used, log-transform the data
    before analyzing.
  • Arithmetic mean of log-transformed data is the
    log of the geometric mean of the data
  • E.g. t-test on log-transformed data test for
    location of the geometric mean
  • Langley R., Practical Statistics Simply
    Explained, 1970, Dover Press

19
Source Richardson Overbaugh (2005). Basic
statistical considerations in virological
experiments. Journal of Virology, 29(2) 669-676.
20
Hypothesis testing Selecting an Appropriate test
  • Other tests are available for more complex
    situations. For example,
  • Repeated measures ANOVA gt2 measurements taken
    on each subject usually interested in time
    effect.
  • GEEs / Mixed-effects models gt2 measurements
    taken on each subject adjust for other
    covariates.

21
Hypothesis testing
  • Steps
  • Select a one-tailed or two-tailed test.
  • Establish the level of significance (e.g., ?
    .05).
  • Select an appropriate test statistic.
  • Run the test.

22
Example 1
  • Expression of chemokine receptors on CD14/CD14-
    populations of blood monocytes.
  • Percent of cells positive by FACS.

23
(No Transcript)
24
Example 1 contd
  • Continuous data, 2 samples
  • gt t-test, if normal OR
  • gt Wilcoxon rank sum or signed-rank sum test, if
    non-normal
  • Are samples independent or paired?
  • If independent, can test for equality of
    variances using a Levenes test

25
Example 1 contd
1-sided or 2-sided test
  • T-tests in excel
  • TTEST(L6L15,M6M15,2,2)

Type of t-test 1 paired 2 independent, equal
variance 3 independent, unequal variance
Cells containing data from sample 1
Cells containing data from sample 2
26
(No Transcript)
27
Example 1 contd
  • Possible results for different assumptions

28
Example 1 contd
  • Which result is correct?
  • Data are paired
  • The differences for each subject are normally
    distributed.
  • gt Paired t-test
  • p .0095
  • There is a difference in the percentage of
    positive CD14 and CD14- cells.

29
A graph of the 95 CIs for the means would give
the impression there is no difference
30
When its really the differences we are testing.
31
Example 1 contd
  • Note paired tests dont always give lower
    p-values.
  • A 1-sided test on the CCR5 values would give
    p-values of
  • p 0.06 independent samples
  • p 0.11 paired samples
  • WHY?

32
Example 1 contd
  • The differences have a larger spread than the
    individual variables.

33
Example 2
  • Does the level of CCR5 expression on PBLs (basal
    or upregulated using lentiviral vector) determine
    the of entry that occurs via CCR5?
  • Two viruses
  • 89.6
  • DH12

34
Example 2 contd
35
Example 2 contd
  • How do we know if the slope of the line is
    significantly different from 0?
  • Can perform a t-test on the slope estimate. For
    simple linear regression, this is the same as a
    t-test for correlation ( square root of R2).

36
Example 2 contd
37
Interpreting Results
  • P-values
  • Is there a statistically significant result?
  • If not, was the sample size large enough to
    detect a biologically meaningful difference?

38
Online Resources
  • Power / sample size calculators
  • http//calculators.stat.ucla.edu/powercalc/
  • http//www.stat.uiowa.edu/rlenth/Power/
  • Free statistical software
  • http//members.aol.com/johnp71/javasta2.htmlFreeb
    ies

39
BECC Consulting Center
  • www.cceb.upenn.edu/main/center/becc.html
  • Hourly fee service
  • Design and analysis strategies for research
    proposals
  • Selecting and implementing appropriate
    statistical methods for specific applications to
    research data
  • Statistical and graphical analysis of data
  • Statistical review of manuscripts.
Write a Comment
User Comments (0)
About PowerShow.com