Introduction to Statistics: Frequentist

About This Presentation

Title:

Introduction to Statistics: Frequentist

Description:

Introduction to Statistics ... Take random samples from the population and calculate a statistic Describes the chance fluctuations ... A Study Guide to Epidemiology ... – PowerPoint PPT presentation

Number of Views:518

Avg rating:3.0/5.0

Slides: 44

Provided by: eceUtAcI

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to Statistics: Frequentist

1
Introduction to Statistics Frequentist
Bayesian Approaches (for Non-Statisticians)

Ryung Suh, MD
Becker Associates Consulting, Inc.
Internal Staff Training
June 8, 2004
ryung.suh_at_becker-consult.com

2
Objectives

To provide a basic understanding of the terms and
concepts that underlie statistical analyses of
clinical trials data
To introduce Bayesian approaches and their
application to FDA submissions

3
Table of Contents

Sources of Statistical Data
Frequentist Approaches
Bayesian Approaches
Insights from the Experts (from the Bayesian
Approaches meeting, May 20-21, 2004)
Take-Aways and Strategic Insights
Corporate Resources

4
Sources of Data

Retrospective Studies Design, Bias, Matching,
Relative Risk, Odds Ratio
Prospective Studies Design, Loss to Follow-up,
Analysis, Relative Risk, Nonconcurrent
Prospective Studies, Incidence, Prevalence
Randomized Controlled Trials Design,
Elimination of Bias, Placebo Effect, Analysis
Survival Analysis Person-Time, Life-Tables,
Proportional Hazard Models

5
FREQUENTIST APPROACHES
6
Classical Frequentist

Hypothesis Testing In order to draw a valid
statistical inference that an independent
variable has a statistically significant effect
(not the same as clinically significant effect),
it is important to rule out chance or random
variability as an explanation for the effects
seen in a sampling distribution.

7
Statistical Inference

Two inferential techniques
Hypothesis Testing
Confidence Intervals
Inference is the process of making statements
(hypotheses) with a degree of statistical
certainty about population parameters based on a
sampling distribution

8
Hypothesis Testing Terms

Null Hypothesis Ho initially held to be true
unless proven otherwise
e.g. there is NO difference between treatment and
control
e.g. µ 11, or µ2 µ1 0
Akin to the accused is innocent
Alternative Hypothesis Ha is the claim we
usually want to prove
e.g. there is a difference between treatment and
control
e.g. µ ? 11, or µ2 µ1 ? 0
Akin to the accused is guilty
We assume innocence until proven guilty beyond a
reasonable doubt the same applies with Ho

9
Hypothesis Testing Decisions

Decision Options
Reject Ho (and assert Ha to be true)
Fail to Reject Ho (due to insufficient evidence)
Errors in Decisions

10
Level of Significance

Alpha a P(Type I Error) P(Reject Ho Ho is
true)
Beta ß P(Type II Error) P(Fail to Reject Ho
Ho is false)
Power 1 ß
We want both a and ß to be small
but increasing one decreases the other

This example is a simplification to aid
understanding the exact ß tends to be
generally unknown, although it is frequently
due to sample sizes that are too small.
Alternative Hypothesis
Null Hypothesis
11
Sampling Distribution

Population Distribution usu. a normal
distribution with a mean of µ and a variance of
s2 (but tough to measure the entire population)
Sampling Distribution a distribution of means
from random samples drawn from the population a
random variable (?) normally distributed with a
mean (µ?) and variance of (s2/n),
Take random samples from the population and
calculate a statistic
Describes the chance fluctuations of the
statistic and the variability of sample averages
around the population mean, for a given sample
size (n).
Sample mean (µ?) serves as a point estimate for
the population mean (µ)
Central Limit Theorem as n ? 8, sampling
distribution approaches normal distribution (and
the estimate becomes more precise)
http//www.ruf.rice.edu/lane/stat_sim/sampling_di
st/

12
Determining the P(?µ)

Key Question Does the sample mean reflect the
population mean, given the effects of
variability/chance?
If population standard deviation (s) is known, we
can standardize (mean0 s.d.1) and compare
Z (? - µ?) / (s / vn)
If s is unknown, we can estimate s from the same
set of sample data and compare with a normal
t-distribution
T (? - µ?) / (s / vn)
a continuous distribution symmetric about zero
an infinite number of t-distributions indexed by
degrees of freedom
as degrees of freedom (n-1) increase,
t-distributions approach standard normal
distributions

13
Normal versus t-distribution
N(0,1)
T-distributions are flatter and have more area
in the tails compared to Normal
distributions T-distributions approximate the
Normal as degrees of freedom (n-1) increase
t(1)
t(5)
14
Hypothesis Testing More Terms

Test Statistic the computed statistic used to
make the decisions in hypothesis testing relates
to a probability distribution (e.g. Z, t, ?2)
Critical Region contains the values of the test
statistic such that Ho is rejected
Critical Value the endpoint(s) of the critical
region
One-tailed versus two-tailed tests depends on
Ha
P-Value the smallest value of a such that Ho
will be rejected (a probability associated with
the calculated value of the test statistic)

15
Steps in Hypothesis TestingThe
Classical/Frequentist Approach

Define parameter and specify Ho and Ha
Specify n (sample size), a (significance level),
the test statistic, and the critical value(s) and
critical regions
Take a sample and compute the value of the test
statistic compare to the relevant probability
distribution
Reject or fail to reject Ho and draw statistical
inferences
Remember P-value is not the probability of
the null hypothesis being true (the null
hypothesis is either true or not, with P-value
defining the level of significance for which
randomness is considered).

16
Confidence Intervals

CI for (1-a)100 ? t (n-1, a/2)(s/vn)
Provides CI for population mean (µ) at the chosen
level of confidence (e.g. 90, 95, 99)
Provides interval estimate of the population mean
(vs. the point estimate that the sample mean
gives)
Depends on the amount of variability in the data
Depends on the level of certainty we require
Increasing (1-a) will increase the CI width
Increasing sample size (n) will decrease the CI
width

17
Issues for Frequentists (and others)

Multiplicity the chance of a Type I error when
multiple hypotheses are tested is larger than the
chance of a Type I error in each hypothesis test
Multiple Endpoints Frequentists worry about the
dimensions of the sample space (the Bayesian
looks at the dimensions of the parameter
space)both tend to be skeptical of believing
what he thinks he sees in high-dimensional
problems (Permutt)
Multiple Looks Trials are expensive, so
sequential methods are attractive but stopping
rules tend to be fixed in frequentist approaches
Multiple Studies Frequentist meta-analysis (to
look at combined evidence from several studies)
cannot rely simply on a fixed p-value (i.e.
0.05) it must look at the entirely of the
evidence and the strength of each piece
Garbage In, Garbage Out

18
BAYESIAN APPROACHES
19
Bayesian Statistics

Thomas Bayes (1702-1761) English theologian and
mathematician Essay towards solving a problem
in the doctrine of chances (1763)
Bayesian methods iterative processes that make
better decisions based on learning from
experiences
combines a prior probability distribution for the
states of nature with new sample information
the combined data gives a revised probability
distribution about the states of nature, which is
then used as a prior probability distribution
with new (future) sample information
and so on and so on
Key feature using an empirically derived
probability distribution for a population
parameter
May use objective data or subjective opinions in
specifying a prior distribution
Criticized for lack of objectivity in specifying
prior probability distribution

20
A Bayesian example

From http//www.abelard.org/briefings/bayes.htm
15 blue taxis 85 black taxis only 100 taxis in
the entire town
Witness claims seeing a blue taxi in hit-and-run
Witness is given a random ordered test
successfully identifies 4/5 taxis correctly (80)
If witness claims blue, how likely is she to
have the color correct?
Blue taxis 80 is 12 blue 3 black
Black taxis 80 is 68 black 17 blue
In given sample space, 12/29 claims of blue are
actually blue taxis (41)
A claim of black would be 68/71 (in the given
sample space) 96
Bayesians take into account the rate of false
positives for black taxis as well as for blue
taxis (note that black taxis are in greater
supply here)
Bayesian stats useful for calculating relatively
small risks (e.g. rare disorders)
Bayesian stats useful in non-random distributions

21
Perspectives on Probability

Frequentist probability the relative
frequency of an event, given the experiment is
repeated an infinite number of times
Bayesian probability degree of belief or
the likelihood of an event happening given what
is known about the population

22
Bayesian Hypothesis Testing

Non-Bayesians navigate the optimal tradeoff
between the probabilities of a false alarm
(Type I error) and a miss (Type II error)
One can compare the likelihood ratio of these two
probabilities to a nonnegative threshold value
(or the log likelihood ratio to an arbitrary real
threshold value)
Increasing the threshold makes the test less
sensitive (higher chance of a miss)
decreasing the threshold makes the test more
sensitive (but with a higher chance of a false
alarm)
More data improves the limits of this ratio (the
limit relation is often give as Steins lemma,
which approaches the Kullback-Leibler distance)
Bayesians instead of optimizing a probability
tradeoff, a miss event or false alarm event
is assigned costs additionally, we have prior
distributions
Decision function is based on the Bayes Risk, or
expected costs
Threshold value is a function of costs and priors

23
Bayesian Parameter Estimation

Non-Bayesians the probability of an event is
estimated as the empirical frequency of the event
in a data sample
Bayesians include empirical prior
information as the data sample goes to
infinity, the effects of the past trial wash out
If there is no empirical prior information, it
is possible to create a prior distribution based
on reasonable beliefs
We calculate the posterior distribution from the
sample data and the prior distribution using
Bayes Theorem
P(AB) P(BA) P(A) / P(B)
This becomes the new prior distribution (known as
a conjugate prior) this process allows efficient
sequential updating of the posterior
distributions as the study proceeds
The output of the Bayesian analysis is the
entire posterior distribution (not just a single
point estimate) it summarizes ALL our
information to date
As we get more data, the posterior distribution
will become more sharply peaked about a single
value

24
Bayesian Sequential Analysis

Given no fixed number of observations, and the
observations come in sequence (until we decide to
stop)
Non-Bayesians the sequential probability ratio
test is comparable to the log likelihood ratio
and is used to decide on outcome 1, outcome 2, or
to keep collecting observations (assigning
threshold values to the log ratio functions)
Bayesians use the sequential Bayes risk by
assigning a cost (of false alarms and misses)
proportional to the number of observations prior
to stopping the goal is to minimize expected
cost using a strategy of optimal stopping

25
INSIGHTS FROM THE EXPERTS (BAYESIANS AND
FREQUENTISTS)
26
Steve Goodman (Hopkins)

Medical Inference is inductive
Deductive (disease ? signs/symptoms) traditional
statistical methods
Inductive (signs/symptoms ? disease)Bayesian
approaches more appropriate
Bayes Theorem
prior odds x Bayes factor posterior odds
Pretest odds x likelihood factor posttest odds
P-Value P(X being more extreme than observed
result, assuming null hypothesis to be true)
Does not represent the probability of observed
data being true
Does not represent the probability of observed
data being by chance
Does not represent the probability of the truth
of the null hypothesis
If P(datahypothesis) p, then likelihood of
(hypothesisdata) cp, where c is an arbitrary
constant
P(H0data) / P(Hadata) g / (1-g)
P(dataH0) / P (dataHa)

27
Steve Goodman (Hopkins)

P-Value
Noncomparative
Observed hypothetical data
Implicit Ha
Evidence can only be negative
Sensitive to stopping rules
No formal interpretation

Bayes Factor
Comparative
Only observed data
Pre-defined explicit Ha
Positive or negative evidence
Insensitive to stopping rules
Formal interpretation

P-Value asks you to look at the data only ? then
make inferences later Bayesian methods ask you
to ask the question first ? and look at existing
data
that is evidence for the
question
28
Tom Louis (Hopkins)

Bayesian Inference
Specify the multi-level structure of prior
probability distributions
Compute the joint posterior distribution for all
unknowns
Compute the posterior distribution of quantities
by integrating known conditions
Use the joint distribution to make inferences
Bayesian Advantages
Precision increases with more available
information
Repeated sampling gives information on the prior
More flexible when looking at partially related
gaussian distributions
Allows inclusion and structuring of historical
data (allows a compromise between ignoring
historical data (no weight) and data-pooling
(full weight)
Captures relevant uncertainties
Structures complicated inferences
Adds flexibility in designs
Documents assumptions

29
Don Berry (M.D. Anderson)

Approaches to drug/device development
Fully Bayes ? likelihood principle (for company
decision-making)
Bayesian tools for expanding the frequentist
envelope (for designing and analyzing
registration studies)
Bayesian advantages
Sequential learning is useful in study design
Predictive distributions (frequentists cannot
emulate this)
Borrowing strength from historical data,
concomitant trials, or from across patient and
disease groups
Early data allows Adaptive Randomization
Ethical advantage stop clearly harmful or
ineffective drugs/devices early in the trial
Find nuggets quickly and with higher
probability
Learn quickly, treat patients in trial more
effectively, save resources
May save resources (base development on early
decision-analysis)
May test multiple experimental drugs (e.g. cancer
drug cocktails)
Seamless transitions through clinical trial
phases (e.g. do not stop accrual)
Increase statistical power with much smaller
sample populations
Relates response and survival rates as well
Early decisions on treatmentand on ending a
trial

30
Bob Temple (CDER)

FDA is nervous and inexperienced with regard
to Bayesian analysis (perhaps with exception in
CRDH)
Strategy should show both frequentist and
Bayesian results (and show the difference)
Pitfalls Bayesian approaches can sometimes be
longer and more expensive for the company
Bottomline Bayesian approaches are still new
and need to be better understood by investigators
and regulators

31
Larry Kessler (CDRH)

Bayesians at CDRH Greg Campbell, Don Malec,
Gene Pennello, Telba Irony
White Paper (1997) http//ftp.isds.duke.edu/Worki
ngPapers/97-21.ps
Applications to devices
Devices tend to have a great deal of prior
information (mechanism of action is physical and
local, as opposed to pharmacokinetic and
systemic)
Devices usually evolve in small steps
Studies gain strength by using quantitative
prior information
Prediction models available for surrogate
variables
Sensitivity analysis available for missing data
Adaptive trial designs often useful for decision
theoretics, non-inferiority trials, and
post-market surveillance
Helps determine sample size and interim-look
strategies
Risks and Challenges
Often a trade-off between clinical burden and
computational burden
Can be more expensive (e.g. if the prior
information is NOT predictive or useless)
Beware of the regression to the mean effect
Hierarchical structure is not good if too little
(single prior study) or too much prior info

32
Larry Kessler (CDRH)

Considerations
Restrict to quantitative prior information
Need legal permission because companies tend to
own prior studies and data
Published literature and SSEs often lack
patient-level data
FDA/companies need to reach agreement on the
validity of any prior info
Need new decision rules for the clinical study
process
Frequentist statistically significant result
for primary endpoint effectiveness
Bayesian posterior probability exceeding some
predetermined value (or some interval within
which it behaves consistently)
Bayesian trials must be prospectively designed
(no switching mid-stream)
Control group cannot be used as a source of prior
info for the new device
Need new formats for Labeling and for the Summary
of Safety and Effectiveness
Simulations are important (show that Type I
error is well-controlled)
FDA review team plays role in choice of decision
rules for success and for the exchangeability of
prior studies in a hierarchical model
Recommendations
Prospectively planned, with legally available and
valid prior information
Good communications with the FDA, with a good
statistician, and proper electronic Data

33
Ralph DAgostino (Boston Univ)(Advisory
Committee Member)

Randomized Controlled Trials need to keep
simple
Challenge is that Bayesian methods can sometimes
seem complex
Promise is that Bayesian methods can be made more
intuitive
Should NOT use Bayesian methods to salvage
studies that have failed frequentist approaches
Sometimes Bayesians are too optimistic about
their ability to see validity across studies with
different populations, different endpoints, and
different analytical methods

34
Bob ONeill (CDER)

Too many people misinterpret the p-value
We rely on statistical significance with little
regard for effect size or magnitude
The FDA needs to develop more format and content
guides about reporting Bayesian statistics
Dealing with missing data is essentially a
Bayesian exercise (i.e. model-building)
Bayesian statistics cut both ways (may require
more time, expenses, and data to reach required
evidence)

35
Stacy Lindborg (Global Statistics) and Greg
Campbell (CDRH)

SL Need validated computer software for
Bayesian statistics and need a great deal of
education to help regulators and clinicians
understand the meaning of predictive posterior
probabilities and to trust in Bayesian
statistics
SL Great promise with regard to
Looking at data more comprehensively
Conducting trials more ethically
GC Bayesian designs need to be done
prospectively
CANNOT switch to Bayesian analysis to
rescue/salvage studies that are not going well
GC Bayesian methods have the potential to
shorten study duration, cut costs (by reducing
number of patients), and enhance product
development
GC Between 1999-2003, there have been 14
original PMAs Supplements in which Bayesian
estimation was the primary analysis many more
are in the works

36
Don Rubin (Harvard) and Jay Siegal (Centecor)

DR Bayesian thinking is our natural way to look
at the world
DR Frequentist approaches need to work with
Bayesian thinking (they are still just rules)
DR Validation is needed to ensure that both the
model and the analysis are appropriate
JS Bayesian approaches (which relies on
Predictive Value) and Frequentist approaches
(which relies on Specificity) will converge to
the extent that prior probabilites are similar
e.g. in adult use drugs/devices now applied to
pediatric use
e.g the same class of drug being applied to
similar therapeutic uses
JS Concerns about movement toward Bayesian
approaches
Shifts incentives toward non-innovative (more
valid priors for existing therapies)
Priors constantly change during a trial (need
predictable, prospective standards)
Legal concerns about using competitors data

37
Susan Ellenberg (OBE, CBER) and Norris Alderson
(FDA)

SE If Bayesian approaches are really a better
mousetrap, it will spread and people will beg
to demand it
NA Bayesian is NOT a religion
NA Incorporating a priori knowledge is useful,
but we need frequentist checks at times (reality
checks)
NA Clear guidelines on methods, formats,
content, analysis, etc. are need FDA regulators
will need to work with statisticians, clinicians,
and industry to accomplish this
NA Bayesian approaches still must deal with the
common sources of bias found in frequentist
approaches

38
TAKE-AWAYS
39
Statistical Terms and Concepts

Sources of Data
Statistical Inference
Frequentist Hypothesis Testing
Null and Alternative Hypotheses
Test Statistics and Sampling Distribution
Type I and Type II Errors Power
P-Value and Significance Level (a)
Confidence Intervals
Bayesian Statistics
Prior probability distribution
Posterior (or Joint) probability distribution
Bayes Factor (or Likelihood Ratio)
Adaptive Randomization

40
Strategic FDA Insights

FDA (especially CDRH) favorable to Bayesian
approaches
Not effective in rescuing/salvaging troubled
studies must do prospectively
May lead to quicker, less expensive approvals
(but may be longer, more expensive as well)
Useful in predictive models, sensitivity analysis
for missing data, adaptive trial designs, and for
looking at data more comprehensively (and perhaps
ethically)
Need to use valid quantitative prior information
(work with owners of data and with the FDA)
New decision rules, content, format, method,
analysis, and reporting guidelines are needed (as
well as new labeling and SSE)
A good statistician with both Bayesian and
Frequentist credentials is perhaps our best
advocate many Bayesians already have good
relationships with the FDA

41
Final Thoughts

Clinical versus Statistical Significance
Why p-values of 0.05?
Importance of the research question
Bayesian is not a religion, although some
Bayesians seem to see it that way
The promise of new statistical approaches
Our need to understand (at least at a basic
level) the statistical work we do for our clients

42
Corporate Resources

Carlos Alzola, MS
Aldo Crossa, MS
Campbell Tuskey, MSPH
Reine Lea Speed, MPH
Ryung Suh, MD
Expert Associates Simon, dAgostino, Rubin,
HCRI, Hopkins
Firm Library and Statistical Literature

43
References

Bayesian Approaches, U.S. Food and Drug
Administration. Meeting at Masur Auditorium,
National Institutes of Health, May 20-21, 2004.
Morton, Richard F, J. Richard Hebel, and Robert
J. McCarter. A Study Guide to Epidemiology and
Biostatistics. 3rd ed. 1990.
Permutt, Thomas. Three Nonproblems in the
Frequentist Approach to Clinical Trials, U.S.
Food and Drug Administration.
Stockburger, David W. Introductory Statistics
Concepts, Models, and Applications.
http//www.psychstat.smsu.edu/introbook/sbk19m.htm
Thornburg, Harvey. Introduction to Bayesian
Statistics, CCRMA. Stanford University, Spring
2000-2001.
Sampling Distribution Demonstration.
http//www.ruf.rice.edu/lane/stat_sim/sampling_di
st/