Title: Introduction to Bayesian Methods (I)
1Introduction to Bayesian Methods (I)
- C. Shane Reese
- Department of Statistics
- Brigham Young University
2Outline
- Definitions
- Classical or Frequentist
- Bayesian
- Comparison (Bayesian vs. Classical)
- Bayesian Data Analysis
- Examples
3Definitions
- Problem Unknown population parameter (?) must be
estimated. - EXAMPLE 1
- ? Probability that a randomly selected person
will be a cancer survivor - Data are binary, parameter is unknown and
continuous - EXAMPLE 2
- ? Mean survival time of cancer patients.
- Data are continuous, parameter is continuous.
4Definitions
- Step 1 of either formulation is to pose a
statistical (or probability)model for the random
variable which represents the phenomenon. - EXAMPLE 1
- a reasonable choice for f (y?) (the sampling
density or likelihood function) would be that the
number of 6 month survivors (Y) would follow a
binomial distribution with a total of n subjects
followed and the probability of any one subject
surviving is ?. - EXAMPLE 2
- a reasonable choice for f (y?) survival time (Y)
has an exponential distribution with mean ?.
5Classical (Frequentist) Approach
- All pertinent information enters the problem
through the likelihood function in the form of
data(Y1, . . . ,Yn) - objective in nature
- software packages all have this capability
- maximum likelihood, unbiased estimation, etc.
- confidence intervals, difficult interpretation
6Bayesian Data Analysis
- data (enters through the likelihood function as
well as allowance of other information - reads the posterior distribution is a constant
multiplied by the likelihood muliplied by the
prior Distribution - posterior distribution in light of the data our
updated view of the parameter - prior distribution before any data collection,
the view of the parameter
7Additional Information
- Prior Distributions
- can come from expert opinion, historical studies,
previous research, or general knowledge of a
situation (see examples) - there exists a flat prior or noninformative
which represents a state of ignorance. - Controversial piece of Bayesian methods
- Objective Bayes, Empirical Bayes
8Bayesian Data Analysis
- inherently subjective (prior is controversial)
- few software packages have this capability
- result is a probability distribution
- credible intervals use the language that everyone
uses anyway. (Probability that ? is in the
interval is 0.95) - see examples for demonstration
9Mammography
Test Result Test Result
Positive Negative
Patient Status Cancer 88 12
Patient Status Healthy 24 76
- Sensitivity
- True Positive
- Cancer IDd!
- Specificity
- True Negative
- Healthy not IDd!
10Mammography Illustration
- My friend (40!!!) heads into her OB/GYN for a
mammography (according to Dr.s orders) and finds
a positive test result. - Does she have cancer?
- Specificity, sensitivity both high! Seems likely
... or does it? - Important points incidence of breast cancer in
40 year old women is 126.2 per 100,000 women.
11Bayes Theorem for Mammography
12Mammography Tradeoffs
- Impacts of false positive
- Stress
- Invasive follow-up procedures
- Worth the trade-off with less than 1
(0.46)chance you actually have cancer???
13Mammography Illustration
- My mother-in-law has the same diagnosis in 2001.
- Holden, UT is a downwinder, she was 65.
- Does she have cancer?
- Specificity, sensitivity both high! Seems likely
... or does it? - Important points incidence of breast cancer in
65 year old women is 470 per 100,000 women, and
approx 43 in downwinder cities. - Does this change our assessment?
14Downwinder Mammography
15Modified Example 1
- One person in the class stand at the back and
throw the ball tothe target on the board (10
times). - before we have the person throw the ball ten
times does the choice of person change the a
priori belief you have about the probability they
will hit the target (?)? - before we have the person throw the ball ten
times does the choice of target size change the a
priori belief you have about the probability they
will hit the target (?)?
16Prior Distributions
- a convenient choice for this prior information is
the Beta distribution where the parameters
defining this distribution are the number of a
priori successes and failures. For example, if
you believe your prior opinions on the success or
failure are worth 8 throws and you think the
person selected can hit the target drawn on the
board 6 times, we would say that has a Beta(6,2)
distribution.
17Bayes for Example 1
- if our data are Binomial(n, ?) then we would
calculate Y/n as our estimate and use a
confidence interval formula for a proportion. - If our data are Binomial(n, ?) and our prior
distribution is Beta(a,b), then our posterior
distribution is Beta(ay,bn-y). - thus, in our example
- a b n y
- and so the posterior distribution is Beta( , )
18Bayesian Interpretation
- Therefore we can say that the probability that ?
is in the interval ( , ) is 0.95. - Notice that we dont have to address the problem
of in repeated sampling - this is a direct probability statement
- relies on the prior distribution
19Example Phase II Dose Finding
- Goal
- Fit models of the form
- Where
- And d1,,D is the dose level
20Definition of Terms
- ED(Q)
- Lowest dose for which Q of efficacy is achieved
- Multiple definitions
- Def. 1
- Def. 2
- Example Q.95, ED95 dose is the lowest dose for
which .95 efficacy is achieved
21Classical Approach
- Completely randomized design
- Perform F-test for difference between groups
- If significant at , then call the
trial a success, and determine the most
effective dose as the lowest dose that achieves
some pre-specified criteria (ED95)
22Bayesian Adaptive Approach
- Assign patients to doses adaptively based on the
amount of information about the dose-response
relationship. - Goal maximize expected change in information
gain - Weighted average of the posterior variances and
the probability that a particular dose is the
ED95 dose.
23Probability of Allocation
- Assign patients to doses based on
-
- Where is the probability of being assigned to
dose
24Four Decisions at Interim Looks
- Stop trial for success the trial is a success,
lets move on to next phase. - Stop trial for futililty the trial is going
nowhere, lets stop now and cut our losses. - Stop trial because the maximum number of patients
allowed is reached (Stop for cap) trial outcome
is still uncertain, but we cant afford to
continue trial. - Continue
25Stop for Futility
- The dose-finding trial is stopped because there
is insufficient evidence that any of the doses is
efficacious. - If the posterior probability that the mean change
for the most likely ED95 dose is within a
clinically meaningful amount of the placebo
response is greater than 0.99 then the trial
stops for futility.
26Stop for Success
- The dose-finding trial is stopped when the
current probability that the ED95 is
sufficiently efficacious is sufficiently high. - If the posterior probability that the most likely
ED95 dose is better than placebo reaches a high
value (0.99) or higher then the trial stops early
for success. - Note Posterior (after updated data) probability
drives this decision.
27Stop for Cap
- Cap If the sample size reaches the maximum (the
cap) defined for all dose groups the trial stops.
- Refine definition based on application. Perhaps
one dose group reaching max is of interest. - Almost always driven.
28Continue
- Continue If none of the above three conditions
hold then the trial continues to accrue. - Decision to continue or stop is made at each
interim look at the data (accrual is in batches)
29Benefits of Approach
- Statistical weighting by the variance of the
response at each dose allows quicker resolution
of dose-response relationship. - Medical Integrating over the probability that
each dose is ED95 allows quicker allocation to
more efficacious doses.
30Example of Approach
- Reduction in average number of events
- Yreduction of number of events
- D6 (5 active, 1 placebo)
- Potential exists that there is a non-monotonic
dose-response relationship. - Let be the dose value for dose d.
31Model for Example
32Dynamic Model Properties
- Allows for flexibility.
- Borrows strength from neighboring doses and
similarity of response at neighboring doses. - Simplified version of Gaussian Process Models.
- Potential problem semi-parametric, thus only
considers doses within dose range
33Example Curves
?
34Simulations
- 5000 simulated trials at each of the 5 scenarios
- Fixed dose design,
- Bayesian adaptive approach as outlined above
- Compare two approaches for each of 5 cases with
sample size, power, and type-I error
35Results (power alpha)
Case Pr(S) Pr(F) Pr(cap) P(Rej)
1 .018 .973 .009 .049
2 1 0 0 .235
3 1 0 0 .759
4 1 0 0 .241
5 1 0 0 .802
36Results (n)
0 10 20 40 80 120
1 51.6 26.1 26.2 31.2 33.5 36.8
2 28.4 10.9 13.8 18.9 22.5 19.2
3 27.7 11.3 14.5 25.2 17 15.2
4 31.2 10.8 13.3 19.6 22.2 27.8
5 28.9 18.0 22.3 21.1 14.5 10.7
Fixed 130 130 130 130 130 130
37Observations
- Adaptive design serves two purposes
- Get patients to efficacious doses
- More efficient statistical estimation
- Sample size considerations
- Dose expansion -- inclusion of safety
considerations - Incorporation of uncertainties!!! Predictive
inference is POWERFUL!!!
38Conclusions
- Science is subjective (what about the choice of a
likelihood?) - Bayes uses all available information
- Makes interpretation easier
- BAD NEWS I have showed very simple cases . . .
they get much harder. - GOOD NEWS They are possible (and practical) with
advanced computational procedures