Data Mining in Pharmacovigilence - PowerPoint PPT Presentation

1 / 70
About This Presentation
Title:

Data Mining in Pharmacovigilence

Description:

Safety in Lifecycle of a Drug/Biologic product. Phase 1. Phase 2. Pre-clinical. Phase 3 ... Limitations on pre-licensure trials. Size. Duration ... – PowerPoint PPT presentation

Number of Views:191
Avg rating:3.0/5.0
Slides: 71
Provided by: davidm45
Category:

less

Transcript and Presenter's Notes

Title: Data Mining in Pharmacovigilence


1
Data Mining in Pharmacovigilence
David Madigan, Aimin Feng, Ivan Zorych Rutgers
University dmadigan_at_rutgers.edu http//stat.rutger
s.edu/madigan
2
What is Data Mining?
  • Finding interesting structure in data
  • Structure refers to statistical patterns,
    predictive models, hidden relationships
  • Examples of tasks addressed by Data Mining
  • Predictive Modeling (classification, regression)
  • Segmentation (Data Clustering )
  • Anomaly Detection
  • Visualization

3
Ronny Kohavi, ICML 1998
4
(No Transcript)
5
(No Transcript)
6
Safety in Lifecycle of a Drug/Biologic product
A P P R O V A L
Phase 1
Phase 2
Pre-clinical
Phase 3
Post- Marketing Safety Monitoring
Safety Dose-Ranging
Safety
Safety Efficacy
Safety
Safety Concern
7
Why Post-marketing Surveillance
  • Limitations on pre-licensure trials
  • Size
  • Duration
  • Patient population age, comorbidity, severity
  • Fact
  • Several hundred drugs have been removed from
    market in the last 30 years due to safety
    problems which became known after approval

8
(No Transcript)
9
Databases of Spontaneous ADRs
  • FDA Adverse Event Reporting System (AERS)
  • Online 1997 replace the SRS
  • Over 250,000 ADRs reports annually
  • 15,000 drugs - 16,000 ADRs
  • CDC/FDA Vaccine Adverse Events (VAERS)
  • Initiated in 1990
  • 12,000 reports per year
  • 50 vaccines and 700 adverse events
  • Other SRS
  • WHO - international pharmacovigilance program

10
(No Transcript)
11
Weakness of SRS Data
  • Passive surveillance
  • Underreporting
  • Lack of accurate denominator, only numerator
  • Numerator No. of reports of suspected reaction
  • Denominator No. of doses of administered drug
  • Lack of known background rates of disease
  • No certainty that a reported reaction was causal
  • Missing, inaccurate or duplicated data

12
Databases of Spontaneous ADRs
  • FDA Adverse Event Reporting System (AERS)
  • Online 1997 replace the SRS
  • Over 250,000 ADRs reports annually
  • 15,000 drugs - 16,000 ADRs
  • CDC/FDA Vaccine Adverse Events (VAERS)
  • Initiated in 1990
  • 12,000 reports per year
  • 50 vaccines and 700 adverse events
  • Other SRS
  • WHO - international pharmacovigilance program

13
Weakness of SRS Data
  • Passive surveillance
  • Underreporting
  • Lack of accurate denominator, only numerator
  • Numerator No. of reports of suspected reaction
  • Denominator No. of doses of administered drug
  • No certainty that a reported reaction was causal
  • Missing, inaccurate or duplicated data

14
Existing Methods
  • Multi-item Gamma Poisson Shrinker (MGPS)
  • US Food and Drug Administration (FDA)
  • Bayesian Confidence Propagation Neural Network
  • WHO Uppsala Monitoring Centre (UMC)
  • Proportional Reporting Ratio (PRR and aPRR)
  • UK Medicines Control Agency (MCA)
  • Reporting Odds Ratios and Incidence Rate Ratios
  • Other national spontaneous reporting centers and
    drug safety research units

15
Existing Methods (Contd)
  • Focus on 2X2 contingency table projections
  • 15,000 drugs 16,000 AEs 240 million tables
  • Most Nij 0, even though N.. very large

16
The Different Measures
17
These Measures not Robust
18
These Measures not Robust
Nij/Eij same in both cases
19
Think about this
Denote by q the probability that the next
operation in Hospital A results in a death Use
the data to estimate (i.e., guess the value of) q
20
Think about this
Denote by qi the probability that the next
operation in Hospital i results in a death Assume
qi beta(a,b) Compute joint posterior
distribution for all the qi simultaneously
21
Borrowing strength Shrinks estimate towards
common mean (7.4) Empirical Bayes use the data
to estimate a and b
22
Relative Reporting Ratio (RRijNij/Eij )
  • Advantages
  • Simple
  • Easy to interpret
  • Disadvantages
  • Extreme sampling variability when baseline and
    observed frequencies are small
  • (N1, E0.01 v.s. N100, E1)
  • GPS provides a shrinkage estimate of RR that
    addresses this concern.

EijNijN../Ni.N.j
23
GPS/MGPS
  • GPS/MGPS follows the same recipe as for the
    hospitals
  • Denote by ?ij the true RR for Drug i and AE j
  • Assumes the ?ijs arise from a particular
    5-parameter distribution
  • Use empirical Bayes to use the data to estimate
    these five parameters.

24
GPS-EBGM
  • Define ?ij ?ij / Eij , where
  • Nij ? Poisson( ?ij )
  • ?ij ? ? p g(? ?1,?1) (1-p) g(? ?2,?2)
  • a mixture of two Gamma Distributions
  • EBGM Geometric mean of Post-Dist. of ?ij
  • Estimates of ?ij / Eij
  • Shrinks Nij /Eij
  • EB05

25
GPS SHRINKAGE AERS DATA
number of reports
26
Simpson's Paradox
  • Contingency table analysis ignores effects of
    drug-drug association on drug-AE association

Ganclex
X
Rosinex
Nausea
27
Rosinex Ganclex Nausea
P(Rosinex1)0.1
P(Ganclex1Rosinex1)0.9 P(Ganclex1Rosinex0)
0.01
P(Nausea1Rosinex1)0.9 P(Nausea1Rosinex0)0.
1
28
Logistic Regression
  • log P/(1-P) intercept ? (each drug effect )
  • P Pr (report with these drugs will have the AE)
  • Classic logistic regression hard to scale up
  • Huge number of predictors (drugs, drug x drug,
    etc.)
  • Alternative approach
  • Bayesian Logistic Regression (Shrinkage Method)
  • Put a prior on coefficients (?1,, ?p), and
    shrink
  • their estimates towards zero
  • Stabilize the estimation when there are many
    predictors
  • Bayesian solution to the multiple comparison
    problem

29
Bayesian Logistic Regression
  • Two shrinkage methods
  • Ridge regression - Gaussian prior
  • ?j ? N (0,?)
  • Lasso regression - Laplace prior
  • f(?j ) ? exp?- ? ?j??
  • Choosing hyperparameter ?
  • Decide how much to shrink
  • Cross-validation choose prior to fit left-out
    data
  • Aggregation method by Bunea and Nobel (2005)

30
(No Transcript)
31
Bayesian Logistic Regression
  • Software Bayesian Binary Regression (BBR)
  • http//stat.rutgers.edu/madigan/BBR
  • Two priors Gaussian and Laplace
  • Hyperparameter fixed, default and CV
  • Handles millions of predictors efficiently
  • Safety Signal an apparent excess of an adverse
    effect associated with use of a drug
  • Coefficients ?s logs of odds ratios
  • Pr(AEj drugi ) - Pr(AEj not drugi )

32
Evaluation Strategies
  • Top-Rank Plot for Safety Signal
  • To compare the timeliness of outbreak detection
  • Similar to AMOC (Activity Monitor Operating
    Characteristic) curve in fraud detection
  • Y window (month in 1999)
  • X Top rank of association from window 1 to
    corresponding window

33
RV v.s. INTUSS
  • Rotavirus
  • Severe diarrhea (with fever and vomiting)
  • Hospitalize 55,000 children each year in US
  • Intussusception (INTUSS)
  • Uncommon type of bowel obstruction
  • RotaShield (RV)
  • Licensed on 8/31/1998 in US
  • Recommended for routine use in infants
  • Increased the risk for intussusception
  • 1 or 2 cases among each 10,000 infants
  • On 10/14/1999, the manufacturer withdrew RV

34
(No Transcript)
35
(No Transcript)
36
Simulation
  • Step-by-step procedure
  • Choose either a rare (5, 1), intermediate (50,
    3), or common (95, 100) vaccine - adverse event
    (V-A) combination
  • Use year 1998 data as baseline
  • Add extra report(s) per month of 1999 containing
    the chosen V-A combination
  • Generate the AMOC curve

37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
Conclusions of Simulation
  • The Bayesian Logistic Regressions (Normal-CV and
    Laplace-CV) signal consistently, and are at least
    as good as the MGPS method
  • Simple RR cannot signal for intermediate and
    common cases
  • GPS is relatively good on rare and intermediate
    cases, but not stable on common cases
  • Pattern of dependencies in AERS likely to more
    complex than VAERS

42
Discussion of Logistic Method
  • Advantages over low-dimensional tables
  • Corrects for some confounding and masking effects
  • Analyze multiple drugs/vaccines simultaneously
  • Limitations
  • Build separate model for each AE
  • Ignore dependencies between AEs
  • Fail to adjust for unmeasured/unrecorded factors
  • health status, unreported drugs, etc.
  • Model-based approach
  • Require model assumptions

43
Causal Inference View
  • Rubin causal model
  • Potential outcomes
  • Factual outcome
  • Im a smoker and I get lung cancer
  • Counterfactual outcome
  • If I hadnt been a smoker, I wouldnt have gotten
    lung cancer
  • Define
  • Zi treatment applied to unit i (0control,
    1treat)
  • Yi (0) response for unit i if Zi 0
  • Yi (1) response for unit i if Zi 1
  • Unit level causal effect Yi (1) - Yi (0)
  • Fundamental problem only see one of these!

44
(No Transcript)
45
Bias Due To Confounding
  • Individuals are observed already under their
    respective conditions
  • The two groups may differ in ways other than just
    the observed condition
  • Average effects may be biased due to confounding
    between covariates and group condition
  • We can simulate randomization or counterfactual
    world using information from observational
    studysort of

46
Propensity Score Method
  • Definition
  • e(xi) P(Zi1 Xixi)
  • Conditional probability of assignment to test
    treatment Zi1 given observed covariates
  • Assuming no unmeasured confounders, stratifying
    on e(xi) leads to causal inferences just as valid
    as in randomized trials
  • Methods with propensity scores
  • Inverse weighting
  • Regression adjustment
  • Matching

47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
Conclusion
  • First generation Method
  • Contingency table methods
  • Deal with each drug and each adverse event in
    isolation
  • Second generation Method
  • Bayesian logistic regression
  • Propensity score
  • Deal with large numbers of drugs jointly and with
    multi-drug interactions
  • Ultimate Method
  • Not only interactions and relationships among
    drugs , but also adverse events
  • Question which sets of drugs cause which sets of
    adverse events?

51
(No Transcript)
52
Overview
  • Brief Introduction to Data Mining
  • Data Mining Algorithms
  • Currently fashionable DMAs for drug safety
  • Future Directions, etc.

53
Of Laws, Monsters, and Giants
  • Moores law processing capacity doubles every
    18 months CPU, cache, memory
  • Its more aggressive cousin
  • Disk storage capacity doubles every 9 months

What do the two laws combined produce? A
rapidly growing gap between our ability to store
data, and our ability to make use of it.
54
Data Mining Algorithms
A data mining algorithm is a well-defined
procedure that takes data as input and produces
output in the form of models or patterns
Hand, Mannila, and Smyth
well-defined can be encoded in
software algorithm must terminate after some
finite number of steps
55
Algorithm Components
1. The task the algorithm is used to address
(e.g. classification, clustering, etc.) 2. The
structure of the model or pattern we are fitting
to the data (e.g. a linear regression model) 3.
The score function used to judge the quality of
the fitted models or patterns (e.g. accuracy,
BIC, etc.) 4. The search or optimization method
used to search over parameters and/or structures
(e.g. steepest descent, MCMC, etc.) 5. The data
management technique used for storing, indexing,
and retrieving data (critical when data too large
to reside in memory)
56
Association Rules Support and Confidence
Customer buys both
  • Find all the rules Y ? Z with minimum confidence
    and support
  • support, s, probability that a transaction
    contains Y Z
  • confidence, c, conditional probability that a
    transaction having Y Z also contains Z

Customer buys diapers
Customer buys beer
  • Let minimum support 50, and minimum confidence
    50, we have
  • A ? C (50, 66.6)
  • C ? A (50, 100)

57
Mining Association RulesAn Example
Min. support 50 Min. confidence 50
  • For rule A ? C
  • support support(A C) 50
  • confidence support(A C)/support(A) 66.6
  • The Apriori principle
  • Any subset of a frequent itemset must be frequent

58
Mining Frequent Itemsets the Key Step
  • Find the frequent itemsets the sets of items
    that have minimum support
  • A subset of a frequent itemset must also be a
    frequent itemset
  • i.e., if AB is a frequent itemset, both A and
    B should be a frequent itemset
  • Iteratively find frequent itemsets with
    cardinality from 1 to k (k-itemset)
  • Use the frequent itemsets to generate association
    rules.

59
The Apriori Algorithm Example
Database D
L1
C1
Scan D
C2
C2
L2
Scan D
C3
L3
Scan D
60
Association Rule Mining A Road Map
  • Boolean vs. quantitative associations (Based on
    the types of values handled)
  • buys(x, SQLServer) buys(x, DMBook)
    buys(x, DBMiner) 0.2, 60
  • age(x, 30..39) income(x, 42..48K)
    buys(x, PC) 1, 75
  • Single dimension vs. multiple dimensional
    associations (see ex. Above)
  • Single level vs. multiple-level analysis
  • What brands of beers are associated with what
    brands of diapers?
  • Various extensions (thousands!)

61
(No Transcript)
62
Statistics
The subject of statistics concerns itself with
using data to make inferences and predictions
about the world Researchers assembled the vast
bulk of the statistical knowledge base prior to
the availability of significant computing Lots of
assumptions and brilliant mathematics took the
place of computing and led to useful and
widely-used tools Serious limits on the
applicability of many of these methods small
data sets, unrealistically simple models,
Produce hard-to-interpret outputs like p-values
and confidence intervals
63
Bayesian Statistics
The Bayesian approach has deep historical roots
but required the algorithmic developments of the
late 1980s before it was of any use The old
sterile Bayesian-Frequentist debates are a thing
of the past Most data analysts take a pragmatic
point of view and use whatever is most useful
64
Bayes Theorem
65
Bayes Theorem Example
66
Bayes Theorem for Densities
67
Hospital Example (0/27)
prior distribution
likelihood
posterior distribution
68
(No Transcript)
69
Unreasonable prior distribution implies
unreasonable posterior distribution
70
0.032
0.023
What to report? Mode? Mean? Median? Posterior
probability that theta exceeds 0.2? theta such
that Pr(theta gt theta) 0.05 theta such that
Pr(theta gt theta) 0.95
0.013
0.095
0.002
Posterior probability that theta is in
(0.002,0.095) is 90
Write a Comment
User Comments (0)
About PowerShow.com