Title: From Population to Individual Drug Dosing in Chronic Illness
1From Population to Individual Drug Dosing in
Chronic Illness
- Intelligent Control for Management of Renal Anemia
Adam E Gaweda University of Louisville Department
of Medicine
Challenges in Dynamic Treatment Regimes and
Multistage Decision-Making
2Overview
- Anemia management
- Dose-response modeling
- Model-based control in drug dosing
- Model-free control in drug dosing
3Anemia ManagementBiological vs. clinical
4Anemia ManagementClinical guidelines
- Dosing guidelines (NKF KDOQI)
- Maintain Haemoglobin (Hb) between 11 and 12 g/dL
( Hematocrit (Hct) between 33 36 ). - Titration of EPO
- If the increase in Hb after EPO initiation or
after a dose increase has been less than 1 g/dL
over a 2- to 4-week period, the dose of EPO
should be increased by 50. - If the absolute rate of increase of Hb after EPO
initiation or after a dose increase exceeds 3
g/dL per month (eg, an increase from a Hgb 7 to
10 g/dL), or if the Hgb exceeds the target,
reduce the weekly dose of EPO by 25. - When the weekly EPO dose is being increased or
decreased, a change may be made in the amount
administered in a given dose and/or the frequency
of dosing.
5Anemia ManagementCurrent state-of-the-art
- Anemia Management Protocols (AMP)
- Frequency of Hb observation
- Every 4 weeks if Hb within the target
- Every 2 weeks if Hb outside of the target
- EPO dose adjustment
- Minimum adjustment amount 10 (of current dose)
- Maximum decrease 50 (if Hb gt 15 g/dL)
- Maximum increase 70 (if Hb lt 9 g/dL)
- Problem with AMP
- Based on average response.
- Only 1/3 of the patient population achieve the
target. - Can we improve the outcome of anemia management
by making it patient-specific using control
theory and machine learning techniques ?
6Dose-response modelingOverview
- In control system design and simulation, a good
process model is priceless. - Models of erythropoiesis
- Physiological model (Uehlinger et al. 1992)
- PK / PD model(Brockmöller et al. 1992)
- Bayesian network model (Bellazzi et al. 1993)
- Artificial Neural Network (ANN) models (Martin
Guerrero et al. 2003, Gaweda et al. 2003, Gabutti
et al. 2006)
7Dose-response modelingPopulation vs.
subpopulation modeling
8Dose-response modelingExample of response
prediction
9Dose-response modelingOpen problems
- Prediction seems to lag behind the actual value
- Do our data allow us to build a model that shows
the true effect of EPO on Hb ( Hct ) ? - Lets estimate a dynamic linear model Hb(k1)
f( Hb(k), EPO(k) ) - Hbm(k1) 0.82 Hb(k) 0.011 EPO(k) 1.91
- Lets now estimate a model of ?Hb(k1) f(
EPO(k) ) - ?Hbm(k1) 0.015 EPO(k) - 0.23
- Both models achieve comparable accuracy.
- The second model explains the dose effect
better.
10Dose-response modelingOpen problems
- Our data come from clinical treatment
(closed-loop system) - How does that affect the model ?
Martin Guerrero et al. report the same phenomenon.
11Model-based controlModel Predictive Control (MPC)
- Rationale for using Model Predictive Control
- There is a delay between EPO administration and
Hb response(about 17 days from EPO
manufacturer information). - The relationship between EPO dose and Hb increase
is nonlinear (monotonically increasing with
saturation Uehlinger et al. 1992). - The effect of EPO continues throughout the
lifetime of red blood cells (up to 120 days). - We plan to include constraints on EPO dose (in
the future)(such as minimization of the total
dose or minimization of dose changes).
12Model-based controlMPC - Schematic diagram
MODEL(population) Hb(k1) Hb(k)
FNN(EPO(k),EPO(k-1),EPO(k-2))
EPO
Hbm
CONTROLLER
PATIENT
Hb
EPO
13Model-based controlMPC Clinical trial - setup
- Trial population
- 60 patients
- 30 controls (dosed by physicians) / 30 treatment
(dosed by MPC) - 45 African-American / 15 Caucasian
- 35 males / 25 females
- Average age 58, min 21, max 84
- Trial length
- 8 months
- 2 months wash-out period / 6 months for outcome
analysis - Treatment goal
- maintain Hb at 11.5 g/dL
- performance measure mean absolute deviation from
11.5
14Model-based controlMPC - Clinical trial results
(thus far)
Mean 11.5-Hb
Month
15Model-based controlOpen problems
- Simulating MPC
- How do we accurately represent the mismatch
between the model and the patient ? - How do we effectively simulate adverse events ?
- Measuring success
- We try to individualize the treatment yet we use
a mean performance measure what are the
alternatives ? - Individual performance measures (e.g.
within-subject StDev of Hb ) ???? - How do we eliminate influence of Hb changes due
to adverse events on the performance measure ?
16Model-free controlReinforcement Learning
- Drug administration in chronic conditions is a
trial-and-error control process that resembles
reinforcement learning - disease symptoms initial state (s0)
- (standard) initial dose action (a0)
- k 1
- Repeat (infinitely)
- evaluate patient (remission/progression/side
effects) new state (sk), reward (rk) - adjust dosing strategy update state-action
table/function (Qk), extract policy (?k) - administer new dose action (ak)
- k k 1
- End
17Model-free controlQ-Learning simulation -
Schematic diagram
Q-LEARNING AGENT
POLICY (?)Ri IF Hb Hbi THEN EPO EPOi
Hb(s)
EPO(a)
PATIENT SIMULATOR(subpopulation model)Hb(k1)
F( Hb(k), EPO(k), IRON(k) )
IRON(disturbance)
18Model-free controlReward function
11.5
11.5
11.5
11.5
11.5
19Model-free controlQ-table update
- Dose-response relationship (EPO to ? Hb) is
monotonically increasing with saturation
(Uehlinger et al. 1992). - Lets update multiple entries in the Q-table at a
time - If Hb(k) lt 11.5 and Hb(k1) ? Hb(k) or Hb(k)
11.5 and Hb(k1) lt Hb(k)then update Q( s, a )
for all s ? Hb(k) and all a ? EPO(k) - If Hb(k) gt 11.5 and Hb(k1) Hb(k) or Hb(k)
11.5 and Hb(k1) gt Hb(k)then update Q( s, a )
for all s Hb(k) and all a EPO(k)
20Model-free controlQ-Learning - Simulated
clinical trial
- Trial population
- 200 individuals with various degrees of response
to EPO - 100 distinct responders / 100 distinct
non-responders - In the first run, all individuals dosed by AMP
- In the second run, all individuals dosed by
policy updatedon-line by Q-learning - Trial length
- 24 months
- Treatment goal
- drive Hb to, and maintain at 11.5 g/dL
- performance measure mean absolute deviation from
11.5
21Model-free controlQ-Learning - Simulation results
Mean 11.5-Hb
Month
22Conclusionsand open problems
- We believe that we are on a good path to
successfully individualize anemia management
using presented techniques. - However, we need to address the following
- How do we produce reliable dose-response models
that perform well on under-represented data
instances ? - What performance measure do we need to use in
order to adequately evaluate the success of an
individualized treatment ?
23Acknowledgments
- UofL Division of Nephrology
- George R Aronoff
- Michael E Brier
- Alfred A Jacobs
- UofL Dept Electrical and Computer Engineering
- Mehmet K Muezzinoglu
- Jacek M Zurada
Michael E Brier has been sponsored by Department
of Veterans Affairs Merit Review Grant. Adam E
Gaweda is sponsored by NIDDK (1K25DK072085-01A2).