Title: New Statistics
1New Statistics
2Patent
- This presentation is based on a patent
application on personalized medication held by
George Mason University. Scientists and
government organizations have free access to this
patent
3Acknowledgement
- Farrokh Alemi, Ph.D.
- Harold Erdman, Ph.D.
- Igor Griva, Ph.D.
- Charles H. Evans, Jr., M.D., Ph.D.
- Jee Vang, Ph.D.
- Seminar on data mining
4Case of George
- Military service
- Industrial manager, polite, defines himself a
medieval knight. - First depression episode at 26, treated with
clomipramine, dose unknown. - At 30 married with a daughter
- At 45 return of depressive symptoms, treated with
fluvoxamine 200300 mg and mirtazapine 15 mg - Depression continues, loss of interest in work,
difficulty with bi-polar daughter - Loss of daughter, divorce and loss of work
- At 48, suicide
5Potential for Guided Treatmentto Increase
Remission
Data is based on 1,933 cases in STARD with
genetic profiles. Remission was defined as
score less than 5 in the Quick Inventory of
Depressive SymptomsClinical Version.
Cumulative in remission rate at previous
level (1 rate at previous level) rate at
current level.
6Clincial Success Has Been Elusive
Examined 68 candidate genes, 768
single-nucleotidepolymorphism in 1953 patients
7Clinical Success Has Been Elusive
8Make Genetic Progress Relevant to Patient at Hand
9Patients Like Me Algorithm
- For a patient, select cases in the database in
order of their similarity to the patient - Check if the number of failures exceed Walds
Sequential Probability Ratio Test
Additional steps on selection of features not
presented in this slide.
10Patients Like Me Algorithm
- For a patient, select cases in the database in
order of their similarity to the patient - Check if the number of failures exceed Walds
Sequential Probability Ratio Test
11Enables Personalized Medicine
- No one is the average patient (Duncan Neuhauser)
- Different recommendations to different groups
- Subgroup analysis is not practical in traditional
statistics - For example, different markers for male/females
These findings are consistent with the
hypothesis that major depression may be related
to some distinct genes in male versus female
patients. Paddock et al., Am J Psychiatry 2007
164 1168
12Patients Like Me Algorithm
13Simulated Data of Treatment Success in Two
Dimensions
14Simulated Data of Treatment Success in Two
Dimensions
15Simulated Data of Treatment Success in Two
Dimensions
16STARD Data
- Citalopram
- 1933 cases
- Remission of symptoms
- Hamilton score of less than 7
- 50 decrease in Inventory of Depressive
Symptomatology Clinician-Rated - Many independent predictors
- 45 phenotypic features
- 216 Single Nucleotide Polymorphisms
17Success Has Been Elusive
- SNPs in STARD not predictive of outcomes
- Random forest
- Logistic regression
- BESt
185-fold Cross Validation of Patients Like Me
Algorithm in 386 Cases
- Select 1/5 data for validation
- Use training data to estimate parameters
- Use SKNN to predict validation set
- Repeat the process
195-fold Cross Validation of Patients Like Me
Algorithm in 386 Cases
205-fold Cross Validation of Patients Like Me
Algorithm in 386 Cases
21Optimization of SKNN
- Use training data to set SKNN parameters
- Percent of predicted cases correctly classified
92
22SKNN helps some patients using data otherwise
ignored
23Medications withdrawn from market could benefit
some patients
24Scientists should publish databases not just
their papers
25Questions or Comments
26SKNN Feature Selection
27SKNN Feature Selection
285-fold Cross Validation Confusion Matrices (no
optimizaton)