Title: Mathematical Modeling and Classification of Eye Disease
1Mathematical Modeling and Classification of Eye
Disease
- Srinivasan Parthasarathy
- Joint work with M. Bullimore, K. Marsolo and
- M. Twa
Aspects of this work are funded by the NIH, NSF
and the DOE.
2Desiderata for Clinical Diagnosis
- Should be accurate and ideally interoperable
- Can we use mathematical modeling?
- Can we improve accuracy by meta-learning?
- Should be interpretable
- Can we visualize the decision making process
effectively? - Should be responsive
- Can we leverage distributed computing tools to
speed up the process?
Synopsis of Approach
3Ocular Anatomy 101
4Case Study Keratoconus
- Progressive, degenerative, non-inflammatory
disease. - A leading cause of blindness and corneal
transplant. - Early detection is difficult important
- Has implications for eye surgery and
control-of-disease - Initial Symptoms Minor fluctuations in corneal
shape - Diagnosis procedure
- Video-keratography exam
- Manual analysis of results by clinician
- Challenges to detection
- Voluminous data
- one image is 1000s of data points representing
corneal surface - spatial and temporal (longitudinal)
- Features of interest small in scale to mean shape
- Leads to variance in prognosis
Late stage Keratoconus Normal (clinically
ideal)
5Raw Data Description
- Corneal surface represented as a 7000 point
matrix output of video-keratographic device - Fixed angula sampling in concentric circles
around center - Data stored in cylindrical coordinates
- Radius (?)
- Angle (?)
- Height / Elevation (z)
- 3 classes Keratoconus, surgical repaired
(lasik), normal - 508 eyes from 254 people (L-R normalization)
- Can we use this data to construct a 3-D surface
of the cornea. - How to model?
6Modeling
- Desired Properties
- Reflect General Shape and Structure of Entity of
Interest - Should capture important features? local
harmonics - Need for Compact Representation
- Should be capable of capturing important features
such as local harmonics - Capable of tuning to desired resolution
- Capable of dealing with multiple dimensions
- Options Evaluated
- Zernike Polynomials, Pseudo-Zernike Polynomials,
Wavelets
7Modeling Zernike and Pseudo-Zernike Polynomials
- Hyper-geometric radial basis functions
- Each term (mode) in the series represents a 3D
geometric surface. - Value of each term represents the contribution of
that mode to the overall surface (independent) - Benefits
- Lower order modes show correlation to general
surface features of the cornea. - Higher order modes capture local harmonics
- Orthogonal
- Anatomic correspondence to clinical concepts
- Drawbacks
- Can be computationally expensive.
- Can model noise as well especially higher order
8Details
9Z PZ Transformation Algorithm
- Compute least-squares fit between model and
original data - Then use coefficients as feature vector as input
for classification
10Wavelet Modeling
- Convert to 1-Dimensional Signal
- A. Sample along concentric circles (Klyce
Smolek) - Use standard 1D Wavelets to model Signal and use
coefficients to classify - B. Sample along a space-filling curve (us)
- Key idea is to maintain spatial coherence
- Same as above to classify (works better than A)
- Apply 2-dimensional Wavelet Models (us)
- Use coefficients to classify (does not work as
well as 1.B.) - Pros
- Fast and efficient
- Cons
- Overall performance worse (5 -10) than
Zernike-based approaches - No anatomic correspondence difficult to
interpret
11Experiments Model Fidelity
- Model error of Z vs. PZ?
- Model error on different patient classes?
- What set of parameters provides the best model
fit? - Transformation Parameters
- Polynomial Order 4th 10th
- Larger the order ? may model signal noise !
- Radius 2.0, 2.5, 3.0, 3.5mm (max)
- Larger Radius ? more number of points to model
12Results Model Fidelity
- General Trend
- Increasing polynomial order decreases error.
- Increasing transformation radius increases error.
- Same order Z gt PZ
- Same coefficients Z PZ
- Between patient classesKeratoconus gt LASIK gt
Normal
13Classification and Clinical Decision Support
- Prefer transparent algorithms over black-box
classifiers. - Use simple classifiers and provide a way to
visually explore the decision making process - Desire high accuracy
- Use an ensemble of simple and interpretable
classifiers - Desire efficiency
- Use netsolve distributed computing tool
14Basic Classification Performance
- Accuracy of Classifier based on PZ vs. Z?
- Zernike works better
- Which classifiers work well
- C4.5 (84-85), Naïve Bayes (84), VFI (84),
- Neural Networks (81), one-vs-all SVM (82)
- SVMs and NN are also difficult to explain
(interpretability) - Performance of Ensemble Techniques
- Boosting, bagging and random forests
- All upgrade performance (3-4)
- Bagging prefered easy to interpret, performance
marginally better - More accurate model higher classification
accuracy? - C4.5 (4th order works best but others are not
bad) - NB/VFI/NN/SVM (higher orders do not work well
noise or irrelevant features hampers performance)
15Ensemble Learning
- Combine results of multiple classification models
built from different samples of dataset to
improve accuracy. - Training data represents a sample of the
population. - A classifier built on one sample can overfit
and model noise. - Constructing multiple models can filter noise and
reduce generalization error Breiman, 1996. - Traditional Methods
- Bootstrap Aggregation (Bagging)
- Boosting
16Spatial Averaging (SA)
- Use classifiers built on different resolutions
and models of the dataset to improve accuracy. - Build classifier for each spatial transformation
and resolution. - Take modal label of classifiers to reach final
decision. - Can view as a structured column bagging
algorithm - Intuition
- Lower order transformations result in more
general, global model. - Higher order transformations better at capturing
local harmonics, but can model noise. - If errors are uncorrelated, SA should smooth
noise effects.
17Spatial Averaging
Org.Data
1. Transform Data
2. Classify
3. Tally Votes
Spatial Transformation(s)
4Z
6Z
8Z
10Z
10PZ
8PZ
6PZ
4PZ
5Z
7Z
9Z
9PZ
7PZ
5PZ
5
0
2
4
2
1
18Spatial Averaging with Sub-Selection (Combined)
Org.Data
1. Transform Data
2. Classify
3. Tally Votes
Spatial Transformation(s)
4Z
10Z
4PZ
7Z
7PZ
19Experiments
- How does SA compare to a single decision tree?
- How does SA compare to traditional
ensemble-learning methods? - Can SA be combined with ensemble-learning methods
to further improve results? - Ensemble Methods Evaluated Include
- Bagging, Boosting, Random Forests
20Spatial Averaging vs. C4.5
- 10-fold c.v.
- Zernike-based SA
- 7 trees (4th to 10th order)
- 3-5 over individual tree.
- PZ-based based SA
- Up to 7 improvement
- Combined SA classifier (5 trees)
- accuracy of 91.1, 6-10 improvement.
- Rationale Clinically it appears that PZ and Z do
better on different varieties of Keratoconus
21SA and Traditional Ensemble-Learning
- Combined SA (5 trees) outperforms Boosted (10) or
Bagged C4.5 (10) - Bagging does marginally better than Boosting RF
(not shown) - Combined Bagging (5)
- 94.1 accuracy
- However it trades off interpretability (5X5
trees) for accuracy
22Visualization of Results
- Task Visualize results to provide decision
support for clinicians. - Give intuition as to why a group of patients are
classified the way they are. - Contrast an individual patient with others in the
same group - How?
- Modes of Zernike/Pseudo-Zernike polynomial
correspond to specific features of the cornea.
23Patient-Specific Decision Surface
- Treat each path through the decision tree as a
rule. - Cluster training data by rule.
- Compute average coefficient values for each
cluster. - Given a patient, classify and keep the rule
coefficients, set others to zero. - Construct overall surface and rule surface
24Patient-Specific Decision Surface
- Create surfaces using
- All patient coefficients.
- All rule mean coefficients.
- Patient coefficients used in the classifying rule
(rest zero). - Rule mean coefficients used in the classifying
rule (rest zero). - Also
- Bar chart with relative error between patient and
rule mean coefficients.
25Visualization Strongest Rules
Rule 1 - Keratoconus
Rule 8 - Normal
Rule 4 - LASIK
26High Performance Results
- Optimize and parallelize (5 nodes) key steps of
the code over a grid environment - M unoptimized algorithm
- NS netsolve version
- Times shown for computing one decision tree using
particular model (Z/PZ) and includes model
building time.
27Case Study Glaucoma
- Progressive neuropathy of the optic nerve
- Disease characteristics
- Symptom free
- Elevated intraocular pressure
- Structural loss of retinal ganglion cells
- Gradual restriction of the visual field from
periphery to center
28Clinical Management of Glaucoma
- Monitoring Intraocular pressure
- Static threshold visual field sensitivity
- Observations of structural change at the optic
nerve head
Glaucoma
Normal
29Topographic Modeling
- Objectives
- Feature reduction
- Preservation of spatial correlation
- Polynomial Modeling
- Zernike
- Pseudo-Zernike
- Spline Modeling
- Knot locations, coefficients
- Wavelet Modeling
- 1D vs. 2D
30(No Transcript)
31(No Transcript)
32Concluding Remarks
- Modeling and Classifying Corneal Shape
- Low-order Zernike polynomials provide adequate
model of corneal surface. - Higher order polynomials begin to model noise but
may contain a few useful features for
classification - Decision trees provide classification accuracy
greater than or equal to other classification
methods. - Accuracy can be further improved by using
SA-strategy. - Visualization
- Using classification attributes as basis for
visualization provides method of decision support
for clinicians. - High Performance Implementations can help
- Modeling and Classifying Glaucoma ongoing
33General thoughts on Interdisciplinary
Collaboration
- Steep learning curve
- Need to learn language and requirements
- Need to express results in domain language
- Patience, patience, patience
- Communities are inertia bound
- Often difficult to make headway
- Potential for incredible rewards
- Scientific/medical implications
- Good working relationship essential
- Equal partners
34900µm diameter fit
35(No Transcript)
36How is RMS related to Class?
- Greater variance in glaucoma
- Higher mean in glaucoma
37Conclusions
38Single Decision Tree
39Decision Surface
40Data
- 254 Patient Records
- 3 Patient Categories
- Normal (119)
- Diseased (99)
- Post-LASIK (36)
41Imaging Scripting Crop Routines
42Imaging Scripting Centering Routines
- 2D Mean SD
- What is the best center point (disc vs cup)?
- Failure associated with class assignment
- Normals fail more often
43Re-centered