Title: Feature Selection using Mutual Information
1Feature Selectionusing Mutual Information
- SYDE 676 Course Project
- Eric Hui
- November 28, 2002
2Outline
- Introduction prostate cancer project
- Definition of ROI and Features
- Estimation of PDFs using Parzen Density
Estimation - Feature Selection using MI Based Feature
Selection - Evaluation of Selection using Generalized
Divergence - Conclusions
3Ultrasound Image of Prostate
4Prostate Outline
5Guesstimated Cancerous Region
6Regions of Interest (ROI)
7Features as Mapping Functions
- Mapping from image space to feature space
8Parzen Density Estimation
- Histogram Bins
- bad estimation with limited data available!
- Parzen Density Est.
- reasonable approximation with limited data.
9Features
- Gray-Level Difference Matrix (GLDM)
- Contrast
- Mean
- Entropy
- Inverse Difference Moment (IDM)
- Angular Second Moment (ASM)
- Fractal Dimension
- FD
- Linearized Power Spectrum
- Slope
- Y-Intercept
10P(XCCancerous), P(XCBenign), and P(X)
11Entropy and Mutual Information
- Mutual Information I(CX) measures the degree of
interdependence between X and C. - Entropy H(C) measures the degree of uncertainty
of C. - I(XC) H(C) H(CX).
- I(XC) H(C) is the upper bound.
12ResultsMutual Information I(CX)
13Feature Images - GLDM
14Feature Images Fractal Dim.
15Feature Images - PSD
16Interdependence between Features
- Expensive to compute all features.
- Some features might be similar to each other.
- Thus, need to measure the interdependence between
features I(Xi Xj)
17ResultsInterdependence between Features
18Mutual Information BasedFeature Selection (MIFS)
- Select first feature with highest I(CX).
- Select next feature with highest
- Repeat until a desired number of features are
selected.
19Mutual Information BasedFeature Selection (MIFS)
- This method takes into account both
- the interdependence between class and features,
and - the interdependence between selected features.
- The parameter ß controls the amount of
interdependence between selected features.
20Varying ß in MIFS
21Generalized Divergence J
- If the features are biased towards a class, J
is large. - A good set of features should have small J.
22ResultsJ with respect to ß
- First feature selected GLDM ASM
- Second feature selected
23Conclusions
- Mutual Info. Based Feature Selection (MIFS)
- Generalized Divergence
24Questions and Comments