Title: Classification
1Classification
- and application in Remote Sensing
2Overview
- Introduction to classification problem
- an application of classification in remote
sensing vegetation classification - band selection
- multi-class classification
3Introduction
- make program that automatically recognize
handwritten numbers
4Introduction classification problem
- from raw data to decisions
- learn from examples and generalize
- Given Training examples (x, f(x)) for some
unknown function f.Find A good approximation to
f.
5Examples
- Handwriting recognition
- x data from pen motion
- f(x) letter of the alphabet
- Disease Diagnosis
- x properties of patient (symptoms, lab tests)
- f(x) disease (or maybe, recommended therapy)
- Face Recognition
- x bitmap picture of persons face
- f(x) name of person
- Spam Detection
- x email message
- f(x) spam or not spam
6Steps for building a classifier
- data acquisition / labeling (ground truth)
- preprocessing
- feature selection / feature extraction
- classification (learning/testing)
- post-processing
- decision
7Data acquisition
- acquiring the data and labeling
- data is independently randomly sample according
to unknown distribution P(x,y) -
8Pre-processing
- e.g. image processing
- histogram equalization,
- filtering
- segmentation
- data normalization
9Pre-processing example
10Feature selection/extraction
- This is generally the most important step
- conveying the information in the data to
classifier - the number of features
- should be high more info is better
- should be low curse of dimensionality
- will include prior knowledge of problem
- in part manual, in part automatic
11Feature selection/extraction
- User knowledge
- Automatic
- PCA reduce number of feature by decorrelation
- look which feature give best classification result
12Feature extraction example
13Feature scatterplot
14Classification
- learn from the features and generalize
- learning algorithm analyzes the examples and
produces a classifier f - given a new data point (x,y), the classifier is
given x and predicts y f(x) - the loss L(y,y) is then measured
- goal of the learning algorithm Find the f that
minimizes the expected loss
15Classification Bayesian decision theory
- fundamental statistical approach to the problem
of pattern classification - assuming that the descision problem is posed in
probabilistic terms - using P(yx) posterior probability, make
classification (Maximum aposteriori
classification)
16Classification
- need to estimate p(y) and p(xy), prior and
class-conditional probability density using only
the data density estimation. - often not feasible too little data in to
high-dimensional space - assume simple parametric probability model
(normal) - non-parametric
- directly find discriminant function
17example
18example
19Post-processing
- include context
- e.g. in images, signals
- integrate multiple classifiers
20Decision
- minimize risk, considering cost of
misclassification when unsure, select class
of minimal cost of error.
21no free lunch theorem
- dont wait until the a generic best classifier
is here!
22Applications in Remote Sensing
23Remote Sensing acquisition
- image are acquired from air or space.
24Spectral response
25Spectral response
26(No Transcript)
27Westhoek
Brugge
Hyperspectral sensor AISA Eagle (July 2004)
400-900nm _at_1m resolution
28Labeling
29Labelingspectral class mean
30Feature extraction
- here exploratory use Automatically look for
relevant features - which spectral bands (wavelength) should be
measured at what which spectral resolution
(width) for my application. - results can be used for classification, sensor
design or interpretation
31Feature extraction Band Selection
With spectral response function
32Hypothetical 12 band sensor
33Class distribution Normal
34Class Separation Criterion
two class Bhattacharyya bound
multi-class criterion
35Optimization
Minimize
Gradient descent is possible, but local minima
prevent it from giving good optimal values.
Therefore, we use global optimization Simulated
Annealing.
36(No Transcript)
37(No Transcript)
38(No Transcript)
39(No Transcript)
40(No Transcript)
41(No Transcript)
42Remote sensing classification
43Multi-class Classification
- Linear Multi-class Classifier
- Combining Binary Classifiers
- One against all K-1 classifiers
- One against one K(K-1)/2 classifiers
44combining linear multi-class classifiers
45Combining Binary Classifiers
- Maximum Voting 4 class example
Votes 1 0 2 2 3 1 4 3 (Winner)
Bin Classifier Result
1-2 2
1-3 3
1-4 4
2-3 2
2-4 4
3-4 4
46Problem with max voting
- No Probabilities, just class labels
- Hard classification
- Probabilities are usefull for
- spectral unmixing
- post-processing
47Combining Binary Classifiers Coupling
Probabilities
- Look for class probabilities pi
- with rij probability class ?i for binary
classifier i-j - K-1 free parameters and K(K-1)/2 constraints !
- Hastie and Tibshirani find approximations
- minimizing Kullback-Leibler distance
48Classification result
49single pixel classes not wanted
50Remote Sensing post-processing
- use contextual information to adjust
classification. - look a classes of neighboring pixels and
probabilities, if necessary adjust pixel class
51Post-processed classification result
52Pixel mixing
GREEN GRASS
DRY GRASS
MOSS
SAND
53Pixel mixing
54Unmixing with sand
55The End