Lecture 6 Classifiers and Pattern Recognition Systems

About This Presentation

Title:

Lecture 6 Classifiers and Pattern Recognition Systems

Description:

Chapter 2, Pattern Classification by Duda, Hart, Stork, 2001, Section 2.8.3, 48-51. Chapter 9, Pattern Classification by Duda, Hart, Stork, 2001, Section 9.6, 482-485 ... – PowerPoint PPT presentation

Number of Views:667

Avg rating:3.0/5.0

Slides: 33

Provided by: djam79

Category:

more less

Transcript and Presenter's Notes

Title: Lecture 6 Classifiers and Pattern Recognition Systems

1
Lecture 6Classifiers and Pattern Recognition
Systems

2
Outlines

Comparison of Classifiers
Error Estimation for Classifiers
ROC (Receiver Operator Characteristics )
Bayes Framework for Image Classification an
application case study

3
(No Transcript)
4
(No Transcript)
5
Error Estimation of Classifiers

Ensure independent training and testing sets used
in training and testing.
Both data sets must be large enough depending on
the complexities of the problem, hundreds or
thousands samples are needed.
When samples are not enough using one of the
following methods
Re-substitution
Holdout
Leave-one-out
N-fold cross validation
Bootstrap

6
(No Transcript)
7
ROC (Receiver Operator Characteristics)

The ROC curve plots the proportion of correct
responses (hits) against the false positives as
the decision boundary changes.
The ROC curve gives information which is
independent of the observers loss function.
The ROC presents a full picture than a single
error rate
A set of error rates under different confidence
levels
Subject to users needs to select proper
threshold values

For class ?2
Hit rate correct classification of ?2
Miss rate missing detection of ?2
For class ?1
False alarm rate false accepted as ?2
Rejection rate correct rejection for ?1

9
ROC and rejection rate

Rejection is used to reduce recognition error
rates when samples are close to the decision
boundary.
Higher rejection rate will improve recognition
rate, shown in Fig. 3-13

10
Classification Error vs. No. of Training
Samples/Class
11
Face recognition performance comparison computer
vs. humanBy Andy Adler, et al., 2006
12
Advances in Pattern Recognition

Multimodality in features and classifiers
Image, audio, text
Different modality of medical images such as CT,
MRI, X-rays, Ultrasound, etc.
Combination of classifiers
Open databases for performance benchmark of
pattern detection, classification for specific
application domains such as face, fingerprint,
handwritten characters, object recognition,
content based retrieval, etc.

13
Neural networks

Based on parallel massive simple processing units
Perform nonlinear mapping from input features to
output category nodes
Black box hard to interpret after learning
Good performance if training samples are large
enough
Matlab NN toolbox

14
Application Case- Bayes Image Classification

A Bayes Framework for Semantic Classification of
Outdoor Vacation Images, by Vailaya, et al, 2001

15
(No Transcript)
16
(No Transcript)
17
Problem and approach

Classify city vs. landscape (sunset, forest,
mountains)
Database 2716 images
Classification accuracy 94 - 95
Features used color histograms, color
coherences, edge direction histograms, and edge
direction coherence vectors
Representation VQ codebook
Bayes framework
VQ as conditional density estimator
MAP criteria (Maximum A Posteriori)

18
Bayes Framework

Given image x belongs to one of K classes
A priori knowledge
0/1 loss function
Bayes law

19
Feature extraction VQ representation

Feature extraction color, edge related
Assume feature independent
Class conditional density estimation based on VQ
(vector quantization)

20
Nearest Neighbor and Voronoi Tessellation

Parzen window approximate density with
proportion of sample data in each cell
Different from the standard Parzen windows? Why?
Codebook size q increase
May cause over-fitting
Consider the total data length for describing the
data and model
Minimal Description Length (MDL)

21
System Architecture
150x150 750x750, 24bits/pixel
Color Histo. (64) Edge direc. (72)
Code book 40
22
Experimental Results

Edge information important for detecting city
images
Color information important for discriminating
landscape subclasses

23
Wrongly classified images
24
Summary

Choose right classifiers based on your needs in
terms of accuracy, speed, memory constrains, and
samples available.
Classifiers performance evaluation is critical
for design of pattern recognition systems.
ROC is a powerful tool for evaluation of
classifiers.
Design, implement, evaluation of pattern
recognition systems require many iterative steps
to achieve good performance and meet application
needs.

25
Reading

Chapter 2, Pattern Classification by Duda, Hart,
Stork, 2001, Section 2.8.3, 48-51
Chapter 9, Pattern Classification by Duda, Hart,
Stork, 2001, Section 9.6, 482-485
A.K. Jain, R.P.W. Duin and J. Mao, "Statistical
pattern recognition a review," IEEE Transactions
on Pattern Analysis and Machine Intelligence,
vol. 22, no. 1, 2000, Section 7, 24-27
A. Vailaya, et al., A Bayesian Framework for
Semantic Classification of Outdoor Vacation
Images, SPIE Conference on Electronic Imaging,
1999, San Jose, California, USA.

26
Backup slides
27
Generative vs. Discriminative

Generative Methods
Determine models of how patterns are formed.
Use these models to perform discrimination.
Pattern Theory. Grenander. (1996 book)
Discriminative Methods
Dont model pattern formation.
Instead extract features from patterns and
make decision using these features.
Both Generative and Discriminative methods
require training data to learn the
models/features/decision rules.
Machine Learning concentrates on learning
discrimination rules.
Key Issue do we have enough training data to
learn?

The Generative approach will attempt to estimate
the Gaussian distributions from data and then
derive the decision rule.
The Discriminant approach will seek to estimate
the decision rule directly by learning the
discriminant plane.
In practice, we will not know the form of the
distributions or the form of the discriminant.

Bayes Decision Theory gives a framework for
Generative and Discriminative approaches.
Current Wisdom
(i) Discriminative methods are simpler,
computationally faster, and easier to apply.
(ii) Generative methods are needed for most
complex problems.
Hybrid methods are increasingly popular.

30
SVM (Supporting Vector Machine )

SVMs perform structural risk minimization to
achieve good generalization.
A function that (1) minimizes the empirical
risk, (2) has low VC dimension
The optimization criterion is the width of the
margin between the classes.
Primarily two-class classifiers but can be
extended to multiple classes.
Linear SVM vs. Nonlinear SVM
Mapping samples to a higher dimensional space
where different classes can be separated with a
hyperplane (kernel trick)
The performance of SVMs depends on the choice of
the kernel and its parameters.

31
Margin of Separation for SVM
32
Supporting Vectors

The empty area around the decision boundary
defined by the distance to the nearest training
patterns (i.e., support vectors).
These are the most difficult patterns to classify.

Write a Comment

User Comments (0)

About PowerShow.com

Lecture 6 Classifiers and Pattern Recognition Systems - PowerPoint PPT Presentation

Lecture 6 Classifiers and Pattern Recognition Systems

Chapter 2, Pattern Classification by Duda, Hart, Stork, 2001, Section 2.8.3, 48-51. Chapter 9, Pattern Classification by Duda, Hart, Stork, 2001, Section 9.6, 482-485 ... – PowerPoint PPT presentation