Title: Intro to Pattern Recognition
1Intro to Pattern Recognition
PCTech 02
MIT 2202-501Â Â PC Technology
Dr. Bunyarit Uyyanonvara IT Program, Sirindhorn
International Institute of Technology Thammasat
University bunyarit_at_siit.tu.ac.th
http//www.siit.tu.ac.th/bunyarit
2Decision Making Process
- Decision-making process of a human being are
often related to the recognition of regularity
(pattern).
- Human are good at looking for correlations and
extracting of regularities based on them.
3Artificial Intelligence - recap
- What is artificial intelligence (AI)?
- Russell Norvig gave possible definitions.
- 2,3 strong AI 1,4 weak AI
- Systems that act like humans.
- Systems that think like humans.
- Systems that think rationally.
- Systems that act rationally.
4Intelligent Agents - recap
- Intelligent Agents Perception Reason
Actuation - (Input) (Execute) (Output)
Reason
5Basic Terminology
- The most obvious example for Artificial
intelligence is the Pattern Classification
process.
6Basic Terminology
- What is a pattern ?
- A pattern is essentially an arrangement or an
ordering in which some organization of underlying
structure can be said to exist.
7Pattern Recognition
These patterns don't have to be pictures.
Just any pieces of information. You can
recall pieces of music from only a few notes of
the melody, think of words that rhyme with
Heart or name a politician who have a name
as a verb.
8Basic Terminology
- A pattern can be represented by a vector composed
of measured stimuli or attributes derived from
measured stimuli and their interrelationship.
9Example of Machine Perception
- Build a machine that can recognize patterns
- Speech recognition
- Fingerprint identification
- OCR (Optical Character Recognition)
- DNA sequence identification
10An Example
- Sorting incoming Fish on a conveyor according to
species using optical sensing - Sea bass
- Species
- Salmon
11(No Transcript)
12Basic Terminology
13Problem analysis
- Set up a camera and take some sample images to
extract features - Length
- Lightness
- Width
- Number and shape of fins
- Position of the mouth, etc
- This is the set of all suggested features to
explore for use in our classifier!
14Preprocessing
- Use a segmentation operation to isolate fishes
from one another and from the background - Information from a single fish is sent to a
feature extractor whose purpose is to reduce the
data by measuring certain features - The features are passed to a classifier
15Basic Terminology
- Preprocessing partitions the image into isolated
objects (character) - Feature extraction abstracts high level
information about individual pattern to
facilitate recognition - The classifier is identifies the category to
which the pattern belongs or, in general, the
attributes associated with the given pattern - Context processor increases recognition accuracy
16Classification
- Select the length of the fish as a possible
feature for discrimination
17Length Graph
18First Conclusion
- The length is a poor feature alone!
- Select the lightness as a possible feature.
19Lightness Graph
20Threshold decision boundary and cost relationship
- Move our decision boundary toward smaller values
of lightness in order to minimize the cost
(reduce the number of sea bass that are
classified salmon!) - Task of decision theory
21Formulation
- Adopt the lightness and add the width of the fish
- Fish xT x1, x2
Lightness
Width
22Width/Lightness Graph
23Search for Boundary
- We might add other features that are not
correlated with the ones we already have. A
precaution should be taken not to reduce the
performance by adding such noisy features
24Search for Boundary
- Ideally, the best decision boundary should be the
one which provides an optimal performance such as
in the following figure
25Classifier
26Search for Boundary
27Generalization
- However, our satisfaction is premature because
the central aim of designing a classifier is to
correctly classify novel input - Issue of generalization!
28More general Boundary
29Boundary Selection
30Boundary Selection
31Cost of MisClassification
- The boundary are chosen in order to minimize the
cost of misclassification.
32Syntactic pattern recognition
- In application involving patterns that can be
represented meaningfully, using vector notations
the statistical pattern recognition approach is
idea.
- However, in the patterns that are required
various components of fundamental importance,
relationship among them, are very difficult to be
statistically represented.
33Syntactic Pattern Recognition
34Syntactic pattern recognition
- That is the pattern is being viewed as being
composed of subpatterns.
- These subpatterns may be composed of other
subpatterns or they can be primitives.
35Syntactic pattern recognition
36Syntactic pattern recognition
- Each chromosome shown in the picture can be
encoded as a string of qualifiers by tracking
each structure boundary in a clockwise direction.
- The first cromosome abcbabdbabcbadbd
37Syntactic pattern recognition
- A set of rules governing the syntax can be viewed
as a grammar for the generation of a sentence
(string) from a given symbols.
38Syntactic pattern recognition
39Quiz
- ???????? ??????? ? ?? ?? 4545 ??????????????? 4
??? ??? ? ? ?
? - ??????? primitive subpatterns ???????????????????
??????????????? primitive ????????? ???
???????????????????? 4 ????????????????? string
of primitives
40Example Characters Recognition
- Many results for neural network based recognizers
are reported in the literature. Results ranging
from about 80 to high 90 have been reported.
41Example Characters Recognition
- What level of recognition accuracy is good enough
? - What level of reliability is good enough ?
- The questions are largely depend on context.
- For example, reliability measure is much more
crucial to a banking system than it would be in
PDA.
42Example Characters Recognition
- Recognition accuracy rates in the low to mid 90
range may sound good for PDA users,
- but it is considered high risk for Banking
Identification system.
43Process of Designing the Learning Machine
44The System Design Cycle
- Data collection
- Feature Choice
- Model Choice
- Training
- Evaluation
- Computational Complexity
45Pattern Recognition Systems
- Sensing
- Use of a transducer (camera or microphone)
- PR system depends of the bandwidth, the
resolution sensitivity distortion of the
transducer - Segmentation and grouping
- Patterns should be well separated and should not
overlap
46Pattern Recognition Systems
- Feature extraction
- Discriminative features
- Invariant features with respect to translation,
rotation and scale.
47Pattern Recognition Systems
- Consider the problem of recognizing speech
patterns. In this case the acoustic signals are a
function of time.
48Pattern Recognition Systems
- A pattern vector can be formed by sampling these
functions at discrete time interval, t1, t2, t3,
tn
- A feature vector for speech recognition might,
for example, consist of the fist N fourier
coefficients of the captured waveform.
49Pattern Recognition
- Depends on the characteristics of the problem
domain. - Simple to extract,
- invariant to irrelevant transformation
- insensitive to noise.
50Pattern Recognition Systems
- Classification
- Use a feature vector provided by a feature
extractor to assign the object to a category
51Pattern Recognition Systems
- Thus a pattern can be viewed as a point in either
m-dimensional measurement space or the
n-dimensional feature space
- Typically, feature spaces are chosen to be of
lower dimensionality than the corresponding
measurement space.
52Pattern Recognition Systems
- Pattern recognition involves mapping a pattern
correctly from the feature/measurement space into
a class membership space.
Feature space
Class space
53The System Design Cycle
- Model Choice
- Unsatisfied with the performance of our fish
classifier and want to jump to another class of
model
54Statistical pattern recognition
55The System Design Cycle
- Training
- Use data to determine the classifier. Many
different procedures for training classifiers and
choosing models
56- Decision given the posterior probabilities
- X is an observation for which
- if P(?1 x) gt P(?2 x) True state of
nature ?1 - if P(?1 x) lt P(?2 x) True state of
nature ?2 - Therefore
- whenever we observe a particular x, the
probability of error is - P(error x) P(?1 x) if we decide ?2
- P(error x) P(?2 x) if we decide ?1
57- Minimizing the probability of error
- Decide ?1 if P(?1 x) gt P(?2 x) otherwise
decide ?2 - Therefore
- P(error x) min P(?1 x), P(?2 x)
- (Bayes
decision)
58- Overall risk
- R Sum of all R(?i x) for i 1,,a
- Minimizing R Minimizing R(?i x) for i
1,, a -
- for i 1,,a
Conditional risk
59- Select the action ?i for which R(?i x) is
minimum - R is minimum and R in this case is
called the Bayes risk best
performance that can be achieved!
60Pattern Recognition Systems
- The training phase begins with training data that
are representative of the problem domain must be
obtained
- The recognition engine is adjusted such that it
maps feature vectors into categories with a
minimum number of misclassifications.
61Pattern Recognition
- In the second phase (prediction phase), the
trained classifier assigns the unknown input
pattern to one of the categories based on the
extracted feature vector.
62Pattern Recognition Systems
- Post Processing
- Exploit context input dependent information other
than from the target pattern itself to improve
performance
63The System Design Cycle
- Evaluation
- Measure the error rate (or performance and
switch from one set of features to another one
64The System Design Cycle
- Computational Complexity
- What is the trade-off between computational ease
and performance? - (How an algorithm scales as a function of the
number of features, patterns or categories?)
65The System Design Cycle
66Conclusion
- Reader seems to be overwhelmed by the number,
complexity and magnitude of the sub-problems of
Pattern Recognition - Many of these sub-problems can indeed be solved
- Many fascinating unsolved problems still remain
67Conclusion
- The problem of designing machines that can
recognize patterns is highly diverse and unsolved
problem depending on the nature of the problem.