Visual Classification and Regression using Scale Space Theory - PowerPoint PPT Presentation

About This Presentation
Title:

Visual Classification and Regression using Scale Space Theory

Description:

Nearest-neighbor method. Discriminant function approach. Linear discriminant method ... Fuzzy methods. Neural Networks. kernel methods: Support Vector Machine ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 52
Provided by: meng62
Category:

less

Transcript and Presenter's Notes

Title: Visual Classification and Regression using Scale Space Theory


1
A New Approach for Classification
Visual Simulation Viewpoint
Zongben Xu Deyu Meng
Xian Jiaotong University
2
Outline
  • Introduction
  • The existing approaches
  • Visual sensation principle
  • Visual classification approach
  • Visual learning theory
  • Concluding remarks

3
1. Introduction
  • Data Mining (DM) the main procedure of KDD,
    aims at the discovery of useful know-ledge from
    large collections of data. The knowledge mainly
    refers to
  • Clustering
  • Classification
  • Regression

4
Clustering
  • Partitioning a given dataset with known or
    un-known distribution into homogeneous subgroups.

5
Clustering
  • Object categorization/classification from remote
    sensed image example

6
Classification
  • Finding a discriminant rule (a function f(x))
    from the experiential data with k labels
    generated from an unknown but fixed distribution
    (normally, k2 is focused).

7
Classification
  • Face recognition example

8
Classification
  • Fingerprint recognition example

9
Regression
  • Finding a relationship (a function f(x)) between
    the input and output training data generated by
    an unknown but fixed distribution

10
Regression
  • Air quality prediction example (the data
    obtained at the Mong Kok monitory station of Hong
    Kong based on hourly continuous measurement
    during the whole year of 2000).

11
Existing Approaches for Clustering
  • Hierarchical clustering
  • nested hierarchical clustering
  • nonnested hierarchical clustering
  • SLINK
  • COMLINK
  • MSTCLUS
  • Partitional clustering
  • K-means clustering
  • Neural networks
  • Kernel methods
  • Fuzzy methods

12
Existing Approaches for Classification
  • Statistical approach
  • Parametric methods
  • Bayesian method
  • Nonparametric methods
  • Density estimation method
  • Nearest-neighbor method
  • Discriminant function approach
  • Linear discriminant method
  • Generalized linear discriminant method
  • Fisher discriminant method
  • Nonmetric approach
  • Decision trees method
  • Rule-based method
  • Computational intelligence approach
  • Fuzzy methods
  • Neural Networks
  • kernel methods Support Vector Machine

13
Existing Approaches for Regression
  • Interpolation methods
  • Statistical methods
  • Parameter regression
  • Non-parameter regression
  • Computational intelligent methods
  • Fuzzy regression methods
  • -insensitive fuzzy c-regression model
  • Neural Networks
  • kernel methods Support Vector Regression

14
Main problems encountered
  • Validity problem (Clustering) is there real
    clustering? how many?
  • Efficiency/Scalability problem in most cases
    efficient only for small/ middle sized data set.
  • Robustness problem most of the results are
    sensitive to model parameters, and sample
    neatness.
  • Model selection problem no general rule to
    specify the model type and parameters.

15
Research agenda
  • The essence of DM is modeling from data. It
    depends not only on how the data are generated,
    but also on how we sense or perceive the data.
    The existing DM methods are developed based on
    the former principle, but less on the latter one.
  • Our idea is to develop DM methods based on human
    visual sensation and perception principle
    (particularly, to treat a data set as an image,
    and to mine the knowledge from the data in
    accordance with the way we observe and perceive
    the image).

16
Research agenda (Cont.)
  • We have successfully developed such an approach
    for clustering , particularly solved the
    clustering validity problem. See,
  • Clustering by Scale Space Filtering,
  • IEEE Transaction on PAMI, 2212(2000), 1396-1410
  • This report aims at initiating the approach for
    classification, with an emphasis on solving the
    Efficiency/ Scalability problem and the
    Robustness problem.
  • The model selection problem is under our current
    research.

17
2.1. Visual sensation principle
The structure of the human eye
18
2.1. Visual sensation principle
Accommodation (focusing) of an image by changing
the shape of the crystalline lens of the eyes (or
equivalently, by changing the distance between
image and eye when the shape of lens is fixed)
19
2.1. Visual sensation principle
How an image in retina varies with the distance
between object and eye (or equivalently, with the
shape of crystalline lens)? Scale space theory
provides us an explanation. The theory is
supported by neurophysiologic findings in animals
and psychophysics in man directly.
20
2.2. Scale Space Theory
21
2.2. Scale Space Theory
22
2.2. Scale Space Theory
23
2.3 Cell responses in retina
Only change of light can be perceived and only
three types of cell responses exist in retina
  • 'ON' response the response to arrival of a light
    stimulus (the blue region)
  • 'OFF' response the response to removal of a
    light stimulus (the red region)
  • 'ON-OFF' response the response to the hybrids of
    on and off (because both presentation and
    removal of the stimulus may simultaneously exist)
    (the yellow region)

24
2.3. Cell responses in retina
Between on and off regions, roughly at the
boundary is a narrow region where on-off
responses occur. Every cell has its own response
strength, roughly, the strength is Gaussian-like.

25
3. Visual Classification Approach our philosophy
26
3. VCA Our philosophy (Cont.)
27
3. VCA A method to choose scale
An observation
28
3. VCA A method to choose scale
29
3. VCA Method to choose scale
30
3. VCA Procedure
31
3. VCA Demonstrations
Linearly separable data without noise

32
3. VCA Demonstrations
Linearly separable data with 5 noise

33
3. VCA Demonstrations
Circularly separable data without noise

34
3. VCA Demonstrations
Circularly separable data with 5 noise

35
3. VCA Demonstrations
Spirally separable data without noise

36
3. VCA Demonstrations
spirally separable data with 5 noise

37
3. VCA Efficiency test
11 groups of benchmark datasets from UCI,
DELVE and STATLOG

38
3. VCA Efficiency test
Performance comparison between VCA SVM

39
3. VCA Scalability test
Time complexity of VCA with increase of size of
training data is quadratic (a), with increase of
dimension of data is linear (b).

(a) fixed 10-D but varying size data
sets are used.
(b) Fixed 5000 size but varying dimension
datasets are used.
40
3. VCA conclusion
1. Without increase of misclassification rate
(namely, loss of generalization capability),
much less computation effort is paid, as
compared with SVM (approximately 0.7 times of
SVM is required, increasing 142 times
computation efficiency). That is, VCA has very
high computation efficiency. 2. The VCA
s training time increases linearly with
dimension and quadtratically with size of
training data. This shows that VCA has a very
good scalabity.

41
4. Theory Visual classification machine
  • Formalization (Learning theory)
  • Let be sample space ( be
    pattern space and
    label space), and assume that there exists a
    fixed but unknown relationship F (or
    equivalently, there is a fixed but unknown
    distribution on ).
  • Given a family of functions
  • and a finite number of samples
  • which is drawn independently identically
    according to .

42
4. Theory Visual classification machine
  • Formalization (cont.)

We are asked to find a function
in which approximates F in , that is,
find a a function in , for a certain
type of measure Q between machines output
and actual output , so that
(Learning
problem) where
(risk or generalization error)
43
4. Theory Visual classification machine
  • Learning algorithm (Convergence)
  • A learning algorithm L is a mapping from
    to H with the following property
  • For any , there is an
    integer such that whenever
    ,
  • where .
  • In this case, we say that L(Z) is a
    -solution of the learning problem. Given
    an implementation scheme of a learning problem,
    we say it is convergent if it is a learning
    algorithm.

44
4. Theory Visual classification machine
  • Visual classification machine (VCM)
  • The function set
  • The generalization error
  • The learning implementation scheme (the procedure
    of finding ) (Is it a learning algorithm?)

45
4. Theory Visual classification machine
  • Learning theory of VCM
  • How can the generalization performance of VCM be
    controlled (what is the learning principle)?
  • If is it convergent? (If it is a learning
    algorithm?)

Key is to develop a rigorous upper bound
estimation on and estimate
46
4. Theory Visual classification machine
  • This theorem shows that to maximize the
    generalization of the machine is equivalently to
    minimize .

47
4. Theory Visual classification machine
  • VCA is just designed to approximate
    here. This reveals the learning principle behind
    VCA and explain why VCA has strong generalization
    capability.

48
4. Theory Visual classification machine
  • This theorem shows that the VCA is a learning
    algorithm. Consequently, a learning theory of VCM
    is established.

49
5. Concluding remarks
  • The existing approaches for classification has
    mainly been aimed to exploring the intrinsic
    structure of dataset, less or no emphasis paid on
    simulating human sensation and perception. We
    have initiated an approach for classification
    based on human visual sensation and perception
    principle (The core idea is to model the blurring
    effect of lateral retinal interconnections based
    on scale space theory). The preliminary
    simulations have demonstrated that the new
    approach potentially is encouraging and very
    useful.
  • The main advantages of the new approach are its
    very high efficiency and excellent scalibility.
    It very often brings a significant reduction of
    computation effort without loss of prediction
    capability, especially compared with the
    prevalently adopted SVM approach.
  • The theoretical foundations of VCA, Visual
    learning theory, have been developed, which
    reveals that (1) VCA attains its high
    generalization performance via minimizing the
    upper error bound between actual and optimal
    risks (learning principle) (2) VCA is a learning
    algorithm.


50
5. Concluding remarks
  • Many problems deserve further research
  • To apply nonlinear scale space theory for
    further efficiency speed-up
  • to utilize VCA to practical engineering problems
    (e.g., DNA sequence analysis)
  • to develop visual learning theory for regression
    problem, etc.

51
Thanks!
Write a Comment
User Comments (0)
About PowerShow.com