Institute for Advanced Studies in Basic Sciences - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Institute for Advanced Studies in Basic Sciences

Description:

... between inputs and outputs or to find patterns in data. ... Hence, in each level of weight only data of one specific variable are handled. Trained KANN ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 30
Provided by: Mah103
Category:

less

Transcript and Presenter's Notes

Title: Institute for Advanced Studies in Basic Sciences


1
Institute for Advanced Studies in Basic Sciences
Zanjan
Kohonen Artificial Neural Networks in Analytical
Chemistry
Mahdi Vasighi
2
Contents
  • Introduction to Artificial Neural Network (ANN)
  • Self Organizing Map ANN
  • Kohonen ANN
  • Applications

3
Introduction
An artificial neural network (ANN), is a
mathematical model based on biological neural
networks.
In more practical terms neural networks are
non-linear statistical data modeling tools. They
can be used to model complex relationships
between inputs and outputs or to find patterns in
data.
4
The basic types of goals or problems in
analytical chemistry for solution of which the
ANNs can be used are the following
  • Election of samples from a large quantity of the
    existing ones for further handling.
  • Classification of an unknown sample into a class
    out of several pre-defined (known in advance)
    number of existing classes.
  • Clustering of objects, i.e., finding the inner
    structure of the measurement space to which the
    samples belong.
  • Making models for predicting behaviors or
    effects of unknown samples in a quantitative
    manner.

5
The first thing to be aware of in our
consideration of employing the ANNs is the nature
of the problem we are trying to solve
Supervised or Unsupervised
6
Supervised Learning
The supervised problem means that the chemist has
already a set of experiments with known outcomes
for specific inputs at hand.
In this networks, structure consists of an
interconnected group of artificial neurons and
processes information using a connectionist
approach to computation.
7
Unsupervised Learning
The unsupervised problem means that one deals
with a set of experimental data which have no
specific associated answers (or supplemental
information) attached.
In unsupervised problems (like clustering) it is
not necessary to know in advance to which cluster
or group the training objects Xs belongs. The
Network automatically adapts itself in such a way
that the similar input objects are associated
with the topological close neurons in the ANN.
8
Kohonen Artificial Neural Networks
The Kohonen ANN offers considerably different
approach to ANNs. The main reason is that the
Kohonen ANN is a self-organizing system which
is capable to solve the unsupervised rather than
the supervised problems.
The Kohonen network is probably the closest of
all artificial neural networks architectures and
learning schemes to the biological neuron network
9
As a rule, the Kohonen type of net is based on a
single layer of neurons arranged in a
two-dimensional plane having a well defined
topology
A defined topology means that each neuron has a
defiend number of neurons as nearest neighbors,
second-nearest neighbor, etc.
10
The neighborhood of a neuron is usually arranged
either in squares or in hexagon.
In the Kohonen conception of neural networks, the
signal similarity is related to the spatial
(topological) relation among neurons in the
network.
11
Competitive Learning
The Kohonen learning concept tries to map the
input so that similar signals excite neurons that
are very close together.
12
1st step an m-dimensional object Xs enters the
network and only one neuron from those in the
output layer is selected after input occurs, the
network selects the winner c (central)
according to some criteria. c is the neuron
having either the largest output in the entire
network
13
2nd step After finding the neuron c, its weight
vector are corrected to make its response closer
to input.
3rd step The weight of neighboring neurons must
be corrected as well. These corrections are
usually scaled down, depending on the distance
from c. Beside decreasing with increasing the
distance from c, it decreases with each iteration
step. (learning rate)
14
XS
i
Input
(1i )
55
15
4th step After the correction have been made
the weights should be normalized to a constant
value, usually 1.
5th step The next object Xs is input and the
process repeated. After all objects are input
once, one epoch is completed.
16
0.2 0.4 0.1 0.4 0.5 0.5 0.1 0.3 0.6 0.6 0.8 0.0
0.7 0.2 0.9 0.2 0.4 0.3 0.3 0.1 0.8 0.9 0.2 0.4
0.5 0.1 0.5 0.0 0.6 0.3 0.7 0.0 0.1 0.2 0.9 0.1
1.0 0.0 0.1 0.1 0.2 0.3 0.8 0.7 0.4 0.7 0.2 0.7
1.0 0.2 0.6
0.34 0.80 0.52 0.76
1.28 0.46 0.80 1.18
0.82 0.30 0.76 0.44
1.06 0.32 1.18 1.16
Input vector
Winner
442
output
17
Input vector
1.0 0.2 0.6
0.2 0.4 0.1 0.4 0.5 0.5 0.1 0.3 0.6 0.6 0.8 0.0
0.7 0.2 0.9 0.2 0.4 0.3 0.3 0.1 0.8 0.9 0.2 0.4
0.5 0.1 0.5 0.0 0.6 0.3 0.7 0.0 0.1 0.2 0.9 0.1
1.0 0.0 0.1 0.1 0.2 0.3 0.8 0.7 0.4 0.7 0.2 0.7
0.8 -0.2 0.5 0.6 -0.3 0.1 0.9 -0.1 0.0 0.4 -0.6 0.6
0.3 0.0 -0.3 0.8 -0.2 0.3 0.7 0.1 -0.2 0.1 0.0 0.2
0.5 0.1 0.1 1.0 -0.4 0.3 0.3 0.2 0.5 0.8 -0.7 0.5
0.0 0.2 0.5 0.9 0.0 0.3 0.2 -0.5 0.2 0.3 0.0 -0.1
0.40.9
1 0.9
0.80.9
0.60.9
18
Top Map
After the training process accomplished, the
complete set of the training vectors is once more
run through the KANN. In this last run the
labeling of the neurons excited by the input
vector is made into the table called top map.
19
Weight Map
The number of weights in each neuron is equal to
the dimension m of the input vector. Hence, in
each level of weight only data of one specific
variable are handled.
Trained KANN
20
Toroidal Topology
W
3rd layer of neighbor neurons
21
Analytical Applications
  • Classification and Reaction monitoring
  • Classification of photochemical and metabolic
    reactions by Kohonen self-organizing maps is
    demonstrated
  • Changes in the 1H NMR spectrum of a mixture and
    their interpretation in terms of chemical
    reactions taking place.
  • Difference between the 1H NMR spectra of the
    products and the reactants as a descriptor of
    reaction was introduced as input vector to
    Kohonen self organizing map.

22
Dataset Photochemical cycloadditions. This was
partitioned into a training set of 147 reactions
and a test set of 42 reactions, all manually
classified into seven classes. The 1H NMR spectra
were simulated from the molecular structures by
SPINUS.
  • The input variables Reaction descriptors
    derived from 1H NMR spectra.
  • Topology toroidal 1313 and 1515 for
    photochemical reactions and 2929 for metabolic
    reactions.
  • Neighbor Scaling function Linear decreasing
    triangular with learning rate of 0.1 to 0 with
    50-100 epoch
  • Winning neuron selection criteria Euclidean
    distance.

23
After the predictive models for the
classification of chemical reactions were
established on the basis of simulated NMR data,
their applicability to reaction data from mixed
sources (experimental and simulated) was
evaluated.
24
A second dataset 911 metabolic reactions
catalyzed by transferases classified into eight
subclasses according to the Enzyme Commission
(E.C.) system.
resulting surface for such a SOM, each neuron
colored according to the Enzyme Commission
subclass of the reactions activating it, that
is, the second digit of the EC number.
25
For photochemical reactions, The percentage of
correct classifications obtained for the training
and test sets by SOMs. Correct predictions could
be achieved for 94-99 of the training set and
for 81-88 of the test set.
For metabolic reactions, 94-96 of correct
predictions for SOMs. The test set was predicted
with 66-67 of accuracy by individual SOMs.
26
Analytical Applications
  • QSAR QSTR

A general problem in QSAR modeling is the
selection of most relevant descriptors.
27
  • Descriptor clustering
  • Calibration and test set Selection

28
References
  • Chem.Int.Lab.sys. 38 (1997) 1-23
  • Neural Networks For Chemists, An Introduction.
    (Weinheim/VCH Publishers )
  • Anal. Chem. 2007, 79, 854-862
  • Current Computer-Aided Drug Design, 2005, 1,
    73-78
  • Acta Chimica Slovenica 1994, pp. 327-352

29
Thanks
Write a Comment
User Comments (0)
About PowerShow.com