Title: Optimizing number of hidden neurons in neural networks
1Optimizing number of hidden neurons in neural
networks
IASTED International Conference on Artificial
Intelligence and Applications Innsbruck, Austria
Feb, 2007
Janusz A. Starzyk School of Electrical
Engineering and Computer Science Ohio
University Athens Ohio U.S.A
2Outline
- Neural networks multi-layer perceptron
- Overfitting problem
- Signal-to-noise ratio figure (SNRF)
- Optimization using signal-to-noise ratio figure
- Experimental results
- Conclusions
3Neural networks multi-layer perceptron (MLP)
4Neural networks multi-layer perceptron (MLP)
- Efficient mapping from inputs to outputs
- Powerful universal function approximation
- Number of inputs and outputs determined by the
data - Number of hidden neurons determines the fitting
accuracy ? critical
5Overfitting problem
- Generalization
- Overfitting overestimates the function
complexity, degrades generalization capability - Bias/variance dilemma
- Excessive hidden neuron ? overfitting
6Overfitting problem
- Avoid overfitting cross-validation early
stopping
training data (x, y)
Training error etrain
MLP training
All available training data (x, y)
testing data (x, y)
MLP testing
Testing error etest
Fitting error
etest
Stopping criterion etest starts to increase or
etrain and etest start to diverge
etrain
Number of hidden neurons
Optimum number
7Overfitting problem
- How to divide available data?
Fitting error
training data (x, y)
All available training data (x, y)
etest
testing data (x, y)
etrain
data wasted
Number of hidden neurons
Optimum number
- Can test error catch the generalization error?
8Overfitting problem
- Desired
- Quantitative measure of unlearned useful
information from etrain - Automatic recognition of overfitting
9Signal-to-noise ratio figure (SNRF)
- Sampled data function value noise
- Error signal
- approximation error component noise component
Noise part Should not be learned
Useful signal Should be reduced
- Assumption continuous function WGN as noise
- Signal-to-noise ratio figure (SNRF)
- signal energy/noise energy
- Compare SNRFe and SNRFWGN
Learning should stop ? If there is useful
signal left unlearned If noise dominates in the
error signal
10Signal-to-noise ratio figure (SNRF)
one-dimensional case
How to measure the level of these two components?
noise component
approximation error component
11Signal-to-noise ratio figure (SNRF)
one-dimensional case
High correlation between neighboring samples of
signals
12Signal-to-noise ratio figure (SNRF)
one-dimensional case
13Signal-to-noise ratio figure (SNRF)
one-dimensional case
Hypothesis test 5 significance level
14Signal-to-noise ratio figure (SNRF)
multi-dimensional case
- Signal and noise level estimated within
neighborhood
sample p
M neighbors
15Signal-to-noise ratio figure (SNRF)
multi-dimensional case
All samples
16Signal-to-noise ratio figure (SNRF)
multi-dimensional case
M1 ? threshold multi-dimensional (M1)
threshold one-dimensional
17Optimization using SNRF
- SNRFelt threshold SNRFWGN
- Start with small network
- Train the MLP ? etrain
- Compare SNRFe SNRFWGN
- Add hidden neurons
Noise dominates in the error signal, Little
information left unlearned, Learning should stop
Stopping criterion SNRFelt threshold SNRFWGN
18Optimization using SNRF
Applied in optimizing number of iterations in
back-propagation training to avoid overfitting
(overtraining)
- Set the structure of MLP
- Train the MLP with back-propagation iteration
- ? etrain
- Compare SNRFe SNRFWGN
- Keep training with more iterations
19Experimental results
- Optimizing number of iterations
noise-corrupted 0.4sinx0.5
20Optimization using SNRF
- Optimizing order of polynomial
21Experimental results
- Optimizing number of hidden neurons
- two-dimensional function
22Experimental results
23Experimental results
- Mackey-glass database
- Every consecutive 7 samples ? the following
sample
MLP
24Experimental results
WGN characteristic
25Experimental results
- Puma robot arm dynamics database
- 8 inputs (positions, velocities, torques)?
angular acceleration
MLP
26Conclusions
- Quantitative criterion based on SNRF to optimize
number of hidden neurons in MLP - Detect overfitting by training error only
- No separate test set required
- Criterion simple, easy to apply, efficient and
effective - Optimization of other parameters of neural
networks or fitting problems