Title: Advance Data Mining
1Part III
- Advance Data Mining
- Techniques
2Neural Networks
38.1 Feed-Forward Neural Networks
4(No Transcript)
5(No Transcript)
6Neural Network Input Format
Input between 0,1, Conversion necessary for
attribute values, Numeric values
straightforward Categorical values are tricky
7Neural Network Output Format
- Output between 0,1
- Numerical output value should be converted back
to real world values - Binary attribute Yes -gt 1, No -gt 0
- What if 0.5 Yes or No
- Solution
- Feed test data, record outputs
- feed new instance x, giving output v
- classify x with majoirty of test instance
clustering at or near x
8The Sigmoid Function
Returns values in 0,1 range When sufficiently
excited, output is close to 1
9(No Transcript)
108.2 Neural Network Training A Conceptual View
11(No Transcript)
12Supervised Learning with Feed-Forward Networks
- Backpropagation Learning
- Genetic Learning
13(No Transcript)
14Backpropagation
- Principle
- For each instance, there is an error between
computed and actual value - Weights or topology are to blame for this, so
adjust the weights on the paths from input nodes
to the current output node, called
backpropagation - Reiterate with another instance
- With enough iteration backpropagation is
garanteed to converge
15Backpropagation Learning
- Initialize network
- Create network topology
- Intialize weights randomly between -1,1
- Choose a learning parameter between 0, 0.1
- Choose termination condition (max epochs or min
rms) - For all training set instances
- Feed instance to network
- Determine output error
- Update weights
- If termination condition is not met, repeat step
2 - Test accuracy using test dataset, if less than
optimal, change topology, start over
16Genetic learining to train ANN
- Create a population of k solutions (each solution
is set of weights for the ANN) randomly - For each solution si in population
- Assign the weights in si to ANN
- Use training data, record outputs
- Compute an average squared error for the pass of
training set - The error is the fitness score for si
- Use min error as good pop. Elements, replace the
bad ones with mutation, crossover or selection - Repeat until a termination condition is met (min
error or max iterations)
17Unsupervised Clustering with Self-Organizing Maps
18(No Transcript)
19SOM
- Contains 2 layers
- No of nodes in output layer is the max no of
clusters in data - Fully connected
- Output layer could be 2D for image processing
20SOM
- For each training instance
- Feed it to SOM
- The output node o whose weights most closely
match the instance wins the instance, - the number of winned nodes is incremented for o
- The weights are adjusted for o
- (At the beginning the neighbor node weights are
adjusted as well, but after many passes of
training set only winning nodes weights are
adjusted) - The number of output nodes with winning nodes are
the clusters - The output node with no or very few node are
deleted - 1 last time training instances fed to SOM with
new winning outputs only to cluster instances
with no cluste
218.3 Neural Network Explanation
- Transform network architecture into a set of
rules - Sensitivity Analysis
- Average Member Technique
- Use supervised techniques to evaluate
unsupervised clustering
22Sensitivity Analysis
- Applied to gain insight into effect individual
attributes have on network output - Allows us to determine a rank ordering for
relative importance of attributes
23Sensitivity Analysis
- Divide data into training and test datasets
- Train network with training data
- Use data set to create a new instance I. Each
attribute value for I is the average of all
attribute values in test data - For each attribute
- Vary the attribute value within instance I, feed
modified I to the network - Determine effect of variations have on output
- Relative importance of each attribute is measured
by the effect on network output
24Average Member Technique
- Compute the average or most typical instance of
each class by finding the average value for each
class attribute
25Supervised Technique to evaluate Unsupervised
Clustering
- Feed data to network
- Make each cluster a class, assign a name
- Use classes as training set, perform supervised
learning, create a set of rules - Examine rule set to determine nature of clusters
formed by unsupervised learning
268.4 General Considerations
- What input attributes will be used to build the
network? - How will the network output be represented?
- How many hidden layers should the network
contain? - How many nodes should there be in each hidden
layer? - What condition will terminate network training?
27Neural Network Strengths
- Work well with noisy data.
- Can process numeric and categorical data.
- Appropriate for applications requiring a time
element. - Have performed well in several domains.
- Appropriate for supervised learning and
unsupervised clustering.
28Weaknesses
- Lack explanation capabilities.
- May not provide optimal solutions to problems.
- Overtraining can be a problem.
298.5 Neural Network Training A Detailed View
30The Backpropagation Algorithm An Example
31Backpropagation Example
- Use the network 2 inputs, 1 hidden layer (2
nodes), 1 output, table 8.1, figure 8.1,
target-output0.65 - Output(j, 0.2x10.3x0.4-0.1x0.7)0.562
- Output(i, 0.1x1-1x0.40.2x0.7)0.550
- Output(k, 0.1x0.5620.5x0.550)0.582
32Backpropagation Error Output Layer
33Backpropagation Error Output Layer
Error(k) (0.65-0.582)(0.582)(1-0.582) 0.0017
34Backpropagation Error Hidden Layer
Oj 0.562, error(k) 0.0017 Error(j)
(0.0017)(0.1)(0.562)(1-0.562) 0.00042
35The Delta Rule
36Backpropagation-Weight adjustment
- r0.5
- deltaWjk0.5x0.017x0.5620.0048
- New wjk0.10.00480.1048
- .
37Root Mean Squared Error
Termination condition is minimum degree of
learning measured by rms
38Kohonen Self-Organizing Maps An Example
39(No Transcript)
40SOM Example
- For each instance i
- For each output node o
- Calculate closesness score between i and o using
formula - Thw output node with min score wins i, gets
weights updated
41Classifying a New Instance output nodej
Input instance 0.4, 0.7 Score(i)((0.4-0.2)2(0.
7-0.1)2)0.50.632 Score(j)((0.4-0.3)2(0.7-0.6)2)
0.50.141 Winner is j, j is rewarded, weights for
j will be adjusted
42Adjusting the Weight Vectors Output Node j
R0.5 Delta w1j0.5x(04-0.3)0.05 New
w1j0.30.050.35