Title: DATA MINING: Meaning, motivation, methodology and models
1DATA MININGMeaning, motivation, methodology and
models
Owotoki Peter O. FamilyNameAttu-harburg.de
2OUTLINE
- Meaning of Data Mining
- Motivation for Data Mining
- Methodology of Data Mining
- Models for Data Mining
3OUTLINE
- Meaning of Data Mining
- Motivation for Data Mining
- Methodology of Data Mining
- Models for Data Mining
4Meaning of Data Mining
What is Data Mining?
Knowledge is the new Gold. Data mining
helps us to extract it from Data?
It is the non trivial and proactive process of
extracting VALID, COMPREHENSIBLE and
INTERESTING knowledge from data
5OUTLINE
- Meaning of Data Mining
- Motivation for Data Mining
- Methodology of Data Mining
- Models for Data Mining
6Motivation for Data Mining
Why is data mining neccessary today?
- Technical Necessity
- Digital revolution gt explosion of data
generation - Cheaper hardware gt easy storage
- Price of storage has fallen from 10/Mbyte in
1990 to 0.1c/Mbyte today - Increasing complexity efficacy of Moore's Law
- Business Necessity
- Increasing complexity Manufacturing,
maintenance - More demanding customers
- Fiercer and more competent competition
- Regulatory
We are drowning in data, but starving for
knowledge!
7Motivation for Data Mining
Data Mining Provides Competitive Advantage in the
knowledge Economy
It does this by providing the maximum knowledge
needed to rapidly make valuable business
decisions despite the enormous amounts of
available data
8Motivation for Data Mining
Measurable benefits from Data Mining have been
achieved in many different domains
- Fraud management e.g. telecommunications,
financial, insurance industries - Market analysis customer, competition, trend ,
analyses - Product development biotechnology,
pharmaceutical industry - Entertainment digital convergence, sports
- Diagnosis and monitoring medical, aerospace,
automotive
Interesting emerging applications areas are in
the implantation of intelligent agents with data
mined core in embedded digital systems
9OUTLINE
- Meaning of Data Mining
- Motivation for Data Mining
- Methodology of Data Mining
- Models for Data Mining
10Methodology of Data Mining
Life cycle of Data Mining projects
- Business understandingunderstanding project
objectives from business perspective, data mining
problem definition - Data understandinginitial data collection, get
familiar with data - Data preparationconstruct final dataset from raw
data - Modelingselect and apply modeling techniques
- Evaluationevaluate model, decide on further
deployment - Deploymentcreate report, carry out actions based
on new insights
11OUTLINE
- Meaning of Data Mining
- Motivation for Data Mining
- Methodology of Data Mining
- Models for Data Mining
12Models for Data Mining
Functions provided by Data Mining Models
Data Mining Functions
Predictive
Descriptive
Classification
Prediction and Regression
Clustering and Segmentation
Summarization
Linkage and Dependency Analysis
Time Series Analysis
13Models for Data Mining
Computational Intelligence Methods
Artificial Neural Networks
Fuzzy Logic
Evolutionary Computing
Machine Learning
HYBRID METHODS
1. Divide and Conquer Methods Decision trees,
production rules
1. Adaptive Resonance Theory
2. Back Propagation Learning
2. Instance Based Learning Nearest neighbor,
case based reasoning
3. Hopfields Associative Memory
4. Kohonens Self Organizing Maps
3. Reinforcement Learning
5. Pulsed Neural Networks
4. Statistical Methods Bayesian, Monte Carlo etc
6. Radial Basis Functions
5. Support Vector Machines SVM, kernel methods,
PCA, ICA
7. Real Time Recurrent Learning
14Models for Data Mining
Functions provided by Data Mining Models
15Models for Data Mining
Adaptive Resonance Theory
Motivation is the Stability Plasticity Dilemma
(Grossberg 1987)
How can a learning system be designed to remain
plastic, or adaptive, in response to significant
events and yet remain stable in response to
irrelevant events?How does the system know how
to switch between its stable and its plastic
modes to achieve stability without rigidity and
plasticity without chaos?In particular, how can
it preserve its previously learned knowledge
while continuing to learn new things?And, what
prevents the new learning to wash away the
memories of prior learning? (Tauritz 1995)
16Models for Data Mining
Adaptive Resonance Theory
ART1 Architecture
17Models for Data Mining
Adaptive Resonance Theory
ART Algorithm
1. InitialisationInitialise the number N of
categoriesInitialize every Prototype Vector
(Category) Pi, i ? 1, N to the unitary
vectorInitialize the vigilance parameter ? ?
0,12. Apply InputGet the next input vector
XjEnable all Output3. Compute Activation for
every enabled category4. Select Category Pi
with Max Tif set of enabled category is empty
goto 25. Check for resonance 6. if resonance
equation is false disable output Pi and goto
47. Adjust winning Category ß ? 0,1 is the
learning rate
18Models for Data Mining
Adaptive Resonance Theory
Types of ART
ART1 Unsupervised Clustering of binary input
vectors.ART2 Unsupervised Clustering of
real-valued input vectors.ART3 Incorporates
"chemical transmitters" to control the search
process in a hierarchical ART structure.ARTMAP
Supervised version of ART that can learn
arbitrary mappings of binary patterns.Fuzzy
ART Synthesis of ART and fuzzy logic.Fuzzy
ARTMAP Supervised fuzzy ARTdART and dARTMAP
Distributed code representations in the F2 layer
(extension of winner take all approach).
Gaussian ARTMAP Supervised ART network that
uses Gaussian-defined receptive fields.
19Models for Data Mining
Adaptive Resonance Theory
Some applications of ART
- Robotics navigation and control
- Pattern recognition e.g. facial recognition,
Signature verification, target recognition - Land cover classification
- Medical diagnosis
- Signal processing e.g. Speech production.
- Financial application Stock market
20Models for Data Mining
Nested Generalized Exemplar (NGE)
An Instance Based Learner
21Models for Data Mining
Nested Generalized Exemplar (NGE)
Use generalized hyperrectangles instead of single
points
22Models for Data Mining
Nested Generalized Exemplar (NGE)
New Euclidean distance
23Models for Data Mining
Nested Generalized Exemplar (NGE)
Algorithm
24Models for Data Mining
Non Nested Generalized Exemplar (NNGE)
NGE
NNGE
NNGE has been shown to improve on the performance
of NGE
25Models for Data Mining
Non Nested Generalized Exemplar (NNGE)
Algorithm
26Models for Data Mining
NGE and NNGE sample applications
In Recommender Systems Recently used for the
analysis of electrical networks Flood
management Fault Monitoring in aircraft
systems