Title: Data Mining Concepts and Research Trends
1Data Mining Concepts and Research Trends
KISS-SIGDB Tutorial 1998
Do-Heon LEE Database Laboratory Dept. of Computer
Science Chonnam National University
1998. 5. 21.
2Table of Contents
- Definition and Motivation of Data Mining
- Classification of Data Mining Techniques
- Mining Association Rules
- Attribute Dependencies
- Database Summarization
- Data Mining Projects
- DBMiner/GeoMiner/WebMiner
- MineSet
- Data Mining and Data Warehousing
- References
3Definition of Data Mining
- Data mining is the
- nontrivial extraction of
- implicit, beyond databases and catalogs
- previously unknown, and exclude well-known
knowledge - potentially useful information
application-dependent usefulness - from large volume of performance perspective
- actual data . missing, erroneous data
- Some counter examples
- The 3th attribute of table EMP is SALARY.
- Explicit information in the DB catalog
- Most of college students have been graduated from
high schools. - Well-known information, common sense
4Motivation of Data Mining Research
Growing reliance on database systems
Database operational data collection useful
resource reflecting domain characteristics
Fast advance of database system technology
Increasing volume of data stored in databases
Mining databases for useful knowledge that can be
exploited in decision making
5Comparison with Machine Learning
Data Mining Dynamic data Errorneous
data Uncertain data Missing data Coexistence
of irrelvant data Immense size Structured data
Machine Learning Static data Error-free
data Exact data No missing data Only relevant
data Moderate size Flat collection of data
- Data mining is an actual application of machine
learning methodologies.
6Classification of Data Mining Techniques
- On knowledge types to be discovered
- Characterization generalized description of
data characteristics - Classfication description of discriminating
characteristics - Clustering grouping data having common
properties - Association co-occurence relationships among
multiple events - Trend analysis characterize evolution trend
of temporal data - Pattern analysis find specified patterns in
large DBs - Types of mining targets are continuously evolved
- according to emerged application demands. ( cf.
SQL evolution ) - On database types to be mined
- relational, transactional, object-oriented,
temporal, multi-media etc .. - On techniques adopted
- statistics, symbolic learning, neural networks,
visualization etc..
7Association Rules Definition and Applications
- QUEST project at IBM Almaden Research Center
- Association rules ( among items )
- Given a collection of transactions each of which
is item-1, ..., item-n , - an association rule has a form of
- item-11, item-12, ... , item-1m --gt
item-21, item-22, ... , item-2k - antecedent items
consequence items - The existence of an item(or items) implies the
existence of other item(s) in the same
transaction.
In a POS(Point-Of-Sales) data set, 10/15/1301
coke, bread, hamburger 10/15/1421 coke,
hamburger , juice 10/15/1425 milk, sandwich,
juice 10/15/1513 sandwich, milk, juice,
bread 10/15/1631 hamburger, juice,
coke .....
association rules
decision making for shelf layout design, direct
mailing, etc ...
hamburger --gt coke sandwich, juice --gt
milk
- Customer usage patterns in public communication
services - Fault co-occurence analysis in complex systems
8Association Rules Usefulness Measures
- Two measures for identifying useful association
rules - support statistical significance - the
fraction of transactions containing all items - confidence rule strength - the fraction of
transactions containing consequence items to -
transactions containing antecedent items
hamburger o o x x o o o o o x 7
coke o o x x o o o x x o 6
both o o x x o o o x x x 5
coke, bread, hamburger coke, hamburger ,
juice milk, sandwich, juice sandwich,
milk, juice, bread hamburger, juice, coke
coke, bread, hamburger coke, hamburger ,
juice hamburger, juice milk, hamburger,
sweater coke, milk, juice
For an assoication rule coke --gt hamburger
, support 5 out of 10 50 confidence 5
out of 6 83
9Association Rules Mining Procedures
The first phase finding frequent item-sets (
high support ) the threshold value for support
is given as 40
coke, bread, hamburger coke, hamburger ,
juice milk, sandwich, juice sandwich,
milk, juice, bread hamburger, juice, coke
coke, bread, hamburger coke, hamburger ,
juice hamburger, juice milk, hamburger,
sweater coke, milk, juice coke, juice
coke, sweater
coke 8 bread 3 hamburger 7 juice
8 milk 4 sandwich 2 sweater 2
coke, hamburger 5 coke, juice
5 hamburger, juice 4
coke, hamburger, juice 2
The second phase finding strong associations
(high confidence) the threshold value for
confidence is given as 70
- Blind search 2N candidates
- AIS basic algorithm
- SETM sort-merge algorithm
- Apriori tree-structured candidate sets
- AprioriTid temprary table generation
- Partition partitioned mining
- DHP hash-based algorithm
coke --gt hamburger 5 out of 8 62.5
hamburger --gt coke 5 out of 7 71
coke --gt juice 5 out of 8
62.5 juice --gt coke 5 out of 8
62.5
10Sequential Patterns
CID 1 1 2 2 2 3 4 4 4 5
Time 95/06/25 95/06/30 95/06/10 95/06/15 95/06/2
0 95/06/25 95/06/25 95/06/30 95/07/25 95/06/12
Items 30 90 10,20 30 40,60,70 30,50,70 30 40,7
0 90 90
CID 1 2 3 4 5
Sequence lt(30) (90)gt lt(10,20) (30)
(40,60,70)gt lt(30,50,70)gt lt(30) (40,70)
(90)gt lt(90)gt
Maximal sequential patterns with support gt 25
lt(30) (90)gt lt(30) (40,70)gt
11Telecommunication Network Diagnosis
node-A
node-B
time 30 min
(C, 123 )
( F, 678 )
(E, 256 )
node-C
node-F
node-D
Co-occurence of 123 alarm in C and 256 alarm in
E implies 678 alarm in F in 30 minintes.
node-E
node-I
node-G
node-H
12Attribute Dependencies
- Given attributes A1, A2, ..., Am
- f(A1, A2, ..., Am, a set of constants) gt
-
g(A1, A2, ... Am, a set of constants) - where f and g are arbitrary (boolean)
functions. - e.g. (A1 c1 and A2 c2) then (A3 c3 and
A4 c4) - Intractable problems because the number of
possible functions and constants are potentially
infinite. - Thus, several constraints are given to make them
tractable in actual domains. - e.g. LHS is a conjuction of simple
predicates and RHS is an assertion of
classification --gt Classification problem
13Classification
- Symbolic classification rules(e.g. decision
trees) - The most well-studied area among inductive
learning problems.
A1
A1 A2 C a d 1 a e 2 b f 3 b g 3
a
b
A2
d
e
1
2
3
- Neural network approach
- Weight values in edges --gt symbolic description
of classification rules - Still far from a practical solution lt-- too
costly learning time - Suitable for single-learning/multiple-runs
problems
14Bottom-Up Summarization
- DBLEARN project at J.Han's Lab., Simon Fraser
Univ., Canada
Name Lee Kim Yoon Park Choi Hong
Major music physics math painting computing stati
stics
Birth_Place Kwangju Sunchon Mokpo Yeosu Taegu Suw
on
GPA 3.4 3.9 3.7 3.4 3.8 3.2
vote 1 1 1 1 1 1
Major art science science art science science
Birth_Place Chunnam Chunnam Chunnam Chunnam Kyung
buk Kyonggi
GPA good execellent execellent good execellent go
od
vote 1 1 1 1 1 1
Major art science science science
Birth_Place Chunnam Chunnam Kyungbuk Kyonggi
GPA good execellent execellent good
vote 2 2 1 1
attribute-oriented substitution
merging redundant records
Domain Knowledge
15Top-Down Summarization
- CLEVER system at DB Lab. KAIST
Table to be summarized
user's selection
tSD 0.4
lt w, w gt 1.000
PROGRAM vi emacs word gcc tetris
USER John Tom Lee Park Yang
lt engineering, w gt 0.833
lt w, developer gt 0.800
lt w, marketer gt 0.411
lt engineering, developer gt 0.700
lt w, programmer gt 0.589
Fuzzy set hierarchies
PROG_01
USR_01
lt editor, developer gt 0.489
lt engineering, programmer gt 0.522
lt editor, programmer gt 0.456
16Data Mining Projects
- QUEST IBM Almeden Research Center
- a common set of operations in a unified framework
- classfication, association etc..
- KDW(Knowledge Discovery Workbench) GTE
Laboratory Inc. - focus on architectural issues of data mining
system - clustering, classification, summarization,
deviation detection etc - IMACS(Intelligent Market Analysis and
Classification System) ATT Bell Lab - focus on human interaction on data mining
- data archaeology
- CoverStory Information Resources Incorporated
- summarization on supermarket scanner data
- DBMiner/GeoMiner/WebMiner Simon Fraser Univ.
- MineSet Silicon Graphics Inc.
17DBMiner
- DBMiner Research Group in Simon Fraser Univ.,
Canada - DMQL a SQL-like Data Mining Query Language
- Data structures Generalized relations,
multi-dimensional data cube
18DBMiner(contd)
- Functions
- Characterizer the general characteristics of a
set of user-specified data - attribute-oriented induction
- eg. Cold(x) gt headache(x) and cough(x)
- eg. Fever(x) gt headache(x) and
low-leucocyte-count(x) - Discriminator features that distinguish the
target class from constrasting classes - eg. Low-leucocyte-count(x) gt Fever(x)
- Classifier generalization-based decision tree
induction - Association rule finder multi-level association
rules - Meta-rule guided miner confine the search to
specific forms of rules - eg. Meta-rule major(s student, x) and p(s, y)
gt GPA(s, z) - Predictor predict the possible values for
missing data, after factor analysis - eg. An employees potential salary can be
predicted based on the salary distribution of
similar employees in the company - Data evolution evaluator
- eg. Growth patterns of certain stocks
- Deviation evaluator
- eg. A set of stocks whose growth patterns deviate
from the major trend.
19GeoMiner/WebMiner
- GeoMiner with GMQL(Geo-Mining Query Language)
- An extension of DBMiner for spatial data mining
- Modules
- Geo-characterizer
- eg. Given spatial hierarchies of Western Canada,
discover general weather patterns according to
region partitions - Geo-comparator( discriminator)
- eg. The differences in weather patterns between
British Columbia and Alberta - Geo-associator
- WebMiner with WebQL
- It finds resources in the internet related to a
specific topic - eg. What is the most popular document about data
mining in terms of number of accesses - cf. WEB traversal pattern discovery(by Chen, Park
and Yu, 1996) - eg. If a user visits h1 gt h2 gt h5 then he/she
is apt to visit h8 gt h11
20MineSet
- Developed by Silicon Graphics Inc.
- Combine intelligent data mining algorithms and
multidimensional data visualization techniques - Association rule generator/rule visualizer
- Classification tools
- MLC based classification modules
- Decision tree inducer
- Option tree inducer
- Evidence classifier inducer
- Decision table inducer
- Tree/evidence visualizers
- Map visualizer spatial data analysis
- Clustering module
- Regressin tree inducer predict unknown values
21Rule Visualizer of MineSet
Cited from the Silicon Graphics Inc. Home Pages
22Decision Tree Visualizer of MineSet
Cited from the Silicon Graphics Inc. Home Pages
23Map Visualizer of MineSet
Cited from the Silicon Graphics Inc. Home Pages
24Two Perspectives on Data Mining
- AI practitioners perspective
- Extensions of machine learning technology
- Focus on sophisticated measures and theories
rather than efficiency improvement - DB practitioners perspective
- Application of machine learning paradigms to
massive and actual data management problems - A suggestion as a DB practitioner
- First step Blindly search possible knowledge
gt Data Mining - There is no guru who could guide the search
directions. - No available heuristics Rather ignore
heuristics for unknown patterns. - Second step Validate the discovered rough
knowledge in detail
25Data Mining and Data Warehousing
Process-oriented
Data Mining
Metadata
Subject-oriented
Relational DB-1
Data mart-1
Relational DB-2
Data warehouse builder/ manager
Data mart-2
Object-oriented DB-1
Data warehouse
Data mart-3
Object-oriented DB-2
Data mart-4
Legacy DB-1
Data mart-5
File system-1
Operational Data
Data for Decision Support
26Research Issues
- Looking for useful mining targets
- Associations, characteristic rules,
classification, clustering - Functional dependency, regression trees
- Similar sequential patterns/time series
- Variations of association rules
- Alternatives for simple support and confidence
measures - Generalized/multilevel association rules
- Performance enhancement for association rule
discovery - System implementation issues
- Identify core functions(eg. A tightly-coupled
architectureMEO98, MLC) - Elicit common DBMS requirements for various data
mining tasks - Integration with relational databases and/or
multi-dimensional databases - Data/knowledge visualization
- Extended query language or extened CLI eg. DMQL
- And so on ...
27References
- Data Mining General
- FRW91 W. J. Frawley, G. Piatetsky-Shapiro and
C. J. Matheus, Knowledge Discovery in Databases
An Overview, Knowledge Discovery in Databases,
G. Piatetsky-Shapiro and W. J. Frawley Ed., AAAI
Press, 1991, pp. 1-27 - AGR93a R. Agrawal, T. Imielinski and A. Swami,
Database Mining A Performance Perspective,
IEEE Trans. on Knowledge and Data Enginieering,
Vol. 5, No. 6, 1993, pp. 914-925 - MAT93 C. J. Matheus, P. Chan and G.
Piatetsky-Shapiro, Systems for Knowledge
Discovery in Databases, IEEE TKDE, Vol. 5, No.
6, 1993, pp. 903-913 - HOL94a M Holsheimer and A. Siebes, Data Mining
The Search for Knowledge in Databases, Report
CS-R9406, ISSN 0169-118X, CWI(Centrum voor
Wiskunde en Informatica), The Netherland, 1994 - Association Rules
- AGR93b R. Agrawal, T. Imielinski and A. Swami,
Mining Associations between Sets of Items in
Massive Databases, Proc. ACM SIGMOD, Washington
D.C., May 1993 - AGR94 R. Agrawal and R. Srikant, Fast
Algorithms for Mining Association Rules in Large
Databases, Proc. VLDB, Santiago, Sep. 1994, pp.
487-499 - KLE94 M. Klemettien, H. Mannila, P. Ronakainen,
H. Toivonen and A. Verkamo, Finding Interesting
Rules from Large Sets of Discovered Association
Rules, Proc. CIKM, Gaithersburg, Nov. 1994, pp.
401-407
28References(Contd)
- HOT95 M. Houtsma and A. Swami, Set-Oriented
Mining for Association Rules in Relational
Databases, Proc. ICDE, Taipei, Mar. 1995, pp.
25-33 - SAV95 A. Savasere, E. Omiecinski, S. Navathe,
An Efficient Algorithm for Mining Association
Rules in Large Databases, Proc. VLDB, Zurich,
Sep. 1995, pp. 432-444 - SRI95 R. Srikant and R. Agrawal, Mining
Generalized Association Rules, Proc. VLDB,
Zurich, Sep. 1995, pp. 407-419 - HAN95 J. Han and Y. Fu, Discovery of
Multiple-level Association Rules from Large
Databases, Proc. VLDB, Zurich, Sep. 1995, pp.
420-431 - PAR95a J. -S. Park and Y. Fu, An Efficient
Hash Based Algorithm for Mining Association
Rules, Proc. SIGMOD, 1995, pp. 175-186 - PAR95b J. -S. Park, M. -S. Chen and P. S. Yu,
Efficient Parallel Data Mining for Association
Rules, Proc. CIKM, 1995 - SRI96 R. Srikant and R. Agrawal, Minining
Quantitative Association Rules in Large
Relational Tables, Proc. SIGMOD, Quebec, Jun.
1996, pp. 1-12 - FUK96 T. Fukuda, Y. Morimoto, S. Morishita and
T.Tokuyama, Data Mining Using Two-Dimensional
Optimized Association Rules Scheme, Algorithms,
and Visualization, Proc. SIGMOD, Quebec, Jun.
1996, pp. 13-23 - CHE96 D. Cheung, J. Han, V. Ng and C.Wong,
Maintenance of Discovered Association Rules in
Large Databases An Incremental Updating
Technique, Proc. ICDE, New Orleans, Feb. 1996,
pp. 106-114
29References(Contd)
- BRI97a S. Brin, R. Motwami, J. Ullman and S.
Tsur, Dynamic Itemset Counting and Implication
Rules for Market Basket Data, Proc. SIGMOD,
1997, pp. 255-264 - BRI97b S. Brin, R. Motwami and C. Silverstein,
Beyond Market Baskets Generalizing Association
Rules to Correlations, Proc. SIGMOD, 1997, pp.
265-276 - HAN97 E. H. Han, G. Karypis and V. Kumar,
Scalable Parallel Data Mining for Association
Rules, Proc. SIGMOD, 1997, pp. 277-288 - AGG98 C. C. Aggarwal and P. S. Yu, Online
Generation of Association Rules, Proc. Intl
Conf. on Data Engineering, 1998, pp. 402-411 - OZD98 B. Özden, S. Ramaswamy and A.
Silberschatz, Cyclic Association Rules, Proc.
Intl Conf. on Data Engineering, 1998, pp.
412-423 - LIN98 J. -L. Lin and M. H. Dunham, Mining
Association Rules Anti-Skew Algorithms, Proc.
Intl Conf. on Data Engineering, 1998, pp.
486-493 - SAV98 A. Savasere, E. Omiecinski ans S.
Navathe, Mining for Strong Negative Associations
in a Large Database of Customer Transactions,
Proc. Intl Conf. on Data Engineering, 1998, pp.
494-502 - RAS98 R. Rastogi and K. Shim, Mining Optimized
Association Rules with Categorical and Numeric
Attributes, Proc. Intl Conf. on Data
Engineering, 1998, pp. 503-513
30References(Contd)
- Characterization
- HAN91 Y. Cai, N. Cercone and J. Han,
Attribute-Oriented Induction in Relational
Databases, Knowledge Discovery in Databases, G.
Piatetsky-Shapiro and W. Frawley Ed., AAAI Press,
1991, pp. 213-228 - HAN92a J. Han, Y. Cai and N. Cercone,
Knowledge Discovery in Databases An
Attribute-Oriented Approach, Proc. VLDB, 1992,
pp. 547-559 - HAN92b J. Han, Y. Cai, N. Cercone and Y. Huang,
DBLEARN A Knowledge Discovery System for Large
Databases, Proc. CIKM, 1992, pp. 473-481 - HAN93 J. Han, Y. Cai and N. Cercone,
Data-Driven Discovery of Quantitative Rules in
Relational Databases, IEEE TKDE, Vol. 5, No. 1,
Feb. 1993, pp. 29-40 - LEE94 D.-H. Lee and M. H. Kim, Discovering
Database Summaries through Refinements of Fuzzy
Hypotheses, Proc. ICDE, Houston, Feb. 1994, pp.
223-230 - LEE97 D.H. Lee and M.H. Kim, "Database
Summarization Using Fuzzy ISA Hierarchies", IEEE
Transactions on Systems, Man and Cybernetics,
Vol.27, No.4, August 1997, pp. 671-680
31References(Contd)
- Sequential Patterns
- ARG93c R. Agrawal, C. Faloutsos and A. Swami,
Efficient Similarity Search in Sequence
Databases, Proc. the 4th Intl Conf. on
Foundations of Data Organization and Algorithms,
Chicago, Oct 1993 - FAL94 C. Faloutsos, M. Ranganathan and Y.
Manolopoulos, Fast Subsequence Matching in
Time-Series Databases, Proc. SIGMOD,
Minneapolis, May. 1994, pp. 419-429 - AGR95a R. Agrawal and R. Srikant, Mining
Sequential Patterns, Proc. ICDE, Taipei, Mar.
1995, pp. 3-14 - AGR95b R. Agrawal, K.Lin, H. Sawhney and K.
Shim, Fast Similarity Search in the Presense of
Noise, Scaling, and Translation in Time-Series
Databases, Proc. VLDB, Zurich, Sep. 1995, pp.
490-501 - AGR95c R. Agrawal, G. Psaila, E. Wimmers and M.
Zait, Querying Shapes of Histories, Proc. VLDB,
Zurich, Sep. 1995, pp. 502-514 - HAT96 K. Hatonen, M. Klemettinen, H. Mannila,
P. Ronkainen and H. Toivonen, Knowledge
Discovery from Telecommunication Network Alarm
Databases, Proc. ICDE, New Orleans, Feb. 1996,
pp. 115-123 - SHA96 H. Shatkay and S.Zdonik, Approximate
Queries and Representations for Large Data
Sequences, Proc. ICDE, New Orleans, Feb. 1996,
pp. 536-545 - LI96 C. Li, P. Yu and V. Castelli,
HierarchyScan A Hierarchical Similarity Search
Algorithm for Databases of Long Sequences, Proc.
ICDE, New Orleans, Feb. 1996, pp. 546-555 - CHE96 M. -S. Chen, J. S. Park and P. S. Yu,
Data Mining for Path Traversal Patterns in a Web
Environment, Proc. ICDCS, 1997, pp. 385-392 - SHA97 J. Shafer and R. Agrawal, Parallel
Algorithms for High-Dimensional Proximity Joins,
Proc. VLDB, 1997, pp. 176-185
32References(Contd)
- Classification/Clustering
- QUI89 J. Quinlan and R. Rivest, Inferring
Decision Trees Using the Minimum Description
Length Principle, Information and Computation,
Vol. 80, 1989, pp. 227-248 - YAS91 R. Yasdi, Learning Classification Rules
from Database in the Context of Knowledge
Acquisition and Representation, IEEE TKDE, Vol.
3, No. 3, Sep. 1991, pp. 293-306 - CHA91 K. Chan and A. Wong, A Statistical
Technique for Extracting Classificatory Knowledge
from Databases, Knowledge Discovery in
Databases, G. Piatetsky-Shapiro and W. Frawley
Ed., AAAI Press, 1991, pp. 107-123 - UTH91 R. Uthursamy, U. Fayyad and S. Spangler,
Learning Useful Rules from Inconclusive Data,
Knowledge Discovery in Databases, G.
Piatetsky-Shapiro and W. Frawley Ed., AAAI Press,
1991, pp. 141-157 - ZIA91 W. Ziarko, The Discovery, Analysis and
Representation of Data Dependencies in
Databases, Knowledge Discovery in Databases, G.
Piatetsky-Shapiro and W. Frawley Ed., AAAI Press,
1991, pp. 195-209 - PIA91 G. Piatetsky-Shapiro, Discovery,
Analysis and Presentation of Strong Rules,
Knowledge Discovery in Databases, G.
Piatetsky-Shapiro and W. Frawley Ed., AAAI Press,
1991, pp. 229-248 - MAN91 M. Manago and Y. Kodratoff, Induction of
Decision Trees from Complex Structured Data,
Knowledge Discovery in Databases, G.
Piatetsky-Shapiro and W. Frawley Ed., AAAI Press,
1991, pp. 289-306
33References(Contd)
- SMY92 P. Smyth and R. Goodman, An Information
Theoretic Approach to Rule Induction from
Databases, IEEE TKDE, Vol. 4, No. 4, Aug. 1992,
pp. 301-316 - WAN92 L. Wang and J. Mendel, Generating Fuzzy
Rules by Learning from Examples, IEEE TSMC, Vol.
22, No. 6, Nov. 1992, pp. 1414-1427 - AGR92 R. Agrawal, S. Ghosh, T. Imielinski, B.
Iyer and A. Swami, An Interval Classifier for
Database Mining Applications, Proc. VLDB,
Vancouver, Aug. 1992, pp.207-216 - LU95 H. Lu, R. Setiono and H. Liu, NeuroRule
A Connectionist Approach to Data Mining, Proc.
VLDB, Zurich, Sep. 1995, 478-489 - HON91 J. Hong and C. Mao, Incremental
Discovery of Rules and Structure by Hierarchical
and Parallel Clustering, Knowledge Discovery in
Databases, G. Piatetsky-Shapiro and W. Frawley
Ed., AAAI Press, 1991, pp. 177-194 - NG94 R. Ng and J. Han, Efficient and Effective
Clustering Methods for Spatial Data Mining,
Proc. VLDB, 1994, pp. 144-155 - XU98 X. Xu, M. Ester, H. -P. Kriegel and J.
Sander, A Distribution-Based Clustering
Algorithm for Mining in Large Spatial Databases,
Proc. Intl Conf. on Data Engineering, 1998, pp.
324-333
34References(Contd)
- System Implementations
- SEL96 P.Selfridge, D.Srivastava and L. Wilson,
IDEA Interactive Data Exploration and
Analysis, Proc. SIGMOD, Quebec, Jun. 1996, pp.
24-34 - MEO98 R. Meo, G. Psalia and S. Ceri, A
Tightly-Coupled Architecture for Data Mining,
Proc. Intl Conf. on Data Engineering, 1998, pp.
316-323 - HAN96 J. Han et. al., DBMiner A System for
Mining Knowledge in Large Relational Databases,
Proc. KDD, 1996 - HAN97 J. Han et. al., GEOMiner A System
Prototype for Spatial Data Mining, Proc. SIGMOD,
1997 - HAN98 WebMiner A Resource and Knowledge
Discovery System for the Internet,
http//db.cs.sfu.ca/WebMiner/ - KOH96 R. Kohavi et. al., Data Mining Using
MCL A Machine Learning Library in C, Proc.
Tools with AI, 1996, pp. 234-245 - HAL98 C. Hall ed., MineSet 2.0 for Data Mining
and Multidimensional Data Analysis,
http//www.cgi.com/Products/software/MineSet/DMStr
ategies/index.html