MAIDS Mining Alarming Incidents in Data Streams Implementation Discussion - PowerPoint PPT Presentation

1 / 18

About This Presentation

Title:

MAIDS Mining Alarming Incidents in Data Streams Implementation Discussion

Description:

Example: Minimal: quarter, then 4 quarters 1 hour, 24 hours day, ... second, minute, quarter, hour, day, week, ... Targeted items ... – PowerPoint PPT presentation

Number of Views:134

Avg rating:3.0/5.0

Slides: 19

Provided by: jiaw186

Category:

more less

Transcript and Presenter's Notes

Title: MAIDS Mining Alarming Incidents in Data Streams Implementation Discussion

1
MAIDS (Mining Alarming Incidents in Data
Streams)Implementation Discussion

MAIDS group
NCSA and Dept. of CS
University of Illinois at Urbana-Champaign
www.maids.ncsa.uiuc.edu

2
Implementation Essentials

Framework Titled time window
Extended FPgrowth for mining frequent patterns in
data streams
Extended Naïve Bayes for mining classification
models in data streams
Extended K-means by integration of
micro-clustering and macro-clustering for cluster
analysis in data streams
Extended H-tree cubing method for
multi-dimensional query answering in data streams
Application developments and testing

3
Framework Titled Time Window (1)

Natural tilted time frame window
Example Minimal quarter, then 4 quarters ? 1
hour, 24 hours ? day,
Logarithmic tilted time frame window
Example Minimal 1 minute, then 1, 2, 4, 8, 16,
32,

4
Framework Titled Time Window (2)

Pyramidal tilted time frame window
Example Suppose there are 5 frames and each
takes maximal 3 snapshots
Given a snapshot number N, if N mod 2d 0,
insert into the frame number d. If there are
more than 3 snapshots, kick out the oldest one.

5
Frequent Pattern Finder

Frequent Pattern Finder Tilted Window
FPgrowth
A tilted time frame
Different time granularities (natural vs.
pyramidal)
second, minute, quarter, hour, day, week,
Targeted items
User- or expert- selected items as targeted items
Trace targeted items and their combinations using
FP-tree
FP-tree registers items with tilted time window
information
Mining based on the extended FPgrowth algorithm
on tilted window

6
FPGrowth (1) FP-Tree Construction
TID Items bought (ordered) frequent
items 100 f, a, c, d, g, i, m, p f, c, a, m,
p 200 a, b, c, f, l, m, o f, c, a, b,
m 300 b, f, h, j, o, w f, b 400 b, c,
k, s, p c, b, p 500 a, f, c, e, l, p, m,
n f, c, a, m, p
min_support 3

Scan DB once, find frequent 1-itemset
Sort frequent items in frequency descending
order, f-list
Scan DB again, construct FP-tree

F-listf-c-a-b-m-p
7
FPGrowth (2) FP-Tree Mining

Start at the frequent item header table in the
FP-tree
Traverse the FP-tree by following the link of
each frequent item p
Accumulate all of transformed prefix paths of
item p to form ps conditional pattern base

Conditional pattern bases item cond. pattern
base c f3 a fc3 b fca1, f1, c1 m fca2,
fcab1 p fcam2, cb1
8
FP-Tree with Tilted Window

With a fixed order of itemset (could be based on
sampling, or alphabetic ordering)
Construct FP-tree while scanning data stream
Each node contains a tilted time frame for count
accumulation
Add to the newest slot
Propagate when needed

F-listf-c-a-b-m-p
9
Mining Frequent Patterns in Dynamic Data Streams

Mining when a user submits mining queries
(on-demand)
Mining on FPtree is done using FPgrowth on the
data in the corresponding time windows
If mine freq. patterns in the last 30 minutes,
then
If mine freq. patterns between 6am to 8am, then
We may compare what has been changed in the last
24 hours (by comparing their frequent patterns,
i.e., mining the current patterns, mining the
patterns 24 hours ago, and then comparing them)

10
Classification for Dynamic Data Streams

Methodology Naïve Bayes Titled Time Windows
Tilted time framework as shown above (natural
vs. pyramidal)
Instead of decision-trees, consider other models
which do not changes drastically
Naïve Bayesian with boosting is a good approach
Major advantages
Store statistical information related to each
variable
Model construction prediction
Incremental updating and dynamic maintenance
Advanced task Comparing of models to find changes

11
Bayesian Classification Why?

Probabilistic learning Calculate explicit
probabilities for hypothesis, among the most
practical approaches to certain types of learning
problems
Incremental Each training example can
incrementally increase/decrease the probability
that a hypothesis is correct. Prior knowledge
can be combined with observed data.
Probabilistic prediction Predict multiple
hypotheses, weighted by their probabilities
Standard Even when Bayesian methods are
computationally intractable, they can provide a
standard of optimal decision making against which
other methods can be measured

12
Bayesian Theorem Basics

Let X be a data sample whose class label is
unknown
Let H be a hypothesis that X belongs to class C
For classification problems, determine P(H/X)
the probability that the hypothesis holds given
the observed data sample X
P(H) prior probability of hypothesis H (i.e. the
initial probability before we observe any data,
reflects the background knowledge)
P(X) probability that sample data is observed
P(XH) probability of observing the sample X,
given that the hypothesis holds

13
Bayesian Theorem

Given training data X, posteriori probability of
a hypothesis H, P(HX) follows the Bayes theorem
Informally, this can be written as
posterior likelihood x prior / evidence
MAP (maximum posteriori) hypothesis
Practical difficulty require initial knowledge
of many probabilities, significant computational
cost

14
Naïve Bayes Classifier

A simplified assumption attributes are
conditionally independent
The product of occurrence of say 2 elements x1
and x2, given the current class is C, is the
product of the probabilities of each element
taken separately, given the same class
P(y1,y2,C) P(y1,C) P(y2,C)
No dependence relation between attributes
Greatly reduces the computation cost, only count
the class distribution.
Once the probability P(XCi) is known, assign X
to the class with maximum P(XCi)P(Ci)

15
Training dataset
Class C1buys_computer yes C2buys_computer
no Data sample X (agelt30, Incomemedium, Stud
entyes Credit_rating Fair)
16
Naïve Bayesian Classifier Example

Compute P(X/Ci) for each class
P(agelt30 buys_computeryes)
2/90.222
P(agelt30 buys_computerno) 3/5 0.6
P(incomemedium buys_computeryes)
4/9 0.444
P(incomemedium buys_computerno)
2/5 0.4
P(studentyes buys_computeryes) 6/9
0.667
P(studentyes buys_computerno)
1/50.2
P(credit_ratingfair buys_computeryes)
6/90.667
P(credit_ratingfair buys_computerno)
2/50.4
X(agelt30 ,income medium, studentyes,credit_
ratingfair)
P(XCi) P(Xbuys_computeryes) 0.222 x
0.444 x 0.667 x 0.0.667 0.044
P(Xbuys_computerno) 0.6 x
0.4 x 0.2 x 0.4 0.019
P(XCi)P(Ci ) P(Xbuys_computeryes)
P(buys_computeryes)0.028
P(Xbuys_computeryes)
P(buys_computeryes)0.007
X belongs to class buys_computeryes

17
Naïve Bayes for Data Streams

Store single variable statistics
(Attribute-Value-ClassLabel AVC-list) in titled
time windows
Incremental update based on count propagation in
the titled time window
For computing accuracy, partition data into
training set and testing set, and derive
prediction accuracy similar to non-stream data
Boosting based on the testing data, put more
weight on the data whose prediction is incorrect
Advanced task Comparing of models to find changes

18
Training and Testing for Data Streams

Two classes of models for prediction peer vs.
future
We study how to predict future class
Take the data in the current window as testing
data
Take the data in the previous windows as training
set
Derive models based on different weighting scheme
(e.g., uniform, linear decreasing, logarithmic
decreasing, etc.)
Test and select the best model
Then based on this modeling scheme, construct
model by including the current window data as new
training set
To predict peer class
The training and test partition is along the same
time framework
There is no retraining process