Mining Unexpected Rules by Pushing User Dynamics

About This Presentation

Title:

Mining Unexpected Rules by Pushing User Dynamics

Description:

Domain values in data rules, and fuzzy terms (such as 'High', 'Low') in knowledge rules. ... add the selected rule to user knowledge ... – PowerPoint PPT presentation

Number of Views:55

Avg rating:3.0/5.0

Slides: 22

Provided by: Jia994

Category:

more less

Transcript and Presenter's Notes

Title: Mining Unexpected Rules by Pushing User Dynamics

1
Mining Unexpected Rules by Pushing User Dynamics

Ke Wang
Yuelong Jiang
Laks V.S. Lakshmanan

2
Unexpected Rules

Unexpectedness user finds the rules surprising
Existing approaches
Syntax distance (B. Liu, W. Hsu, AAAI96)
Logical contradiction (B. Padmanabhan, A.
Tuzhilin, KDD98)
Both by direct comparison between rules

3
Our approach Data Violation

Knowledge rules Ui
The data rule r
unexpected to the user who links owning
house at BeverlyHill to movie stars and well
paid
Each tuple that satisfies r but violates Ui is an
evidence for unexpectedness of r

4
Three Issues

Knowledge Dynamics
User decides the best knowledge to apply given a
scenario (i.e., a tuple) --- modeling
Knowledge Push
Push user knowledge right from the start of
search --- rule mining
Unexpectedness Dynamics
Adjust the unexpectedness of remaining rules by
what has been presented so far --- rule selection

5
Rule Representation

Knowledge rules and data rules
Domain values in data rules, and fuzzy terms
(such as High, Low) in knowledge rules.
Match degree measures the match between a domain
value (i.e., Primary) and a fuzzy term (i.e., Low)

Target attribute
6
Main Ideas

Preference model the user specifies the best
knowledge rules for each tuple
e.g., U1 and U2 for those owning a house at
BeverlyHill
Violation model we measure the unexpectedness of
r by the violation of satisfying tuples to
their best knowledge rules.

7
The Preference Model

User specifies covering knowledge for each tuple
d (covering depth) best knowledge rules that
match the tuple
Ways to specify best
Explicit enumeration (not scalable)
Rank by preference max strength, best match,
min violation, etc.

8
The Violation Model

For a tuple t and a knowledge rule U
Body match degree, bm(t,U), in 0,1
Head match degree, hm(t,U), in 0,1
Violation of U by t
Violation of t, v(t), is aggregated v(t,U) over
the covering knowledge U of t.

if bm(t, U) ? ? otherwise
9
The Mining Problem

Unexpectedness Support of r
Unexpectedness Confidence of r
Unexpectedness of r
Problem Find all data rules r above specified
thresholds for Usup and Ustr.

10
The Mining Algorithm

Three Phases
Violation Phase
Rule Phase
Final Phase

11
Violation Phase

Compute and store v(t) for all tuples t in the
database T, pruning all t with v(t) 0 get new
database T
prunes the data consistent with the user
knowledge, very effective.

12
Rule Phase

Generate all rules r with Usup(r) above threshold
using T
Usup(r) is anti-monotone
Usup(r) decreases as the body b(r) grows
independent of preference model and violation
function v(t)
Any frequent itemset algorithms can be applied in
this phase

13
Final Phase

Compute sup(r) and sup(b(r)) for rules produced
in rule phase
Output rules r with Ustr(r) above threshold.

14
The Selection Problem

Display a specified number k of rules to the
user, in the order of unexpectedness
See-and-Know Assumption
After seeing rules R, user is interested in only
rules that are unexpected with respect to

15
The Selection Algorithm

At each step,
greedily select the most unexpected rule (until k
rules are selected or there is no rule to select)
add the selected rule to user knowledge
for each matching tuple, update the violation
values to reflect the new covering knowledge.

16
Experiment Dataset

KDD-CUP-98 Dataset
Target Attribute
NK97 donation amount in 1997 campaign
five scales c0, c1, c2, c3, c4, in increasing
order.
23 non-target attributes
Their meanings are easier to understand than
other attributes

17
User Knowledge

Observation People tend to remain unchanged in
donation behaviors
Four knowledge rules

18
Efficiency of Mining

Three Algorithms
UMINE(NULL), without user knowledge
UMINE-Unpruned, without tuple pruning
UMINE-Pruned, pruning those tuples with vt 0

19
Interestingness of Rules
Ui(x,y) Ui covers x tuples with total violation y
20
Effectiveness of Selection
21
Conclusion

A new approach for finding interesting rules by
modeling user knowledge
Violation of covering knowledge by satisfying
tuples
Model human user as a dynamic entity in applying
knowledge and interpreting presented rules.
Push user knowledge in data preparation, mining,
and rule selection. This benefits both search and
quality.

Write a Comment

User Comments (0)