Title: Interactive Interaction Analysis
1Interactive Interaction Analysis
- Aleks Jakulin Gregor Leban
- Faculty of Computer and Information Science
- University of Ljubljana
- Slovenia
2Overview
- Interactions
- Correlation can be generalized to more than 2
attributes, to capture interactions -
higher-order regularities. - Information theory
- A non-parametric approach for measuring
association and uncertainty. - Applications
- Visualizations of the domain uncover previously
unseen structure. - Software for interactive investigation of data
assists the user in identifying interesting
patterns. - Importance
- Understanding possible problems and assumptions
in machine learning algorithms.
3Attribute Dependencies
4Shannons Entropy
A
C
5Interaction Information
I(ABC)
I(ABC)
- I(BC)
- I(AC)
I(ABC) - I(AB)
- Interaction information can be
- POSITIVE synergy between attributes
- NEGATIVE redundancy among attributes
- SMALL nothing special about the 3-way
relationship
6Examples A Useful Attribute
Mutual information or information gain between
the attribute and the label.
7Another Useful Attribute
8A Negative Interaction
The proportion of information provided by either
of the two attributes. This is the overlap
between both mutual informations.
9A Negative Interaction
The only column where spore-print-color succeeded
in providing some information in excess of what
we already knew from odor.
10One Somewhat Useful Attribute
11A (Seemingly) Useless Attribute
Stalk-shape is totally uninformative, as the
class distribution is similar at all attribute
values. Thats why we cannot distinguish between
classes using this attribute.
12Surprise A Positive Interaction!
Information gained by holistic treatment of both
attributes! Again, this is new mutual
information arising from both attributes.
13Why a Positive Interaction?
Specific attribute value combinations that yield
perfect label predictions, but only in
combination of both attributes
14Whole Domain Interaction Matrix
15Interaction Graph
16An Interaction Dendrogram
17Information Diagram
A dissected Venn diagram helps investigate
higher-order interactions.
18Multi-Dimensional Scaling
19Interactive Interaction Analysis
Attributes of interest
A sorted list of interactions, ordered by the
interaction magnitude.
An interaction graph
20Summary
- There are relationships exclusive to groups of n
attributes. - Interaction information is a heuristic for
quantification of relationships with entropy. - Visualization methods attempt to
- summarize the interactions in the domain
(interaction graph, interaction dendrogram), - assist the user in exploring the domain and
constructing classification models (interactive
interaction analysis).
21Work in Progress
- Overfitting the interaction information
computations do not account for the increase in
complexity. - Support for numerical and ordered attributes.
- Inductive learning algorithms which use these
heuristics automatically. - Models that are based on the real relationships
in the data, not on our assumptions about them.