Title: Learning Bayesian Metanetworks from Data with Multilevel Uncertainty
1Learning Bayesian Metanetworks from Data with
Multilevel Uncertainty
- Vagan Terziyan, Oleksandra Vitko
- vagan_at_it.jyu.fi, vitko_at_kture.kharkov.ua
-
- University of Jyväskylä ,
- Kharkov National University of Radioelectronics
AIAI-2004 (WCC 2004), Toulouse, France 24 August
2004
2Contents
Oleksandra Vitko Department of Artificial
Intelligence Kharkov National University of
Radioelectronics (Ukraine) http//www.cs.jyu.fi/ai
/oleksandra
- Bayesian Metanetworks
- Metanetworks for managing conditional
dependencies - Metanetworks for managing feature relevance
- Learning Bayesian Metanetworks from Data
- Conclusions
Vagan Terziyan Industrial Ontologies
Group Department of Mathematical Information
Technologies University of Jyvaskyla
(Finland) http//www.cs.jyu.fi/ai/vagan
This presentation http//www.cs.jyu.fi/ai/AIAI-20
04.ppt
3Bayesian Metanetworks
4Bayesian Metanetwork
- Definition. The Bayesian Metanetwork is a set of
Bayesian networks, which are put on each other in
such a way that the elements (nodes or
conditional dependencies) of every previous
probabilistic network depend on the local
probability distributions associated with the
nodes of the next level network.
5Two-level Bayesian C-Metanetwork for Managing
Conditional Dependencies
6Contextual and Predictive Attributes
air pressure
dust
humidity
temperature
Machine
emission
Environment
Sensors
X
x5
x6
x7
x2
x3
x4
x1
contextual attributes
predictive attributes
7Contextual Effect on Conditional Probability (1)
X
x5
x6
x7
x2
x3
x4
x1
contextual attributes
predictive attributes
Assume conditional dependence between predictive
attributes (causal relation between physical
quantities)
xt
some contextual attribute may effect directly
the conditional dependence between predictive
attributes but not the attributes itself
xk
xr
8Contextual Effect on Conditional Probability (2)
- X x1, x2, , xn predictive attribute with n
values - Z z1, z2, , zq contextual attribute with q
values - P(YX) p1(YX), p2(YX), , p r(YX)
conditional dependence attribute (random
variable) between X and Y with r possible values - P(P(YX)Z) conditional dependence between
attribute Z and attribute P(YX)
9Contextual Effect on Conditional Probability (3)
Xt1 I am in Paris Xt2 I am in Moscow
xt
P1(Xr Xk ) Xk1 Xk2
Xr1 0.3 0.9
Xr2 0.4 0.5
Xr1 visit football match Xr2 visit
girlfriend
Xk1 order flowers Xk2 order wine
xr
xk
P2(Xr Xk ) Xk1 Xk2
Xr1 0.1 0.2
Xr2 0.8 0.7
Xr Make a visit
Xk Order present
10Contextual Effect on Conditional Probability (4)
Xt1 I am in Paris Xt2 I am in Moscow
xt
P( P (Xr Xk ) Xt ) Xt1 Xt2
P1(Xr Xk ) 0.7 0.2
P2(Xr Xk ) 0.3 0.8
xr
xk
P1(Xr Xk ) Xk1 Xk2
Xr1 0.3 0.9
Xr2 0.4 0.5
P2(Xr Xk ) Xk1 Xk2
Xr1 0.1 0.2
Xr2 0.8 0.7
11Contextual Effect on Unconditional Probability (1)
X
x5
x6
x7
x2
x3
x4
x1
contextual attributes
predictive attributes
Assume some predictive attribute is a random
variable with appropriate probability
distribution for its values
xt
P(X)
some contextual attribute may effect directly
the probability distribution of the predictive
attribute
X
x1
x4
x2
x3
xk
12Contextual Effect on Unconditional Probability (2)
- X x1, x2, , xn predictive attribute with
n values - Z z1, z2, , zq contextual attribute
with q values and P(Z) probability distribution
for values of Z - P(X) p1(X), p2(X), , pr(X) probability
distribution attribute for X (random variable)
with r possible values (different possible
probability distributions for X) and P(P(X)) is
probability distribution for values of attribute
P(X) - P(YX) is a conditional probability
distribution of Y given X - P(P(X)Z) is a conditional probability
distribution for attribute P(X) given Z
13Contextual Effect on Unconditional Probability (3)
P( P (Xk ) Xt ) Xt1 Xt2
P1(Xk ) 0.4 0.9
P2(Xk ) 0.6 0.1
Xt1 I am in Paris Xt2 I am in Moscow
xt
P1(Xk)
P2(Xk)
0.7
0.5
0.3
Xk
Xk
0.2
Xk1
Xk2
Xk1
Xk2
Xk1 order flowers Xk2 order wine
xk
Xk Order present
14Causal Relation between Conditional Probabilities
xm
xn
P(P(Xn Xm))
P(Xn Xm)
P2(XnXm)
P3(XnXm)
P1(XnXm)
P(P(Xr Xk)P(Xn Xm))
P(P(Xr Xk))
There might be causal relationship between two
pairs of conditional probabilities
P(Xr Xk)
P2(XrXk)
P1(XrXk)
xk
xr
15Two-level Bayesian C-Metanetwork for managing
conditional dependencies
16Example of Bayesian C-Metanetwork
The nodes of the 2nd-level network correspond to
the conditional probabilities of the 1st-level
network P(BA) and P(YX). The arc in the
2nd-level network corresponds to the conditional
probability P(P(YX)P(BA))
17Two-level Bayesian R-Metanetwork for Modelling
Relevant Features Selection
18Feature relevance modelling (1)
We consider relevance as a probability of
importance of the variable to the inference of
target attribute in the given context. In such
definition relevance inherits all properties of a
probability.
19Feature relevance modelling (2)
20General Case of Managing Relevance (1)
Predictive attributes X1 with values
x11,x12,,x1nx1 X2 with values
x21,x22,,x2nx2 XN with values
xn1,xn2,,xnnxn Target attribute Y with
values y1,y2,,yny. Probabilities P(X1),
P(X2),, P(XN) P(YX1,X2,,XN). Relevancies ?X
1 P(?(X1) yes) ?X2 P(?(X2)
yes) ?XN P(?(XN) yes) Goal to
estimate P(Y).
21General Case of Managing Relevance (2)
Probability P(XN)
22Example of Relevance Bayesian Metanetwork (1)
Conditional relevance !!!
23Example of Relevance Bayesian Metanetwork (2)
24Learning Bayesian Metanetworks from Data
25Learning Task
- Given training set D of training examples ltX1,
X2, Xn, Ygt - Goal is to restore
- the set of levels of Bayesian Metanetwork l1,,
l2,, lL, each level is a Bayesian network - the interlevel links for each pair of successive
levels lr , lr1 - the network structure and parameters at each
level, particularly probabilities P(vi) and
P(viparents(vi)) for each variable vi.
26Learning Bayesian Metanetwork
- Use well-known learning methods for learning
component Bayesian networks on each level of the
Metanetwork - Add procedures for learning interlevel
relationships for the case of multilevel
probabilistic Metanetworks
27Learning Process
Stage 1. Division of attributes on the levels
Stage 2. Learning the network structure
Stage 3. Learning the interlevel links to the
subsequent level
Stage 4. Learning the network parameters
over all levels of Metanetwork
28Stage 1.
- Division of attributes among the levels
- The task of this stage is to divide the input
vector of attributes ltX1, X2, Xngt into the
predictive, contextual and perhaps metacontextual
attributes.
X
x5
x6
x7
x2
x3
x4
x1
contextual attributes
predictive attributes
29Stage 2.
- Learning the network structure at the current
level of Metanetwork - can be made by well-known methods with good
performance - (Cheng-Greiner method,
- KA2 algorithm, etc.)
30Stage 3.
- Learning the interlevel links between the current
and subsequent levels - This is a new stage that has been added
specifically for a Bayesian Metanetwork learning.
- Differs for the C-Metanetwork and for the
R-Metanetwork.
31Learning Interlevel Links in C-Metanetwork
. . .
32Different probability tables
corresponding to different contexts are
associated with vertexes of the second-level
Bayesian network
33Context variables in C-Metanetwork
context random variable U
context random variable W
P(WU)
34Learning Interlevel Links in R-Metanetwork
. . .
35Different relevancies
corresponding to different contexts are
associated with vertexes of the second-level
Bayesian network
36Context variables in R-Metanetwork
context random variable U
context random variable W
P(WU)
37Stage 4.
- Learning the parameters in the network at the
current level - is made by the standard
- procedure just taking into
- account the dynamics of
- parameters values in
- different contexts
38When Bayesian Metanetworks ?
- Bayesian Metanetwork can be considered as very
powerful tool in cases where structure (or
strengths) of causal relationships between
observed parameters of an object essentially
depends on context (e.g. external environment
parameters) - Also it can be considered as a useful model for
such an object, which diagnosis depends on
different set of observed parameters depending on
the context.
39Conclusions
- The main challenge of this work is the extension
of the standard Bayesian learning procedures with
the algorithm of learning the interlevel links - The experiments on the data from the
highly-contextual domain have shown the
effectiveness of the proposed models and learning
procedures
40Read more about Bayesian Metanetworks in
Terziyan V., A Bayesian Metanetwork, In In
International Journal on Artificial Intelligence
Tools, Vol. 14, Ns. 3-4, World Scientific (to
appear).
http//www.cs.jyu.fi/ai/IJAIT-2003.doc
Terziyan V., Vitko O., Bayesian Metanetwork for
Modelling User Preferences in Mobile Environment,
In German Conference on Artificial Intelligence
(KI-2003), Hamburg, Germany, September 15-18,
2003.
http//www.cs.jyu.fi/ai/papers/KI-2003.pdf
Vitko O. The Multilevel Probabilistic Networks
for Modelling Complex Information Systems under
Uncertainty. Ph.D. Thesis, Kharkov National
University of Radioelectronics, 2003.