Title: Learning Maximum Likelihood Bounded SemiNave Bayesian network classifiers
1Learning Maximum Likelihood Bounded Semi-Naïve
Bayesian network classifiers
Huang, Kaizhu Sept.25, 2002
2Outline
- Background
- Classifier
- Naïve Bayesian Classifiers
- Bounded Semi-Naïve Bayes classifier
- Comparison with other traditional Semi-Naïve
Bayes - Experiments results
- Conclusion
3Background
- Classifier
- Given a pre-classified dataset D,
- Where
is the training data - in m-dimension real space,
is the class label. - A classifier is defined as a mapping function
- to
satisfy
4Background
- Probabilistic classifiers
- The classification mapping function is defined
as
However, the posterior probability is not so easy
to be estimated from the dataset, therefore,
some assumptions about the distribution have to
be made.
5Background
- Naïve Bayes Classifier
- Assumption Given the class label C, the
attributes are independent - It gives the classification mapping function as
- is the value of attributes Aj in the
example.
6Background
- Naïve Bayes classifier
- NBCs performance is comparable with some
state-of-the-art classifiers even when its
independency assumption does not hold in normal
cases.
Question Can the performance be better when
the conditional independency assumption of NBC is
relaxed?
7Bounded Semi-Naïve Bayes classifier
- Relax the conditional independency assumption
into a large attributes conditional dependency
assumption.
questions How to find the optimal combination ?
8constraining the searching space
- The total number of the combinations to satisfy
the B-SNB conditions is so large, we need to do
some reduction. - We reduce the searching space by adding the
constraint that - The cardinality of each large attribute is
exactly equal to K - Hidden principle
- When K is small, a K cardinality large
attribute will be more accurate than separating
it into several large attributes. - P(a,b)P(c,d) is more close P(a,b,c,d) than
P(a,b)P(c)P(d).
9constraining the searching space
- Searching K-Bounded-SNB model
- Finding the m n/K K-cardinality subsets
from attributes (features) set which satisfy
the SNB conditions to maximize the log likelihood
(3). - x means rounding the x to the nearest integer
10Transforming into Integer Programming(IP) Problem
Model definition
If we relax the (6) into 0?x ? 1, IP is
transformed into a Linear Programming problem
which can be solved in a polynomial time.
11Comparison with relate work
12Experimental result
Recognition rate for datasets from UCI Machine
learning repository
13Experimental result
14Conclusion
- A novel bounded Semi-Naïve Bayesian classifier
is proposed, which has some important issues as
follows - It is the first algorithm for SNB with a global
nature. - It has a polynomial time cost.
- Its overall performance outperforms the NBC
significantly.
15Main References
- I. Kononenko. Semi-naive bayesian classier. In
Proceedings of sixth European Working Session on
Learning, pages 206-219. Springer-Verlag, 1991. - M.J.Pazzani. Searching dependency in bayesian
classiers. In D. Fisher and H.-J. Lenz, editors,
Learning from data Articial intelligence and
statistics V, pages 239-248. New York,
NYSpringer-Verlag, 1996. - Nathan Srebro. Maximum likelihood bounded
tree-width markov networks, MIT Master thesis,
2001. - Patrick M. Murphy. Uci repository of machine
learning databases. In ftp.ics.uci.edu
pub/machine-learning-databases.
http//www.ics.uci.edu/ mlearn/MLRepository.html.