Title: Learning Maximum Likelihood Bounded SemiNave Bayesian network classifiers
1Learning Maximum Likelihood Bounded Semi-Naïve
Bayesian network classifiers
Huang, Kaizhu Sept.25, 2002
- Background
- Classifier
- Naïve Bayesian Classifiers
- Bounded Semi-Naïve Bayes classifier
- Comparison with other traditional Semi-Naïve
Bayes - Experiments results
- Conclusion
- Classifier
- Given a pre-classified dataset D,
- Where
is the training data - in m-dimension real space,
is the class label. - A classifier is defined as a mapping function
- to
- Probabilistic classifiers
- The classification mapping function is defined
However, the posterior probability is not so easy
to be estimated from the dataset, therefore,
some assumptions about the distribution have to
be made.
- Naïve Bayes Classifier
- Assumption Given the class label C, the
attributes are independent - It gives the classification mapping function as
- is the value of attributes Aj in the
- Naïve Bayes classifier
- NBCs performance is comparable with some
state-of-the-art classifiers even when its
independency assumption does not hold in normal
Question Can the performance be better when
the conditional independency assumption of NBC is
7Bounded Semi-Naïve Bayes classifier
- Relax the conditional independency assumption
into a large attributes conditional dependency
questions How to find the optimal combination ?
8constraining the searching space
- The total number of the combinations to satisfy
the B-SNB conditions is so large, we need to do
some reduction. - We reduce the searching space by adding the
constraint that - The cardinality of each large attribute is
exactly equal to K - Hidden principle
- When K is small, a K cardinality large
attribute will be more accurate than separating
it into several large attributes. - P(a,b)P(c,d) is more close P(a,b,c,d) than
9constraining the searching space
- Searching K-Bounded-SNB model
- Finding the m n/K K-cardinality subsets
from attributes (features) set which satisfy
the SNB conditions to maximize the log likelihood
(3). - x means rounding the x to the nearest integer
10Transforming into Integer Programming(IP) Problem
Model definition
If we relax the (6) into 0?x ? 1, IP is
transformed into a Linear Programming problem
which can be solved in a polynomial time.
11Comparison with relate work
12Experimental result
Recognition rate for datasets from UCI Machine
learning repository
13Experimental result
- A novel bounded Semi-Naïve Bayesian classifier
is proposed, which has some important issues as
follows - It is the first algorithm for SNB with a global
nature. - It has a polynomial time cost.
- Its overall performance outperforms the NBC
15Main References
- I. Kononenko. Semi-naive bayesian classier. In
Proceedings of sixth European Working Session on
Learning, pages 206-219. Springer-Verlag, 1991. - M.J.Pazzani. Searching dependency in bayesian
classiers. In D. Fisher and H.-J. Lenz, editors,
Learning from data Articial intelligence and
statistics V, pages 239-248. New York,
NYSpringer-Verlag, 1996. - Nathan Srebro. Maximum likelihood bounded
tree-width markov networks, MIT Master thesis,
2001. - Patrick M. Murphy. Uci repository of machine
learning databases. In ftp.ics.uci.edu
http//www.ics.uci.edu/ mlearn/MLRepository.html.