Title: Naive bayes
1Naive Bayes
Swipe
2Bayes Theorem
Given a hypothesis h and data D which bears on
the hypothesis P (h D) P(h) P (h D)
P(D) P(h) independent probability of h prior
probability P(D) independent probability of
D P(Dh) conditional probability of D given h
likelihood P(hD) conditional probability of h
given D posterior probability
3Bayesian Classifier
- The classification problem may be formalized
using a-posterior probabilities. - P(CX) prob. that the sample tuple X is of
class C. - E.g. P(classN outlook sunny, windytrue,)
- Idea assign to sample X the class label C such
that P(CX) is maximal
4Naive Bayes
Naïve Bayes is a basic learning method that uses
Bayes rule with an assumption that the
characteristics are conditionally independent,
given the class. It should be noted that
independence assumptions are frequently broken
in practice. yet naive Bayes gives good
classifications nonetheless.
5Uses of Naive Bayes classification
Text Classification Spam Filtering Hybrid
Recommender System Recommender Systems apply
machine learning and data mining techniques for
filtering unseen information and can predict
whether a user would like a given
resource. Online Application Simple Emotion
Modeling
6Why text classification?
Learning which articles are of interest Classify
web pages by topic Information extraction
Internet filters
7Examples of Text Classification
CLASSESBINARY spam / not spam CLASSES
TOPICS finance / sports / politics CLASSES
OPINION like / hate / neutral CLASSES
TOPICS AI / Theory / Graphics CLASSES
AUTHOR Shakespeare / Marlowe / Ben Jonson
8Naive Bayes Approach
Build the Vocabulary as the list of all distinct
words that appear in all the documents of the
training set. Remove stop words and markings The
words in the vocabulary become the attributes,
assuming that classification is independent of
the positions of the word Each document in the
training set becomes a record with frequencies
for each word in the Vocabulary. Train the
classifier based on the training data set, by
computing the prior probabilities for each class
and attributes. Evaluate the results on Test data
9Text Classification Algorithm
Tct 1 (t 1EV Tct1) B1
Tct 1 t EV(Tct1 1)
P(tc)
Tct Number of particular word in particular
class Tct Number of total words in particular
class B Number of distinct words in all class
10Topics for next Post
Linear Discriminant Analysis Decision
tree k-nearest neighbor algorithm Stay Tuned with