Discriminative Parameter Learning for Bayesian Networks - PowerPoint PPT Presentation

1 / 15

About This Presentation

Title:

Discriminative Parameter Learning for Bayesian Networks

Description:

Discriminative Parameter Learning for Bayesian Networks. Jiang Su, Harry Zhang, ... discriminative (accuracy, conditional ... Discriminative Frequency Estimate ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 16

Provided by: DD299

Category:

Tags: bayesian | discriminative | learning | networks | parameter

Transcript and Presenter's Notes

Title: Discriminative Parameter Learning for Bayesian Networks

1
Discriminative Parameter Learning for Bayesian
Networks

Jiang Su, Harry Zhang,
Charles X. Ling, Stan Matwin
University of Ottawa

2
Introduction

Learning Bayesian networks includes structure
and
parameter learning
Parameter learning is an inner loop of
structure learning
An efficient and effective parameter learning
method is
required in Bayesian network learning

3
Introduction

The traditional parameter learning method is
Frequency
Estimate (FE)
The objective function of FE is likelihood
The objective function of classifiers should be
discriminative (accuracy, conditional
likelihood, etc)

4
Related Works

Extended Logistic Regression (ELR) performs
better
than FEGreiner2002
use FE to learn plug-in parameters
conjugate gradient and line search
cross tuning
Gradient descent methods are computationally
expensive in structure learning
Friedman1997, Grossman2004

5
Frequency Estimate

An example

6
Frequency Estimate

Frequency information in data
The frequency of SmokeN or Y equals to the
frequency of
GenderF
The frequency of SmokeN is not greater than the
frequency of
GenderF
Frequency information offers constraints during
parameter learning

7
Discriminative Frequency Estimate

Idea discriminatively count the frequencies in
data
Example

8
Discriminative Frequency Estimate

Frequency information in data
The frequency of SmokeN or Y equals to the
frequency of
GenderF
The frequency of SmokeN is not greater than the
frequency of
GenderF

9
Comparisons
The matrix from different
algorithms
Gradient Descent
FE
DFE
10
Discriminative Frequency Estimate

Example a dataset with 3 training instances, and
1 test instance
The predictions from FE and DFE are influenced
by the frequency
information in data
DFE converges slightly slower than FE

11
Experimental Setup

33 UCI datasets (2 classes, discretization,
missing value)
Parameter learning methods
FE frequency estimate
DFE discriminative frequency estimate
ELR a gradient descent method Greiner2002
Ada use Ada boosting method to generate a set
of Bayes
classifiersFreund96
Structure learning methods
HGC hill-climbing search algorithm (2 parents)

12
Experiments-accuracy

DFE performs competitively with ELR, and both of
them are better than FE and Ada
Structure learning improves the performance of
Bayes
classifiers.(HGCFEgtNBFE)
NBELRHGCFE, HGCDFENBDFE

13
Experiments-convergence

Training Time DFE is 250,000 times faster than
ELR
Small datasets with strong dependencies require
more than 1
iteration (vowel, 200 instances, 4 iterations)
Overfitting training and test data accuracy are
similar and
increased training effort does not change the
accuracy

Solid NBDFE in training data Dotted NBDFE
in test data
14
Experiments-learning curve

Generative parameter learning does not have
advantage over
discriminative parameter learning in small
training data

Solid NBFE Dotted NBDFE Dash NBELR
15
Conclusions

A parameter learning method for Bayesian
network classifiers
competitive with the gradient descent method in
accuracy
computationally efficient
Insensitive to the overfitting problem
simple to implement

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Discriminative Structure and Parameter Learning for Markov Logic Networks PowerPoint PPT Presentation

Discriminative Structure and Parameter Learning for Markov Logic Networks - Discriminative Structure and Parameter Learning for Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney ICML 08, Helsinki, Finland | PowerPoint PPT presentation | free to view

Discriminative Parameter Learning for Bayesian Networks PowerPoint PPT Presentation

Discriminative Parameter Learning for Bayesian Networks - information in data. DFE converges slightly slower than FE. 11. Experimental Setup ... Ada: use Ada boosting method to generate a set of Bayes. classifiers{Freund96} ... | PowerPoint PPT presentation | free to view

Boosted Augmented Naive Bayes Efficient discriminative learning of Bayesian network classifiers PowerPoint PPT Presentation

Boosted Augmented Naive Bayes Efficient discriminative learning of Bayesian network classifiers - Boosted Augmented Naive Bayes. Efficient discriminative learning of. Bayesian network classifiers ... Generalizes Boosted Na ve Bayes (Elkan 1997) Comprehensive ... | PowerPoint PPT presentation | free to view

Learning Bayesian Networks PowerPoint PPT Presentation

Learning Bayesian Networks - Title: Learning Bayesian Networks: Search Methods and Experimental Results Author: Max Chickering Last modified by: Alan Created Date: 6/30/1995 5:30:58 AM | PowerPoint PPT presentation | free to view

Potential Data Mining Techniques for Flow Cyt Data Analysis PowerPoint PPT Presentation

Potential Data Mining Techniques for Flow Cyt Data Analysis - Discriminative Analysis. Learning a function of its inputs to base its decision on ... Discriminative Classifiers vs. Bayesian Classifiers. Advantages ... | PowerPoint PPT presentation | free to view

Graphical model software for machine learning PowerPoint PPT Presentation

Graphical model software for machine learning - For the local evidence, we can use a discriminative classifier (trained iid) ... Uses inference as subroutine (can be slow no worse than discriminative learning) ... | PowerPoint PPT presentation | free to view

Hybrids of generative and discriminative methods for machine learning PowerPoint PPT Presentation

Hybrids of generative and discriminative methods for machine learning - G( ) = p( ) n p(xn, cn| ) 1 reusable model per class, can deal with incomplete data ... power of generative models while performing at discriminating = hybrid models ... | PowerPoint PPT presentation | free to view

CS 391L: Machine Learning: Bayesian Learning: Beyond Na PowerPoint PPT Presentation

CS 391L: Machine Learning: Bayesian Learning: Beyond Na - Equivalent to a one-layer backpropagation neural net. Logistic regression is the source of the sigmoid function used in backpropagation. ... | PowerPoint PPT presentation | free to view

Lecture%202:%20Learning%20without%20Over-learning PowerPoint PPT Presentation

Lecture%202:%20Learning%20without%20Over-learning - Parameters (weights w or a, threshold b) ... A function of the parameters of the ... Shave off unnecessary parameters of your models. The Power of Amnesia ... | PowerPoint PPT presentation | free to view

Machine Learning Methods for Human-Computer Interaction PowerPoint PPT Presentation

Machine Learning Methods for Human-Computer Interaction - ... Massachussetts Institute of Technology, MA ... KDE for activity recognition data KDE for gesture recognition data Other density estimation methods ... | PowerPoint PPT presentation | free to view

Minicourse on Artificial Neural Networks and Bayesian Networks PowerPoint PPT Presentation

Minicourse on Artificial Neural Networks and Bayesian Networks - Mini-course on ANN and BN, The Multidisciplinary Brain Research center, Bar-Ilan ... How can network models explain high-level reasoning? ... | PowerPoint PPT presentation | free to view

Part 1: Biological Networks PowerPoint PPT Presentation

Part 1: Biological Networks - World Wide Web. Degree of a node: the number of edges incident on the node. i ... Definition of Learning. 3 types of learning. Supervised learning. Unsupervised ... | PowerPoint PPT presentation | free to view

Exploring Massive Learning via a Prediction System PowerPoint PPT Presentation

Exploring Massive Learning via a Prediction System - Exploring Massive Learning via a Prediction System Omid Madani Yahoo! Research www.omadani.net | PowerPoint PPT presentation | free to view

Bayesian Optimization Algorithm, Decision Graphs, and Occam PowerPoint PPT Presentation

Bayesian Optimization Algorithm, Decision Graphs, and Occam - BDe metric for Bayesian networks with decision graphs. Bayesian Networks ... BDe metric combines the prior knowledge about the problem and the statistical ... | PowerPoint PPT presentation | free to view

An Introduction to Bayesian Networks PowerPoint PPT Presentation

An Introduction to Bayesian Networks - Car Start. Patterns of Plausible Reasoning. Serial (head-to-tail), diverging (tail-to-tail) and converging (head-to-head) connections ... | PowerPoint PPT presentation | free to view

Dynamic Bayesian Networks for Multimodal Interaction PowerPoint PPT Presentation

Dynamic Bayesian Networks for Multimodal Interaction - audio, video and haptic channels. single, two-person and multi-person activity. ... haptic-video interaction (robotic laparoscopy) ... | PowerPoint PPT presentation | free to view

Boosted Augmented Naive Bayes Efficient discriminative learning of Bayesian network classifiers PowerPoint PPT Presentation

Boosted Augmented Naive Bayes Efficient discriminative learning of Bayesian network classifiers - Create maximum spanning tree using conditional mutual information as edge weight. Convert a undirected graph into a directed graph ... | PowerPoint PPT presentation | free to view

An Introduction to Bayesian Networks PowerPoint PPT Presentation

An Introduction to Bayesian Networks - Serial (head-to-tail), diverging (tail-to-tail) and ... Example: Chernobyl. UNIVERSITY OF SOUTH CAROLINA. Department of Computer Science and Engineering ... | PowerPoint PPT presentation | free to view

Statistical Learning Theory and Support Vector Machines PowerPoint PPT Presentation

Statistical Learning Theory and Support Vector Machines - ... 1999. [20] D.J.C. MacKay, Introduction to Gaussian Processes, Cambridge University, http://wol.ra.phy.cam.ac.uk/mackay/, 1998 ... pp. 211-231, 1998. [29 ... | PowerPoint PPT presentation | free to view

CS 388: Natural Language Processing: Discriminative Training and Conditional Random Fields (CRFs) for Sequence Labeling PowerPoint PPT Presentation

CS 388: Natural Language Processing: Discriminative Training and Conditional Random Fields (CRFs) for Sequence Labeling - CS 388: Natural Language Processing: Discriminative Training and Conditional Random Fields (CRFs) for Sequence Labeling Raymond J. Mooney University of Texas at Austin | PowerPoint PPT presentation | free to view

Naive Bayesian Text Classification for Spam Filtering PowerPoint PPT Presentation

Naive Bayesian Text Classification for Spam Filtering - Gaining Tax Exempt Status for Your Nonprofit Organization. Lindy Turner, Coordinator ... Fifth of of ten sessions on Building an Effective Nonprofit Organization ... | PowerPoint PPT presentation | free to view

Bioinformatics and Machine Learning: the Prediction of Protein Structures on a Genomic Scale Pierre Baldi Dept. Information and Computer Science Institute for Genomics and Bioinformatics University of California, Irvine PowerPoint PPT Presentation

Bioinformatics and Machine Learning: the Prediction of Protein Structures on a Genomic Scale Pierre Baldi Dept. Information and Computer Science Institute for Genomics and Bioinformatics University of California, Irvine - Bioinformatics and Machine Learning: the Prediction of Protein Structures on a Genomic Scale Pierre | PowerPoint PPT presentation | free to view

BottomUp Search and Transfer Learning in SRL PowerPoint PPT Presentation

BottomUp Search and Transfer Learning in SRL - Discriminative learning assumes a particular target predicate is to be inferred ... Existing non-discriminative MLN structure learners did very poorly on several ... | PowerPoint PPT presentation | free to view

Learning From Data Locally and Globally PowerPoint PPT Presentation

Learning From Data Locally and Globally - ... fact that the objective is exclusively determined by local information may lose ... the decision plane is exclusively dependent on global information, i. ... | PowerPoint PPT presentation | free to view

Chapter 3: Supervised Learning PowerPoint PPT Presentation

Chapter 3: Supervised Learning - Ensemble methods: Bagging and Boosting. Summary. CS583, Bing Liu, UIC. 3. An example application ... A decision is needed: whether to put a new patient in an ... | PowerPoint PPT presentation | free to view

Markov Logic: A Simple and Powerful Unification Of Logic and Probability PowerPoint PPT Presentation

Markov Logic: A Simple and Powerful Unification Of Logic and Probability - Discriminative Weight Learning. Maximize conditional likelihood of query (y) given evidence (x) ... Generative & discriminative weight learning. Structure ... | PowerPoint PPT presentation | free to view

Statistical Relational Learning PowerPoint PPT Presentation

Statistical Relational Learning - E.g.: Anna, X, mother_of(X), friends(X, Y) Grounding: Replace all variables by constants. E.g. ... Two constants: Anna (A) and Bob (B) Markov Logic Networks ... | PowerPoint PPT presentation | free to view