Title: CS479679 Pattern Recognition Spring 2006 Prof' Bebis
1CS479/679 Pattern RecognitionSpring 2006 Prof.
Bebis
- Bayesian Belief Networks
- Chapter 2 (Duda et al.)
2Statistical Dependences Between Variables
- Many times, the only knowledge we have about a
distribution is which variables are or are not
dependent. - Such dependencies can be represented graphically
using a Bayesian Belief Network (or Belief Net). - In essence, Bayesian Nets allow us to represent a
joint probability density p(x,y,z,) efficiently
using dependency relationships. - p(x,y,z,) could be either discrete or
continuous.
3Example of Dependencies
- State of an automobile
- Engine temperature
- Brake fluid pressure
- Tire air pressure
- Wire voltages
- Etc.
- NOT causally related variables
- Engine oil pressure
- Tire air pressure
- Causally related variables
- Coolant temperature
- Engine temperature
4Representative Applications
- Bill Gates said (LA Times - 10/28/96)"Microsoft'
s competitive advantage is its expertise in
"Bayesian Nets - Current Microsoft products
- Answer Wizard
- Print Troubleshooter
- Excel Workbook Troubleshooter
- Office 95 Setup Media Troubleshooter
- Windows NT 4.0 Video Troubleshooter
- Word Mail Merge Troubleshooter
5Representative Applications (contd)
- US Army SAIP (Battalion Detection from SAR, IR
etc.) - NASA Vista (DSS for Space Shuttle)
- GE Gems (real-time monitor for utility
generators) - Intel (infers possible processing problems)
6Definitions and Notation
- A belief net is usually a Directed Acyclic Graph
(DAG) - Each node represents one of the system variables.
- Each variable can assume certain values (i.e.,
states) and each state is associated with a
probability (discrete or continuous).
7Relationships Between Nodes
- A link joining two nodes is directional and
represents a causal influence (e.g., X depends on
A or A influences X) - Influences could be direct or indirect (e.g., A
influences X directly and A influences C
indirectly through X).
8Parent/Children Nodes
- Parent nodes P of X
- the nodes before X (connected to X)
- Children nodes C of X
- the nodes after X (X is connected to them)
9Conditional Probability Tables
- Every node is associated with a set of weights
which represent the prior/conditional
probabilities (e.g., P(xi/aj), i1,2, j1,2,3,4)
probabilities sum to 1
10Learning
- There exist algorithms for learning these
probabilities from data
11Computing Joint Probabilities
- We can compute the probability of any
configuration of variables in the joint density
distribution - e.g., P(a3, b1, x2, c3, d2)P(a3)P(b1)P(x2/a3,b1)
P(c3/x2)P(d2/x2) - 0.25 x 0.6 x 0.4 x 0.5 x 0.4
0.012
12Computing the Probability at a Node
- E.g., determine the probability at D
-
13Computing the Probability at a Node (contd)
- E.g., determine the probability at H
-
14Computing Probability Given Evidence (Bayesian
Inference)
- Determine the probability of some particular
configuration of variables given the values of
some other variables (evidence). - e.g., compute P(b1/a2, x1, c1)
15Computing Probability Given Evidence (Bayesian
Inference)(contd)
- In general, if X denotes the query variables and
e denotes the evidence, then - where a1/P(e) is a constant of proportionality.
-
16An Example
- Classify a fish given that we only have evidence
that the fish is light (c1) and was caught in
south Atlantic (b2) -- no evidence about what
time of the year the fish was caught nor its
thickness. -
17An Example (contd)
18An Example (contd)
19An Example (contd)
- Similarly,
- P(x2/c1,b2)a 0.066
- Normalize probabilities (not needed necessarily)
- P(x1/c1,b2) P(x2/c1,b2)1
(a1/0.18) -
- P(x1/c1,b2) 0.63
- P(x2/c1,b2) 0.27
salmon
20Another Example Medical Diagnosis
- Uppermost nodes biological agents (bacteria,
virus) - Intermediate nodes diseases
- Lowermost nodes symptoms
- Given some evidence (biological agents,
symptoms), find most likely disease.
21Naïve Bayes Rule
- When dependency relationships among features are
unknown, we can assume that features are
conditionally independent given the category - P(a,b/x)P(a/x)P(b/x)
- Naïve Bayes rule
- Simple assumption but usually works well in
practice.