Bayesian%20Networks - PowerPoint PPT Presentation

About This Presentation
Title:

Bayesian%20Networks

Description:

Uncertainty & Bayesian Belief Networks Data-Mining with Bayesian Networks on the Internet Section 1 - Bayesian Networks An Introduction Brief Summary of Expert ... – PowerPoint PPT presentation

Number of Views:192
Avg rating:3.0/5.0
Slides: 49
Provided by: Allan165
Category:

less

Transcript and Presenter's Notes

Title: Bayesian%20Networks


1
Uncertainty Bayesian Belief Networks
2
Data-Mining with Bayesian Networks on the Internet
  • Internet can be seen as a massive repository of
    Data
  • Data is often updated
  • Once meaningful data has been collected from the
    Internet, some model is needed which is able to
  • be learnt from the vast amount of available data
  • enable the user to reason about the data.
  • Be easily updated given new data

3
Section 1 - Bayesian Networks An Introduction
  • Brief Summary of Expert Systems
  • Causal Reasoning
  • Probability Theory
  • Bayesian Networks - Definition, inference
  • Current issues in Bayesian Networks
  • Other Approaches to Uncertainty

4
Expert Systems1 Rule Based Systems
  • 1960s - Rule Based Systems
  • Model human Expertise using IF .. THEN rules or
    Production Rules.
  • Combines the rules (or Knowledge Base) with an
    inference engine to reason about the world.
  • Given certain observations, produces conclusions.
  • Relatively successful but limited.

5
2 Uncertainty
  • Rule based systems failed to handle uncertainty
  • Only dealt with true or false facts
  • Partly overcome using Certainty factors
  • However, other problems no differentiation
    between causal rules and diagnostic rules.

6
3 Normative Expert Systems
  • Model Domain rather than Expert
  • Classical probability used rather than ad-hoc
    calculus
  • Expert support rather than Expert Model
  • 1980s - More Powerful Computers make complex
    probability calculations feasible
  • Bayesian Networks introduced (Pearl 1986) e.g.
    MUNIN.

7
Causality - 1 Icy Roads
Icy Roads
Holmes Crashes
Watson Crashes
8
- 2 Wet Grass
Rain
Sprinkler
Watsons Grass Wet
Holmes Grass Wet
9
- 3 Earthquake or Burglar
Burglary
Earthquake
Alarm
Mary Calls
John Calls
10
Tour through Probability
  • All probabilities are between 0 and 1
  • Necessarily true propositions have probability1
    and necessarily false propositions have
    probability0

11
Conjunctions and Disjunctions
Venn Diagrams
  • P(A B) P(A) x P(B)
  • P(A v B) P(A) P(B)
  • (mutually exclusive)
  • P(A v B)
  • P(A)P(B) - P(A B)
  • (not mutually exclusive)

A
B
A
B
A
B
12
Conditional probability independence
  • Probability of B given A
  • Independence

P(BA)P(AB) P(A)
E.g. P(HeartsHeart last time)
P(BA)P(B)
E.g. P(HeadsEven) P(Heads)
13
Probability Distributions
  • Probability Distribution
  • p(WeatherSunny) 0.5
  • p(WeatherRain) 0.2
  • p(WeatherCloud) 0.2
  • p(WeatherSnow) 0.1
  • NB Distribution sums to 1.

0.5
0.2
0.1
S R C S
14
Joint Probability
  • Completely specifies all beliefs in a problem
    domain.
  • Joint prob Distribution is an n-dimensional table
    with a probability in each cell of that state
    occurring.
  • Written as P(X1, X2, X3 , Xn)
  • When instantiated as P(x1,x2 , xn)

15
Joint Distribution Example
  • Domain with 2 variables each of which can take on
    2 states.

P(Toothache, Cavity)
Toothache Toothache
Cavity 0.04 0.06
Cavity 0.01 0.89
16
Bayes Theorem
  • Simple
  • P(YX) P(XY)P(Y)
    P(X)
  • General
  • P(YX,E) P(XY,E)P(YE)
    P(XE)

17
Bayesian Probability
  • No need for repeated Trials
  • Appear to follow rules of Classical Probability
  • How well do we assign probabilities?

The Probability Wheel A Tool for Assessing
Probabilities
18
Bayesian Network - Definition
  • Causal Structure
  • Interconnected Nodes
  • Directed Acyclic Links
  • Joint Distribution formed from conditional
    distributions at each node.

19
Earthquake or Burglar
Burglary
Earthquake
Alarm
Mary Calls
John Calls
20
Bayesian Network for Alarm Domain
Burglary
Earthquake
P(B)
P(E)
.001
.002
B E P(A)
T T .95
T F .94
Alarm
F T .29
F F .001
A P(J)
A P(M)
T .70
T .90
F .01
F .05
Mary Calls
John Calls
21
Retrieving Probabilities from the Conditional
Distributions
n
  • P(x1,xn) P P(xiParents(xi))
  • E.g.
  • P(J M A B E)
  • P(JA)P(MA)P(AB,E)P(B)P(E)
  • 0.9 x 0.7 x 0.001 x 0.999 x 0.998
  • 0.00062

i1
22
Constructing A Network- Node Ordering and
Compactness
  • Mary Calls
  • John Calls
  • Alarm
  • Burglary
  • Earthquake

Mary Calls
John Calls
Alarm
Burglary
Earthquake
23
Node Ordering and Compactness contd.
  • Mary Calls
  • Johns Calls
  • Earthquake
  • Burglary
  • Alarm

24
Node Ordering and Compactness contd.
  • Mary Calls
  • Johns Calls
  • Earthquake
  • Burglary
  • Alarm

Mary Calls
John Calls
Earthquake
Burglary
Alarm
25
Conditional Independence revisited - D-Separation
  • To do inference in a Belief Network we have to
    know if two sets of variables are conditionally
    independent given a set of evidence.
  • Method to do this is called Direction-Dependent
    Separation or D-Separation.

26
D-Separation contd.
  • If every undirected path from a node in X to a
    node in Y is d-separated by E, then X and Y are
    conditionally independent given E.
  • X is a set of variables with unknown values
  • Y is a set of variables with unknown values
  • E is a set of variables with known values.

27
D-Separation contd.
  • A set of nodes, E, d-separates two sets of nodes,
    X and Y, if every undirected path from a node in
    X to a node in Y is Blocked given E.
  • A path is blocked given a set of nodes, E if
  • 1) Z is in E and Z has one arrow leading in and
    one leading out.
  • 2) Z is in E and has both arrows leading out.
  • 3) Neither Z nor any descendant of Z is in E and
    both path arrows lead in to Z.

28
Blocking
X
Y
E
Z
Z
Z
29
D-Separation - Example
Battery
  • Moves and Battery are independent given it is
    known about Ignition
  • Moves and Radio are independent if it is known
    that Battery works
  • Petrol and Radio are independent given no
    evidence. But are dependent given evidence of
    Starts

Radio
Ignition
Petrol
Starts
Moves
30
Inference
  • Diagnostic Inferences (effects to causes)
  • Causal Inferences (causes to effects)
  • Intercausal Inferences - or Explaining Away
    (between causes of common effect)
  • Mixed Inferences (combination of two or more of
    the above)

31
Inference contd.
Q
E
Q
E
E
Q
E
Q
E
Diagnostic Causal Intercausal
Mixed
32
Inference contd.
Burglary
Earthquake
Alarm
Mary Calls
John Calls
33
Inference in Singly Connected Networks
  • E.g. P(XE)
  • Involves computing two values
  • Causal Support (evidence variables above X
    connected through its parents)
  • Evidential Support (evidence variables below X
    connected through its children
  • Algorithm can perform in Linear Time.

34
Inference Algorithm
E
Spreads out from Q to evidence nodes, root
nodes and leaf nodes. Each recursive call
excludes the node from which it was called.
Causal Support
Q
E
E
Evidential Support
35
Inference in Multiply Connected Networks
  • Exact Inference is known to be NP-Hard
  • Approaches include
  • Clustering
  • Conditioning
  • Stochastic Simulation
  • Stochastic Simulation is most often used,
    particularly on large networks.

36
Clustering
Cloudy
Cloudy
Sprinkler Rain
Sprinkler
Rain
P(SR)
C P(R) T .08 .02 F .40 .10
C TT TF FT FF T .08 .02 .72
.18 F .40 .10 .40 .10
C P(S) T .08 .02 F .40 .10
Wet Grass
Wet Grass
37
Conditioning
-
-


Cloudy
Cloudy
Cloudy
Cloudy
Sprinkler
Rain
Sprinkler
Rain
Wet Grass
Wet Grass
38
Stochastic Simulation - Example
P(A1) 0.2
A
A P(B1) 0 0.2 1 0.8
A p(C1) 0 0.05 1 0.2
B
C
E
D
39
Stochastic SimulationRun repeated simulations
to estimate the probability distribution
  • Let Wx the states of all other variables except
    x.
  • Let the Markov Blanket of a node be all of its
    parents, children and parents of children.
  • Distribution of each node, x, conditioned upon Wx
    can be computed locally from their own
    probability with their childrens
  • P(aWa) ? . P(a) . P(ba) . P(ca)
  • P(bWb) ? . P(ba) . P(db,c)
  • P(cWc) ? . P(ca) . P(db,c) . P(ec)
  • Therefore, only the Markov blanket of a node is
    required to compute the distribution

40
The Algorithm
  • Set all observed nodes to their values
  • Set all other nodes to random values
  • STEP 1
  • Select a node randomly from the network
  • According to the states of the nodes markov
    blanket, compute P(xstate, Wx) for all states
  • STEP 2
  • Use a random number generator that is biased
    according to the distribution computed in step 1
    to select the next value of the node
  • Repeat

41
Algorithm contd.
  • The final probability distribution of each
    unobserved node is calculated from either
  • 1) the number of times each node took a
    particular state
  • 2) the average conditional probability of each
    node taking a particular state given the other
    variables states.

42
Case Study - Pathfinder
  • Diagnostic Expert System for Lymph-Node Diseases
  • 4 Versions of Pathfinder
  • 1) Rule Based
  • 2) Experimented with Certainty Factors/Dempster-Sh
    afer theory/Bayesian Models
  • 3) Refined Probabilities
  • 4) Refined dependencies

43
Section 2 - Research Issues in Uncertainty
1 Learning Belief Networks from Data
  • Assume no Knowledge of Probabilities
    Distributions or Causal Structure.
  • Is it possible to infer both of these from data?

Case Fraud Gas Jewellery Age Sex
1 No No No
30-50 F 2 No No No
30-50 M 3 Yes Yes
Yes gt50 M 4 No
No No 30-50 M 5
No Yes No lt30 F
6 No No No lt30
F 7 No No No
gt50 M 8 No No
Yes 30-50 F 9 No Yes
No lt30 M 10 No
No No lt30 F
44
Some Methods
  • Bayesian (Cooper Herskovitz 1991)
  • Minimum Description Length (Lam Bachus 1994)
  • Bound and Collapse (Ramoni 1996)

Fraud
Age
Sex
Gas
Jewelry
45
2 Dynamics - Markov Models
State Transition Model
State t-2
State t-1
State t
State t1
State t2
Percept t-2
Percept t-1
Percept t
Percept t1
Percept t2
Sensor Model
46
Updating over time
State t-1
State t
Percept t-1
Percept t
State t
Percept t
State t
State t1
Percept t
Percept t1
47
Dynamic Belief Networks - Forecasting Car sales
t
t-1
Price
Price
Demand
Health
Demand
Health
Supply
Supply
48
3 Other approaches to modeling Uncertainty
  • Default Reasoning
  • Dempster - Shafer Theory
  • Fuzzy Logic

?
Write a Comment
User Comments (0)
About PowerShow.com