Bayesian Networks and Markov Models: User Modeling and Natural Language Processing

1 / 57

About This Presentation

Title:

Bayesian Networks and Markov Models: User Modeling and Natural Language Processing

Description:

Uncertainty the quality or state of being not clearly known ... A Bayesian Network is a directed acyclic graph (DAG) in which the following holds: ... –

Number of Views:210

Avg rating:3.0/5.0

Slides: 58

Provided by: hcsne

Category:

more less

Transcript and Presenter's Notes

Title: Bayesian Networks and Markov Models: User Modeling and Natural Language Processing

1
Bayesian Networks and Markov Models User
Modeling and Natural Language Processing

Bayesian networks and Markov models
Applications in User Modeling and Natural
Language Processing

2
Bayesian Networks and Markov Models

Bayesian AI
Bayesian networks
Decision networks
Reasoning about changes over time
Dynamic Bayesian Networks
Markov models

3
Introduction to Bayesian AI

Reasoning under uncertainty
Probabilities
Bayesian approach
Bayes Theorem conditionalization
Bayesian decision theory

4
Reasoning under Uncertainty

Uncertainty the quality or state of being not
clearly known
distinguishes deductive knowledge from inductive
belief
Sources of uncertainty
Ignorance
Complexity
Physical randomness
Vagueness

5
Probability Calculus

Classic approach to reasoning under uncertainty
(origin Pascal and Fermat)
Kolmogorovs axioms
Conditional probability
Independence

6
Rev. Thomas Bayes (1702-1761)
7
Bayes Theorem Conditionalization

Due to Rev. Thomas Bayes (1764)Conditionalizati
on
Also read as
Assumptions
Joint priors over hi and e exist
Total evidence e is observed

8
Example Breast Cancer

Let Pr(h)0.01, Pr(eh)0.8 and Pr(eh)0.1
Bayes theorem yields

9
Bayesian Decision Theory

Frank Ramsey (1926)
Decision making under uncertainty what action
to take when the state of the world is unknown
Bayesian answer Find the utility of each
possible outcome (action-state pair), and take
the action that maximizes expected utility

10
Bayesian Decision Theory Example

Expected utilities
E(Take umbrella) 30?0.410?0.618
E(Leave umbrella) -100?0.450?0.6-10

11
Bayesian Conception of an AI

An autonomous agent that
has a utility structure (preferences)
can learn about its world and the relationship
(probabilities) between its actions and future
states maximizes its expected utility
The techniques used to learn about the world are
mainly statistical?Data mining

12
Bayesian Networks and Markov Models

Bayesian AI
Bayesian networks
Decision networks
Reasoning about changes over time
Dynamic Bayesian Networks
Markov models

13
Bayesian Networks (BNs) Overview

Introduction to BNs
Nodes, structure and probabilities
Reasoning with BNs
Understanding BNs
Extensions of BNs
Decision Networks
Dynamic Bayesian Networks (DBNs)

14
Bayesian Networks

A data structure that represents the dependence
between variables
Gives a concise specification of the joint
probability distribution
A Bayesian Network is a directed acyclic graph
(DAG) in which the following holds
A set of random variables makes up the nodes in
the network
A set of directed links connects pairs of nodes
Each node has a probability distribution that
quantifies the effects of its parents

15
Example Lung Cancer Diagnosis

A patient has been suffering from shortness of
breath (called dyspnoea) and visits the doctor,
worried that he has lung cancer.
The doctor knows that other diseases, such as
tuberculosis and bronchitis are possible causes,
as well as lung cancer. She also knows that other
relevant information includes whether or not the
patient is a smoker (increasing the chances of
cancer and bronchitis) and what sort of air
pollution he has been exposed to. A positive Xray
would indicate either TB or lung cancer.

16
Nodes and Values

Q What are the nodes to represent and what
values can they take?
A Nodes can be discrete or continuous
Boolean nodes represent propositions taking
binary valuesExample Cancer node represents
proposition the patient has cancer
Ordered valuesExample Pollution node with
values low, medium, high
Integral valuesExample Age with possible values
1-120

17
Lung Cancer Example Nodes and Values
18
Lung Cancer Example Network Structure
Pollution
Smoker
Cancer
Xray
Dyspnoea
19
Conditional Probability Tables (CPTs)

After specifying topology, must specify
the CPT for each discrete node
Each row contains the conditional probability of
each node value for each possible combination of
values in its parent nodes
Each row must sum to 1
A CPT for a Boolean variable with n Boolean
parents contains 2n1 probabilities
A node with no parents has one row (its prior
probabilities)

20
Lung Cancer Example CPTs
21
The Markov Property

Modeling with BNs requires assuming the Markov
Property
There are no direct dependencies in the system
being modelled which are not already explicitly
shown via arcs
Example smoking can influence dyspnoea only
through causing cancer

22
Reasoning with Bayesian Networks

Basic task for any probabilistic inference
systemCompute the posterior probability
distribution for a set of query variables, given
new information about some evidence variables
Also called conditioning or belief updating or
inference

23
Types of Reasoning
24
Reasoning with Numbers Using Netica software
25
Understanding Bayesian Networks

A (more compact) representation of the joint
probability distribution
understand how to construct a network
Encoding of a collection of conditional
independence statements
understand how to design inference procedures
via Markov propertyEach conditional
independence implied by the graph is present in
the probability distribution

26
Representing the Joint Probability Distribution
Example
27
Conditional Independence

The relationship between conditional independence
and BN structure is important for understanding
how BNs work

28
Conditional Independence Causal Chains

Causal chains give rise to conditional
independenceExample Smoking causes cancer,
which causes dyspnoea

29
Conditional Independence Common Causes

Common Causes (or ancestors) also give rise to
conditional independenceExample Cancer
is a common cause of the two symptoms a positive
Xray and dyspnoea

B
A
C
30
Conditional Dependence Common Effects

Common effects (or their descendants) give rise
to conditional dependenceExample Cancer
is a common effect of pollution and
smokingGiven cancer, smoking explains away
pollution

A
C
B
31
D-separation

Graphical criterion of conditional independence
We can determine whether a set of nodes X is
independent of another set Y, given a set of
evidence nodes E, via the Markov property
If every undirected path from a node in X to a
node in Y is d-separated by E, then X and Y are
conditionally independent given E

32
Determining D-separation

A set of nodes E d-separates two sets of nodes X
and Y, if every undirected path from a node in X
to a node in Y is blocked given E
A path is blocked given a set of nodes E, if
there is a node Z on the path for which one of
three conditions holds
Z is in E and Z has one arrow on the path leading
in and one arrow out (chain)
Z is in E and Z has both path arrows leading out
(common cause)
Neither Z nor any descendant of Z is in E, and
both path arrows lead into Z (common effect)

33
Determining D-separation (cont)
Chain Common cause Common effect
34
Bayesian Networks Summary

Bayes rule allows unknown probabilities to be
computed from known ones
Conditional independence (due to causal
relationships) allows efficient updating
BNs are a natural way to represent conditional
independence info
qualitative links between nodes
quantitative conditional probability tables
(CPTs)
BN inference
computes the probability of query variables given
evidence variables
is flexible we can enter evidence about any
node and update beliefs in other nodes

35
Bayesian Networks and Markov Models

Bayesian AI
Bayesian networks
Decision networks
Reasoning about changes over time
Dynamic Bayesian Networks
Markov models

36
Decision Networks

Extension of BNs to support making decisions
Utility theory represents preferences between
different outcomes of various plans
Decision theory Utility theory Probability
theory

37
Expected Utility

E available evidence
A a non-deterministic action
Oi a possible outcome state
U utility

38
Decision Networks

A Decision network represents
information about
the agents current state
its possible actions
the state that will result from theagent's
action
the utility of that state
Also called, Influence Diagrams
(Howard Matheson, 1981)

39
Types of Nodes

Chance nodes (ovals) random variables (same as
BNs)
Have an associated CPT
Parents can be decision nodes and other chance
nodes
Decision nodes (rectangles) points where the
decision maker has a choice of actions
Utility nodes (Value nodes) (diamonds) the
agent's utility function
Have an associated table representing a
multi-attribute utility function
Parents are variables describing the outcome
states that directly affect utility

40
Types of Links

Informational Links indicate when a chance node
needs to be observed before a decision is made
Conditioning links indicate the variables on
which the probability assignment to a chance node
will be conditioned

41
Fever Problem Description

Suppose that you know that a fever can be caused
by the flu. You can use a thermometer, which is
fairly reliable, to test whether or not you have
a fever. Suppose you also know that if you take
aspirin it will almost certainly lower a fever to
normal. Some people (about 5 of the population)
have a negative reaction to aspirin. You'll be
happy to get rid of your fever, so long as you
don't suffer an adverse reaction if you take
aspirin.

42
Fever Decision Network
43
Fever Decision Table
44
Bayesian Networks and Markov Models

Bayesian AI
Bayesian networks
Decision networks
Reasoning about changes over time
Dynamic Bayesian networks
Markov models

45
Dynamic Bayesian Networks (DBNs)

One node for each variable for each time step
Intra-slice arcs XiT? XjT
Inter-slice (temporal) arcs
XiT? XiT1
XiT? XjT1

46
Fever DBN
47
DBN Reasoning

Can calculate distributions for events at time
t1 and further probabilistic projection
Reasoning can be done using standard BN updating
algorithms
This type of DBN gets very large, very quickly
Usually keep only two time slices of the network

48
Dynamic Decision Networks

Decision networks can be extended to include
temporal aspects
Sequence of decisions taken Plan

49
Fever DDN
50
Bayesian Networks and Markov Models

Bayesian AI
Bayesian networks
Decision networks
Reasoning about changes over time
Dynamic Bayesian networks
Markov models

51
Markov Models Assumptions

Stationary process a process of change that is
governed by laws that dont change over time
Markov assumption the current state depends
only on a finite history of the previous states
First-order MM
Second-order MM

52
Markov Prediction Models Example

Observation Sequence of document requests
arriving at a Web siteD1, D2, D3, D2, D1, D4,
D2, D3,
Task Predict the next requested document
First-order MMCalculate Pr(DiDj)
Second-order MMD1,D2?D3, D2,D3?D2
,D3,D2?D1, D2,D1?D4, D1,D4?D2,
D4,D2?D3,Calculate Pr(DiDj,Dk)

53
Hidden Markov Models (HMMs)

A HMM is a temporal probabilistic model for a
process, where the state of the process is
described by a single discrete random variable
The possible values of the variable are the
possible states of the world
Additional state variables are added by combining
them into one mega variable

54
Hidden Markov Models (cont)

State transitions in a HMM
x hidden statesy observable outputsa
transition probabilitiesb output
probabilities

55
Hidden Markov Models Example
Raint-1
Raint
Raint1
Umbrellat-1
Umbrellat
Umbrellat1
56
Summary (I) Bayesian Networks and Markov Models

BNs are graphical probabilistic models that
express causal and evidential relations between
propositions
Dynamic Bayesian Networks (DBNs) the BN is
replicated for each time slice
Markov models are graphical probabilistic models
that represent transitions between states

57
Summary (II) Static versus Temporal Reasoning