Bayesian Networks and Markov Models: User Modeling and Natural Language Processing

1 / 57
About This Presentation
Title:

Bayesian Networks and Markov Models: User Modeling and Natural Language Processing

Description:

Uncertainty the quality or state of being not clearly known ... A Bayesian Network is a directed acyclic graph (DAG) in which the following holds: ... –

Number of Views:210
Avg rating:3.0/5.0
Slides: 58
Provided by: hcsne
Category:

less

Transcript and Presenter's Notes

Title: Bayesian Networks and Markov Models: User Modeling and Natural Language Processing


1
Bayesian Networks and Markov Models User
Modeling and Natural Language Processing
  • Bayesian networks and Markov models
  • Applications in User Modeling and Natural
    Language Processing

2
Bayesian Networks and Markov Models
  • Bayesian AI
  • Bayesian networks
  • Decision networks
  • Reasoning about changes over time
  • Dynamic Bayesian Networks
  • Markov models

3
Introduction to Bayesian AI
  • Reasoning under uncertainty
  • Probabilities
  • Bayesian approach
  • Bayes Theorem conditionalization
  • Bayesian decision theory

4
Reasoning under Uncertainty
  • Uncertainty the quality or state of being not
    clearly known
  • distinguishes deductive knowledge from inductive
    belief
  • Sources of uncertainty
  • Ignorance
  • Complexity
  • Physical randomness
  • Vagueness

5
Probability Calculus
  • Classic approach to reasoning under uncertainty
    (origin Pascal and Fermat)
  • Kolmogorovs axioms
  • Conditional probability
  • Independence

6
Rev. Thomas Bayes (1702-1761)
7
Bayes Theorem Conditionalization
  • Due to Rev. Thomas Bayes (1764)Conditionalizati
    on
  • Also read as
  • Assumptions
  • Joint priors over hi and e exist
  • Total evidence e is observed

8
Example Breast Cancer
  • Let Pr(h)0.01, Pr(eh)0.8 and Pr(eh)0.1
  • Bayes theorem yields

9
Bayesian Decision Theory
  • Frank Ramsey (1926)
  • Decision making under uncertainty what action
    to take when the state of the world is unknown
  • Bayesian answer Find the utility of each
    possible outcome (action-state pair), and take
    the action that maximizes expected utility

10
Bayesian Decision Theory Example
  • Expected utilities
  • E(Take umbrella) 30?0.410?0.618
  • E(Leave umbrella) -100?0.450?0.6-10

11
Bayesian Conception of an AI
  • An autonomous agent that
  • has a utility structure (preferences)
  • can learn about its world and the relationship
    (probabilities) between its actions and future
    states maximizes its expected utility
  • The techniques used to learn about the world are
    mainly statistical?Data mining

12
Bayesian Networks and Markov Models
  • Bayesian AI
  • Bayesian networks
  • Decision networks
  • Reasoning about changes over time
  • Dynamic Bayesian Networks
  • Markov models

13
Bayesian Networks (BNs) Overview
  • Introduction to BNs
  • Nodes, structure and probabilities
  • Reasoning with BNs
  • Understanding BNs
  • Extensions of BNs
  • Decision Networks
  • Dynamic Bayesian Networks (DBNs)

14
Bayesian Networks
  • A data structure that represents the dependence
    between variables
  • Gives a concise specification of the joint
    probability distribution
  • A Bayesian Network is a directed acyclic graph
    (DAG) in which the following holds
  • A set of random variables makes up the nodes in
    the network
  • A set of directed links connects pairs of nodes
  • Each node has a probability distribution that
    quantifies the effects of its parents

15
Example Lung Cancer Diagnosis
  • A patient has been suffering from shortness of
    breath (called dyspnoea) and visits the doctor,
    worried that he has lung cancer.
  • The doctor knows that other diseases, such as
    tuberculosis and bronchitis are possible causes,
    as well as lung cancer. She also knows that other
    relevant information includes whether or not the
    patient is a smoker (increasing the chances of
    cancer and bronchitis) and what sort of air
    pollution he has been exposed to. A positive Xray
    would indicate either TB or lung cancer.

16
Nodes and Values
  • Q What are the nodes to represent and what
    values can they take?
  • A Nodes can be discrete or continuous
  • Boolean nodes represent propositions taking
    binary valuesExample Cancer node represents
    proposition the patient has cancer
  • Ordered valuesExample Pollution node with
    values low, medium, high
  • Integral valuesExample Age with possible values
    1-120

17
Lung Cancer Example Nodes and Values
18
Lung Cancer Example Network Structure
Pollution
Smoker
Cancer
Xray
Dyspnoea
19
Conditional Probability Tables (CPTs)
  • After specifying topology, must specify
  • the CPT for each discrete node
  • Each row contains the conditional probability of
    each node value for each possible combination of
    values in its parent nodes
  • Each row must sum to 1
  • A CPT for a Boolean variable with n Boolean
    parents contains 2n1 probabilities
  • A node with no parents has one row (its prior
    probabilities)

20
Lung Cancer Example CPTs
21
The Markov Property
  • Modeling with BNs requires assuming the Markov
    Property
  • There are no direct dependencies in the system
    being modelled which are not already explicitly
    shown via arcs
  • Example smoking can influence dyspnoea only
    through causing cancer

22
Reasoning with Bayesian Networks
  • Basic task for any probabilistic inference
    systemCompute the posterior probability
    distribution for a set of query variables, given
    new information about some evidence variables
  • Also called conditioning or belief updating or
    inference

23
Types of Reasoning
24
Reasoning with Numbers Using Netica software
25
Understanding Bayesian Networks
  • A (more compact) representation of the joint
    probability distribution
  • understand how to construct a network
  • Encoding of a collection of conditional
    independence statements
  • understand how to design inference procedures
  • via Markov propertyEach conditional
    independence implied by the graph is present in
    the probability distribution

26
Representing the Joint Probability Distribution
Example
27
Conditional Independence
  • The relationship between conditional independence
    and BN structure is important for understanding
    how BNs work

28
Conditional Independence Causal Chains
  • Causal chains give rise to conditional
    independenceExample Smoking causes cancer,
    which causes dyspnoea

29
Conditional Independence Common Causes
  • Common Causes (or ancestors) also give rise to
    conditional independenceExample Cancer
    is a common cause of the two symptoms a positive
    Xray and dyspnoea

B
A
C
30
Conditional Dependence Common Effects
  • Common effects (or their descendants) give rise
    to conditional dependenceExample Cancer
    is a common effect of pollution and
    smokingGiven cancer, smoking explains away
    pollution

A
C
B
31
D-separation
  • Graphical criterion of conditional independence
  • We can determine whether a set of nodes X is
    independent of another set Y, given a set of
    evidence nodes E, via the Markov property
  • If every undirected path from a node in X to a
    node in Y is d-separated by E, then X and Y are
    conditionally independent given E

32
Determining D-separation
  • A set of nodes E d-separates two sets of nodes X
    and Y, if every undirected path from a node in X
    to a node in Y is blocked given E
  • A path is blocked given a set of nodes E, if
    there is a node Z on the path for which one of
    three conditions holds
  • Z is in E and Z has one arrow on the path leading
    in and one arrow out (chain)
  • Z is in E and Z has both path arrows leading out
    (common cause)
  • Neither Z nor any descendant of Z is in E, and
    both path arrows lead into Z (common effect)

33
Determining D-separation (cont)
Chain Common cause Common effect
34
Bayesian Networks Summary
  • Bayes rule allows unknown probabilities to be
    computed from known ones
  • Conditional independence (due to causal
    relationships) allows efficient updating
  • BNs are a natural way to represent conditional
    independence info
  • qualitative links between nodes
  • quantitative conditional probability tables
    (CPTs)
  • BN inference
  • computes the probability of query variables given
    evidence variables
  • is flexible we can enter evidence about any
    node and update beliefs in other nodes

35
Bayesian Networks and Markov Models
  • Bayesian AI
  • Bayesian networks
  • Decision networks
  • Reasoning about changes over time
  • Dynamic Bayesian Networks
  • Markov models

36
Decision Networks
  • Extension of BNs to support making decisions
  • Utility theory represents preferences between
    different outcomes of various plans
  • Decision theory Utility theory Probability
    theory

37
Expected Utility
  • E available evidence
  • A a non-deterministic action
  • Oi a possible outcome state
  • U utility

38
Decision Networks
  • A Decision network represents
  • information about
  • the agents current state
  • its possible actions
  • the state that will result from theagent's
    action
  • the utility of that state
  • Also called, Influence Diagrams
  • (Howard Matheson, 1981)

39
Types of Nodes
  • Chance nodes (ovals) random variables (same as
    BNs)
  • Have an associated CPT
  • Parents can be decision nodes and other chance
    nodes
  • Decision nodes (rectangles) points where the
    decision maker has a choice of actions
  • Utility nodes (Value nodes) (diamonds) the
    agent's utility function
  • Have an associated table representing a
    multi-attribute utility function
  • Parents are variables describing the outcome
    states that directly affect utility

40
Types of Links
  • Informational Links indicate when a chance node
    needs to be observed before a decision is made
  • Conditioning links indicate the variables on
    which the probability assignment to a chance node
    will be conditioned

41
Fever Problem Description
  • Suppose that you know that a fever can be caused
    by the flu. You can use a thermometer, which is
    fairly reliable, to test whether or not you have
    a fever. Suppose you also know that if you take
    aspirin it will almost certainly lower a fever to
    normal. Some people (about 5 of the population)
    have a negative reaction to aspirin. You'll be
    happy to get rid of your fever, so long as you
    don't suffer an adverse reaction if you take
    aspirin.

42
Fever Decision Network
43
Fever Decision Table
44
Bayesian Networks and Markov Models
  • Bayesian AI
  • Bayesian networks
  • Decision networks
  • Reasoning about changes over time
  • Dynamic Bayesian networks
  • Markov models

45
Dynamic Bayesian Networks (DBNs)
  • One node for each variable for each time step
  • Intra-slice arcs XiT? XjT
  • Inter-slice (temporal) arcs
  • XiT? XiT1
  • XiT? XjT1

46
Fever DBN
47
DBN Reasoning
  • Can calculate distributions for events at time
    t1 and further probabilistic projection
  • Reasoning can be done using standard BN updating
    algorithms
  • This type of DBN gets very large, very quickly
  • Usually keep only two time slices of the network

48
Dynamic Decision Networks
  • Decision networks can be extended to include
    temporal aspects
  • Sequence of decisions taken Plan

49
Fever DDN
50
Bayesian Networks and Markov Models
  • Bayesian AI
  • Bayesian networks
  • Decision networks
  • Reasoning about changes over time
  • Dynamic Bayesian networks
  • Markov models

51
Markov Models Assumptions
  • Stationary process a process of change that is
    governed by laws that dont change over time
  • Markov assumption the current state depends
    only on a finite history of the previous states
  • First-order MM
  • Second-order MM

52
Markov Prediction Models Example
  • Observation Sequence of document requests
    arriving at a Web siteD1, D2, D3, D2, D1, D4,
    D2, D3,
  • Task Predict the next requested document
  • First-order MMCalculate Pr(DiDj)
  • Second-order MMD1,D2?D3, D2,D3?D2
    ,D3,D2?D1, D2,D1?D4, D1,D4?D2,
    D4,D2?D3,Calculate Pr(DiDj,Dk)

53
Hidden Markov Models (HMMs)
  • A HMM is a temporal probabilistic model for a
    process, where the state of the process is
    described by a single discrete random variable
  • The possible values of the variable are the
    possible states of the world
  • Additional state variables are added by combining
    them into one mega variable

54
Hidden Markov Models (cont)
  • State transitions in a HMM
  • x hidden statesy observable outputsa
    transition probabilitiesb output
    probabilities

55
Hidden Markov Models Example
Raint-1
Raint
Raint1
Umbrellat-1
Umbrellat
Umbrellat1
56
Summary (I) Bayesian Networks and Markov Models
  • BNs are graphical probabilistic models that
    express causal and evidential relations between
    propositions
  • Dynamic Bayesian Networks (DBNs) the BN is
    replicated for each time slice
  • Markov models are graphical probabilistic models
    that represent transitions between states

57
Summary (II) Static versus Temporal Reasoning
  • Static reasoning
  • Bayesian networks
  • Decision networks
  • Temporal reasoning
  • Markov models and HMMs
  • Dynamic Bayesian networks
Write a Comment
User Comments (0)
About PowerShow.com