Bayesian%20Network - PowerPoint PPT Presentation

About This Presentation
Title:

Bayesian%20Network

Description:

Bayesianism is a controversial but increasingly popular approach of statistics ... gull (= bird 1.00, 0.90 gull (= swimmer 1.00, 0.90 row (= bird 1.00, 0.90 ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 46
Provided by: mathie2
Category:

less

Transcript and Presenter's Notes

Title: Bayesian%20Network


1
Bayesian Network
  • David Grannen
  • Mathieu Robin
  • Micheal Lynch
  • Sohail Akram
  • Tolu Aina

2
Bayesianism is a controversial but increasingly
popular approach of statistics that offers many
benefits, although not everyone is persuaded of
its validity
3
  • Bayesians Networks based on a statistical
    approach presented by a mathematician, Thomas
    Bayes in 1763.
  • This is an approach for calculating
    probabilities among several variables that are
    causally related but for which the relationships
    can't easily be derived by experimentation.
  • Bayes formula provides the mathematical tool
    that combines prior knowledge with current data
    to produce a posterior distribution

4
It most likely seemed to be a complicated
formula that looked something like this P(ab)
L(ba)P(a) / L(ba)P(a) L(bnot a)P( not a)
  Following medical example, we have a patient
who is concerned about his/her chances of
experiencing a heart attack. Historical data that
we have   Population experiences heart attacks
20 Smokers experience heart attacks 90
(of all) Without experience of a heart attack
smokers 60   P(heart attack smoker)
L(smoker heart attack)Prior(heart attack) /
L(smoker heart attack)Prior(heart attack)
L(smoker no heart attack)Prior(no heart attack)
or P(heart attack smoker) (90 20) /
(90 20) (60 80) P(heart attack
smoker) 27  
5
  • Bayesian networks are complex diagrams that
    organize the body of knowledge in any given area
    by mapping out cause-and-effect relationships
    among key variables and encoding them with
    numbers that represent the extent to which one
    variable is likely to affect another.
  • This approach allows scientists to combine new
    data with their existing knowledge or expertise.

6
  • In the late 1980 on the basis of work of Judea
    Pearl, a professor of computer science at UCLA,
  • AI researchers discovered that Bayesian networks
    offered an efficient way to deal with the lack or
    ambiguity of information that has hampered
    previous systems.
  •     Bayesian networks provide "an overarching
    graphical framework" that brings together diverse
    elements of AI and increases the range of its
    likely application to the real world

7
Bayesian applications
8
  • Decision-making using Bayesian methods has many
    applications in software applications. Best-known
    example is Microsoft's Office Assistant .When a
    user calls up the assistant, Bayesian methods are
    used to analyse recent actions in order to try to
    work out what the user is attempting to do, with
    this calculation constantly being modified in the
    light of new actions.
  •    Microsoft is the most aggressive in exploiting
    Bayesian approach. The company offers a free Web
    service that helps customers diagnose printing
    problems with their computers and recommends the
    quickest way to resolve them. Another Web service
    helps parents diagnose their children's health
    problems.

9
  • Scott Musman, a computer consultant in Arlington,
    Va., recently designed a Bayesian network for the
    Navy that can identify enemy missiles, aircraft
    or vessels and recommend which weapons could be
    used most advantageously against incoming
    targets.
  • General Electric is using Bayesian techniques to
    develop a system that will take information from
    sensors attached to an engine and, based on
    expert opinion built into the system as well as
    vast amounts of data on past engine performance,
    pinpoint emerging problems

10
Representation of Graphical Models
  • Graphical models are graphs in which nodes
    represent random variables.
  • A Bayesian Network is kind of directed graphical
    model , which takes into account the
    directionality of the arcs. (arrows between
    nodes)
  • Advatage ofa directed graphical model is that one
    can regard an arc from A to B as indicating that
    A causes'' B..

A
B
11
Graphical Models 2
  • Along with Graph , it is necessary to specify the
    parameters of the model.
  • For a directed model, we must specify the
    Conditional Probability Distribution (CPD) at
    each node.
  • If the variables are discrete, this can be
    represented as a table (CPT), which lists the
    probability that the child node takes on each of
    its different values for each combination of
    values of its parents.

12
Example wet Grass
13
Example Wet grass
  • Event grass is wet 2 causes Rain or sprinkler.
  • From table Pr(W true) Strue, R False0 0.9
    , each row sums to 1.0 so Pr(W false Strue ,
    R false) 0.1
  • Developing Inference from the Bayesian networks

14
Inference
We observe the grass is wet- 2 causes sprinkler
or rain .. Which is more likely ???
Pr(S1W1) S Pr(S1, W1) / Pr(W1)
0.2781/0.6 Pr(S1W1) S Pr(R1, W1) / Pr(W1)
0.4581/0.6
Normalizing Pr(W1) 0.6471
15
Inference 2
  • Pr(S1 W1) 0.2781/0.6471 0.429
  • Pr((R1W1) 0.4581 / 0.6471 0.7079
  • More likely grass is wet because its raining!!
  • Example given is bottom up Bayes Network from
    effects to causes. Top down reasoning also
    possible using example above we can deduce
    probability grass is wet given that its cloudy.

16
Inference (cont.)
  • Inference is concerned with, how can we use
    graphical models to efficiently answer
    probabilistic queries?
  • Uses Bayes thoerem
  • P(BA) odds P(AB) / 1 P(AB)
  • A prior probability is based on previously
    observed data
  • Conditional probability of the form P(BA)

17
Scenario
  • Apartment with a smoke detector
  • Smoke detector near bathroom
  • Taking shower often triggers detector (smoke
    detectors detect stream)

18
Scenario (2)
  • B (burn dinner)
  • O (plan to go out ) A
    (smoke alarm)
  • S (take shower)
  • F( Electrical
    Fire)

19
Bayes theorem
  • Bayesian P(BA)
  • odds P(AB) / 1 P(AB)
  • Likelihood(AB) odds(B) /
  • 1 (Likelihood(AB) odds(B))
  • Likelihood(AB) P(B) / P(AB') P(B')

20
Bayes theorem (2)
  • Conditional probabilities specify the degree of
    belief in some proposition or propositions based
    on the assumption that some other propositions
    are true.
  • Therefore the theory has no meaning without prior
    resolution of the probability of these antecedent
    propositions.

21
Approch
  • Top down
  • The probability an event will occur given it a
    prior probability
  • Bottom up  
  • Reasoning which starts from effect and tries
    to determine the causes

22
types of inference
  • (a) Predictive - a can cause b
  • (b) Diagnostic - b is evidence of a
  • (c) Intercasual - a and b can cause c
  • a explains c so its evidence
    against b
  • (explaining away,Berkson's paradox, or
    "selection bias")



a
a
a
b
b
b
c
23
Example
  • The a priori probability of a burglary B is
    0.0001.
  • The conditional probability of an alarm A given a
    burglary is Pr(AB)

24
Example (2)
  • Burglary No Burglary
  • --------------------
  • Alarm 0.95 0.01
  • --------------------
  • No Alarm 0.05 0.99
  • --------------------
  • What is value of Pr(BA)?

25
Example (3)
  •   Pr(BA) odds(BA) / 1 odds(BA)
  • Where odds(BA) Likelihood(AB)
    odds(B)
  • P(AB) / P(AB') / Pr(B) /
    P(B)
  • 0.95 / 0.01 0.0001/ .9999 0.0095
  • Pr(BA) 0.0095 / 1.0095 0.00941
  • An alarm implies that , burglary is 94
    times
  • more likely than a priori

26
Bayesian Learning
Sources. A Tutorial on Learning Bayesian
Networks by David Heckerman MSR-TR-95-06 Lear
ning Bayesian Networks from Data by Nir Friedman
and Moises Goldszmidt from Berkeley and SRI
International
27
The easier side to Bayesian Learning
Chorus In the Theory we can build a sample, With
Convergeance surely guarenteed, But beware of
autocorrelations, Or it will take forever to
succeed! Verse 4 When it runs aint it
thrillin To the last Iteration. It frolics and
plays throughout n-space Walkin in a Bayesian
Wonderland Ending Random walkin in a Bayesian
Wonderland.
28
In perspective
29
Where Learning enters the arena
  • Bayesian Networks Summarise as follows as
  • Efficient representations of probability
    distributions
  • Local Models
  • Independence
  • Effective representations of Probability
    Distributions for
  • Computing posterior probabilities
  • Computing most probable instantiation
  • Decision making
  • But there is more i.e. Statistical Induction -gt
    Learning

30
The Learning Process
  • Done by
  • Encode existing expert knowledge in a Bayesian
    Network
  • Use a database to update this knowledge
    creating one or more new Bayesian Networks
  • Results in
  • Refinement of original knowledge
  • Sometimes the identification of new distinctions
    and relationships
  • Robust to the errors in knowledge of experts

31
Similar to Neural Net Learning
  • But with the following advantages
  • We can easily encode expert knowledge
    increasing efficiency and accuracy of learning
  • Nodes and Arcs in learned Bayesian Networks often
    correspond to recognizable distinctions and
    causal relationships
  • Thus it is easier to understand and interpreted
    the knowledge encoded in the representation

32
Bayesian Learning The Problem
Known Structure Unknown Structure
Complete Data Statistical parametric estimation Discrete optimization over structures
Incomplete Data Parametric optimization Combined
33
Why Learning
  • Feasibility of Learning
  • Availability of data and computational power
  • Need for Learning
  • Characteristics of current systems and processes
  • Defy closed form analysis
  • gt need data driven approach for characterisation
  • Scale and change fast
  • gt need continuous automatic adaptation
  • Examples
  • Communications networks, illegal activities, the
    brain, economic markets

34
Why Learn a Bayesian Network
  • Combine knowledge engineering and statistical
    induction
  • Covers the whole spectrum from knowledge
    intensive model construction to data intensive
    model induction
  • More than a learning black-box
  • Explanation of outputs
  • Interpretability and modifiability
  • Algorithms for decision making, value of
    information diagnosis an repair
  • Causal representation , reasoning and discovery
  • i.e. does smoking cause cancer

35
A Simple Example
  • Wang presents a simple example in 2 using only
    the first four operations, which I reproduce in
    abbreviated form here. He begins with the
    following 8 statements
  • robin ( feathered-creature lt1.00, 0.90gt
  • bird ( feathered-creature lt1.00, 0.90gt
  • wan ( bird lt1.00, 0.90gt
  • wan ( swimmer lt1.00, 0.90gt
  • gull ( bird lt1.00, 0.90gt
  • gull ( swimmer lt1.00, 0.90gt
  • row ( bird lt1.00, 0.90gt
  • row ( swimmer lt0.00, 0.90gt
  • (Note that giving a statement with a frequency of
    0.00 simply means that it is not true.) The
    system is then asked to evaluate the truth value
    of "robin ( swimmer". It comes to the following
    conclusions, in this order
  • robin ( bird lt1.00, 0.45gt (1 and 2, abduction)
  • bird ( swimmer lt1.00, 0.45gt (3 and 4,
    induction)
  • obin ( swimmer lt1.00, 0.20gt (9 and 10,
    deduction)
  • bird ( swimmer lt1.00, 0.45gt (5 and 6,
    induction)
  • bird ( swimmer lt1.00, 0.62gt (10 and 12,
    revision)
  • ird ( swimmer lt0.00, 0.45gt (7 and 8, induction)
  • bird ( swimmer lt0.67, 0.71gt (13 and 14,
    revision)
  • robin ( swimmer lt0.67, 0.32gt (9 and 15,
    deduction)
  • Note that NARS actually comes to a great many
    more conclusions than this, but the ones shown
    are the ones that actually lead toward the
    conclusion. Also, NARS reports the conclusions at
    both lines 11 and 16, since the guesswork
    involved necessarily means it needs to be able to
    change its mind, as it were. The final
    conclusion, given at line 16, means that two
    thirds of the relevant evidence indicates that a
    robin can swim, but that this conclusion has
    somewhat less than one third of the possible
    degree of confidence both of these items, of
    course, indicate the need for more information
    -).

36
A Comparison with another Learning Technique
37
Current Topics
  • Time
  • Beyond discrete time and beyond fixed rate
  • Causality
  • Removing the assumptions
  • Hidden Variables
  • Where to place them and how many
  • Model Evaluation and active learning
  • What parts of it are suspect and what and how
    much data is needed.

38
Decision Theory (1)
  • What happens when it is time to convert beliefs
    into actions?
  • Decision Theory Probability Theory Utility
    Theory

39
Decision Theory (2)
  • Decompose a multi-attribute utility fonction into
    a sum of local utilities
  • Each term is a node, which has as parents
  • The random variables on which it depends
  • The action (control) nodes
  • The resulting graph is an influence diagram
  • Finally, compute the optimal sequence of actions
    to perform to maximize expected utility

40
Applications (1)
  • QMR-DT a decision-theoretic reformulation of the
    Quick Medical Reference model

41
Some Applications
  • Biostatistics Medical Research Council Bayesian
    Inferance Using Gibbs Sampling BUGS)
  • Data Analysis NASA (AutoClass)
  • Collaborative filtering Microsoft (Microsoft
    Belief Networks - MSBN)
  • Fraud Detection ATT
  • Speech recognition UC Berkeley

42
Applications (2)
  • Real-Time decision NASAs system Visa
  • Genetics linkage analysis
  • Speech recognition
  • Data compression density estimation
  • Coding turbocodes

43
Applications MS Office
  • MS office assistant The Lumière Project
  • Source The Lumière Project Bayesian User
    Modeling for Inferring the Goals and Needs of
    Software Users, by E. Horvitz, J. Breese, D.
    Heckerman, D. Hovel, K. Rommelse (Microsoft
    Research)

44
MS Office (2)
  • User behaviour is monitored to determine
    Assistant actions. Examples
  • Search
  • Focus of attention
  • Introspection
  • Undesired effects
  • Inefficient command sequences
  • Domain-specific syntactic and semantic content

45
MS Office (3)
  • Portion of a Bayesian Net for infering the
    likehood that a user needs assistance,
    considering profile info and recent activity
Write a Comment
User Comments (0)
About PowerShow.com