Graphical models Tom Griffiths UC Berkeley

1 / 60

About This Presentation

Title:

Graphical models Tom Griffiths UC Berkeley

Description:

Assume grass will be wet if and only if it rained last night, or if the sprinklers were left on: Explaining away. Rain. Sprinkler. Grass Wet ... – PowerPoint PPT presentation

Number of Views:69

Avg rating:3.0/5.0

Slides: 61

Provided by: josht83

Learn more at: http://www.ipam.ucla.edu

more less

Transcript and Presenter's Notes

Title: Graphical models Tom Griffiths UC Berkeley

1
Graphical modelsTom GriffithsUC Berkeley
2
Challenges of probabilistic models

Specifying well-defined probabilistic models with
many variables is hard (for modelers)
Representing probability distributions over those
variables is hard (for computers/learners)
Computing quantities using those distributions is
hard (for computers/learners)

3
Representing structured distributions

Four random variables
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin

Domain 0,1 0,1 0,1 0,1
4
Joint distribution
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001
1010 1011 1100 1101 1110 1111

Requires 15 numbers to specify probability of all
values x1,x2,x3,x4
N binary variables, 2N-1 numbers
Similar cost when computing conditional
probabilities

5
How can we use fewer numbers?

Four random variables
X1 coin toss produces heads
X2 coin toss produces heads
X3 coin toss produces heads
X4 coin toss produces heads

Domain 0,1 0,1 0,1 0,1
6
Statistical independence

Two random variables X1 and X2 are independent if
P(x1x2) P(x1)
e.g. coinflips P(x1Hx2H) P(x1H) 0.5
Independence makes it easier to represent and
work with probability distributions
We can exploit the product rule

If x1, x2, x3, and x4 are all independent
7
Expressing independence

Statistical independence is the key to efficient
probabilistic representation and computation
This has led to the development of languages for
indicating dependencies among variables
Some of the most popular languages are based on
graphical models

8
Graphical models

Introduction to graphical models
definitions
efficient representation and inference
explaining away
Graphical models and cognitive science
uses of graphical models

9
Graphical models

Introduction to graphical models
definitions
efficient representation and inference
explaining away
Graphical models and cognitive science
uses of graphical models

10
Graphical models

Express the probabilistic dependency structure
among a set of variables (Pearl, 1988)
Consist of
a set of nodes, corresponding to variables
a set of edges, indicating dependency
a set of functions defined on the graph that
specify a probability distribution

11
Undirected graphical models
X3
X4
X1

Consist of
a set of nodes
a set of edges
a potential for each clique, multiplied together
to yield the distribution over variables
Examples
statistical physics Ising model, spinglasses
neural networks (e.g. Boltzmann machines)

X2
X5
12
Ising models
X1
X2

Consist of
a set of nodes
a set of edges
a potential for each clique, multiplied together
to yield the distribution over variables
Distribution is specified as

X3
X4
13
Ising models
14
Boltzmann machines
X3
X4
X1

Consist of
a set of nodes
a set of edges
a potential for each clique, multiplied together
to yield the distribution over variables
Distribution is specified as

X2
X5
15
Boltzmann machines
True image
Boltzmann
PCA
PCA
Boltzmann
(Hinton Salakhutdinov, 2006)
16
Directed graphical models
X3
X4
X1

Consist of
a set of nodes
a set of edges
a conditional probability distribution for each
node, conditioned on its parents, multiplied
together to yield the distribution over variables
Constrained to directed acyclic graphs (DAGs)
Called Bayesian networks or Bayes nets

X2
X5
17
Bayesian networks and Bayes

Two different problems
Bayesian statistics is a method of inference
Bayesian networks are a form of representation
There is no necessary connection
many users of Bayesian networks rely upon
frequentist statistical methods
many Bayesian inferences cannot be easily
represented using Bayesian networks

18
Graphical models

Introduction to graphical models
definitions
efficient representation and inference
explaining away
Graphical models and cognitive science
uses of graphical models

19
Efficient representation and inference

Four random variables
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin

20
The Markov assumption

Every node is conditionally independent of its
non-descendants, given its parents

where Pa(Xi) is the set of parents of Xi
(via the product rule)
21
Efficient representation and inference

Four random variables
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin

1
1
4
2
total 8 (vs 15)
P(x1, x2, x3, x4) P(x1x3, x4)P(x2x3)P(x3)P(x4)
22
Reading a Bayesian network

The structure of a Bayes net can be read as the
generative process behind a distribution
Gives the joint probability distribution over
variables obtained by sampling each variable
conditioned on its parents

23
Reading a Bayesian network

Four random variables
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin

X3
X4
X1
X2
P(x1, x2, x3, x4) P(x1x3, x4)P(x2x3)P(x3)P(x4)
24
Reading a Bayesian network

The structure of a Bayes net can be read as the
generative process behind a distribution
Gives the joint probability distribution over
variables obtained by sampling each variable
conditioned on its parents
Simple rules for determining whether two
variables are dependent or independent

25
Identifying independence
X1 and X3 dependent
X1 and X3 independent
X2
X2
X2
X2
X2
X2
(shaded variables are observed)
26
Identifying independence

Four random variables
X1 coin toss produces heads
X2 pencil levitates
X3 friend has psychic powers
X4 friend has two-headed coin

X4 and X2 are independent
X3
X4
X3
X4
X1
X2
X1
X2
X4 and X2 are independent
X4 and X2 are dependent
27
Reading a Bayesian network

The structure of a Bayes net can be read as the
generative process behind a distribution
Gives the joint probability distribution over
variables obtained by sampling each variable
conditioned on its parents
Simple rules for determining whether two
variables are dependent or independent
Independence makes inference more efficient

28
Computing with Bayes nets
X3
X4
X1
X2
P(x1, x2, x3, x4) P(x1x3, x4)P(x2x3)P(x3)P(x4)
29
Computing with Bayes nets
sum over 8 values
P(x1, x2, x3, x4) P(x1x3, x4)P(x2x3)P(x3)P(x4)
30
Computing with Bayes nets
P(x1, x2, x3, x4) P(x1x3, x4)P(x2x3)P(x3)P(x4)
31
Computing with Bayes nets
sum over 4 values
P(x1, x2, x3, x4) P(x1x3, x4)P(x2x3)P(x3)P(x4)
32
Computing with Bayes nets

Inference algorithms for Bayesian networks
exploit dependency structure
Message-passing algorithms
belief propagation passes simple messages
between nodes, exact for tree-structured networks
More general inference algorithms
exact junction-tree
approximate Monte Carlo schemes

33
Logic and probability

Bayesian networks are equivalent to a
probabilistic propositional logic
Associate variables with atomic propositions
Bayes net specifies a distribution over possible
worlds, probability of a proposition is a sum
over worlds
More efficient than simply enumerating worlds
Developing similarly efficient schemes for
working with other probabilistic logics is a
major topic of current research

34
Graphical models

Introduction to graphical models
definitions
efficient representation and inference
explaining away
Graphical models and cognitive science
uses of graphical models

35
Identifying independence
X1 and X3 dependent
X1 and X3 independent
X2
X2
X2
X2
X2
X2
(shaded variables are observed)
36
Explaining away

Assume grass will be wet if and only if it
rained last night, or if the sprinklers were left
on

37
Explaining away
Compute probability it rained last night, given
that the grass is wet
38
Explaining away
Compute probability it rained last night, given
that the grass is wet
39
Explaining away
Compute probability it rained last night, given
that the grass is wet
40
Explaining away
Compute probability it rained last night, given
that the grass is wet
41
Explaining away
Compute probability it rained last night, given
that the grass is wet
42
Explaining away
Compute probability it rained last night, given
that the grass is wet and sprinklers were left
on
43
Explaining away
Compute probability it rained last night, given
that the grass is wet and sprinklers were left
on
44
Explaining away
Discounting to prior probability.
45
Contrast w/ production system
Rain
Grass Wet

Formulate IF-THEN rules
IF Rain THEN Wet
IF Wet THEN Rain
Rules do not distinguish directions of inference
Requires combinatorial explosion of rules

46
Contrast w/ spreading activation
Rain
Sprinkler
Grass Wet

Observing rain, Wet becomes more active.
Observing grass wet, Rain and Sprinkler become
more active
Observing grass wet and sprinkler, Rain cannot
become less active. No explaining away!

Excitatory links Rain Wet, Sprinkler
Wet

47
Contrast w/ spreading activation
Rain
Sprinkler
Grass Wet

Excitatory links Rain Wet, Sprinkler
Wet
Inhibitory link Rain Sprinkler

Observing grass wet, Rain and Sprinkler become
more active
Observing grass wet and sprinkler, Rain becomes
less active explaining away

48
Contrast w/ spreading activation
Rain
Burst pipe
Sprinkler
Grass Wet

Each new variable requires more inhibitory
connections
Not modular
whether a connection exists depends on what
others exist
big holism problem
combinatorial explosion

49
Contrast w/ spreading activation
(McClelland Rumelhart, 1981)
50
Graphical models

Capture dependency structure in distributions
Provide an efficient means of representing and
reasoning with probabilities
Support kinds of inference that are problematic
for other cognitive models explaining away
hard to capture in a production system
more natural than with spreading activation

51
Graphical models

Introduction to graphical models
definitions
efficient representation and inference
explaining away
Graphical models and cognitive science
uses of graphical models

52
Uses of graphical models

Understanding existing cognitive models
e.g., neural network models

53
Sigmoid belief networks
y

We can view multilayer perceptrons as Bayes nets
with specific probabilities
(e.g., Neal, 1992)
Makes it possible to use Bayesian tools with
existing neural network models
(e.g., Mackay, 1992)

z1
z2
x1
x2
54
Uses of graphical models

Understanding existing cognitive models
e.g., neural network models
Representation and reasoning
a way to address holism in induction (c.f. Fodor)

55
The holism of confirmation

If everything we know is one big probability
distribution, then discovering one small fact
requires changing all of our beliefs
Used by Fodor (2001) as an argument against the
possibility of inductive logic
Bayes nets everything can be connected to
everything, but inference can still be efficient

vs.
56
Uses of graphical models

Understanding existing cognitive models
e.g., neural network models
Representation and reasoning
a way to address holism in induction (c.f. Fodor)
Defining generative models
mixture models, language models,

57
Graphical models and coinflipping
q
d1
d2
d3
d4
d1
d2
d3
d4
d1
d2
d3
d4
Hidden Markov model si Fair coin, Trick
coin
Fair coin P(H) 0.5
P(H) q
58
A hierarchical Bayesian model
physical knowledge
Coins
q Beta(FH,FT)
FH,FT
...
Coin 1
Coin 2
Coin 200
q200
q1
q2
d1 d2 d3 d4
d1 d2 d3 d4
d1 d2 d3 d4
59
Uses of graphical models