An Introduction to Bayesian Inference - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

An Introduction to Bayesian Inference

Description:

Bayes nets: MSBNx Setup ... Hillsdale, NJ: Erlbaum. For a lot more detail, see. Jensen, F.V. (1996) ... Inference in multiply-connected nets ... – PowerPoint PPT presentation

Number of Views:569
Avg rating:3.0/5.0
Slides: 59
Provided by: bobmi9
Category:

less

Transcript and Presenter's Notes

Title: An Introduction to Bayesian Inference


1
An Introduction to Bayesian Inference
  • Robert J. Mislevy
  • University of Maryland
  • March 19, 2002

2
A quote from Glenn Shafer
  • Probability is not really about numbers
  • it is about the structure of reasoning.
  • Glenn Shafer, quoted in Pearl, 1988, p. 77

3
Views of Probability
  • Two conceptions of probability
  • Aleatory (chance)
  • Long-run frequencies, mechanisms
  • Probability is a property of the world
  • Degree of belief (subjective)
  • Probability is a property of Your state of
    knowledge (de Finetti)
  • Same formal definitions machinery
  • Aleatory paradigms as analogical basis for degree
    of belief (Glenn Shafer)

4
Frames of discernment
  • Frame of discernment is all the possible
    combinations of values of the variables your are
    working with. (Shafer, 1976)
  • Discern detect, recognize, distinguish
  • Property of you as much as property of world
  • Depends on what you know and what your purpose is
    (e.g., expert vs. novice)
  • Frame of discernment can evolve over time

5
Frames of Discernment in Assessment
  • In Student Model, determining what aspects of
    skill knowledge to use as explicit SM
    variables--psych perspective, grainsize,
    reporting requirements
  • In Evidence Model, evidence identification
    (task scoring), evaluation rules map from
    unique work product to common observed variables.
  • In Task Model, which aspects of situations are
    important in task design to keep track of and
    manipulate, to achieve assessments purpose?
  • Task features versus values of Task Model
    variables

6
Random Variables
  • We will concentrate on variables with a finite
    number of possible values.
  • Denote random variable by upper case, say X.
  • Denote particular values and generic values by
    lower case, x.
  • Y is the outcome of a coin flip yÃŽ h,t.
  • Xi is the answer to Item i xi ÃŽ 0,1.
  • Zjk is the rating of Judge k to Essay j
    zjk ÃŽ 0,1,...,5.

7
Finite Probability Distributions
  • Finite set of possible values x1,xn
  • P(Xxj), or more simply p(xj), is the probability
    that X takes the value xj.
  • 0 p(xj) 1.
  • P(Xxj or Xxm) p(xj) p(xm).

8
MSBNx representation
MSBNx (Microsoft Bayesian Network editor).
http//research.microsoft.com/adapt/MSBNx/default.
asp
9
Hypergraph representation
X
p(x)
the probability distribution
the variable
10
Conditional probability distributions
  • Two random variables, X and Y
  • P(XxjYyk), or p(xj yk), is the probability
    that X takes the value xj given that Y takes the
    value yk .
  • This is how we express relationships among
    real-world phenomena
  • P(heart attackage, family history, blood
    pressure)
  • P(tomorrows high temperature geographical
    location, date, todays high)
  • IRT P(Xj1) vs. P(Xj1q)
  • Coin flip p(heads) vs. p(headsRJM_strategy)

11
Conditional probability distributions
  • Two random variables, X and Y
  • P(XxjYyk), or p(xj yk), is the probability
    that X takes the value xj given that Y takes the
    value yk .
  • 0 p(xj yk) 1 for a given yk.
  • for a given yk
  • P(Xxj or Xxm Yyk) p(xj yk) p(xm yk).

12
Hygiene Example Variables
One observable variable Quality of Patient
History Taken Adequate, Inadequate One
unobservable ability Level of Proficiency
Expert, Novice Want to model probability
of taking an adequate history as depending on
level of proficiency.
13
Hygiene Example Conditional Probabilities
One observable Adequate Patient History Taken,
Yes or No One unobservable ability, Expert or
Novice
p(inadequateexpert)
What kind of reasoning is this direction?
p(adequateexpert)
14
Hygiene Example Conditional Probabilities
One observable Adequate Patient History Taken,
Yes or No One unobservable ability, Expert or
Novice
p(inadequatenovice)
p(adequatenovice)
15
Tie-in with Toulmin diagram
Harry is an expert hygienist.
Novices sometimes take adequate patient histories
unless
Experts tend to take adequate patient histories
since
so
Harry took an adequate patient history.
16
Tie-in with Toulmin diagram
Harry is an expert hygienist.
unless
since
so
Harry took an adequate patient history.
17
Tie-in with Toulmin diagram
Pr(Harry is an expert hygienistadequate
history) .82
unless
since
so
Harry took an adequate patient history.
18
Bayes nets MSBNx Setup
History (i.e., quality of patient history
procedure) is modeled as being conditionally
dependent on Proficiency. That is,
Proficiency is a parent of History
equivalently, History is a child of
Proficiency. Note that Proficiency has no
parents.
19
Bayes nets MSBNx Setup
This is the probability table expressing initial
belief about Proficiency.
20
Bayes nets MSBNx Setup
This is the probability table expressing belief
about History. Note there there are two
different conditional distributions for the
values of History one is relevant if Proficiency
Expert and a different one if
ProficiencyNovice.
21
Bayes nets MSBNx Setup
Belief if you know ProficiencyExpert. You know
that with certainty (probability 1), and what you
expect about History is P(HistoryProficiencyExp
ert) .8 for adequate and .2 for inadequate.
22
Bayes nets MSBNx Setup
Click here to bring up the evaluation window.
23
Bayes nets MSBNx Setup
Belief if you know ProficiencyExpert. You know
that with certainty (probability 1), and what you
expect about History is P(HistoryProficiencyExp
ert) .8 for adequate and .2 for inadequate.
24
Bayes nets MSBNx Setup
Belief if you know ProficiencyNovice. You know
that with certainty (probability 1), and what you
expect about History is P(HistoryProficiencyNov
ice) .4 for adequate and .6 for inadequate.
25
Bayes nets MSBNx Setup
Depiction if you dont know the value of either
Proficiency or History. Starting with 70-30
belief about Expert/Novice. The conditional
distributions for History are averaged together
to get the marginal (overall) distribution for
History -- appropriate if you dont know
Proficiency level.
26
Bayes Theorem
  • The setup, with two random variables, X and Y
  • You know conditional probabilities, p(xj yk),
    which tell you what to believe about X if you
    knew the value of Y.
  • You learn Xx what should you believe about Y?
  • You combine two things
  • Relative conditional probabilities (the
    likelihood)
  • Previous probabilities about Y values

posterior likelihood
prior
27
Hygiene Example Likelihoods
One observable Adequate Patient History Taken,
Yes or No One unobservable proficiency, Expert or
Novice
What kind of reasoning is this direction?
p(adequateexpert)
p(adequatenovice)
Note 21 ratio
28
Hygiene Example Likelihoods
One observable Adequate Patient History Taken,
Yes or No One unobservable proficiency, Expert or
Novice
p(inadequateexpert)
p(inadequatenovice)
Note 13 ratio
29
Bayes nets Posterior Probabilities
Posterior distribution for Proficiency if
HistoryAdequate is observed. The 21 ratio
favoring Expert multiplies the 7030 ratio we
started with. The products are rescaled so they
sum to one.
30
Bayes nets Posterior Probabilities
Posterior distribution for Ability if
HistoryInadequate is observed. The 13 ratio
for Expert/Novice from the likelihood has
multiplied the 7030 prior.
31
Conditional independence
  • Conditional independence is not a grace of
    nature for which we must wait passively, but
    rather a psychological necessity which we satisfy
    actively by organizing our knowledge in a
    specific way.
  • An important tool in such organization is the
    identification of intermediate variables that
    induce conditional independence among
    observables if such variables are not in our
    vocabulary, we create them.
  • In medical diagnosis, for instance, when some
    symptoms directly influence one another, the
    medical profession invents a name for that
    interaction (e.g., syndrome, complication,
    pathological state) and treats it as a new
    auxiliary variable that induces conditional
    independence dependency between any two
    interacting systems is fully attributed to the
    dependencies of each on the auxiliary variable.
    (Pearl, 1988, p. 44)

32
Conditional independence
  • Independence
  • The probability of the joint occurrence of
    values of two variables is always equal to the
    product of the probabilities individually
  • P(Xx,Yy) P(Xx) P(Yy).
  • Conditional independence
  • The conditional probability of the joint
    occurrence given the value of another variable is
    always equal to the product of the conditional
    probabilities
  • P(Xx,YyZz) P(Xx Zz) P(Yy Zz).

33
Conditional independence
  • MSBNx diagram with Independence
  • MSBNx diagram with Conditional independence

Entering a value for one variable, X or Y,
doesnt change the probabilities for the other.
Entering a value for one variable, X or Y,
doesnt change the probabilities for the other IF
the value for Z has already been fixed at some
value.
34
Example Classical Test Theory
  • One true score, 5 identically distributed,
    conditionally independent observable test scores.

TrueScore is probably about such such (a
probability distribution)
Distribution of observable test score is true
score noise
Since
Test3 score
Test1 score
Test2 score
Test5 score
Test4 score
almost
35
Example Classical Test Theory
  • One true score, 5 identically distributed,
    conditionally independent observable test scores.
  • Distribution of observable is true score noise.

almost
36
Example Classical Test Theory
  • Conditional distributions of observables are
    identical, and dont depend on other observables

almost
37
Conditional distributions of observables given
TrueScore2
(What kind of reasoning is this?)
38
  • Prior (sort of normal) distribution for TrueScore

39
Example Classical Test Theory
  • Posterior distribution for TrueScore given Test14

(What kind of reasoning is this?)
almost
40
Example Classical Test Theory
  • Posterior for TrueScore given Test14, Test24,
    Test35

almost
41
Example Classical Test Theory
  • Note shift in what we expect for Test 4

almost
42
Building up complex networks
  • Extending Wigmores ideas to probability-based
    reasoning
  • Interrelationships among many variables modeled
    in terms of important relationships among smaller
    subsets of variables,
  • sometimes unobservable ones.

43
Building up complex networks
  • Recursive representation of probability
    distributions
  • terms drop out when there is conditional
    independence.
  • The relationship between recursive
    representations and acyclic directed graphs
  • Edges (arrows) represent explicit dependence
    relationships

44
Example The Asia Network
45
Mixed-Number Subtraction
Based on cognitive analyses, task construction,
and data collection by Dr. Kikumi Tatsuoka. See
Mislevy (1994, 1995) readings for details of
Bayes net.
46
Computation in Bayes nets
  • Very brief conceptual overview
  • For a bit more detail, see
  • Mislevy, R.J. (1995). Probability-based
    inference in cognitive diagnosis. In P. Nichols,
    S. Chipman, R. Brennan (Eds.), Cognitively
    diagnostic assessment (pp. 43-71). Hillsdale,
    NJ Erlbaum.
  • For a lot more detail, see
  • Jensen, F.V. (1996). An introduction to Bayesian
    networks. New York Springer-Verlag.

47
Inference in a chain
Recursive representation
p(u,v,x,y,z) p(zy,x,v,u) p(yx,v,u) p(xv,u)
p(vu) p(u) p(zy)
p(yx) p(xv) p(vu) p(u).
U
V
X
Y
Z
p(zy)
p(yx)
p(xv)
p(vu)
48
Inference in a chain
Suppose we learn the value of X
Start here, by revising belief about X
U
V
X
Y
Z
p(zy)
p(yx)
p(xv)
p(vu)
49
Inference in a chain
Propagate information down the chain using
conditional probabilities
From updated belief about X, use conditional
probability to revise belief about Y
U
V
X
Y
Z
p(zy)
p(yx)
p(xv)
p(vu)
50
Inference in a chain
Propagate information down the chain using
conditional probabilities
From updated belief about Y, use conditional
probability to revise belief about Z
U
V
X
Y
Z
p(zy)
p(yx)
p(xv)
p(vu)
51
Inference in a chain
Propagate information up the chain using Bayes
Theorem
From updated belief about X, use Bayes Theorem to
revise belief about V
U
V
X
Y
Z
p(zy)
p(yx)
p(xv)
p(vu)
52
Inference in a chain
Propagate information up the chain using Bayes
Theorem
From updated belief about V, use Bayes Theorem to
revise belief about U
U
V
X
Y
Z
p(zy)
p(yx)
p(xv)
p(vu)
53
Inference in a tree
In a tree each variable has no more than one
parent. Suppose we learn the value of X. We can
update every variable in the tree using either
the conditional probability relationship or Bayes
theorem.
V
U
X
Y
Z
54
Inference in multiply-connected nets
In a multiply- connected graph, in at least one
instance there is more than one path from one
variable to another variable. Repeated
applications of Bayes theorem and conditional
probability at the level of individual variables
doesnt work.
V
W
U
X
Y
Z
55
Inference in multiply-connected nets
V
W
  • Key idea Group variables into subsets
    (cliques) such that the subsets form a tree.

U
X
Y
Z
U,V,W
56
Inference in multiply-connected nets
V
W
  • Key idea Group variables into subsets
    (cliques) such that the subsets form a tree.

U
X
Y
Z
U,V,W
U,V,X
57
Inference in multiply-connected nets
V
W
  • Key idea Group variables into subsets
    (cliques) such that the subsets form a tree.

U
X
Y
Z
U,V,W
U,V,X
U,X,Y
58
Inference in multiply-connected nets
V
W
  • Key idea Group variables into subsets
    (cliques) such that the subsets form a tree.
  • Can the update cliques with a generalized
    version of updating individual variables in
    cliques.

U
X
Y
Z
X,Z
U,V,W
U,V,X
U,X,Y
Write a Comment
User Comments (0)
About PowerShow.com