Title: P1253814607EOwhc
1KSU Math Department Colloquium
Graphical Models of Probabilityfor Causal
Reasoning
Thursday 07 November 2002 William H.
Hsu Laboratory for Knowledge Discovery in
Databases Department of Computing and Information
Sciences Kansas State University http//www.kddres
earch.org This presentation is http//www.kddres
earch.org/KSU/CIS/Math-20021107.ppt
2Overview
- Graphical Models of Probability
- Markov graphs
- Bayesian (belief) networks
- Causal semantics
- Direction-dependent separation (d-separation)
property - Learning and Reasoning Problems, Algorithms
- Inference exact and approximate
- Junction tree Lauritzen and Spiegelhalter
(1988) - (Bounded) loop cutset conditioning Horvitz and
Cooper (1989) - Variable elimination Dechter (1996)
- Structure learning
- K2 algorithm Cooper and Herskovits (1992)
- Variable ordering problem Larannaga (1996), Hsu
et al. (2002) - Probabilistic Reasoning in Machine Learning, Data
Mining - Current Research and Open Problems
3Stages of Data Mining andKnowledge Discovery in
Databases
Adapted from Fayyad, Piatetsky-Shapiro, and Smyth
(1996)
4Graphical Models Overview 1Bayesian Networks
P(20s, Female, Low, Non-Smoker, No-Cancer,
Negative, Negative) P(T) P(F) P(L T)
P(N T, F) P(N L, N) P(N N) P(N N)
5Graphical Models Overview 2Markov Blankets
and d-Separation Property
Motivation The conditional independence status
of nodes within a BBN might change as the
availability of evidence E changes.
Direction-dependent separation (d-separation) is
a technique used to determine conditional
independence of nodes as evidence
changes. Definition A set of evidence nodes E
d-separates two sets of nodes X and Y if every
undirected path from a node in X to a node in Y
is blocked given E. A path is blocked if one of
three conditions holds
From S. Russell P. Norvig (1995)
Adapted from J. Schlabach (1996)
6Graphical Models Overview 3Inference Problem
Multiply-connected case exact, approximate
inference are P-complete
Adapted from slides by S. Russell, UC Berkeley
http//aima.cs.berkeley.edu/
7Other Topics in Graphical Models 1Temporal
Probabilistic Reasoning
- Goal Estimate
- Filtering r t
- Intuition infer current state from observations
- Applications signal identification
- Variation Viterbi algorithm
- Prediction r lt t
- Intuition infer future state
- Applications prognostics
- Smoothing r gt t
- Intuition infer past hidden state
- Applications signal enhancement
- CF Tasks
- Plan recognition by smoothing
- Prediction cf. WebCANVAS Cadez et al. (2000)
Adapted from Murphy (2001), Guo (2002)
8Other Topics in Graphical Models 2Learning
Structure from Data
- General-Case BBN Structure Learning Use
Inference to Compute Scores - Optimal Strategy Bayesian Model Averaging
- Assumption models h ? H are mutually exclusive
and exhaustive - Combine predictions of models in proportion to
marginal likelihood - Compute conditional probability of hypothesis h
given observed data D - i.e., compute expectation over unknown h for
unseen cases - Let h ? structure, parameters ? ? CPTs
Posterior Score
Marginal Likelihood
Prior over Parameters
Prior over Structures
Likelihood
9Propagation Algorithm in Singly-Connected
Bayesian Networks Pearl (1983)
Multiply-connected case exact, approximate
inference are P-complete (counting problem is
P-complete iff decision problem is NP-complete)
Adapted from Neapolitan (1990), Guo (2000)
10Inference by Clustering 1 Graph Operations
(Moralization, Triangulation, Maximal Cliques)
Adapted from Neapolitan (1990), Guo (2000)
11Inference by Clustering 2Junction Tree
Lauritzen Spiegelhalter (1988)
Input list of cliques of triangulated, moralized
graph Gu Output Tree of cliques Separators
nodes Si, Residual nodes Ri and potential
probability ?(Clqi) for all cliques Algorithm 1.
Si Clqi ?(Clq1 ? Clq2 ?? Clqi-1) 2. Ri
Clqi - Si 3. If i gt1 then identify a j lt i such
that Clqj is a parent of Clqi 4. Assign each
node v to a unique clique Clqi that v ? c(v) ?
Clqi 5. Compute ?(Clqi) ?f(v) Clqi P(v
c(v)) 1 if no v is assigned to Clqi 6. Store
Clqi , Ri , Si, and ?(Clqi) at each vertex in the
tree of cliques
Adapted from Neapolitan (1990), Guo (2000)
12Inference by Clustering 3Clique-Tree
Operations
Adapted from Neapolitan (1990), Guo (2000)
13Inference by Loop Cutset Conditioning
- Deciding Optimal Cutset NP-hard
- Current Open Problems
- Bounded cutset conditioning ordering heuristics
- Finding randomized algorithms for loop cutset
optimization
Split vertex in undirected cycle condition upon
each of its state values
Exposure-To- Toxins
Serum Calcium
Number of network instantiations Product of
arity of nodes in minimal loop cutset
Cancer
X3
X6
X5
X4
X7
Smoking
Lung Tumor
X2
Gender
Posterior marginal conditioned upon cutset
variable values
14Inference by Variable Elimination 1Intuition
Adapted from slides by S. Russell, UC Berkeley
http//aima.cs.berkeley.edu/
15Inference by Variable Elimination 2Factoring
Operations
Adapted from slides by S. Russell, UC Berkeley
http//aima.cs.berkeley.edu/
16Inference by Variable Elimination 3Example
P(A), P(BA), P(CA), P(DB,A), P(FB,C), P(GF)
G
D
F
B
C
A
P(GF)
G1
P(DB,A)
P(FB,C)
P(BA)
P(CA)
P(A)
P(AG1) ? d lt A, C, B, F, D, G gt
?G(f) SG1 P(GF)
Adapted from Dechter (1996), Joehanes (2002)
17Genetic Algorithms for Parameter Tuning in
Bayesian Network Structure Learning
18Supervised and Unsupervised LearningDecision
Support in Insurance Pricing
Hsu, Welge, Redman, Clutter (2002) Data Mining
and Knowledge Discovery, 6(4)361-391
19Computational Genomics andMicroarray Gene
Expression Modeling
Learning Environment
Adapted from Friedman et al. (2000)
http//www.cs.huji.ac.il/labs/compbio/
20DESCRIBER An ExperimentalIntelligent Filter
- Example Queries
- What experiments have found cell cycle-regulated
metabolic pathways in Saccharomyces? - What codes and microarray data were used, and why?
Users of Scientific Document Repository
Data Entity and Source Code Repository Index for
Bioinformatics Experimental Research
Learning and Inference Components
Historical Use Case Query Data
Personalized Interface
New Queries
Domain-Specific Collaborative Filtering
Decision Support Models
Interface(s) to Distributed Repository
Domain-Specific Repositories Experimental
Data Source Codes and Specifications Data
Models Ontologies Models
21Tools for Building Graphical Models
- Commercial Tools Ergo, Netica, TETRAD, Hugin
- Bayes Net Toolbox (BNT) Murphy (1997-present)
- Distribution page http//http.cs.berkeley
.edu/murphyk/Bayes/bnt.html - Development group http//groups.yahoo.co
m/group/BayesNetToolbox - Bayesian Network tools in Java (BNJ) Hsu et al.
(1999-present) - Distribution page
http//bndev.sourceforge.net - Development group
http//groups.yahoo.com/group/bndev - Current (re)implementation projects for KSU KDD
Lab - Continuous state Minka (2002) Hsu, Guo, Perry,
Boddhireddy - Formats XML BNIF (MSBN), Netica Guo, Hsu
- Space-efficient DBN inference Joehanes
- Bounded cutset conditioning Chandak
22References 1Graphical Models and Inference
Algorithms
- Graphical Models
- Bayesian (Belief) Networks tutorial Murphy
(2001)
http//www.cs.berkeley.edu/murphyk/Bayes/bayes.ht
ml - Learning Bayesian Networks Heckerman (1996,
1999) http//research.microsoft.com/heckerman - Inference Algorithms
- Junction Tree (Join Tree, L-S, Hugin) Lauritzen
Spiegelhalter (1988) http//citeseer.nj.nec.com
/huang94inference.html - (Bounded) Loop Cutset Conditioning Horvitz
Cooper (1989) http//citeseer.nj.nec.com/shachter9
4global.html - Variable Elimination (Bucket Elimination,
ElimBel) Dechter (1986) http//citeseer.nj.nec.co
m/dechter96bucket.html - Recommended Books
- Neapolitan (1990) out of print see Pearl
(1988), Jensen (2001) - Castillo, Gutierrez, Hadi (1997)
- Cowell, Dawid, Lauritzen, Spiegelhalter (1999)
- Stochastic Approximation http//citeseer.nj.nec.
com/cheng00aisbn.html
23References 2Machine Learning, KDD, and
Bioinformatics
- Machine Learning, Data Mining, and Knowledge
Discovery - K-State KDD Lab literature survey and resource
catalog (2002) http//www.kddresearch.org/Resource
s - Bayesian Network tools in Java (BNJ) Hsu, Guo,
Joehanes, Perry, Thornton (2002)
http//bndev.sourceforge.net - Machine Learning in Java (BNJ) Hsu, Louis,
Plummer (2002)
http//mldev.sourceforge.net - NCSA Data to Knowledge (D2K) Welge, Redman,
Auvil, Tcheng, Hsu - http//www.ncsa.uiuc.edu/STI/ALG
- Bioinformatics
- European Bioinformatics Institute Tutorial
Brazma et al. (2001) http//www.ebi.ac.uk/microarr
ay/biology_intro.htm - Hebrew University Friedman, Peer, et al. (1999,
2000, 2002) http//www.cs.huji.ac.il/labs/compbio/
- K-State BMI Group literature survey and resource
catalog (2002) http//www.kddresearch.org/Groups/B
ioinformatics
24Acknowledgements
- Kansas State University Lab for Knowledge
Discovery in Databases - Graduate research assistants Haipeng Guo
(hpguo_at_cis.ksu.edu), Roby Joehanes
(robbyjo_at_cis.ksu.edu) - Other grad students Prashanth Boddhireddy,
Siddharth Chandak, Ben B. Perry, Rengakrishnan
Subramanian - Undergraduate programmers James W. Plummer,
Julie A. Thornton - Joint Work with
- KSU Bioinformatics and Medical Informatics (BMI)
group Sanjoy Das (EECE), Judith L. Roe
(Biology), Stephen M. Welch (Agronomy) - KSU Microarray group Scot Hulbert (Plant
Pathology), J. Clare Nelson (Plant Pathology),
Jan Leach (Plant Pathology) - Kansas Geological Survey, Kansas Biological
Survey, KU EECS - Other Research Partners
- NCSA Automated Learning Group (Michael Welge, Tom
Redman, David Clutter, Lisa Gatzke) - University of Manchester (Carole Goble, Robert
Stevens) - The Institute for Genomic Research (John
Quackenbush, Alex Saeed) - International Rice Research Institute (Richard
Bruskiewich)