Title: Knowledge Engineering for Bayesian Networks
1Knowledge Engineering for Bayesian Networks
School of Computer Science and Software
Engineering Monash University
2Overview
- Representing uncertainty
- Introduction to Bayesian Networks
- Syntax, semantics, examples
- The knowledge engineering process
- Case Study Intelligent Tutoring
- Summary of other BN research
- Open research questions
3Sources of Uncertainty
- Ignorance
- Inexact observations
- Non-determinism
- AI representations
- Probability theory
- Dempster-Shafer
- Fuzzy logic
4Probability theory for representing uncertainty
- Assigns a numerical degree of belief between 0
and 1 to facts - e.g. it will rain today is T/F.
- P(it will rain today) 0.2 prior probability
(unconditional) - Posterior probability (conditional)
- P(it wil rain today rain is forecast) 0.8
- Bayes Rule P(HE) P(EH) x P(H)
-
P(E)
5Bayesian networks
- Directed acyclic graphs
- Nodes random variables,
- R it is raining, discrete values T/F
- T temperature, cts or discrete variable
- C colour, discrete values red,blue,green
- Arcs indicate dependencies (can have causal
interpretation)
6Bayesian networks
- Conditional Probability Distribution (CPD)
- Associated with each variable
- probability of each state given parent states
Jane has the flu
P(FluT) 0.05
Models causal relationship
Jane has a high temp
P(TeHighFluT) 0.4 P(TeHighFluF) 0.01
Models possible sensor error
Thermometer temp reading
P(ThHighTeH) 0.95 P(ThHighTeL) 0.1
7BN inference
- Evidence observation of specific state
- Task compute the posterior probabilities for
query node(s) given evidence.
Flu
8BN software
- Commerical packages Netica, Hugin, Analytica
(all with demo versions) - Free software Smile, Genie, JavaBayes,
- http//HTTP.CS.Berkeley.EDU/murphyk/Bayes/bnsoft.
html - Example running Netica software
9Decision networks
- Extension to basic BN for decision making
- Decision nodes
- Utility nodes
- EU(Action) ? p(oAction,E) U(o)
- o
- choose action with highest expect utility
- Example
10Elicitation from experts
- Variables
- important variables? values/states?
- Structure
- causal relationships?
- dependencies/independencies?
- Parameters (probabilities)
- quantify relationships and interactions?
- Preferences (utilities)
11Expert Elicitation Process
- These stages are done iteratively
- Stops when further expert input is no longer cost
effective - Process is difficult and time consuming.
- Current BN tools
- inference engine
- GUI
- Next generation of BN tools?
12Knowledge discovery
- There is much interest in automated methods for
learning BNS from data - parameters, structure (causal discovery)
- Computationally complex problem, so current
methods have practical limitations - e.g. limit number of states, require variable
ordering constraints, do not specify all arc
directions - Evaluation methods
13The knowledge engineering process
- 1. Building the BN
- variables, structure, parameters, preferences
- combination of expert elicitation and knowledge
discovery - 2. Validation/Evaluation
- case-based, sensitivity analysis, accuracy
testing - 3. Field Testing
- alpha/beta testing, acceptance testing
- 4. Industrial Use
- collection of statistics
- 5. Refinement
- Updating procedures, regression testing
14Case Study Intelligent tutoring
- Tutoring domain primary and secondary school
students misconceptions about decimals - Based on Decimal Comparison Test (DCT)
- student asked to choose the larger of pairs of
decimals - different types of pairs reveal different
misconceptions - ITS System involves computer games involving
decimals - This research also looks at a combination of
expert elicitation and automated methods (UAI2001)
15Expert classification of Decimal Comparison Test
(DCT) results
H high (all correct or only one wrong)
L low (all wrong or only one correct)
16The ITS architecture
Adaptive Bayesian Network
Inputs
Student
Generic BN model of student
Decimal comparison test (optional)
Item
Answers
Answer
- Diagnose misconception
- Predict outcomes
- Identify most useful information
Information about student e.g. age (optional)
Computer Games
Hidden number
Answer
Classroom diagnostic test results (optional)
Feedback
Answer
Flying photographer
- Select next item type
- Decide to present help
- Decide change to new game
- Identify when expertise gained
Item type
System Controller Module
Item
Decimaliens
New game
Sequencing tactics
Number between
Help
Help
.
Report on student
Classroom Teaching Activities
Teacher
17Expert Elicitation
- Variables
- two classification nodes fine and coarse (mut.
ex.) - item types (i) H/M/L (ii) 0-N
- Structure
- arcs from classification to item type
- item types independent given classification
- Parameters
- careless mistake (3 different values)
- expert ignorance - in table (uniform
distribution)
18Expert Elicited BN
19Evaluation process
- Case-based evaluation
- experts checked individual cases
- sometimes, if prior was low, true
classification did not have highest posterior
(but usually had biggest change in ratio) - Adaptiveness evaluation
- priors changes after each set of evidence
- Comparison evaluation
- Differences in classification between BN and
expert rule - Differences in predictions between different BNs
20Comparison evaluation
- Development of measure same classification,
desirable and undesirable re-classification - Use item type predictions
- Investigation of effect of item type granularity
and probability of careless mistake
21Comparison expert BN vs rule
Undesirable
Desirable
Same
22Results
Same
varying prob. of careless mistake
Desir.
Undes.
varying granularity of item type 0-N and H/M/L
23Investigation by Automated methods
- Classification (using SNOB program, based on MML)
- Parameters
- Structure (using CaMML)
24Results
25Another Case Study Seabreeze prediction
- 2000 Honours project, joint with Bureau of
Meteorology (with Russell Kennett and Kevin Korb,
PAKDD2001 paper, TR) - BN network built based on existing simple expert
rule - Several years data available for Sydney
seabreezes - CaMML (Wallace and Korb, 1999) and Tetrad-II
(Spirtes et al. 1993) programs used to learn BNs
from data - Comparative analysis showed automated methods
gave improved predictions.
26Other BN-related projects
- DBNS for discrete monitoring (PhD, 1992)
- Approximate BN inference algorithms based on a
mutual information measure for relevance (with
Nathalie Jitnah, ICONIP97, ECSQARU97,
PRICAI98,AI99) - Plan recognition DBNs for predicting users
actions and goals in an adventure game (with
David Albrecht, Ingrid Zukerman,
UM97,UMUAI1999,PRICAI2000) - Bayesian Poker (with Kevin Korb, UAI99, honours
students)
27Other BN-related projects (cont.)
- DBNs for ambulation monitoring and fall diagnosis
(with biomedical engineering, PRICAI96) - Autonomous aircraft monitoring and replanning
(with Ph.D. student Tim Wilkin, PRICAI2000) - Ecological risk assessment (2003 honours project
with Water Studies Centre) - Writing a textbook! (with Kevin Korb)
- Bayesian Artificial Intelligence
28Open Research Questions
- Methodology for combining expert elicitation and
automated methods - expert knowledge used to guide search
- automated methods provide alternatives to be
presented to experts - Evaluation measures and methods
- may be domain depended
- Improved tools to support elicitation
- e.g. visualisation of d-separation
- Industry adoption of BN technology