CIS732Lecture1320011004

About This Presentation

Title:

CIS732Lecture1320011004

Description:

Diagnosis (medical, equipment) Pattern recognition (image, speech) Prediction ... Histogram of estimated fitness for all 8! = 40320 permutations of Asia variables. ... – PowerPoint PPT presentation

Number of Views:41

Avg rating:3.0/5.0

Slides: 32

Provided by: lindaj160

Learn more at: https://people.cis.ksu.edu

Category:

Tags: cis732lecture1320011004

more less

Transcript and Presenter's Notes

Title: CIS732Lecture1320011004

1
Ben Perry M.S. Thesis Defense
A Genetic Algorithm for Learning Bayesian
Network Adjacency Matrices from Data
Benjamin B. Perry Laboratory for Knowledge
Discovery in Databases Kansas State
University http//www.kddresearch.org http//www.
cis.ksu.edu/bbp9857
2
Overview

Bayesian Network
Definitions and examples
Inference and learning
Genetic Algorithms
Structure Learning Background
Problem
K2 algorithm
Sparse Candidate
Improving K2 Permutation Genetic Algorithm
(GASLEAK)
Shortcoming greedy, sensitive to ordering
Permutation GA
Masters thesis Adjacency Matrix GA (SLAM GA)
Rationale
Evaluation with Known Bayesian Networks
Summary

3
Bayesian Belief Networks (BBNS)Definition

Bayesian Network
Directed acyclic graph
Vertices (nodes) denote events, or states of
affairs (each a random variable)
Edges (arcs, links) denote conditional
dependencies, causalities
Model of conditional dependence assertions (or CI
assumptions)
Example (Bens Presentation BBN)
(sprinkler)
General Product (Chain) Rule for BBNs

Appearance Good, Bad
Ben is nervous Extremely, Yes, No
Sleep Narcoleptic Well Bad All-nighter
X2
X1
X4
X5
Bens presentation Good, Not so good, Failed
miserably
X3
Memory Elephant, Good, Bad, None
P(Well, Good, Good, No, Good) P(G) P(G W)
P(G W) P(N G, G) P(G N)
4
Graphical Modelsof Probability Distributions

Idea
Want model that can be used to perform inference
Desired properties
Correlations among variables
Ability to represent functional, logical,
stochastic relationships
Probability of certain events
Inference Decision Support Problems
Diagnosis (medical, equipment)
Pattern recognition (image, speech)
Prediction
Want to Learn Most Likely Model that Generates
Observed Data
Under certain assumptions (Causal Markovity), it
has been shown that we can do it
Given data D (tuples or vectors containing
observed values of variables)
Return directed graph (V, E) expressing target
CPTs
NEXT Genetic algorithms

5
Genetic Algorithms

Idea
Emulate natural process of survival of the
fittest (Example Roaches adapt)
Each generation has many diverse individuals
Each individual competes for the chance to
survive
Most common approach best individuals live to
the next generation and mate
Produce children with traits from both parents
If parents are strong, children might be stronger
Major components (operators)
Fitness function
Chromosome manipulation
Cross-over (Not the John Edward type!),
mutation
From (Educated?) Guess to Gold
Initial population typically random or not much
better than random bad scores
Performs well with a non-deceptive search space
and good genetic operators
Ability to escape local optima with mutations.
Not guaranteed to get the best answer, but
usually gets close

6
Learning StructureK2 Algorithm

Algorithm Learn-BBN-Structure-K2 (D, Max-Parents)
FOR i ? 1 to n DO // arbitrary ordering of
variables x1, x2, , xn
WHILE (Parentsxi.Size lt Max-Parents) DO // find
best candidate parent
Best ? argmaxjgti (P(D xj ? Parentsxi) // max
Dirichlet score
IF (Parentsxi Best).Score gt
Parentsxi.Score) THEN Parentsxi Best
RETURN (Parentsxi i ? 1, 2, , n)
A Logical Alarm Reduction Mechanism Beinlich et
al, 1989
BBN model for patient monitoring in surgical
anesthesia
Vertices (37) findings (e.g., esophageal
intubation), intermediates, observables
K2 found BBN different in only 1 edge from gold
standard (elicited from expert)

7
Learning StructureK2 downfalls

Greedy (may fall into local maxima)
Highly dependent upon node ordering
Optimal node ordering must be given
If optimal order is already known, an expert
could probably create the network
Number of orderings consistent with DAGs is
exponential (n!)

8
Learning StructureSparse Candidate

General Idea
Inspect k-best parent candidates at a time. (K2
only inspects one)
k is typically very small 5 k 15
Exponential to the order of k
Algorithm
Loop until no improvements or iteration limit
exceeds
For each node, select the top k parent
candidates (mutual information or m_disc)
RestrictBuild a network by manipulating
parents (add, remove, reverse from candidate set
for each node) . Only accept changes that
maximizes the network score (Minimum Descriptor
Length) Maximize phase
Must handle cycles.. expensive.
K2 gives this to us for free
Next Improving K2

9
GASLEAKA Permutation GA for Variable Ordering
10
Properties of the Genetic Algorithm

Elitist
Chromosome representation
Integer permutation ordering
Sample chromosome in a BBN of 5 nodes might look
like 3 1 2 0 4
Seeding
Random shuffle
Operators
Order crossover
Swap mutation
Fitness
RMSE
Job farm
Java-based Utilize many machines regardless of
OS

11
GASLEAK results

Not encouraging
Bad fitness functionor bad evidence b.v.
Many graph errors

12
Masters Thesis SLAM GA

SLAM GA Structure Learning Adjacency Matrix
Genetic Algorithm
Initial population- tried several approaches
Completely Random Bayesian Networks (Box-Muller,
Max parents)
Many illegal structures wrote fixCycles
algorithm.
Random networks generated from parents
pre-selected by the Restrict phase of Sparse
Candidate
Performed better than random
Aggregate of k learned networks from K2 given
random orderings (cycles eliminated) Best
approach

13
Aggregator Instantiater
Training Data
K2 Manager
BBN
K2
D
Random Order
1
K2
Random Order
BBN
2
Aggregator
Aggregate BBN
. . . .
BBN
BBN
k
For small networks, k1 is best. For larger
networks, k2 is best.
14
SLAM GA

Chromosome representation
Edge matrix n2 bits
Each bit represents a parent edge to node.
1 parent, 0 not parent
Operators
Crossover Swap parents, fix cycles.

15
SLAM GA Crossover
16
SLAM GA

Chromosome representation
Edge matrix n2
Each bit represents a parent edge to node.
1 parent, 0 not parent
Operators
Crossover Swap parents, fix cycles.
Mutation Reverse, delete, or add a random number
of edges. Fix cycles.
Fitness
Total Bayesian Dirichlet equivalence score for
each node

17
Results - Asia
Best of first generation
Actual
15 Graph Errors
18
Results Asia
19
Results - Poker
Best of first generation
Actual
11 Graph Errors
20
Results - Poker
21
Results - Golf
Best of first generation
Actual
11 Graph Errors
22
Results - Golf
23
Results Boerlage92
Initial
Actual
24
Results - Boerlage92
25
Results - Alarm
26
Final Fitness Values
27
K2 vs. SLAM GA

K2
Very good if ordering is known
Ordering is often not known
Greedy, very dependent on ordering.
SLAM GA
Stochastic falls out of local optima trap
Can improve on bad structures learned by K2
Takes much longer than K2

28
GASLEAK vs. SLAM GA

GASLEAK
Gold network never recovered
Much more computationally-expensive
K2 is run on each new individual each
generation
Each chromosome must be scored
Final network has many graph errors
SLAM GA
For small networks, gold standard network often
recovered.
Relatively few graph errors for final network.
Less computationally intensive
Initial population most expensive
Each chromosome must be scored

29
SLAM GA Ramifications

Effective structure learning algorithm
Ideal for small networks
Improvement over GASLEAK
SLAM GA faster in spite of same GA parameters
SLAM GA more accurate
Improvement over K2
Aggregate algorithm produces better initial
population
Parent-swapping crossover technique effective
Diversifies search space while retaining past
information

30
SLAM GA Future Work

Parameter tweaking
Better fitness function
Several bad structures score better than gold
standard
GA works fine
Intelligent mutation operator
Add edges from pre-qualified set of candidate
parents
New instantiation methods
Use GASLEAK
Other structure-learning algorithms
Scalability
Job farm

31
Summary

Bayesian Network
Genetic Algorithms
Learning Structure K2, Sparse Candidate
GASLEAK
SLAM GA

Write a Comment

User Comments (0)

About PowerShow.com

CIS732Lecture1320011004 - PowerPoint PPT Presentation

CIS732Lecture1320011004

Diagnosis (medical, equipment) Pattern recognition (image, speech) Prediction ... Histogram of estimated fitness for all 8! = 40320 permutations of Asia variables. ... – PowerPoint PPT presentation