Integrating Probabilistic Modeling and Representation-Building - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

Integrating Probabilistic Modeling and Representation-Building

Description:

Pre-representational - how to describe the problem as formalized input? ... Pelikan, Goldberg, & Lobo, 1999. 9. The Bayesian Optimization Algorithm ... – PowerPoint PPT presentation

Number of Views:19

Avg rating:3.0/5.0

Slides: 37

Provided by: moshe8

Learn more at: http://metacog.org

Category:

more less

Transcript and Presenter's Notes

Title: Integrating Probabilistic Modeling and Representation-Building

1
Integrating Probabilistic Modeling and
Representation-Building

Doctoral Thesis Proposal
Moshe Looks
March 2nd, 2006

2
Outline

Background
Thesis
Proposed Approach
Proposed Goals

3
Problems and Problem-Solving

Levels of Analysis
Pre-representational - how to describe the
problem as formalized input?
Post-representational - how to solve the
formal problem?

4
Problems and Problem-Solving

Hofstadter, 1985
Knob creation - discovering novel values to
parameterize
Knob twiddling - adjusting the values of
existing parameters

5
General Optimization

Formal Representation
Solution space S (e.g., 0,1n)
Scoring function maps solutions to reals
Solving the problem means maximizing the score
To outperform enumeration and random sampling
assume some knowledge of the space

6
What Knowledge?

Complete separability would be nice
Near-decomposability (Simon, 1969) is more
realistic

Weaker Interactions
Stronger Interactions
7
How to Exploit This?

Separability Independence Assumptions
Given a prior over the solution space
Represented as a probability vector
Sample solutions from the model (distribution)
Update the model toward higher-scoring points
Iterate...
Baljua, 1994
Works surprisingly well, even when the
assumptions dont hold completely
when the interactions are weak
or there is little deception

8
How to Exploit This?

A known correct problem decomposition may be
incorporated into the model
Mühlenbein Manhig, 1998
An unknown decomposition may be learned
algorithms that adaptively learn such linkages
are termed competent
Optimization via probabilistic modeling is
surveyed in
Pelikan, Goldberg, Lobo, 1999

9
The Bayesian Optimization Algorithm

Represents problem decomposition as a Bayes Net
learned greedily, via a network scoring metric
Augmented in Hierarchical BOA
BOA Bayesian Optimization Algorithm
uses Bayes Nets with local structure
allows smaller model-building steps
leads to more accurate models
restricted tournament replacement
promotes diversity
Robust and scalable results on problems with both
known and unknown decompositions
Pelikan Goldberg, 2003

10
Decompositions Representations

Competent adaptive optimization algorithms
can overcome a poor choice of representation
via problem decomposition
Requires the existence of a problem decomposition
compact
satisficing
in the model-space searched by the algorithm

11
Decompositions Representations

I propose extending methods such as hBOA to
domains where a compact decomposition does not
exist directly in the user-specified problem

12
Representation-Building An Example

Optimizing over strings (X1, X2, , Xn)
A separate distribution maintained for each xi
What if there is positional ambiguity?
Some features refer to absolute position, some do
not
E.g., DNA - a gene's positions is sometimes
critical, and sometimes irrelevant
Consider abstracted features, defined in terms of
"base-level variables" (Xis)
E.g., contains a prime number of ones
E.g., does not contain the substring AATGC
Model-based instance generation (sampling) must
be generalized to accommodate features

13
Representation-Building An Example

Exploit background knowledge to choose effective
feature-classes
E.g., motifs (variable-position substrings)
motifs may be prespecified
or learned via information-theoretic criteria
Demonstrated performance gains with learned
motifs (with respect to the BOA)
Looks, 2006 (in submission)

14
Representation-Building - Observations

A superior decomposition may exist that cannot be
compactly represented
Generalize the representational language?
Computationally intractable!
Representation-building mechanisms
Tractable if they incorporate inductive bias
Goal is to provide salient parameters to the
optimization algorithm

15
Learning Open-Ended Hierarchical Structures

User selects (pre-representationally)
a set of functions
E.g., , -, , log, sin
a set of terminals
E.g., x, y, z, 0, 1
a scoring function over trees
Decrease pre-representational effort
Solution structure and content must both be
learned
Claim
Representation-building is thus correspondingly
more instrumental in finding a compact problem
decomposition

16
Current Evolutionary Approaches

Genetic Programming (GP)
Koza, 1992
Many variants
Population-based search with new instances
generated via
swapping of subtrees (crossover)
random insertions/deletions/modifications
(mutation)

17
Current Evolutionary Approaches

Probabilistic model building approaches without
decomposition-learning
Probabilistic Incremental Program Evolution
Salustowicz Schmidhuber, 1997
Hierarchical generalization, 1998
Based on absolute tree-position (address from the
root)
Assumes complete independence
Estimation-of-Distribution Programming
Yanai Iba, 2003
Assumes a fixed network of dependency
relationships

18
Current Evolutionary Approaches

Probabilistic model building approaches with
decomposition-learning
Grammar-learning methods
Shan et al., 2004
Bosman de Jong, 2004
Based on relative tree-position
Methods from competent optimization algorithms
Extended Compact Genetic Programming
Sastry Goldberg, 2003
Bayesian-Optimization-Algorithm Programming
Looks, Goertzel, Pennachin, 2005

19
Claim

Compact problem decompositions rarely exist for
non-trivial problems with generic representation
of general expressions
generic representation
E.g., trees
E.g., grammars
general expressions
E.g., Boolean formulae
E.g., symbolic equations
E.g., finite automatons

20
Justification

Solution scores are assumed to only vary based on
semantics
Determining (semantic) equivalence of general
expressions is NP-hard!
Says nothing about approx. decompositions
However, a compact decomposition derived from a
generic representation is still implausible
assuming no knowledge of semantics
and no explicit computational effort towards
specialized representational reduction

21
Thesis

General expressions may be organized so that
compact decompositions may often be found for
non-trivial problems, via representation-building
Representation-building will require
knowledge of semantics (i.e., domain knowledge)
explicit computational effort towards
representational reduction
Comparable to the notion of a heuristic solver
for a NP-hard problems

22
Meta-Adaptive Programming (MAP)

Generate a random population of trees
Select promising trees from the population for
modeling
Build a parameterized representation of these
trees, and transform them into parameter
assignments
Model these assignments using a Bayesian network
with local structure to discover the problem
decomposition
Sample the model to generate new parameter
assignments, apply the inverse transformation to
convert them into trees, and integrate them into
the population
Go to step 2.

23
Constructing a Parameterized Representation
24
Simplification Normalization
25
Alignment
26
Parameterization
27
Constructing a Parameterized Representation

Simplify trees via rewrite rules and convert them
into a normal form
Incrementally align all trees
Based on an alignment scoring function
May be solved optimally via dynamic programming
Unfortunately, is NP-hard for
unordered operators (e.g., )
multiple trees
Pairwise greedy alignment (agglomerative
clustering)
quadratic in the number of trees
Feng Doolittle, 1987
For unordered operators, do greedy alignment of
children

28
Proposed Goals

Theoretical
Modeling tree growth
GP schema theory
Experimental (and Implementational)
Adversarial problems
Normal forms
Challenge problems
Conceptual
The role of representation-building in AI

29
Theoretical Goals

Modeling Tree Growth
How does the average / maximal tree size change
over time?
GP is prone to bloat
Cf. Langon Poli, 2002
Probabilistic modeling approaches may avoid this
pressure toward solutions that are easy to model

30
Theoretical Goals

Tree growth in meta-adaptive programming
is constrained by the size of the representations
in turn constrained by the alignment scoring
functions
Alignment scoring function
may lead to a completely bounded space
may lead to unbounded growth
Subject to the fitness functions
Goal is to analyze this theoretically
leading to speed limit results for scoring
functions

31
Theoretical Goals

Exact GP schema theory
Recently developed
Cf. Poli Langdon, 2002
Equivalent to Markov Chain Models
Provides exact distributional data for the next
generation based on fitness
Intractable for real problems!
Goal is to analyze the differences in schema
processing between GP and MAP
crossover (subcomponent mixing) is not random
Controlled by alignment and probabilistic
modeling
no notion of problem semantics in GP
In GP, schema (2,a) and (a,a) are completely
separate

32
Theoretical Goals - Checklist
Goal Status
Modeling Tree Growth Content-Free Binary Trees Binary Trees With Content Effects of Rewrite Rules ???
Schema-Processing Comparative Analysis ?
33
Experimental Goals

Design Benchmarking on Adversarial Problems
Decomposition should be known to the user, not
the algorithm
Dimensions of Deceptiveness for Trees
Relative-position (subtree) deceptiveness
Absolute-position deceptiveness
Operator deceptiveness

34
Experimental Goals

Normal Forms
Heuristically remove redundancy
Preserve hierarchical structure
Domains
Simple Agent Control (Artificial Ants)
E.g., progn(turn-left, turn-right, move) ? move
Boolean Formulae
CNF doesnt preserve hierarchical structure
Holmans normal form does
Advanced Agent Control
Including general programmatic constructs

35
Experimental Goals - Checklist
Goal Status
Modeling and Sampling with Features ?
Adversarial Problems ?
Domains Simple Agent Control Boolean Formulae (CNF) Boolean Formulae (Hierarchical) Advanced Agent Control ? ? ??
Tree Alignment and Representation-Building 75
36
Conceptual Goals