Approximate Inference for Complex Stochastic Processes: Parametric - PowerPoint PPT Presentation

About This Presentation
Title:

Approximate Inference for Complex Stochastic Processes: Parametric

Description:

Representation: Model problem as a Dynamic Bayesian Network ... time t through the transition model and condition it on our evidence at time t 1. ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 25
Provided by: bren90
Learn more at: http://www.ai.mit.edu
Category:

less

Transcript and Presenter's Notes

Title: Approximate Inference for Complex Stochastic Processes: Parametric


1
Approximate Inference for Complex Stochastic
Processes Parametric Nonparametric Approaches
  • Brenda Ng John Bevilacqua
  • 16.412J/6.834J Intelligent Embedded Systems
  • 10/24/2001

2
Overview
  • Problem statement
  • Given a complex system with many state variables
    that evolve over time, how do we monitor and
    reason about its state?
  • Robot localization and map building
  • Network for monitoring freeway traffic
  • Approach
  • Representation Model problem as a Dynamic
    Bayesian Network
  • Inference Approximate inference techniques
  • Exact inference on approximate model
  • ? Parametrized approach Boyen-Koller Projection
  • Approximate inference on exact model
  • ? Nonparametrized approach Particle Sampling
  • Contribution

Reduce complexity of problem via approximate
methods, rendering the problem of monitoring
complex dynamic system tractable.
3
What is a Bayesian Network?
  • A Bayesian network, or a belief network, is a
    graph in which the following holds
  • A set of random variables makes up nodes of the
    network.
  • A set of directed links connects pairs of nodes
    to denote causality relations between variables.
  • Each node has a conditional probability
    distribution that quantifies the effects that the
    parents have on the node.
  • The graph is directed and acyclic.

Courtesy of Russell Norvig
4
Why Bayesian Networks?
Season
x1
  • Bayesian networks achieve compactness by
    factoring the joint distribution into local,
    conditional distributions for each variable given
    its parents.

x2
x3
Rain
Sprinkler
x4
Wet
Slippery
x5
  • Bayesian networks lend easily to evidential
    reasoning.

5
Dynamic Bayesian Networks
  • Dynamic Bayesian networks capture the process of
    variables changing over time by representing
    multiple copies of state variables, one for each
    time step.
  • A set of variables X denotes the world state at
    time t and a set of sensor variables E denotes
    the observations available at time t.
  • Keeping track of the world means computing the
    current probability distribution over world
    states given all past observations,
    P(XtE1,,Et).
  • Observation model P(EtXt) and transition model
    P(Xt1Xt)

Courtesy of Koller Lerner
6
Why Approximate Inference?
The variables in a DBN can become correlated very
quickly as the network is unrolled.
Consequently, no decomposition of the belief
state is possible and exact probabilistic
reasoning methods is infeasible.
7
Monitoring Task I
Belief state at time t-1 ?t-1?)si
State evolution model T
Prior distribution
Observation at time t
Observation model O
Posterior distribution
Belief state at time t ?t ?)si
8
Monitoring Task II
Belief state at time t ?t?)si
State evolution model T
?(t)
Prior distribution
Observation at time t1
Observation model O
Posterior distribution
Belief state at time t1 ?t1 ?)si
9
Boyen-Koller Projection
  • Algorithm
  • Decide on some computationally tractable
    representation for an approximate belief state,
    e.g. one that decomposes into independent
    factors.
  • Propagate the approximate belief state at time t
    through the transition model and condition it on
    our evidence at time t1.
  • In general, the resulting state for time t1 will
    not fall into the class which we have chosen to
    maintain.
  • Continue to approximate the belief state using
    one that does and continue.
  • Assumptions
  • T is ergodic for error to be bounded.
  • The approximate belief state must be
    decomposable.
  • Approximation Error
  • Gradual accumulation of approximation errors
  • Spontaneous amplification of error due to
    instability

10
Monitoring Task (Revisited)
State evolution model T
?(t)
Observation at time t1
Observation model O
11
Approximation Error
  • Approximation error results from two sources
  • old error inherited from the previous
    approximation ?(t)
  • new error derived from approximating ?(t1?)
    using ?(t1)
  • Suppose that each approximation introduces an
    error of ?, increasing the distance between the
    exact belief state and our approximation to it.
  • How is the error going to be bounded?




12
Idea of Contraction
  • To insure that the error is bounded means T and O
    must reduce the distance between the two belief
    states by a constant factor.
  • Distance Measure
  • If j and y are two distributions over the same
    space W, the relative entropy of j and y is

s(t )
error

s(t )
Point of convergence
13
Contraction Results
These contraction results show that the
approximate belief state and the true belief
state are driven closer to each other at each
propagation of the stochastic dynamics.
Thus, the BK approximation method never
introduces unbounded error! BK is a stable
approximation method!
14
Rao-Blackwellised Particle Filters
  • Sampling-based inference/learning algorithms for
    Dynamic Bayesian Networks
  • Exploit the structure of a DBN by sampling some
    of the variables and marginalizing out the rest
    in order to increase the efficiency of particle
    filtering
  • Lead to more accurate estimates than standard
    Particle Filters

15
Particle Filtering
  • Resample Particles
  • 2) Propagate according to action a1
  • 3) Reweight according to observation zt1

16
Rao-Blackwellization
  • Conditions for using RBPFs
  • The system contains discrete states and discrete
    observations.
  • The system contains variables which can be
    integrated out analytically.
  • Algorithm
  • Decompose dynamic network via factorization.
  • Identify variables whose value can be discerned
    by observation of the system and other variables.
  • Perform particle filtering on these variables and
    compute relevant sufficient statistics.

17
Example of Rao-Blackwellization
  • Have a system with variables time, position,
    velocity
  • Realize that, given the current and previous
    position and time, velocity can be determined
  • Remove velocity from system model and perform
    particle filtering on position
  • Based on sampled value for position, calculate
    the distribution for the velocity variable

18
Application Robot Localization I
  • Robot Localization Map Building Scenario
  • A robot can move on a discrete, 1D grid.
  • Objective
  • To learn a map of the environment, represented
    as a matrix that stores the color of each grid
    cell.
  • Sources of Stochasticity
  • Imperfect sensors
  • Mechanical faults (motor failure wheel
    slippage)
  • The robot needs to know where it is to learn the
    map, but needs to know the map to figure out
    where it is.

19
Application Robot Localization II
  • A. Exact inference B. RBPF with 50 samples
    C. Fully-factorized BK
  • Summary of results
  • RBPF is able to provide near the same accuracy of
    estimation as exact inference.
  • BK gets confused because it ignores correlations
    between the map cells.

20
Application BAT Network I
  • Analysis procedures
  • Process is monitored using the BK/sampling method
    with some fixed decomposition or some fixed
    number of particles.
  • Performance is compared with the result derived
    from exact inference.

BAT network used for monitoring freeway traffic
21
BAT Results BK
  • The errors obtained in practice is significantly
    lower than that predicted by the theoretical
    analysis.
  • This algorithm works optimally when clusters are
    chosen to correspond to structure of weekly
    correlated subprocesses.
  • Accuracy can be further improved by using
    conditionally independent approximations.
  • Evidence boosts contraction and significantly
    reduce the overall error of our approximation.
    (Good for likely evidence, bad for unlikely
    evidence).
  • By exploiting structure of a process, this
    approach can achieve orders of magnitude faster
    inference with only a small degradation
    inaccuracy.

22
BAT Results Sampling
  • Error drops sharply initially and then the
    improvements become smaller and smaller.
  • The average KL-error changes dramatically over
    the sequence. The spikes correspond to unlikely
    evidence, in which case the samples become less
    reliable. This suggests that a more adaptive
    sampling scheme may be advantageous.
  • Particle filtering gives a very good estimate of
    the true distribution over the variable.

23
BK vs. RBPF
  • Boyen-Koller projection
  • Applicable for inference on networks with
    structure that lends easily to decomposition
  • Requires transition model to be ergodic
  • Rao-Blackwellized particle filters
  • Applicable for inference on networks with
    redundant information
  • Applicable for difficult distributions

24
Contributions
  • Introduced Dynamic Bayesian Networks and
    explained their role in inference of complex
    stochastic processes
  • Examined two different approaches to analyzing
    DBNs
  • Exact Computation on an approximate model BK
  • Approximate Computation on an exact model RBPF
  • Compared performance and applicability of these
    two approaches
Write a Comment
User Comments (0)
About PowerShow.com