Title: Variational Inference for the Indian Buffet Process
1Variational Inference for the Indian Buffet
Process
- Finale Doshi-Velez, Kurt T. Miller, Jurgen Van
Gael and Yee Whye Teh - AISTATS 2009
- Presented by John Paisley, Duke University,
Dept. of ECE
2Introduction
- This paper provides variational inference
equations for the stick-breaking construction of
the Indian buffet process (IBP). In addition,
bounds are given on truncated stick-breaking
approximations of the IBP to the infinite
stick-breaking IBP. - Outline of Presentation
- Review of IBP and stick-breaking construction
- Variational inference for the IBP
- Truncation error bounds for variational inference
- Results on a linear-Gaussian model for toy and
real data
3Indian Buffet Process
- First customer selects features
- The ith customer selects feature k with
probability , fraction of all customers
selecting this feature. - The ith customer then selects new
features.
Below is the probability of the binary matrix Z.
The top term is the probability of K dishes,
bottom is for permutation.
4The Stick-Breaking Construction of the IBP
- Rather than marginalizing out
, being the probability of selecting a
dish, a stick-breaking construction can be used.
- (Note The above generative process is written
by the presenter. The probability values are
presented in the paper in decreasing order as
below) - This stick-breaking representation is for this
specific parameterization of the beta
distribution.
Y.W. The, D. Gorur Z. Ghahramani (2007).
Stick-breaking construction for the Indian buffet
process. 11th AISTAT.
5VB Inference for the Stick-Breaking Construction
Focus on inference for the parameters A lower
bound approximation needs to be made for one of
the terms. This is given at right, where the
authors introduce a multinomial distribution, q,
and optimize for this parameter (lower
right). This is for the likelihood of z, the
posterior of v is more complicated. Using this
multinomial lower bound, terms decompose
independently for each vm and we get a closed
form exponential family update.
6Truncation Error for VB Inference
Given a truncation of the stick-breaking
construction at level K, how close are we to the
infinite model? A bound is given using the same
motivation as Ishwaran James in their
calculation for the Dirichlet process.
H. Ishwaran L.F. James (2001). Gibbs sampling
methods for stick-breaking priors. JASA.
After deriving approximations, an upper bound
is, At right is a comparison of this bound
with an estimation of this value using 1000 Monte
Carlo simulations for N 30, \alpha 5.
7Results Synthetic Data
(lower left) Randomly generated data and
calculated the log-likelihoods of test data using
the inferred models as a function of time. This
indicates that variational inference is both
better and faster. (right) More information
about speed for toy problem.
8Results Two Real Datasets
- Yale Faces 721, 32 x 32 images of 14 people with
different expressions and lighting. - Speech Data 245 observations from 10 microphones
and 5 speakers - At right, we can see that the variational
inference methods outperforms and is faster than
Gibbs sampling for the Yale Faces - Performance and speed is worse for the speech
dataset. A reason is that the dataset is only 10
dimensional, while Yale is 1032-D. In this small
dimensional case, inference is fast for MCMC and
the VB approximation becomes apparent.