Title: Variational Bayesian Image Processing on Stochastic Factor Graphs
1Variational Bayesian Image Processing on
Stochastic Factor Graphs
- Xin Li
- Lane Dept. of CSEE
- West Virginia University
2Outline
- Statistical modeling of natural images
- From old-fashioned local models to newly-proposed
nonlocal models - Factor graph based image modeling
- A powerful framework unifying local and nonlocal
approaches - EM-based inference on stochastic factor graphs
- Applications and experimental results
- Denoising, inpainting, interpolation,
post-processing, inverse halftoning, deblurring
... ...
3Cast Signal/Image Processing Under a Bayesian
Framework
- Image restoration (Besag et al.1991)
- Image denoising (SimoncelliAdelson1996)
- Interpolation (Mackay1992) and super-resolution
(Schultz Stevenson1996 ) - Inverse halftoning (Wong1995)
- Image segmentation (BoumanShapiro1994)
Likelihood (varies from application to
application)
Image prior (the focus of this talk)
x Unobservable data
y Observation data
4Statistical Modeling of Natural Imagesthe
Pursuit of a Good Prior
- Local models
- Markov Random Field (MRF) and its extensions
(e.g., 2D Kalman-filtering, Field-of-Expert) - Sparsity-based DCT, wavelets, steerable
pyramids, geometric wavelets (edgelets,
curvelets, ridgelets, bandelets) - Nonlocal models
- Bilateral filtering (Tomasi et al. ICCV1998)
- Texture synthesis (EfrosLeung ICCV1999)
- Exemplar-based inpainting (Criminisi et al.
TIP2004) - Nonlocal mean denoising (Buades et al.
CVPR2005) - Total Least-Square denoising (HirakawaParks
TIP2006) - Block-matching 3D denoising (Dabov et al.
TIP2007)
5Introducing a New Language of Factor Graphs
- Why Factor Graphs?
- The most general form of graphical probability
models (both MRF and Bayesian networks can be
converted to FGs) - Widely used in computer science and engineering
(forward-backward algorithm, Viterbi algorithm,
turbo decoding algorithm, Pearls belief
propagation algorithm, Kalman filter1) - What is Factor Graph?
- a bipartite graph that expresses which variables
are arguments of which local functions - Factor/function node (solid squares) vs. variable
nodes (empty circles)
f1
f2
f3
f4
LF ? V
f1
1,2,4
f2
3,6
f3
5,7
B1
B2
B7
B8
B3
B4
B5
B6
f4
7,8
1Kschischang, F.R. Frey, B.J. Loeliger, H.-A.,
"Factor graphs and the sum-product algorithm,"
IEEE Transactions on Information Theory,,
vol.47, no.2, pp.498-519, Feb 2001
6Variable NodesImage Patches
- Neuroscience receptive fields of neighboring
cells in human vision system have severe
overlapping - Engineering patch has been under the disguise of
many different names such as windows in digital
filters, blocks in JPEG and the support of
wavelet bases
Cited from D. Hubel, Eye, Brain and Vision,
1988
7Factorization the Art of Statistical Image
Modeling
ML
SP
Range-Markovian
Locally linear embedding1 (perceptual similarity
defines the neighborhood)
Domain-Markovian
Wavelet-based statistical models (geometric
proximity defines the neighborhood)
1S.T. Roweis and L.K. Saul, Nonlinear
Dimensionality Reduction by Locally Linear
Embedding (22 December 2000),Science 290 (5500),
2323.
8Unification Using Factor Graphs
B1
x
f1
f2
f3
f4
B2
B3
B1
B2
B3
B4
B0
naive Bayesian (DCT/wavelet-based models)
B0
B1
B2
B0
B1
B3
B2
B3
kNN/kmeans clustering (nonlocal image models)
MRF-based
9 A Manifold Interpretation of Nonlocal Image Prior
M?RN
B0
B1
Bk
How to maximize the sparsity of a
representation? Conventional wisdom adapt basis
to signal (e.g., basis pursuit, matching
pursuit) New proposal adapt signal to basis (by
probing its underlying organization principle)
10Organizing Principle Latent Variable L
fC
fB
image denoising
fA
L
B11
B14
B13
B12
image inpainting
B22
B21
B23
B24
B31
B33
B32
B34
B41
B44
B43
B42
x
y
P(yx)
image coding
image halftoning
sparsifying transform
image deblurring
L
Nature is not economical of structures but
organizing principles. - Stanislaw M. Ulam
11Maximum-Likelihood Estimation of Graph Structure L
loop over every factor node fj
Pack into 3D Array D
For. Trans.
Update the estimate of L
Update the estimate of x
P(yx)
Coring
Inv. Trans.
B0
Bk
B1
unpack into 2D patches
A variational interpretation of such
EM-based inference on FGs is referred to the paper
12Problem 1 Image Denoising
PSNR(DB) PERFORMANCE COMPARISON AMONG DIFFERENT
SCHEMES FOR 12 TEST IMAGES ATsw 100
SSIM PERFORMANCE COMPARISON AMONG DIFFERENT
SCHEMES FOR 12 TEST IMAGES ATsw 100
BM3D (kNN,iter2)
SFG (kmeans,iter20)
sw
org. 200 400 600 800 1000
13Problem 2 Image Recovery
DCT FoE EXP BM3D LSP SFG
x
y
top-down test1, test3, test5
PSNR(dB) performance comparison
SSIM performance comparison
Local models DCT, FoE and LSP Nonlocal models
EXP, BM3D1 and SFG 1Our own extension into image
recovery
top-down test2, test4, test6
14Problem 3 Resolution Enhancement
FG
x
y
bicubic
NEDI1
31.76dB 32.36dB 32.63dB
34.71dB 34.45dB 37.35dB
28.70dB 27.34dB 28.19dB
18.81dB 15.37dB 16.45dB
1X. Li and M. Orchard, New edge directed
interpolation, IEEE TIP, 2001
15Problem 4 Irregular Interpolation
x
y
KR
FG1
DT
29.06dB 31.56dB 34.96dB
DT- Delauney Triangle-based (griddata under
MATLAB) KR- Kernal Regression-based (Takeda et
al. IEEE TIP 2007 w/o parameter optimization)
28.46dB 31.16dB 36.51dB
26.04dB 24.63dB 29.91dB
17.90dB 18.49dB 29.25dB
25 kept
1X. Li, Patch-based image interpolation
algorithms and applications, Inter. Workshop
on Local and Non-Local Approximation (LNLA)2008
16Problem 5 Post-processing
SFG-enhanced at rate of 0.32bpp (PSNR33.22dB)
JPEG-decoded at rate of 0.32bpp (PSNR32.07dB)
SPIHT-decoded at rate of 0.20bpp (PSNR26.18dB)
SFG-enhanced at rate of 0.20bpp (PSNR27.33dB)
Maximum-Likelihood (ML) Decoding
Maximum a Posterior (MAP) Decoding
17Problem 6 Inverse Halftoning
without nonlocal prior1 (PSNR31.84dB, SSIM0.8390
)
with nonlocal prior (PSNR32.82dB, SSIM0.8515)
1Available from Image Halftoning Toolbox
released by UT-Austin Researchers
18Conclusions and Perspectives
- Despite the rich structures in natural images,
the underlying organization principle is simple
(self-similarity - We have shown how similarity can lead to sparsity
in a nonlinear representation of images - FG only represents one mathematical language for
interpreting such principle (multifractal
formalism is another) - Image processing (low-level vision) could benefit
from data clustering (higher-level vision) how
does human visual cortex learn to decode the
latent variable L through unsupervised learning?
Reproducible Research MATLAB codes accompanying
this work are available at http//www.csee.wvu.edu
/xinl/sfg.html (more will be added)