Title: Inspiration from Information Theory
1Inspiration from Information Theory
- The Shannon challenge
- Record breaking progress Turbo, LDPC
- Distributed inference in factor graphs
- The sum-product algorithm
- Connections to control and some thoughts
Material from McElicie, Richardsson and many
other sources
2A communication channel
3Shannon's channel capacity theorem (1948)
- There exists codes that can achieve arbitrarily
reliable transmission at any rate RltC
4(No Transcript)
5Decades of slow progress
- 1948-1993 Algebraic codes (Reed-Muller, BCH,
Reed-Solomon etc). Convolutional codes.
Concatenated codes. - Almost all codes are good, expect the ones we
know of. Efficient decoding bottleneck. - Early 90s Use convolutional coding with Viterbi
algorithm for decoding. - By 1992 within 3dB of Shannon limit.
6Record-breaking codes (1993-)
- Turbo codes
- Rediscovery of Gallager's Low Density Parity
Check Codes (1962) - Iterative decoding by heuristic message-passing
algorithms - Within tenths of dB from Shannon limit !
7Turbo and LDPC performance
Well done! What's next?
8Performance
9So how is it done?
- And what inspiration can we get to our field?
Iterative methods that use feedback ideas,
distributed computing, message passing No
guarantee of optimality or convergence to Maximum
Likelihood estimates Some analysis is possible
anyway The Sum-Product algorithm (special
cases Viterbi, BCJR, Kalman filter, Belief
propagation in neural networks ...)
10Lesson learned
11(No Transcript)
12(No Transcript)
13(No Transcript)
14Graph representations of codes
15(No Transcript)
16Idea Use graphs to define a sparse parity check
matrix
Low Density Parity Check (LDPC) Easy to
decode iteratively World record
performance The most astonishing thing about
them is perhaps that they were invented in 1962
(Gallager) but never much studied during 30 years
(some exceptions Tanner, Wiberg ,..) After they
was rediscovered they have become very popular,
even attracting venture capitalist money to
create new companies
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26Exact inference on cycle-free graphs
27Factor graphs sum-product algorithm
Say we want to calculate a marginal function
and lets assume the global function f has the
following local representation
We can then optimize the calculation by
28Factor graphs
29Simultaneous marginalization
Bipartite graph variable nodes and factor nodes
30The sum-product algorithm
q(m,n,x) Prob(xnx all constraints except m)
Variable node n
Constraint node m
r(m,n,x) Prob(constraint m ok xnx)
31Send q-messages and update r
q(m,n,x) Prob(xnx all constraints except m)
Variable node n
r(m,n,0) sum( prod ( q(m,n',xn') )) where the
sum is over all xn' so that constraint m is ok
if xn0 r(m,n,1) similar but with xn1
Constraint node m
r(m,n,x) Prob(constraint m ok xnx)
32Send r-messages and update q
q(m,n,x) Prob(xnx all constraints except m)
Variable node n
q(m,n,0) p(n,0) prod(r(m',n,0)) q(m,n,1)
p(n,1) prod(r(m',n,1)) where the product is taken
over all m' except m p(n,x) is the apriori prob
that xnx
Constraint node m
r(m,n,x) Prob(constraint m ok xnx)
q(m,n,0) and q(m,n,1) are then normalized so they
sum to 1
33The sum-product algorithm
The algorithm converges on cycle-free graphs in a
finite number of steps giving the correct
aposteriori probabilities on all variables All
calculations are local Convergence on graphs
with cycles (such as the decode problem of Turbo
and LDPC) not guaranteed (NP-complete problems
lure around) For coding algorithms the method
seem to work well, giving excellent performance
34Linear Gaussian models
35Kalman filtering and smoothing
36Tools
37Tools
38Dualization
39Dualization
40Kalman filtering and smoothing
41Some observations
The concept of Lyapunov function does not seem
to be used in the area There are work trying to
connect the feedback fundamental limitations
(Bode etc) with Shannon channel capacity. It has
then been noted that the standard channel
capacity concept should be extended (I(XY) vs
I(X-gtY)) see Causality, Feedback and Directed
Information, Massey. The feedback capacity of
a network with memory can be computed using the
sum-product algorithm. It looks like there could
be more connections.
42Recommended reading
An introduction to factor graphs, Hans-Andrea
Loeliger, IEEE Sig. Proc Mag Jan
2004 Factor-graphs and the Sum-Product
Algorithm, Kschischang, Frey, Loeliger, IEEE
Trans on Info Theory, 498-519, Feb
2001 Sum-product algorithm and Feedback
Capacity, Yang and Kavcic Information Theory,
Inference and Learning Algorithms, David MacKay,
available at www.inference.phy.cam.uk/mackay
chapters 26, 47-50 Fundamental Limitations in
the Presence of Finite Capacity Networks, Martins
and Dahleh, ACC 2005 Causality, Feedback and
Directed Information, Massey
43Home Work Problems
Design a Sudoku solver, using belief
propagation on a suitable constraint graph Use
the concept of Lyapunov functions to
describe/extend the density evolution analysis
by Richardsson, Urbanke Design a near-optimal
distributed LQG control scheme using message
passing.