Title: Information Content
1Information Content
Tristan LEcuyer
2Historical Perspective
Claude Shannon (1948), A Mathematical Theory of
Communication, Bell System Technical Journal 27,
pp. 379-423 and 623-656.
- Information theory has its roots in
telecommunications and specifically in addressing
the engineering problem of transmitting signals
over noisy channels. - Papers in 1924 and 1928 by Harry Nyquist and
Ralph Hartley, respectively introduce the notion
of information as a measurable quantity
representing the ability of a receiver to
distinguish different sequences of symbols. - The formal theory begins with Shannon (1948), the
first to establish the connection between
information content and entropy. - Since this seminal work, information theory has
grown into a broad and deep mathematical field
with applications in data communication, data
compression, error-correction, and cryptographic
algorithms (codes and ciphers).
3Link to Remote Sensing
- Shannon (1948) The fundamental problem of
communication is that of reproducing at one
point, either exactly or approximately, a message
selected at another point. - Similarly, the fundamental goal of remote sensing
is to use measurements to reproduce a set of
geophysical parameters, the message, that are
defined or selected in the atmosphere at the
remote point of observation (eg. satellite). - Information theory makes it possible examine the
capacity of transmission channels (usually in
bits) accounting for noise, signal gaps, and
other forms of signal degradation. - Likewise in remote sensing we can use information
theory to examine the capacity of a combination
of measurements to convey information about the
geophysical parameters of interest accounting for
noise due to measurement error and model error.
4Corrupting the MessageNoise and Non-uniqueness
Linear Model
Quadratic Model
Cubic Model
- Measurement and model error as well as the
character of the forward model all introduce
non-uniqueness in the solution.
5Forward Model Errors (?y)
Forward Problem
Inverse Problem
Errors in Inversion
Influence parameters
Measurement error
Uncertainty in influence parameters
Forward model errors
- Uncertainty due to unknown influence parameters
that impact forward model calculations but are
not directly retrieved often represents the
largest source of retrieval error - Errors in these parameters introduce
non-uniqueness in the solution space by
broadening the effective measurement PDF
6Error Propagation in Inversion
Bi-variate PDF of (sim. obs.) measurements.
Width dictated by measurement error and
uncertainty in forward model assumptions.
7Visible Ice Cloud Retreivals
- Nakajima and King (1990) technique based on a
conservative scattering visible channel for
optical depth and an absorbing near- IR channel
for reff - Influence parameters are crystal habit, particle
size distribution, and surface albedo.
Due to assumptions t 16-50 Re 9-21
8CloudSat Snowfall Retrievals
- Snowfall retrievals relate reflectivity, Z, to
snowfall rate, S - This relationship depends on snow crystal shape,
density, size distribution, and fall speed - Since few, if any of these factors can be
retrieved from reflectivity alone, they all
broaden the Z-S relationship and lead to
uncertainty in the retrieved snowfall rate
9Impacts of Crystal Shape (2-7 dBZ)
10Impacts of PSD (3-6 dBZ)
11Implications for Retrieval
- Given a perfect forward model, 1 dB measurement
errors lead to errors in retrieved snowfall rate
of less than 10
12Quantitative Retrieval Metrics
- Four useful metrics for assessing how well
formulated a retrieval problem - Sx the error covariance matrix provides a
useful diagnostic of retrieval performance
measuring the uncertainty in the products - A the averaging kernel describes, among other
things, the amount of information that comes from
the measurements as opposed to a priori
information - Degrees of freedom
- Information content
- All require accurate specification of
uncertainties in all inputs including errors due
to forward model assumptions, measurements, and
any mathematical approximations required to map
geophysical parameters into measurement space.
13Degrees of Freedom
Clive Rogers (2000), Inverse Methods for
Atmospheric Sounding Theory and Practice,
World Scientific, 238 pp.
- The cost function can be used to define two very
useful measures of the quality of a retrieval
the number of degrees of freedom for signal and
noise denoted ds and dn, respectively - where Sa is the covariance matrix describing
the prior state space and K represents the
Jacobian of the measurements with respect to the
parameters of interest. - ds specifies the number of observations that are
actually used to constrain retrieval parameters
while the dn is the corresponding number that are
lost due to noise
ds
dn
14Degrees of Freedom
- Using the expression for the state vector that
minimizes the cost function it is relatively
straight-forward to show that - where Im is the m x m identity matrix and A
is the averaging kernel. - NOTE Even if the number of retrieval parameters
is equal to or less than the number of
measurements, a retrieval can still be
under-constrained if noise and redundancy are
such that the number of degrees of freedom for
signal is less than the number of parameters to
be retrieved.
15Entropy-based Information Content
- The Gibbs entropy is the logarithm of the number
of discrete internal states of a thermodynamic
system - where pi is the probability of the system
being in state i and k is the Boltzmann constant. - The information theory analogue has k1 and the
pi representing the probabilities of all possible
combinations of retrieval parameters. - More generally, for a continuous distribution
(eg. Gaussian)
16Entropy of a Gaussian Distribution
- For the Gaussian distributions typically used in
optimal estimation - we have
- For an m-variable Gaussian dist.
17Information Content of a Retrieval
- The information content of an observing system is
defined as the difference in entropy between an a
priori set of possible solutions, S(P1), and the
subset of these solutions that also satisfy the
measurements, S(P2) - If Gaussian distributions are assumed for the
prior and posterior state spaces as in the O. E.
approach, this can be written - since, after minimizing the cost function,
the covariance of the posterior state space is
18Interpretation
- Qualitatively, information content describes the
factor by which knowledge of a quantity is
improved by making a measurement. - Using Gaussian statistics we see that the
information content provides a measure of how
much the volume of uncertainty represented by
the a priori state space is reduced after
measurements are made. - Essentially this is a generalization of the
scalar concept of signal-to-noise ratio.
19Measuring Stick Analogy
- Information content measures the resolution of
the observing system for resolving solution
space. - Analogous to the divisions on a measuring stick
the higher the information content, the finer the
scale that can be resolved. - A Biggest scale 2 divisions ? H 1
A
Full range of a priori solutions
20Liquid Cloud Retrievals
- Blue ? a priori state space
- Green ? state space that also matches MODIS
visible channel (0.64 µm) - Red ? state space that matches both 0.64 and 2.13
µm channels - Yellow ? state space that matches all 17 MODIS
channels
21Snowfall Retrieval Revisited
- With a 140 GHz brightness temperature accurate to
5 K as a constraint, the range of solutions is
significantly narrowed by up to a factor of 4
implying an information content of 2.
22Return to Polynomial Functions
X1 X2 2 X1a X2a 1
Order, N X1 X2 Error () ds H
1 1.984 1.988 18 1.933 1.45
2 1.996 1.998 9 1.985 2.19
5 1.999 2.000 3 1.998 3.16
sy 10 sa 100
Order, N X1 X2 Error () ds H
1 1.909 1.929 41 1.659 0.65
2 1.976 1.986 21 1.911 1.29
5 1.996 1.998 8 1.987 2.25
sy 25 sa 100
Order, N X1 X2 Error () ds H
1 1.401 1.432 8 0.568 0.07
2 1.682 1.771 7 1.099 0.21
5 1.927 1.976 3 1.784 0.83
sy 10 sa 10
23Application MODIS Cloud Retrievals
- LEcuyer et al. (2006), J. Appl. Meteor. 45,
20-41. - Cooper et al. (2006), J. Appl. Meteor. 45, 42-62.
- The concept of information content provides a
useful tool for analyzing the properties of
observing systems within the constraints of
realistic error assumptions. - As an example, consider the problem of assessing
the information content of the channels on the
MODIS instrument for retrieving cloud
microphysical properties. - Application of information theory requires
- Characterize the expected uncertainty in modeled
radiances due to assumed temperature, humidity,
ice crystal shape/density, particle size
distribution, etc. (i.e. evaluate Sy) - Determine the sensitivity of each radiance to the
microphysical properties of interest (i.e.
compute K) - Establish error bounds provided by any available
a priori information (eg. cloud height from
CloudSat) - Evaluate diagnostics such as Sx, A, ds, and H
24Error Analyses
- Fractional errors reveal a strong
scene-dependence that varies from channel to
channel. - LW channels are typically better at lower optical
depths while SW channels improve at higher values.
25Sensitivity Analyses
0.646 µm
- The sensitivity matrices also illustrate a strong
scene dependence that varies from channel to
channel. - The SW channels have the best sensitivity to
number concentration in optically thick clouds
and effective radius in thin clouds. - LW channels exhibit the most sensitivity to cloud
height for thick clouds and to number
concentration for clouds with optical depths
between 0.5-4.
2.130 µm
11.00 µm
26Information Content
- Information content is related to the ratio of
the sensitivity to the uncertainty i.e. the
signal-to-noise.
27The Importance of Uncertainties
- Rigorous specification of forward model
uncertainties is critical for an accurate
assessment of the information content of any set
of measurements.
28The Role of A Priori
- Information content measures the amount state
space is reduced relative to prior information. - As prior information improves, the information
content of the measurements decreases. - The presence of cloud height information from
CloudSat, for example, constrains the a priori
state space and reduces the information content
of the MODIS observations.