Module

About This Presentation

Title:

Module

Description:

It is the identity of a thing itself. ... Quant- ified. by. Quantified. by. Information Concept Map. A Physical. Thing. May be. Another ... – PowerPoint PPT presentation

Number of Views:66

Avg rating:3.0/5.0

Slides: 124

Provided by: Jam123

Learn more at: https://www.cise.ufl.edu

Category:

more less

Transcript and Presenter's Notes

Title: Module

1
Module 4 Information, Entropy,Thermodynamics,
and Computing

A Grand Synthesis

2
Information and Entropy

A Unified Foundation for bothPhysics and
Computation

3
What is information?

Information, most generally, is simply that which
distinguishes one thing from another.
It is the identity of a thing itself.
Part or all of an identification, or a
description of the thing.
But, we must take care to distinguish between the
following
A body of info. The complete identity of a
thing.
A piece of info. An incomplete description,
part of a things identity.
A pattern of info. Many separate pieces of
information contained in separate things may have
identical patterns, or content.
Those pieces are perfectly correlated, we may
call them copies of each other.
An amount of info. A quantification of how
large a given body or piece of information.
Measured in logarithmic units.
A subject of info. The thing that is identified
or described by a given body or piece of
information. May be abstract, mathematical, or
physical.
An embodiment of info. A physical subject of
information.
We can say that a body, piece, or pattern of
information is contained or embodied in its
embodiment.
A meaning of info. A semantic interpretation,
tying the pattern to meaningful/useful
characteristics or properties of the thing
described.
A representation of info. An encoding of some
information within some other (possibly larger)
piece of info contained in something else.

4
Information Concept Map
A Pattern ofInformation
An Amountof Information
Quantifiedby
Quant-ifiedby
AnotherPiece orBody ofInfor-mation
Is a part of
Instanceof
A Pieceof Information
A Bodyof Information
May berepresentedby
Completelydescribes, isembodied by
Describes,in contained in
A PhysicalThing
A Thing
May be
5
What is knowledge?

A physical entity A can be said to know a piece
(or body) of information I about a thing T, and
that piece is considered part of As knowledge K,
if and only if
A has ready, immediate access to a physical
system S that contains some physical information
P which A can observe and that includes a
representation of I,
E.g., S may be part of As brain, wallet card, or
laptop
A can readily and immediately decode Ps
representation of I and manipulate it into an
explicit form
A understands the meaning of I how it relates
to meaningful properties of T. Can apply I
purposefully.

6
Physical Information

Physical information is simply information that
is contained in a physical system.
We may speak of a body, piece, pattern, amount,
subject, embodiment, meaning, or representation
of physical information, as with information in
general.
Note that all information that we can manipulate
ultimately must be (or be represented by)
physical information!
In our quantum-mechanical universe, there are two
very different categories of physical
information
Quantum information is all info. embodied in the
quantum state of a physical system. Cant all be
measured or copied!
Classical information is just a piece of info.
that picks out a particular basis state, once a
basis is already given.

7
Amount of Information

An amount of information can be conveniently
quantized as a logarithmic quantity.
This measures the number of independent,
fixed-capacity physical systems needed to encode
the information.
Logarithmically defined values are inherently
dimensional (not dimensionless, i.e. pure-number)
quantities.
The pure number result must be paired with a unit
which is associated with the base of the
logarithm that was used. log a (logb a)
log-b-units (logc a) log-c-units
log-c-unit / log-b-unit logb c
The log-2-unit is called the bit, the log-10-unit
the decade, the log-16-unit the nibble, the
log-256-unit the byte.
Whereas, the log-e-unit (widely used in physics)
is called the nat
The nat is also known as Boltzmanns constant kB
(e.g. in Joules/K)
A.k.a. the ideal gas constant R (may be expressed
in kcal/mol/K)

8
Defining Logarithmic Units
9
Forms of Information

Many alternative mathematical forms may be used
to represent patterns of information about a
thing.
Some important examples we will visit
A string of text describing some or all
properties or characteristics possessed by
the thing.
A set or ensemble of alternative possible states
(or consistent, complete descriptions) of
the thing.
A probability distribution or probability density
function over a set of possible states of
the thing.
A quantum state vector, i.e., wavefunction giving
a complex valued amplitude for each possible
quantum state of the thing.
A mixed state (a probability distribution over
orthogonal states).
(Some string theorists suggest octonions may be
needed!)

10
Confusing Terminology Alert

Be aware that in the following discussion I will
often shift around quickly, as needed, between
the following related concepts
A subsystem B of a given abstract system A.
A state space S of all possible states of B.
A state variable X (a statistical random
variable) of A representing the state of
subsystem B within its state space S.
A set T?S of some of the possible states of B.
A statistical event E that the subsystem state
is one of those in the state T.
A specific state s?S of the subsystem.
A value x s of the random variable X indicating
that the specific state is s.

11
Preview Some Symbology
U
K
Unknowninformation
Knowninformation
I
total Information(of any kind)
N
S
iNcompressible and/or Non-uNcomputableiNformatio
n
physicalEntropy
12
Unknown Info. Content of a Set

A.k.a., amount of unknown information content.
The amount of information required to specify or
pick out an element of the set, assuming that its
members are all equally likely to be selected.
An assumption we will see how to justify later.
The unknown information content U(S) associated
with a set S is defined as U(S) log S.
Since U(S) is defined logarithmically, it always
comes with attached logarithmic units such as
bits, nats, decades, etc.
E.g., the set a, b, c, d has an unknown
information content of 2 bits.

13
Probability and Improbability

I assume you already know a bit about probability
theory!
Given any probability P?(0,1, the associated
improbability I(P) is defined as 1/P.
There is a 1 in I(P) chance of an event
occurring which has probability P.
E.g. a probability of 0.01 implies an
improbability of 100, i.e., a 1 in 100 chance
of the event.
We can naturally extend this to also define the
improbability I(E) of an event E having
probability P(E) by I(E) I(P(E))

14
Information Gain from an Event

We define the information gain GI(E) from an
event E having improbability I(E) as GI(E)
log I(E) log 1/P(E) -log P(E)
Why? Consider the following argument
Imagine picking event E from a set S which has
S I(E) equally-likely members.
Then, Es improbability of being picked is I(E),
While the unknown information content of S was
U(S) log S log I(E).
Thus, log I(E) unknown information must have
become known when we found out that E was
actually picked.

15
Unknown Information Content (Entropy) of a
Probability Distribution

Given a probability distribution PS?0,1,
define the unknown information content of P as
the expected information gain over all the
singleton events E s ? S.
It is therefore the average information needed to
pick out a single element.
The below formula for the entropy of a
probability distribution was known to the
thermodynamicists Boltzmann and Gibbs in the
1800s!
Claude Shannon rediscovered/rederived it many
decades later.

Note the -
16
Visualizing Boltzmann-Gibbs-Shannon Entropy
17
Information Content of a Physical System

The (total amount of) information content I(A) of
an abstract physical system A is the unknown
information content of the mathematical object D
used to define A.
If D is (or implies) only a set S of (assumed
equiprobable) states, then we have I(A)
U(S) log S.
If D implies a probability distribution PS over
a set S (of distinguishable states), then
I(A) U(PS) -Pi log Pi.
We would expect to gain I(A) information if we
measured A (using basis set S) to find its exact
actual state s?S.
? we say that amount I(A) of information is
contained in A.
Note that the information content depends on how
broad (how abstract) the systems description D
is!

18
Information Capacity Entropy

The information capacity of a system is also the
amount of information about the actual state of
the system that we do not know, given only the
systems definition.
It is the amount of physical information that we
can say is in the state of the system.
It is the amount of uncertainty we have about the
state of the system, if we know only the systems
definition.
It is also the quantity that is traditionally
known as the (maximum) entropy S of the system.
Entropy was originally defined as the ratio of
heat to temperature.
The importance of this quantity in thermodynamics
(the observed fact that it never decreases) was
first noticed by Rudolph Clausius in 1850.
Today we know that entropy is, physically, really
nothing other than (unknown, incompressible)
information!

19
Known vs. Unknown Information

We, as modelers, define what we mean by the
system in question using some abstract
description D.
This implies some information content I(A) for
the abstract system A described by D.
But, we will often wish to model a scenario in
which some entity E (perhaps ourselves) has more
knowledge about the system A than is implied by
its definition.
E.g., scenarios in which E has prepared A more
specifically, or has measured some of its
properties.
Such E will generally have a more specific
description of A and thus would quote a lower
resulting I(A) or entropy.
We can capture this by distinguishing the
information in A that is known by E from that
which is unknown.
Let us now see how to do this a little more
formally.

20
Subsystems (More Generally)

For a system A defined by a state set S,
any partition P of S into subsets can be
considered a subsystem B of A.
The subsets in the partition P can be considered
the states of the subsystem B.

Another subsytem of A
In this example,the product of thetwo
partitions formsa partition of Sinto singleton
sets.We say that this isa complete set
ofsubsystems of A.In this example, the two
subsystemsare also independent.
One subsystemof A
21
Pieces of Information

For an abstract system A defined by a state set
S, any subset T?S is a possible piece of
information about A.
Namely it is the information The actual state of
A is some member of this set T.
For an abstract system A defined by a probability
distribution PS, any probability distribution
P'S such that P0 ? P'0 and U(P')ltU(P) is
another possible piece of information about A.
That is, any distribution that is consistent with
and more informative than As very definition.

22
Known Physical Information

Within any universe (closed physical system) W
described by distribution P, we say entity E (a
subsystem of W) knows a piece P of the physical
information contained in system A (another
subsystem of W) iff P implies a correlation
between the state of E and the state of A, and
this correlation is meaningfully accessible to E.
Let us now see how to make this definition more
precise.

The Universe W
Entity(Knower)E
The PhysicalSystem A
Correlation
23
What is a correlation, anyway?

A concept from statistics
Two abstract systems A and B are correlated or
interdependent when the entropy of the combined
system S(AB) is less than that of S(A)S(B).
I.e., something is known about the combined state
of AB that cannot be represented as knowledge
about the state of either A or B by itself.
E.g. A,B each have 2 possible states 0,1
They each have 1 bit of entropy.
But, we might also know that AB, so the entropy
of AB is 1 bit, not 2. (States 00 and 11.)

24
Marginal Probability

Given a joint probability distribution PXY over a
sample space S that is a Cartesian product S X
Y, we define the projection of PXY onto X, or
the marginal probability of X (under the
distribution PXY), written PX, as PX(x?X)
?y?Y PXY(x,y).
Similarly define the marginal probability of Y.
May often just write P(x) or Px to mean PX(x).

S XY
Xx ?
Px
25
Conditional Probability

Given a distribution PXY X Y, we define the
conditional probability of X given Y (under PXY),
written PXY, as the relative probability of XY
versus Y. That is, PXY(x,y) ? P(xy/y)
PXY(x,y) / PY(y),and similarly for PYX.
We may also write P(xy), or Py(x), or even just
Pxy to mean PXY(x,y).
Bayes rule is the observation that with
this definition, Pxy Pyx Px / Py.

Xx
S XY
Py
Yy ?
Px,y
Px
26
Mutual Probability

Given a distribution PXY X Y as above, the
mutual probability ratio RXY(x,y) or just Rxy
Pxy/PxPy.
Represents the factor by which the prob. of
either outcome (X x or Y y) gets boosted
when we learn the other.
Notice that Rxy Pxy / Px Pyx / Py, that is
it is the relative probability of xy versus
x, or yx versus y.
If the two variables represent independent
subsystems, then the mutual probability ratio
is always 1.
No change in one distribution from measuring the
other.
WARNING Some authors define something they call
mutual probability as the reciprocal of the
definition given here.
This seems somewhat inappropriate, given the
name.
In my definition, if the mutual probability ratio
is greater than 1, then the probability of x
increases when we learn y.
In theirs, the opposite is true.
The traditional definition should perhaps be
instead called the mutual improbability ratio.
Mutual improbability ratio RI,xy Ixy/IxIy
PxPy/Pxy.

27
Marginal, Conditional, Mutual Entropies

For each of the derived probabilities defined
previously, we can define a corresponding
informational quantity.
Joint probability PXY ? Joint entropy S(XY)
S(PXY)
Marginal probability PX ? Marginal entropy
S(X) S(PX)
Conditional probability PXY ? Conditional
entropy S(XY) ExyS(Py(x))
Mutual probability ratio RXY ? Mutual
information I(XY) Exx,ylog Rxy
Expected reduction in entropy of X from finding
out Y.

28
More on Mutual Information

Demonstration that the reduction in entropy of
one variable given the other is the same as the
expected mutual probability ratio Rxy.

29
Known Information, More Formally

For a system defined by probability distribution
P that includes two subsystems A,B with
respective state variables X,Y having mutual
information IP(XY),
The total information content of B is I(B)
U(PY).
The amount of information in B that is known by A
is KA(B) IP(XY).
The amount of information in B that is unknown by
A is UA(B) U(PY) - KA(B) S(Y) - I(XY)
S(YX).
The amount of entropy in B from As perspective
is SA(B) UA(B) S(YX).
These definitions are based on all the
correlations that are present between A and B
according to our global knowledge P.
However, a real entity A may not know,
understand, or be able to utilize all the
correlations that are actually present between
him and B.
Therefore, generally more of Bs physical
information will be effectively entropy, from As
perspective, than is implied by this definition.
We will explore some corrections to this
definition later.
Later, we will also see how to sensibly extend
this definition to the quantum context.

30
Maximum Entropy vs. Entropy
Total information content I Maximum entropy
Smax logarithm of states consistent with
systems definition
Unknown information UA Entropy SA(as seen by
observer A)
Known information KA I - UA Smax - SAas
seen by observer A
Unknown information UB Entropy SB(as seen by
observer B)
31
A Simple Example

A spin is a type of simple quantum system having
only 2 distinguishable states.
In the z basis, the basis states are called up
(?) and down (?).
In the example to the right, we have a compound
system composed of 3 spins.
? it has 8 distinguishable states.
Suppose we know that the 4 crossed-out states
have 0 amplitude (0 probability).
Due to prior preparation or measurement of the
system.
Then the system contains
One bit of known information
in spin 2
and two bits of entropy
in spins 1 3

32
Entropy, as seen from the Inside

One problem with our previous definition of
knowledge-dependent entropy based on mutual
information is that it is only well-defined for
an ensemble or probability distribution of
observer states, not for a single observer state.
However, as observers, we always find ourselves
in a particular state, not in an ensemble!
Can we obtain an alternative definition of
entropy that works for (and can be used by)
observers who are in individual states also?
While still obeying the 2nd law of
thermodynamics?
Zurek proposed that entropy S should be defined
to include not only unknown information U, but
also incompressible information N.
By definition, incompressible information (even
if it is known) cannot be reduced, therefore the
validity of the 2nd law can be maintained.
Zurek proposed using a quantity called Kolmogorov
complexity to measure the amount of
incompressible information.
Size of shortest program that computes the
information intractable to find!
However, we can instead use effective (practical)
incompressibility, from the point of view of a
particular observer, to yield a definition of the
effective entropy for that observer, for all
practical purposes.

33
Two Views of Entropy

Global view Probability distribution, from
outside, of observerobservee system leads to
expected entropy of B as seen by A, and total
system entropy.
Local view Entropy of B according to As
specific knowledge of it, plus incompressible
size of As representation of that knowledge,
yields total entropy associated with B, from As
perspective.

Conditional Entropy SBA Expected entropy of
B, from As perspective Joint distribution PAB
?Total entropy S(AB).
Mutual information IAB
Entity(Knower)A
The PhysicalSystem B
Joint dist. PAB
Amount ofunknown info in B, from Asperspective
Physical System B
Entity (knower) A
Amount ofincompressible info. about B
represented within A
U(PB)
NB
Single actualdistribution PBover states of B
SA(B) U(PB) NB
34
Example Comparing the Two Views

Example
Suppose object B contains 1,000
randomly-generated bits of information. (Initial
entropy SB 1,000 b.)
Suppose observer A reversibly measures and stores
(within itself) a copy of one-fourth (250 b) of
the information in B.
Global view
The total information content of B is I(B) 1000
b.
The mutual information IAB 250 b. (Shared by
both systems.)
Bs entropy conditioned on A S(BA)
I(B)-I(AB) 750 b.
Total entropy of joint distribution S(AB) 1,000
b.
Local view
As specific new dist. over B implies entropy
S(PB) 750 b of unknown info.
A also contains IA 250 b of known but
incompressible information about B.
There is a total of SA(B) 750 b 250 b 1,000
b of unknown or incompressible information
(entropy) still in the combined system.
750 b of this info is only in B, whereas 250 b
of it is shared between AB.

Observer A
System B
750 bunknown by A
250 bknownby A
250 bincompr.informat. Re B
35
Objective Entropy?

In all of this, we have defined entropy as a
somewhat subjective or relative quantity
Entropy of a subsystem depends on an observers
state of knowledge about that subsystem, such as
a probability distribution.
Wait a minute Doesnt physics have a more
objective, observer-independent definition of
entropy?
Only insofar as there are preferred states of
knowledge that are most readily achieved in the
lab.
E.g., knowing of a gas only its chemical
composition, temperature, pressure, volume, and
number of molecules.
Since such knowledge is practically difficult to
improve upon using present-day macroscale tools,
it serves as a uniform standard.
However, in nanoscale systems, a significant
fraction of the physical information that is
present in one subsystem is subject to being
known, or not, by another subsystem (depending on
design).
? How a nanosystem is designed how we deal with
information recorded at the nanoscale may vastly
affect how much of the nanosystems internal
physical information effectively is or is not
entropy (for practical purposes).

36
Conservation of Information

Theorem The total physical information capacity
(maximum entropy) of any closed, constant-volume
physical system (with a fixed definition) is
unchanging in time.
This follows from quantum calculations yielding
definite, fixed numbers of distinguishable states
for all systems of given size and total energy.
We will learn about these bounds later.
Before we can do this, let us first see how to
properly define entropy for quantum systems.

37
Some Categories of Information

Relative to any given entity, we can make the
following distinctions (among others)
A particular piece of information may be
Known vs. Unknown
Known information vs. entropy
Accessible vs. Inaccessible
Measurable vs. unmeasurable
Controllable vs. uncontrollable
Stable vs. Unstable
Against degradation to entropy
Correlated vs. Uncorrelated
Also, the fact of the correlation can be known or
unknown
The details of correlation can be known or
unknown
The details can be easy or difficult to discover

Wanted vs. Unwanted
Entropy is usually unwanted
Except when youre chilly!
Information may often be unwanted, too
E.g., if its in the way, and not useful
A particular pattern of information may be
Standard vs. Nonstandard
With respect to some given coding convention
Compressible vs. Incompressible
Either absolutely, or effectively
Zureks definition of entropy unknown or
incompressible info.
We will be using these various distinctions
throughout the later material

38
Quantum Information

Generalizing classical information theory
concepts to fit quantum reality

39
Density Operators

For any given state ??, the probabilities of all
the basis states si are determined by an
Hermitian operator or matrix ? (called the
density matrix)
Note that the diagonal elements ?i,i are just the
probabilities of the basis states i.
The off-diagonal elements are called
coherences.
They describe the entanglements that exist
between basis states.
The density matrix describes the state ??
exactly!
It (redundantly) expresses all of the quantum
info. in ??.

40
Mixed States

Suppose the only thing one knows about the true
state of a system that it is chosen from a
statistical ensemble or mixture of state vectors
vi (called pure states), each with a derived
density matrix ?i, and a probability Pi.
In such a situation, in which ones knowledge
about the true state is expressed as probability
distribution over pure states, we say the system
is in a mixed state.
Such a situation turns out to be completely
described, for all physical purposes, by simply
the expectationvalue (weighted average) of the
vis density matrices
Note Even if there were uncountably many vi
going into the calculation, the situation remains
fully described by O(n2) complex numbers, where n
is the number of basis states!

41
Von Neumann Entropy

Suppose our probability distribution over states
comes from the diagonal of a density matrix ?.
But, we will generally also have additional
information about the state hidden in the
coherences.
The off-diagonal elements of the density matrix.
The Shannon entropy of the distribution along the
diagonal will generally depend on the basis used
to index the matrix.
However, any density matrix can be (unitarily)
rotated into another basis in which it is
perfectly diagonal!
This means, all its off-diagonal elements are
zero.
The Shannon entropy of the diagonal distribution
is always minimized in the diagonal basis, and so
this minimum is selected as being the true
basis-independent entropy of the mixed quantum
state ?.
It is called the von Neumann entropy.

42
V.N. entropy, more formally

The trace Tr M just means the sum of Ms diagonal
elements.
The ln of a matrix M just denotes the inverse
function to eM. See the logm function in
Matlab
The exponential eM of a matrix M is defined via
the Taylor-series expansion ?i0 Mi/i!

(Shannon S)
(Boltzmann S)
43
Quantum Information Subsystems

A density matrix for a particular subsystem may
be obtained by tracing out the other
subsystems.
Means, summing over state indices for all systems
not selected.
This process discards information about any
quantum correlations that may be present between
the subsystems!
Entropies of the density matrices so obtained
will generally sum to gt that of the original
system. (Even if original state was pure!)
Keeping this in mind, we may make these
definitions
The unconditioned or marginal quantum entropy
S(A) of subsystem A is the entropy of the
reduced density matrix ?A.
The conditioned quantum entropy S(AB)
S(AB)-S(A).
Note this may be negative! (In contrast to the
classical case.)
The quantum mutual information I(AB)
S(A)S(B)-S(AB).
As in the classical case, this measures the
amount of quantum information that is shared
between the subsystems
Each subsystem knows this much information
about the other.

44
Tensors and Index Notation

A tensor is nothing but a generalized matrix that
may have more than one row and/or column index.
Can also be defined recursively as a matrix of
tensors.
Tensor signature An (r,c) tensor has r row
indices and c column indices.
Convention Row indices are shown as subscripts,
and column indices as superscripts.
Tensor product An (l,k) tensor T times an (n,m)
tensor U is a (ln,km) tensor V formed from all
products of an element of T times an element of
U
Tensor trace The trace of an (r,c) tensor T with
respect to index k (where 1 k r,c) is given
by contracting (summing over) the kth row index
together with the kth column index

Example a (2,2)tensor T in which all 4indices
take on values from the set 0,1
(I is the set of legal values of indices rk and
ck) ?
45
Quantum Information Example
AB AB

Consider the state vAB 00?11? of compound
system AB.
Let ?AB vv.
Note that the reduced density matrices ?A ?B are
fully classical
Lets look at the quantum entropies
The joint entropy S(AB) S(?AB) 0 bits.
(Because vAB is a pure state.)
The unconditioned entropy of subsystem A is S(A)
S(?A) 1 bit.
The entropy of A conditioned on B is S(AB)
S(AB)-S(A) -1 bit!
The mutual information between them I(AB)
S(A)S(B)-S(AB) 2 bits!

00? 01? 10? 11?
46
Quantum vs. Classical Mutual Info.

2 classical bit-systems have a mutual information
of at most one bit,
Occurs if they are perfectly correlated,
e.g.,00, 11
Each bit considered by itself appears to have 1
bit of entropy.
But taken together, there is really only 1 bit
of entropy shared between them
A measurement of either extracts that one bit of
entropy,
Leaves it in the form of 1 bit of incompressible
information (to the measurer).
The real joint entropy is 1 bit less than the
apparent total entropy.
Thus, the mutual information is 1 bit.
2 quantum bit-systems (qubits) can have a mutual
info. of two bits!
Occurs in maximally entangled states, such as
00?11?.
Again, each qubit considered by itself appears to
have 1 bit of entropy.
But taken together, there is no entropy in this
pure state.
A measurement of either qubit leaves us with no
entropy, rather than 1 bit!
If done right see next slide.
The real joint entropy is thus 2 bits less than
the apparent total entropy.
Thus the mutual information is (by definition) 2
bits.
Both of the apparent bits of entropy vanish if
either qubit is measured.
Used in a communication tech. called quantum
superdense coding.
1 qubits worth of prior entanglement between two
parties can be used to pass 2 bits of classical
information between them using only 1 qubit!

47
Why the Difference?

Entity A hasnt yet measured B and C, which (A
knows) are initially correlated with each other,
quantumly or classically
A has measured B and is now correlated with both
B and C
A can use his new knowledge to uncompute
(compress away) the bits from both B and C,
restoring them to a standard state

OrderABC
Classical
Quantum
Knowing he is in state 0?1?, A can unitarily
rotate himself back to state 0?. Look ma, no
entropy!
A, being in a mixed state, still holds a bit of
information that is either unknown (external
view) or incompressible (As internal view), and
thus is entropy, and can never go away (by the
2nd law of thermo.).
48
Thermodynamics and Computing
49
Proving the 2nd law of thermodynamics

Closed systems evolve via unitary transforms
Ut1?t2.
Unitary transforms just change the basis, so they
do not change the systems true (von Neumann)
entropy.
? Theorem Entropy is constant in all closed
systems undergoing an exactly-known unitary
evolution.
However, if Ut1?t2 is ever at all uncertain, or
we disregard some of our information about the
state, we get a mixture of possible resulting
states, with provably effective entropy.
? Theorem (2nd law of thermodynamics) Entropy
may increase but never decreases in closed
systems
It can increase if the system undergoes
interactions whose details are not completely
known, or if the observer discards some of his
knowledge.

50
Maxwells Demon

A longstanding paradox in thermodynamics
Why exactly cant you beat the 2nd law, reducing
the entropy of a system via measurements?
There were many attempted resolutions, all with
flaws, until
Bennett_at_IBM (82) noted
The information resulting fromthe measurement
must bedisposed of somewhere
The entropy is still present inthe demons
memory, until heexpels it into the environment.

51
Entropy Measurement

To clarify a widespread misconception
The entropy (when defined as just unknown
information) in an otherwise-closed system B can
decrease (from the point of view of another
entity A) if A performs a reversible or
non-demolition measurement of Bs state.
Actual quantum non-demolition measurements have
been empirically demonstrated in carefully
controlled experiments.
But, such a decrease does not violate the 2nd
law!
There are several alternative viewpoints as to
why
(1) System B isnt perfectly closed the
measurement requires an interaction! Bs entropy
has been moved away, not deleted.
(2) The entropy of the combined, closed AB
system does not decreasefrom the point of view
of an outside entity C not measuring AB.
(3) From As point of view, entropydefined as
unknownincompressibleinformation (Zurek) has
not decreased.

52
Standard States

A certain state (or state set) of a system may be
declared by convention to be standard within
some context.
E.g. gas at standard temperature pressure in
physics experiments.
Another example Newly allocated regions of
computer memory are often standardly initialized
to all 0s.
Information that a system is just in the/a
standard state can be considered null
information.
It is not very informative
There are more nonstandard states than standard
ones
Except in the case of isolated 2-state systems.
However, pieces of information that are in
standard states can still be useful as clean
slates on which newly measured or computed
information can be recorded.

53
Computing Information

Computing, in the most general sense, is just the
time-evolution of any physical system.
Interactions between subsystems may cause
correlations to exist that didnt exist
previously.
E.g. bits a0,b interact, assigning ab
a changes from a known, standard value (null
information with zero entropy) to a value that
correlates with b
When systems A,B interact in such a way that the
state of A is changed in a way that depends on
the state of B,
we say that the information in A is being
computed

54
Uncomputing Information

When some piece of information has been computed
using a series of known interactions,
it will often be possible to perform another
series of interactions that will
undo the effects of some or all of the earlier
interactions,
and uncompute the pattern of information
restoring it to a standard state, if desired
E.g., if the original interactions that took
place were thermodynamically reversible (did not
increase entropy) then
performing the original series of interactions,
inverted, is one way to restore the original
state.
There will generally be other ways also.

55
Effective Entropy

For any given entity A, the effective entropy
Seff in a given system B is that part of the
information in B that A cannot reversibly
uncompute (for whatever reason).
Effective entropy also obeys a 2nd law.
It always increases. Its the incompressible
info.
The law of increase of effective entropy remains
true for an combined system AB in which entityA
measures system B, even fromAs own point of
view!
No outside entity C need bepostulated, unlike
the case fornormal unknown info entropy.

A
B
0/1
0
A
B
0/1
0/1
56
Advantages of Effective Entropy

(Effective) entropy, defined as
non-reversibly-uncomputable information, subsumes
the following
Unknown information Cant be reversibly
uncomputed, because we dont know what its
pattern is.
We dont have any other info that is correlated
with it.
Known but incompressible information Cant be
reversibly uncomputed because its
incompressible.
To reversibly uncompute it would be to compress
it.
Inaccessible information Also cant be
uncomputed, because we cant get to it!
E.g., a signal of known info. sent away into
space at c.
This simple yet powerful definition is, I submit,
the right way to understand entropy.

57
Reversibility of Physics

The universe is (apparently) a closed system.
Closed systems always evolve via unitary
transforms!
Apparent wavefunction collapse doesnt contradict
this (established by work of Everett, Zurek,
etc.)
The time-evolution of the concrete state of the
universe (or any closed subsystem) is therefore
reversible
By which (here) we mean invertible (bijective)
Deterministic looking backwards in time
Total info. content I of poss. states does
not decrease
It can increase, though, if the volume is
increasing
Thus, information cannot be destroyed!
It can only be invertibly manipulated
transformed!
However, it can be mixed up with other info, lost
track of, sent away into space, etc.
Originally-uncomputable information can thereby
become (effective) entropy.

58
Arrow of Time Paradox

An apparent but false paradox, asking
If physics is reversible, how is it possible
that entropy can increase only in one time
direction?
This question results from misunderstandings of
the meaning implications of reversible in this
context.
First, reversibility (here meaning
reverse-determinism) does not imply time-reversal
symmetry.
Which would mean that physics is unchanged under
negation of time coordinate.
In a reversible system, the time-reversed
dynamics does not have to be identical to the
forward-time dynamics, just deterministic.
However, it happens that the Standard Model is
essentially time-reversal symmetric
If we simultaneously negate charges, and reflect
one space coordinate.
This is more precisely called CPT
(charge-parity-time) symmetry.
I have heard that General Relativity is too, but
Im not quite sure yet
But anyway, even when time-reversal symmetry is
present, if the initial state is defined to have
a low max. entropy ( of poss. states), there is
only room for entropy to increase in one time
direction away from the initial state.
As the universe expands, the volume and maximum
entropy of a given region of space
increases. Thus, entropy increases in that time
direction.
If you simulate a reversible and time-reversal
symmetric dynamics on a computer, state
complexity (practically-incompressible info.,
thus entropy) still empirically increases
only in one direction (away from a simple initial
state).
There is a simple combinatorial explanation for
this behavior, namely
There are always a greater number of more-complex
than less-complex states to go to!

59
CRITTERS Cellular Automaton
Movie at http//www.ai.mit.edu/people/nhm/crit.AVI

A cellular automaton (CA) is a discrete, local
dynamical system.
The CRITTERS CA uses the Margolus neighborhood
technique.
On even steps, the black 22 blocks are updated
On odd steps, the red blocks are updated

CRITTERS update rules
A block with 2 1s is unchanged.
A block with 3 1s is rotated 180 and
complemented.
Other blocks are complemented.
This rule, as given, is not time-reversal
symmetric,
But if you complement all cells after each step,
it becomes so.

Margolus Neighborhood
(Plus all rotatedversions of thesecases.)
60
Equilibrium

Due to the 2nd law, the entropy of any closed,
constant-volume system (with not-precisely-known
interactions) increases until it approaches its
maximum entropy I log N.
But the rate of approach to equilibrium varies
greatly, depending on the precise scenario being
modeled.
Maximum-entropy states are called equilibrium
states.
We saw earlier that entropy is maximized by
uniform probability distributions.
? Theorem (Fundamental assumption of statistical
mechanics.) Systems at equilibrium have an equal
probability of being in each of their possible
states.
Proof The Boltzmann distribution is the one
with the maximum entropy! Thus, it is the
equilibrium state.
This holds for states of equal total energy

61
Other Boltzmann Distributions

Consider a system A described in a basis in which
not all basis states are assigned the same
energy.
E.g., choose a basis consisting of energy
eigenstates.
Suppose we know of a system A (in addition to its
basis set) only that our expectation of its
average energy E if measured to have a certain
value E0
Due to conservation of energy, if EE0 initially,
this must remain true, so long as A is a
closed system.
Jaynes (1957) showed that for a system at
temperature T, the maximum entropy probability
distribution P that is consistent with this
constraint is one in which
This same distribution was derived earlier, but
in a less general scenario, by Boltzmann.
Thus, at equilibrium, systems will have this
distribution over state sets that do not all have
the same energy.
Does not contradict the uniform Boltzmann
distribution from earlier, because that was a
distribution over specific distinguishable states
that are all individually consistent with our
description (in this case, that all have energy
E0).

62
What is energy, anyway?

Related to the constancy of physical law.
Noethers theorem (1905) relates conservation
laws to physical symmetries.
The conservation of energy (1st law of thermo.)
can be shown to be a direct consequence of the
time-symmetry of the laws of physics.
We saw that energy eigenstates are those state
vectors that remain constant (except for a phase
rotation) over time. (Eigenvectors of the U?t
matrix.)
Equilibrium states are statistical mixtures of
these
The eigenvalue gives the energy of the eigenstate
the rate of phase-angle accumulation of that
state
Later, we will see that energy can also be viewed
as the rate of (quantum) computing that is
occurring within a physical system.

Noether rhymes with mother
63
Aside on Noethers theorem

(Of no particular use in this course, but fun to
know anyway)
Virtually all of physical law can be
reconstructed as a necessary consequence of
various fundamental symmetries of the dynamics.
These exemplify the general principle that the
dynamical behavior itself should naturally be
independent of all the arbitrary choices that we
make in setting up our mathematical
representations of states.
Translational symmetry (arbitrariness of position
of origin) implies
Conservation of momentum!
Symmetry under rotations in space (no preferred
direction) implies
Conservation of angular momentum!
Symmetry of laws under Lorentz boosts, and
accelerated motions
Implies special general relativity!
Symmetry of electron wavefunctions (state
vectors, or density matrices) under rotations in
the complex plane (arbitrariness of phase angles)
implies
For uniform rotations over all spatial points
We can derive the conservation of electric
charge!
For spatially nonuniform (gauge) rotations
Can derive the existence of photons, and all of
Maxwells equations!!
Add relativistic gauge symmetries for other types
of particles and interactions
Can get QED, QCD and the Standard Model!
Discrete symmetries have various implications as
well...

64
Temperature at Equilibrium

Recall the of states of a compound system AB is
the product of the of states of A and of B.
? the total information I(AB) I(A)I(B)
Combining this with the 1st law of thermo.
(conservation of energy) one can show (Stowe sec.
9A) that two subsystems at equilibrium with each
other (so IS) share a property ?S/?E
Assuming no mechanical or diffusive interactions
Temperature is then defined as the reciprocal of
this quantity, ?E/?S. (Units energy/entropy.)
Energy needed per increase in entropy

65
Generalized Temperature

Any increase in the entropy of a system at
maximum entropy implies an increase in that
systems total information content,
since total information content is the same thing
as maximum entropy.
But, a system that is not at its maximum entropy
is nothing other than just the very same system,
only in a situation where some of its state
information just happens to be known by the
observer!
And, note that the total information content
itself does not depend on the observers
knowledge about the systems state,
only on the very definition of the system.
? adding ?E energy even to a non-equilibrium
system must increase its total information I by
the very same amount, ?S!
So, ?I/?E in any non-equilibrium system equals
?S/?E of the same system, if it were at
equilibrium. So, redefine T?E/?I.

?E energy
System _at_temperature T
?I ?E/T information
66
Information erasure

Suppose we have access to a subsystem containing
one bit of information (which may or may not be
entropy).
Suppose we now want to erase that bit
I.e., restore it unconditionally to a standard
state, e.g. 0
So we can later compute some new information in
that location.
But the information/entropy in that bit
physically cannot just be irreversibly destroyed.
We can only ever do physically reversible
actions, e.g.,
Move/swap the information out of the bit
Store it elsewhere, or let it dissipate away
If you lose track of the information, it becomes
entropy!
If it wasnt already entropy. ? Important to
remember
Or, reversibly transform the bit to the desired
value
Requires uncomputing the old value
based on other knowledge redundant with that old
value

67
Energy Cost of Info. Erasure

Suppose you wish to erase (get rid of) 1 bit of
unwanted (garbage) information by disposing of
it in an external system at temperature T.
T 300 K terrestially, 2.73 K in cosmic µwave
background
Adding that much information to the external
system will require adding at least E (1 bit)T
energy to your garbage dump.
This is true by the very definition of
temperature!
In natural log units, this is kBT ln 2 energy.
_at_ room temperature 18 meV _at_ 2.73 K 0.16 meV.
Landauer_at_IBM (1961) first proved this relation
between bit-erasure and energy.
Though a similar claim was made by von Neumann in
1949.

68
Landauers 1961 Principle from basic quantum
theory
Before bit erasure
After bit erasure
Ndistinctstates

sN-1
s?N-1
0
0
2Ndistinctstates
Unitary(1-1)evolution
s'0
s?N
1
0
Ndistinctstates

s'N-1
s?2N-1
1
0
Increase in entropy S log 2 k ln 2. Energy
lost to heat ST kT ln 2
69
Bistable Potential-Energy Wells

Consider any system having an adjustable,
bistable potential energy surface (PES) or well
in its configuration space.
The two stable states form a natural bit.
One state represents 0, the other 1.
Consider now the P.E. well havingtwo adjustable
parameters
(1) Height of the potential energy
barrierrelative to the well bottom
(2) Relative height of the left and rightstates
in the well (bias)

(Landauer 61)
0
1
70
Possible Parameter Settings

We will distinguish six qualitatively different
settings of the wells parameters, as follows

BarrierHeight
Direction of Bias Force
71
One Mechanical Implementation
Stateknob
Rightwardbias
Barrierwedge
Leftwardbias
spring
spring
Barrier up
Barrier down
72
Possible Reversible Transitions
(Ignoring superposition states.)

Catalog of all the possible transitions between
known states in these wells, both
thermodynamically reversible not...

1states
1
1
1
leak
0
0states
0
leak
0
BarrierHeight
N
1
0
Direction of Bias Force
73
Erasing Digital Entropy

Note that if the information in a bit-system is
already entropy,
Then erasing it just moves this entropy to the
surroundings.
This can be done with a thermodynamically
reversible process, and does not necessarily
increase total entropy!
However, if/when we take a bit that is known, and
irrevocably commit ourselves to thereafter
treating it as if it were unknown,
that is the true irreversible step,
and that is when the entropy iseffectively
generated!!

This state contains 1 bitof uncomputable
information, in a stable, digital form
1
This state contains 1 bitof physical entropy,
but ina stable, digital form
?
1
0
0
In these 3 states, there is no entropy in the
digital state it has all been pushed out into
the environment.
N
0
74
Extropy

Rather than repeatedly saying uncomputable
(i.e., compressible) information,
A cumbersome phrase,
let us coin the term extropy (and sometimes use
symbol X) for this concept.
Name chosen to connote the opposite of entropy.
Sometimes also called negentropy.
Since a systems total information content I X
S,
we have X I - S.
We ignore previous meanings of the word
extropy, promoted by the Extropians
A certain trans-humanist organization.

75
Work vs. Heat

The total energy E of a system (in a given frame)
can be determined from its total
inertial-gravitational mass m (in that frame)
using E mc2.
We can define the heat content H of the system as
that part of

Write a Comment

User Comments (0)