Heisenberg’s principle, Shannon’s information, and nuclear (research) reactors

The information-theoretic formulation of Heisenberg’s Uncertainty Principle that Michael J.W. Hall (Griffith U, Brisbane), Masanao Ozawa (Nagoya U), Mark M. Wilde (Louisiana State U, Baton Rouge), and I formulated a while ago, has been experimentally tested and verified by Georg Sulyok, Stephan Sponar, Demirel Bülent, and Yuji Hasegawa (all at the Vienna Atomistitut) using a very precise measurement on the neutrons emitted by their research nuclear reactor (a TRIGA Mark II). The results have been published in Physical Review Letters, as an Editors’ Suggestion (the paper is also freely available on the arXiv).

Heisenberg (left), the father of the uncertainty principle, meets Shannon (right), the father of information theory, above the core of a TRIGA research reactor. The blue glow is caused by Cherenkov radiation. (Source: Wikipedia)

Heisenberg’s Uncertainty Principles

Heisenberg’s Uncertainty Principle (HUP) is often summarized as the statement that any act of measurement inevitably causes uncontrollable disturbance on the measured system. Put in a more spectacular way, HUP would dictate that we can learn about the present, but at the cost of being unable to fully predict the future. In fact, Heisenberg, in his original paper, never claimed such a generally valid, all-encompassing statement. Instead, his intention was to construct a physically plausible (for the scientific community of that time, 1927) scenario, in which the mathematical property of non-commutativity of quantum observables would have measurable consequences.

We can learn about the present, but at the cost of being unable to fully predict the future.

I think it is fair to put Heisenberg’s original work into perspective: though rigorous (at least for the standards of that time), it without doubt relies on over-idealized measurement models, like the famous and much-debated gamma-ray microscope thought experiment for the measurement of the position of an electron by photon scattering. This, of course, can hardly lead to any statement of general validity, and I believe that neither Heisenberg nor his contemporaries would have thought otherwise.

How to tame the general case then? Starting from the axioms of quantum theory (those about states, observables, and the ‘Born rule’) and proceeding in a purely geometric way, Robertson derived, in 1929, a relation that is usually presented as the mathematical formulation of HUP, namely,

\Delta {A} \Delta {B} \ge \frac12|\left<\psi|(AB-BA)|\psi\right>|

where \Delta {A}^2:=\left<\psi|(A-\left<\psi|A|\psi\right>)^2|\psi\right> and \Delta {B}^2:=\left<\psi|(B-\left<\psi|B|\psi\right>)^2|\psi\right> are the mean-square deviations of the two observables in state \psi.

At this point, however, the orthodox textbook (like, for example, the still nowadays excellent Nielsen&Chuang) will rightly notice that Robertson’s relation has nothing to do with a noise-disturbance relation: \Delta{A} and \Delta{B} cannot be interpreted as measures of ‘accuracy with which A is measured’ and ‘disturbance caused on the value of observable B’ without soon running into some sort of nonsense. The correct interpretation is the following: suppose that we have a very large number of particles, all in the same state \psi, and that we measure A on half of them and B on the remaining half; we would then observe that the statistical data of the measurement outcomes would obey Robertson’s inequality. Since no mention is made of the state of the particles after the measurements, it is clear that Robertson’s relation is surely not about the disturbance caused by the act of measurement, but rather about the limitations that quantum theory poses on the preparation of quantum states, that cannot be simultaneously sharp with respect to two incompatible observables.

It hence seems clear to me that we are dealing with two uncertainty principles:

  1. static uncertainty principle, namely, Robertson’s inequality, and
  2. dynamical uncertainty principle, namely, a statement that should establish a tradeoff between the accuracy with which an observable (A) is measured and the disturbance consequently caused on another non-commuting observable (B).

Should one then give up with the search for a noise-disturbance relation à la Heisenberg, i.e., involving mean-square deviations and commutators? The answer is no: as Masanao Ozawa showed some time ago, with the careful definitions of a ‘noise operator’ and a ‘disturbance operator,’ it is indeed possible to generalize Robertson’s relation, turning it into a tradeoff relation between accuracy (with which one observable, A, is measured) and disturbance (that said measurement introduces in the other observable, B). There have been some (hot) debate on this particular approach, but this would take us too far.

The Information-Theoretic Formulation

Another static uncertainty principle is that discovered by Hans Maassen and Jos Uffink (in 1988), generalizing a proposal first made by David Deutsch (in 1983). Their relation looks like this:

H(A) + H(B) \ge k

where H(A) and H(B) denote the entropies of the statistical distributions of the outcomes of the measurements of A and B, and k=k(A,B) is a number that is strictly positive whenever A and B are incompatible. Whenever this is the case, the Deutsch-Maassen-Uffink relation prevents H(A) and H(B) from being both null at the same time.

The entropic formulation of the uncertainty principle has some features making it preferable, in some situations, to the usual formulation in terms of mean-square deviations. The two main reasons are the following:

  1. the lower bound k(A,B) does not depend on the state of the system being measured, while the lower bound in Robertson’s inequality becomes trivial whenever \psi is, for example, an eigenstate of either A or B;
  2. the entropies H(A) and H(B) do not depend on the numerical value of the possible outcomes (i.e., the eigenvalues of A and B) but only on their statistical distribution; on the contrary, the mean-square deviations, \Delta{A} and \Delta{B}do depend on the numerical value of the eigenvalues of the two observables (for example, a simple relabeling of outcomes can lead to very different values for the mean-square deviations).

Even though the entropic formulation of the uncertainty principle is quite different from the traditional one given in terms of mean-square deviations, it falls in the same category, in the sense that it only captures the ‘static’ part of Heisenberg’s principle. Indeed, the Deutsch-Maassen-Uffink relation refers to the outcome statistics collected from many, independent measurements of observables A and B on a very large number of particles all prepared in the same state, but the states of the particles after the measurements never enter the analysis.

Again, one may wonder whether it is possible to prove an entropic tradeoff relation that captures the dynamical uncertainty principle. Indeed it is possible to do so, and we did that in a recent collaboration. Our formula looks as follows:

N(\mathfrak{m},A)+ D(\mathfrak{m},B)\ge k


  1. N(\mathfrak{m},A) measures the noise with which the measuring apparatus \mathfrak{m} measures the observable A,
  2. D(\mathfrak{m},B) measures the disturbance that the measuring apparatus \mathfrak{m} causes on the value of the observable B, and
  3. k=k(A,B) is the same number appearing also in the Maassen-Uffink relation (it is hence strictly positive whenever A and B are incompatible).

In information-theoretic terms (“as Shannon would say“) the above relation essentially describes the tradeoff between knowledge about A and predictability of B. It thus proves the statement that I wrote at the beginning, namely:

we can learn about the present (the value of A), but at the cost of being unable to fully predict the future (the value of B).

Read the full paper on the arXiv: http://arxiv.org/abs/1504.04200

Note added: I was tempted to title this post “Heisenberg meets Shannon in a reactor core,” just to add an item to the list of people that have met Shannon in titles.

Saint Lucy’s Day: the longest night of the year

In Piacenza, the city where I was born, children receive their presents not for Christmas, but on Saint Lucy’s day (Santa Lucia). They say that Saint Lucy’s Day, the 13th day of December, has the longest night of the year:

“Santa Lucia, la notte più lunga che ci sia.”

This, in a way, makes perfect sense, as the Saint Little Girl needs time to go around, house by house, delivering the presents.

However, everyone knows that the longest night of the year (for the Boreal Hemisphere) is not the one between the 12th the 13th of December, but that between the 21st and the 22nd, aka the Winter Solstice! That’s why I always assumed that such a saying was to be meant as a sort of `poetic license,’ possibly suggested by the rhyming words `Lucia‘ and `sia.’

Today, however, I discovered another interesting, plausible reason for the saying. The discrepancy between Saint Lucy’s day and the Winter Solstice could also be due to the introduction of  the Gregorian calendar, in AD1582, which shifted the calendar back of ten~ish days. And so, everything fits together again: how nice!

Happy Saint Lucy’s Day!

Edit 2014-12-16: Richard Gill on Google+ points out that, as a matter of fact, the current St Lucy’s Day has the earliest (though not the longest) night.

Correlations that enable anomalous `backward’ flows of heat/information

As everyone knows, when two objects at different temperatures get in contact, heat will flow from the hotter to the colder object, until temperatures equilibrate. This fact constitutes the second law of thermodynamics. The same happens also for information: it can only go from the `informed’ party (i.e., where the information is stored) to the `uninformed’ one. This intuition can be formalized as a data-processing principle.

The above arguments hide, however, an implicit assumption — that the two objects (or information carriers) never met in the past, i.e., are uncorrelated. Indeed, in the presence of initial correlations, anomalous backward flows of heat/information have been predicted and observed, in violation of the data-processing principle.

However, not all correlations enable such anomalous flows. For example, purely classical correlations do not have such ability. Hence the question naturally arises: which correlations allow to break the data-processing principle?

In this paper I present a general characterization of such correlations from an information-theoretic viewpoint. The main discovery is that the situation is much richer than previously thought: not only the quality but also the quantity of correlations matters — the delicate tradeoff between them being given by the condition of complete positivity, a central concept in quantum mechanics.

The hope is that the approach I propose here, unifying a number of previous works and thus simplifying the global picture, will contribute to the understanding of the deep (though, in my opinion, not so straightforward, as claimed somewhere) connections between information theory, quantum theory, and thermodynamics.

This work will appear in Physical Review Letters. Pre-print available at http://arxiv.org/abs/1307.0363