Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: Information At A Distance Is Mediated By Deterministic Constraints, published by johnswentworth on the AI Alignment Forum.
You know the game of Telephone? That one where a bunch of people line up, and the first person whispers some message in the second person’s ear, then the second person whispers it to the third, and so on down the line, and then at the end some ridiculous message comes out from all the errors which have compounded along the way.
This post is about modelling the whole world as a game of Telephone.
Information Not Perfectly Conserved Is Completely Lost
Information is not binary; it’s not something we either know or don’t. Information is uncertain, and that uncertainty lies on a continuum - 10% confidence is very different from 50% confidence which is very different from 99.9% confidence.
In Telephone, somebody whispers a word, and I’m not sure if it’s “dish” or “fish”. I’m uncertain, but not maximally uncertain; I’m still pretty sure it’s one of those two words, and I might even think it’s more likely one than the other. Information is lost, but it’s not completely lost.
But if you’ve played Telephone, you probably noticed that by the time the message reaches the end, information is more-or-less completely lost, unless basically-everyone managed to pass the message basically-perfectly. You might still be able to guess the length of the original message (since the length is passed on reasonably reliably), but the contents tend to be completely scrambled.
Main takeaway: when information is passed through many layers, one after another, any information not nearly-perfectly conserved through nearly-all the “messages” is lost. Unlike the single-message case, this sort of “long-range” information passing is roughly binary.
In fact, we can prove this mathematically. Here’s the rough argument:
At each step, information about the original message (as measured by mutual information) can only decrease (by the Data Processing Inequality).
But mutual information can never go below zero, so.
The mutual information is decreasing and bounded below, which means it eventually approaches some limit (assuming it was finite to start).
Once the mutual information is very close to that limit, it must decrease by very little at each step - i.e. information is almost-perfectly conserved.
This does allow a little bit of wiggle room - non-binary uncertainty can enter the picture early on. But in the long run, any information not nearly-perfectly conserved by the later steps will be lost.
Of course, the limit which the mutual information approaches could still be zero - meaning that all the information is lost. Any information not completely lost must be perfectly conserved in the long run.
Deterministic Constraints
It turns out that information can only be perfectly conserved when carried by deterministic constraints. For instance, in Telephone, information about the length of the original message will only be perfectly conserved if the length of each message is always (i.e. deterministically) equal to the length of the preceding message. There must be a deterministic constraint between the lengths of adjacent messages, and our perfectly-conserved information must be carried by that constraint.
More formally: suppose our game of telephone starts with the message
M
0
. Some time later, the message
M
n
is whispered into my ear, and I turn around to whisper
M
n
1
into the next person’s ear. The only way that
M
n
1
can contain exactly the same information as
M
n
about
M
0
is if:
There’s some functions
f
n
f
n
1
for which
f
n
M
n
f
n
1
M
n
1
with probability 1; that’s the deterministic constraint.
The deterministic constraint carries all the information about
M
0
- i.e.
P
M
0
M
n
P
M
0
f
n
M
n
. (Or, equivalently, mutual information of
M
0
and
M
n
equals mutual information of
M
0
...
view more