Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: Thinking About Filtered Evidence Is (Very!) Hard, published by Abram Demski on the AI Alignment Forum.
The content of this post would not exist if not for conversations with Zack Davis, and owes something to conversations with Sam Eisenstat.
There's been some talk about filtered evidence recently. I want to make a mathematical observation which causes some trouble for the Bayesian treatment of filtered evidence. [OK, when I started writing this post, it was "recently". It's been on the back burner for a while.]
This is also a continuation of the line of research about trolling mathematicians, and hence, relevant to logical uncertainty.
I'm going to be making a mathematical argument, but, I'm going to keep things rather informal. I think this increases the clarity of the argument for most readers. I'll make some comments on proper formalization at the end.
Alright, here's my argument.
According to the Bayesian treatment of filtered evidence, you need to update on the fact that the fact was presented to you, rather than the raw fact. This involves reasoning about the algorithm which decided which facts to show you. The point I want to make is that this can be incredibly computationally difficult, even if the algorithm is so simple that you can predict what it will say next. IE, I don't need to rely on anything like "humans are too complex for humans to really treat as well-specified evidence-filtering algorithms".
For my result, we imagine that a Bayesian reasoner (the "listener") is listening to a series of statements made by another agent (the "speaker").
First, I need to establish some terminology:
Assumption 1. A listener will be said to have a rich hypothesis space if the listener assigns some probability to the speaker enumerating any computably enumerable set of statements.
The intuition behind this assumption is supposed to be: due to computational limitations, the listener may need to restrict to some set
H
of easily computed hypotheses; for example, the hypotheses might be poly-time or even log-poly. This prevents hypotheses such as "the speaker is giving us the bits of a halting oracle in order", as well as "the speaker has a little more processing power than the listener". However, the hypothesis space is not so restricted as to limit the world to being a finite-state machine. The listener can imagine the speaker proving complicated theorems, so long as it is done sufficiently slowly for the listener to keep up. In such a model, the listener might imagine the speaker staying quiet for quite a long time (observing the null string over and over, or some simple sentence such as 1=1) while a long computation completes; and only then making a complicated claim.
This is also not to say that I assume my listener considers only hypotheses in which it can 100% keep up with the speaker's reasoning. The listener can also have probabilistic hypotheses which recognize its inability to perfectly anticipate the speaker. I'm only pointing out that my result does not rely on a speaker which the listener can't keep up with.
What it does rely on is that there are not too many restrictions on what the speaker eventually says.
Assumption 2. A listener believes a speaker to be honest if the listener distinguishes between "X" and "the speaker claims X at time t" (aka "claims
t
-X"), and also has beliefs such that P(X| claims
t
-X)=1 when P(claims
t
-X) > 0.
This assumption is, basically, saying that the agent trusts its observations; the speaker can filter evidence, but the speaker cannot falsify evidence.
Maybe this assumption seems quite strong. I'll talk about relaxing it after I sketch the central result.
Assumption 3. A listener is said to have minimally consistent beliefs if each proposition X has a negation X, and P(X)+P(X)
≤
1.
The idea behind minimally consistent beliefs...
view more