Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: Local Validity as a Key to Sanity and Civilization, published by Eliezer Yudkowsky on the AI Alignment Forum.
(Cross-posted from Facebook.)
0.
Tl;dr: There's a similarity between these three concepts:
A locally valid proof step in mathematics is one that, in general, produces only true statements from true statements. This is a property of a single step, irrespective of whether the final conclusion is true or false.
There's such a thing as a bad argument even for a good conclusion. In order to arrive at sane answers to questions of fact and policy, we need to be curious about whether arguments are good or bad, independently of their conclusions. The rules against fallacies must be enforced even against arguments for conclusions we like.
For civilization to hold together, we need to make coordinated steps away from Nash equilibria in lockstep. This requires general rules that are allowed to impose penalties on people we like or reward people we don't like. When people stop believing the general rules are being evaluated sufficiently fairly, they go back to the Nash equilibrium and civilization falls.
i.
The notion of a locally evaluated argument step is simplest in mathematics, where it is a formalizable idea in model theory. In math, a general type of step is 'valid' if it only produces semantically true statements from other semantically true statements, relative to a given model. If x = y in some set of variable assignments, then 2x = 2y in the same model. Maybe x doesn't equal y, in some model, but even if it doesn't, the local step from "x = y" to "2x = 2y" is a locally valid step of argument. It won't introduce any new problems.
Conversely, xy = xz does not imply y = z. It happens to work when x = 2, y = 3, and z= 3, in which case the two statements say "6 = 6" and "3 = 3" respectively. But if x = 0, y = 4, z = 17, then we have "0 = 0" on one side and "4 = 17" on the other. We can feed in a true statement and get a false statement out the other end. This argument is not locally okay.
You can't get the concept of a "mathematical proof" unless on some level—though often an intuitive level rather than an explicit one—you understand the notion of a single step of argument that is locally okay or locally not okay, independent of whether you globally agreed with the final conclusion. There's a kind of approval you give to the pieces of the argument, rather than looking the whole thing over and deciding whether you like what came out the other end.
Once you've grasped that, it may even be possible to convince you of mathematical results that sound counterintuitive. When your understanding of the rules governing allowable argument steps has become stronger than your faith in your ability to judge whole intuitive conclusions, you may be convinced of truths you would not otherwise have grasped.
ii.
More generally in life, even outside of mathematics, there are such things as bad arguments for good conclusions.
There are even such things as genuinely good arguments for false conclusions, though of course those are much rarer. By the Bayesian definition of evidence, "strong evidence" is exactly that kind of evidence which we very rarely expect to find supporting a false conclusion. Lord Kelvin's careful and multiply-supported lines of reasoning arguing that the Earth could not possibly be so much as a hundred million years old, all failed simultaneously in a surprising way because that era didn't know about nuclear reactions. But most of the time this does not happen.
On the other hand, bad arguments for true conclusions are extremely easy to come by, because there are tiny elves that whisper them to people. There isn't anything the least bit more difficult in making an argument terrible when it leads to a good conclusion, since the tiny elves own lawnmowers.
One of the mar...
view more