Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: Science in a High-Dimensional World, published by johnswentworth on the AI Alignment Forum.
Claim: the usual explanation of the Scientific Method is missing some key pieces about how to make science work well in a high-dimensional world (e.g. our world). Updating our picture of science to account for the challenges of dimensionality gives a different model for how to do science and how to recognize high-value research. This post will sketch out that model, and explain what problems it solves.
The Dimensionality Problem
Imagine that we are early scientists, investigating the mechanics of a sled sliding down a slope. What determines how fast the sled goes? Any number of factors could conceivably matter: angle of the hill, weight and shape and material of the sled, blessings or curses laid upon the sled or the hill, the weather, wetness, phase of the moon, latitude and/or longitude and/or altitude, etc. For all the early scientists know, there may be some deep mathematical structure to the world which links the sled’s speed to the astrological motions of stars and planets, or the flaps of the wings of butterflies across the ocean, or vibrations from the feet of foxes running through the woods.
Takeaway: there are literally billions of variables which could influence the speed of a sled on a hill, as far as an early scientist knows.
So, the early scientists try to control as much as they can. They use a standardized sled, with standardized weights, on a flat smooth piece of wood treated in a standardized manner, at a standardized angle. Playing around, they find that they need to carefully control a dozen different variables to get reproducible results. With those dozen pieces carefully kept the same every time. the sled consistently reaches the same speed (within reasonable precision).
At first glance, this does not sound very useful. They had to exercise unrealistic levels of standardization and control over a dozen different variables. Presumably their results will not generalize to real sleds on real hills in the wild.
But stop for a moment to consider the implications of the result. A consistent sled-speed can be achieved while controlling only a dozen variables. Out of literally billions. Planetary motions? Irrelevant, after controlling for those dozen variables. Flaps of butterfly wings on the other side of the ocean? Irrelevant, after controlling for those dozen variables. Vibrations from foxes’ feet? Irrelevant, after controlling for those dozen variables.
The amazing power of achieving a consistent sled-speed is not that other sleds on other hills will reach the same predictable speed. Rather, it’s knowing which variables are needed to predict the sled’s speed. Hopefully, those same variables will be sufficient to determine the speeds of other sleds on other hills - even if some experimentation is required to find the speed for any particular variable-combination.
Determinism
How can we know that all other variables in the universe are irrelevant after controlling for a handful? Couldn’t there always be some other variable which is relevant, no matter what empirical results we see?
The key to answering that question is determinism. If the system’s behavior can be predicted perfectly, then there is no mystery left to explain, no information left which some unknown variable could provide. Mathematically, information theorists use the mutual information
I
X
Y
to measure the information which
X
contains about
Y
. If
Y
is deterministic - i.e. we can predict
Y
perfectly - then
I
X
Y
is zero no matter what variable
X
we look at. Or, in terms of correlations: a deterministic variable always has zero correlation with everything else. If we can perfectly predict
Y
, then there is no further information to gain about it.
In this case, we’re saying that sled speed is deter...
view more