Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Standard Analogy, published by Zack M Davis on June 3, 2024 on LessWrong.
[Scene: a suburban house, a minute after the conclusion of "And All the Shoggoths Merely Players". Doomimir returns with his package, which he places by the door, and turns his attention to Simplicia, who has been waiting for him.]
Simplicia: Right. To recap for [coughs] no one in particular, when we left off [pointedly, to the audience] one minute ago, Doomimir Doomovitch, you were expressing confidence that approaches to aligning artificial general intelligence within the current paradigm were almost certain to fail. You don't think that the apparent tractability of getting contemporary generative AI techniques to do what humans want bears on that question.
But you did say you have empirical evidence for your view, which I'm excited to hear about!
Doomimir: Indeed, Simplicia Optimistovna. My empirical evidence is the example of the evolution of human intelligence. You see, humans were optimized for one thing only: inclusive genetic fitness
[Simplicia turns to the audience and makes a face.]
Doomimir: [annoyed] What?
Simplicia: When you said you had empirical evidence, I thought you meant empirical evidence about AI, not the same analogy to an unrelated field that I've been hearing for the last fifteen years. I was hoping for, you know, ArXiv papers about SGD's inductive biases, or online regret bounds, or singular learning theory ... something, anything at all, from this century, that engages with what we've learned from the experience of actually building artificial minds.
Doomimir: That's one of the many things you Earthlings refuse to understand. You didn't build that.
Simplicia: What?
Doomimir: The capabilities advances that your civilization's AI guys have been turning out these days haven't come from a deeper understanding of cognition, but by improvements to generic optimization methods, fueled with ever-increasing investments in compute. Deep learning not only isn't a science, it isn't even an engineering discipline in the traditional sense: the opacity of the artifacts it produces has no analogue among bridge or engine designs.
In effect, all the object-level engineering work is being done by gradient descent.
The autogenocidal maniac Richard Sutton calls this the bitter lesson, and attributes the field's slowness to embrace it to ego and recalcitrance on the part of practitioners. But in accordance with the dictum to feel fully the emotion that fits the facts, I think bitterness is appropriate.
It makes sense to be bitter about the shortsighted adoption of a fundamentally unalignable paradigm on the basis of its immediate performance, when a saner world would notice the glaring foreseeable difficulties and coordinate on doing Something Else Which Is Not That.
Simplicia: I don't think that's quite the correct reading of the bitter lesson. Sutton is advocating general methods that scale with compute, as contrasted to hand-coding human domain knowledge, but that doesn't mean that we're ignorant of what those general methods are doing. One of the examples Sutton gives is computer chess, where minimax search with optimizations like α-β pruning prevailed over trying to explicitly encode what human grandmasters know about the game. But that seems fine.
Writing a program that thinks about tactics the way humans do rather than letting tactical play emerge from searching the game tree would be a lot more work for less than no benefit.
A broadly similar moral could apply to using deep learning to approximate complicated functions between data distributions: we specify the training distribution, and the details of fitting it are delegated to a network architecture with the appropriate invariances: convolutional nets for processing image data, transformers for variable-length sequences. Ther...
view more