Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Quotes from Leopold Aschenbrenner's Situational Awareness Paper, published by Zvi on June 7, 2024 on LessWrong.
This post is different.
Usually I offer commentary and analysis. I share what others think, then respond.
This is the second time I am importantly not doing that. The work speaks for itself. It offers a different perspective, a window and a worldview. It is self-consistent. This is what a highly intelligent, highly knowledgeable person actually believes after much thought.
So rather than say where I agree and disagree and argue back (and I do both strongly in many places), this is only quotes and graphs from the paper, selected to tell the central story while cutting length by ~80%, so others can more easily absorb it. I recommend asking what are the load bearing assumptions and claims, and what changes to them would alter the key conclusions.
The first time I used this format was years ago, when I offered Quotes from Moral Mazes. I think it is time to use it again.
Then there will be one or more other posts, where I do respond.
Introduction
(1) Page 1: The Project will be on. If we're lucky, we'll be in an all-out race with the CCP; if we're unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the willful blindness of "it's just predicting the next word". They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them.
Section 1: From GPT-4 to AGI: Counting the OOMs
(2) Page 7: AGI by 2027 is strikingly plausible. GPT-2 to GPT-4 took us from ~preschooler to ~smart high-schooler abilities in 4 years. Tracing trendlines in compute (~0.5 orders of magnitude or OOMs/year), algorithmic efficiencies (~0.5 OOMs/year), and "unhobbling" gains (from chatbot to agent), we should expect another preschooler-to-high-schooler-sized qualitative jump by 2027.
(3) Page 8: I make the following claim: it is strikingly plausible that by 2027, models will be able to do the work of an AI researcher/engineer. That doesn't require believing in sci-fi; it just requires believing in straight lines on a graph.
(4) Page 9: We are racing through the OOMs extremely rapidly, and the numbers indicate we should expect another ~100,000x effective compute scaleup - resulting in another GPT-2-to-GPT-4-sized qualitative jump - over four years.
(5) Page 14: Of course, even GPT-4 is still somewhat uneven; for some tasks it's much better than smart high-schoolers, while there are other tasks it can't yet do. That said, I tend to think most of these limitations come down to obvious ways models are still hobbled, as I'll discuss in-depth later.
The raw intelligence is (mostly) there, even if the models are still artificially constrained; it'll take extra work to unlock models being able to fully apply that raw intelligence across applications.
(6) Page 19: How did this happen? The magic of deep learning is that it just works - and the trendlines have been astonishingly consistent, despite naysayers at every turn.
(7) Page 21: An additional 2 OOMs of compute (a cluster in the $10s of billions) seems very likely to happen by the end of 2027; even a cluster closer to +3 OOMs of compute ($100 billion+) seems plausible (and is rumored to be in the works at Microsoft/OpenAI).
(8) Page 23: In this piece, I'll separate out two kinds of algorithmic progress. Here, I'll start by covering "within-paradigm" algorithmic improvements - those that simply result in b...
view more