Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: LLMs seem (relatively) safe, published by JustisMills on April 26, 2024 on LessWrong.
Post for a somewhat more general audience than the modal LessWrong reader, but gets at my actual thoughts on the topic.
In 2018 OpenAI
defeated the world champions of Dota 2, a major esports game. This was hot on the heels of DeepMind's AlphaGo performance
against Lee Sedol in 2016, achieving superhuman Go performance way before anyone thought that might happen. AI benchmarks were being cleared at a pace which felt breathtaking at the time, papers were proudly published, and ML tools like Tensorflow (released in 2015) were coming online. To people already interested in AI, it was an exciting era. To everyone else, the world was unchanged.
Now Saturday Night Live sketches use sober discussions of AI risk
as the backdrop for their actual jokes, there are
hundreds of AI bills moving through the world's legislatures, and Eliezer Yudkowsky
is featured in Time Magazine.
For people who have been predicting, since well before AI was cool (and now passe), that it could spell doom for humanity, this explosion of mainstream attention is a dark portent. Billion dollar AI companies keep springing up and allying with the largest tech companies in the world, and bottlenecks like money, energy, and talent are widening considerably. If current approaches can get us to superhuman AI in principle, it seems like they will in practice, and soon.
But what if large language models, the vanguard of the AI movement, are actually safer than what came before? What if the path we're on is less perilous than what we might have hoped for, back in 2017? It seems that way to me.
LLMs are self limiting
To train a large language model, you need an absolutely massive amount of data. The core thing these models are doing is predicting the next few letters of text, over and over again, and they need to be trained on billions and billions of words of human-generated text to get good at it.
Compare this process to
AlphaZero, DeepMind's algorithm that superhumanly masters Chess, Go, and Shogi. AlphaZero trains by playing against itself. While older chess engines bootstrap themselves by observing the records of countless human games, AlphaZero simply learns by doing. Which means that the only bottleneck for training it is computation - given enough energy, it can just play itself forever, and keep getting new data.
Not so with LLMs: their source of data is human-produced text, and human-produced text is a finite resource.
The precise datasets used to train cutting-edge LLMs are secret, but let's suppose that they include a fair bit of the low hanging fruit: maybe 5% of publicly available text that is in principle available and not garbage. You can schlep your way to a 20x bigger dataset in that case, though you'll hit diminishing returns as you have to, for example, generate transcripts of random videos and filter old mailing list threads for metadata and spam.
But nothing you do is going to get you 1,000x the training data, at least not in the short run.
Scaling laws are among the watershed discoveries of ML research in the last decade; basically, these are equations that project how much oomph you get out of increasing the size, training time, and dataset that go into a model. And as it turns out, the amount of high quality data is extremely important, and often becomes the bottleneck.
It's easy to take this fact for granted now, but it wasn't always obvious! If computational power or model size was usually the bottleneck, we could just make bigger and bigger computers and reliably get smarter and smarter AIs. But that only works to a point, because it turns out we need high quality data too, and high quality data is finite (and, as the political apparatus wakes up to what's going on,
legally fraught).
There are rumbling...
view more