Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why I take short timelines seriously, published by NicholasKees on January 29, 2024 on LessWrong.
I originally started writing this as a message to a friend, to offer my personal timeline takes. It ended up getting kind of long, so I decided to pivot toward making this into a post.
These are my personal impressions gathered while doing a bachelors and a masters degree in artificial intelligence, as well as working for about a year and a half in the alignment space.
AI (and AI alignment) has been the center of my attention for a little over 8 years now. For most of that time, if you asked me about timelines I'd gesture at an
FHI survey that suggested a median timeline of 2045-2050, and say "good chance it happens in my lifetime." When I thought about my future in AI safety, I imagined that I'd do a PhD, become a serious academic, and by the time we were getting close to general intelligence I would already have a long tenure of working in AI (and be well placed to help).
I also imagined that building AI would involve developing a real "science of intelligence," and I saw the work that people at my university (University of Groningen) were doing as pursuing this great project. People there were working on a wide range of machine learning methods (of which neural networks were just one idea), logic, knowledge systems, theory of mind, psychology, robotics, linguistics, social choice, argumentation theory, etc. I heard very often that "neural networks are not magic," and was encouraged to embrace an interdisciplinary approach to understanding how intelligence worked (which I did).
At the time, there was one big event that caused a lot of controversy: the success of
AlphaGo (2016). To a lot of people, including myself, this seemed like "artificial intuition." People were not very impressed with the success of
DeepBlue in chess, because this was "just brute force" and this would obviously not scale. Real intelligence was about doing more than brute force. AlphaGo was clearly very different, though everyone disagreed on what the implications were. Many of my professors bet really hard against deep learning continuing to succeed, but over and over again they were proven wrong. In particular I remember
OpenAI Five (2017/2018) as being an extremely big deal in my circles, and people were starting to look at OpenAI as potentially changing everything.
There was this other idea that I embraced, which was something adjacent to
Moravec's paradox: AI would be good at the things humans are bad at, and vice versa. It would first learn to do a range of specialized tasks (which would be individually very impressive), gradually move toward more human-like systems, and the very last thing it would learn to do was master human language. This particular idea about language has been around since the
Turing test: Mastering language would require general, human-level intelligence. If you had told me there would be powerful language models in less than a decade, I would have been quite skeptical.
When GPT happened, this dramatically changed my future plans.
GPT-2 and especially
GPT-3 were both extremely unnerving to me (though mostly exciting to all my peers). This was, in my view
"mastering language" which was not supposed to happen until we were very close to human level
demonstrating general abilities. I can't overstate how big of a deal this was. GPT-2 could correctly use newly invented words, do some basic math, and a wide range of unusual things that we now call in-context learning. There was nothing even remotely close to this anywhere else in AI, and people around me struggled to understand how this was even possible.
a result of scaling. When GPT-3 came out, this was especially scary, because they hadn't really done anything to improve upon the design of GPT-2, they just made it bigger....
view more