Download - LW - Scale Was All We Needed, At First by Gabriel Mukobi

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

The Nonlinear Library: LessWrong

Education

LW - Scale Was All We Needed, At First by Gabriel Mukobi

2023-12-18

Download Right click and do "save link as"

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Scale Was All We Needed, At First, published by Gabriel Mukobi on December 18, 2023 on LessWrong.
This is a hasty speculative fiction vignette of one way I expect we might get AGI by January 2025 (within about one year of writing this). Like similar works by others, I expect most of the guesses herein to turn out incorrect. However, this was still useful for expanding my imagination about what could happen to enable very short timelines, and I hope it's also useful to you.
The assistant opened the door, and I walked into Director Yarden's austere office. For the Director of a major new federal institute, her working space was surprisingly devoid of possessions. But I suppose the DHS's Superintelligence Defense Institute was only created last week.
"You're Doctor Browning?" Yarden asked from her desk.
"Yes, Director," I replied.
"Take a seat," she said, gesturing. I complied as the lights flickered ominously. "Happy New Year, thanks for coming," she said. "I called you in today to brief me on how the hell we got here, and to help me figure out what we should do next."
"Happy New Year. Have you read my team's Report?" I questioned.
"Yes," she said, "and I found all 118 pages absolutely riveting. But I want to hear it from you straight, all together."
"Well, okay," I said. The Report was all I'd been thinking about lately, but it was quite a lot to go over all at once. "Where should I start?"
"Start at the beginning, last year in June, when this all started to get weird."
"All right, Director," I began, recalling the events of the past year. "June 2024 was when it really started to sink in, but the actual changes began a year ago in January. And the groundwork for all that had been paved for a few years before then. You see, with generative AI systems, which are a type of AI that - "
"Spare the lay explanations, doctor," Yarden interrupted. "I have a PhD in machine learning from MIT."
"Right. Anyway, it turned out that transformers were even more compute-efficient architectures than we originally thought they were. They were nearly the perfect model for representing and manipulating information; it's just that we didn't have the right learning algorithms yet. Last January, that changed when QStar-2 began to work. Causal language model pretraining was already plenty successful for imbuing a lot of general world knowledge in models, a lot of raw cognitive power.
"RLHF started to steer language models, no?"
"Yes, RLHF partially helped, and the GPT-4-era models were decent at following instructions and not saying naughty words and all that. But there's a big difference between increasing the likelihood of noisy human preference signals and actually being a high-performing, goal-optimizing agent. QStar-2 was the first big difference."
"What was the big insight, in your opinion?" asked Yarden.
"We think it was Noam Brown's team at OpenAI that first made it, but soon after, a convergent similar discovery was made at Google DeepMind."
"MuTokenZero?"
"MuTokenZero. The crux of both of these algorithms was finding a way to efficiently fine-tune language models on arbitrary online POMDP environments using a variant of Monte-Carlo Tree Search. They took slightly different approaches to handle the branch pruning problem - it doesn't especially matter now.
"What kinds of tasks did they first try it on?"
"For OpenAI from February through March, it was mostly boring product things: Marketing agents that could drive 40% higher click-through rates. Personal assistants that helped plan the perfect day. Stock traders better than any of the quant firms. "Laundry Buddy" kinds of things. DeepMind had some of this too, but they were the first to actively deploy a goal-optimizing language model for the task of science. They got some initial wins in genomic sequencing with AlphaFold 3, other simp...