Download - Developmental Stages of GPTs by orthonormal | Podbean

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

The Nonlinear Library: LessWrong Top Posts

Education

Developmental Stages of GPTs by orthonormal

2021-12-11

Download Right click and do "save link as"

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: Developmental Stages of GPTs , published by orthonormal on the AI Alignment Forum.
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
Epistemic Status: I only know as much as anyone else in my reference class (I build ML models, I can grok the GPT papers, and I don't work for OpenAI or a similar lab). But I think my thesis is original.
Related: Gwern on GPT-3
For the last several years, I've gone around saying that I'm worried about transformative AI, an AI capable of making an Industrial Revolution sized impact (the concept is agnostic on whether it has to be AGI or self-improving), because I think we might be one or two cognitive breakthroughs away from building one.
GPT-3 has made me move up my timelines, because it makes me think we might need zero more cognitive breakthroughs, just more refinement / efficiency / computing power: basically, GPT-6 or GPT-7 might do it. My reason for thinking this is comparing GPT-3 to GPT-2, and reflecting on what the differences say about the "missing pieces" for transformative AI.
My Thesis:
The difference between GPT-2 and GPT-3 has made me suspect that there's a legitimate comparison to be made between the scale of a network architecture like the GPTs, and some analogue of "developmental stages" of the resulting network. Furthermore, it's plausible to me that the functions needed to be a transformative AI are covered by a moderate number of such developmental stages, without requiring additional structure. Thus GPT-N would be a transformative AI, for some not-too-large N, and we need to redouble our efforts on ways to align such AIs.
The thesis doesn't strongly imply that we'll reach transformative AI via GPT-N especially soon; I have wide uncertainty, even given the thesis, about how large we should expect N to be, and whether the scaling of training and of computation slows down progress before then. But it's also plausible to me now that the timeline is only a few years, and that no fundamentally different approach will succeed before then. And that scares me.
Architecture and Scaling
GPT, GPT-2, and GPT-3 use nearly the same architecture; each paper says as much, with a sentence or two about minor improvements to the individual transformers. Model size (and the amount of training computation) is really the only difference.
GPT took 1 petaflop/s-day to train 117M parameters, GPT-2 took 10 petaflop/s-days to train 1.5B parameters, and the largest version of GPT-3 took 3,000 petaflop/s-days to train 175B parameters. By contrast, AlphaStar seems to have taken about 30,000 petaflop/s-days of training in mid-2019, so the pace of AI research computing power projects that there should be about 10x that today. The upshot is that OpenAI may not be able to afford it, but if Google really wanted to make GPT-4 this year, they could afford to do so.
Analogues to Developmental Stages
There are all sorts of (more or less well-defined) developmental stages for human beings: image tracking, object permanence, vocabulary and grammar, theory of mind, size and volume, emotional awareness, executive functioning, et cetera.
I was first reminded of developmental stages a few years ago, when I saw the layers of abstraction generated in this feature visualization tool for GoogLeNet.
We don't have feature visualization for language models, but we do have generative outputs. And as you scale up an architecture like GPT, you see higher levels of abstraction. Grammar gets mastered, then content (removing absurd but grammatical responses), then tone (first rough genre, then spookily accurate authorial voice). Topic coherence is mastered first on the phrase level, then the sentence level, then the paragraph level. So too with narrative flow.
Gwern's poetry experiments (GPT-2, GPT-3) are good examples. GPT-2 could more ...

More Episodes

Making Beliefs Pay Rent (in Anticipated Experiences) by Eliezer Yudkowsky

2021-12-12

That Alien Message by Eliezer Yudkowsky

2021-12-12

Why the tails come apart by Thrasymachus

2021-12-12

What Do We Mean By "Rationality"? by Eliezer Yudkowsky

2021-12-12

What 2026 looks like by Daniel Kokotajlo

2021-12-12

Reality-Revealing and Reality-Masking Puzzles by AnnaSalamon

2021-12-12

Is Success the Enemy of Freedom? (Full) by alkjash

2021-12-12

Lessons I've Learned from Self-Teaching

2021-12-12

Expecting Short Inferential Distances by Eliezer Yudkowsky

2021-12-12

Are we in an AI overhang? by Andy Jones

2021-12-12

RadVac Commercial Antibody Test Results by johnswentworth

2021-12-12

Politics is way too meta by Rob Bensinger

2021-12-12

DeepMind: Generally capable agents emerge from open-ended play by Daniel Kokotajlo

2021-12-12

The Least Convenient Possible World by Scott Alexander

2021-12-12

The LessWrong Team is now Lightcone Infrastructure, come work with us!by habryka

2021-12-12

Your Dog is Even Smarter Than You Think by StyleOfDog

2021-12-12

Embedded Interactive Predictions on LessWrong by Amandango

2021-12-12

A Fable of Science and Politics by Eliezer Yudkowsky

2021-12-12

Seven Years of Spaced Repetition Software in the Classroom

2021-12-12

Lies, Damn Lies, and Fabricated Options by Duncan_Sabien

2021-12-12

←
1
2
3
4
5
6
7
8
9
10
→

012345678910111213141516171819

Get this podcast on your
phone, FREE

Download Podbean app on App Store

Download Podbean app on Google Play

Create your
podcast in
minutes

Full-featured podcast site
Unlimited storage and bandwidth
Comprehensive podcast stats
Distribute to Apple Podcasts, Spotify, and more
Make money with your podcast

It is Free

Podcast Services
MONETIZATION & MORE
KNOWLEDGE BASE
Support
Podbean

Privacy Policy
Cookie Policy
Terms of Use
Consent Preferences
Copyright © 2015-2024 Podbean.com