Download - LW - MATS AI Safety Strategy Curriculum by Ryan Kidd

Discover

Podcast Features
Your all-in-one podcasting solution.

Podcast Studio
Easy-to-use audio recorder app.
Livestream
High-performing audio live, without limits.

Podcast App
The best podcast player & podcast app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Patron & Paid Content
The seamless way for fans to support you directly
from your podcast.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Enterprise
Pricing
Discover

The Nonlinear Library: LessWrong

Education

LW - MATS AI Safety Strategy Curriculum by Ryan Kidd

2024-03-08

Download Right click and do "save link as"

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: MATS AI Safety Strategy Curriculum, published by Ryan Kidd on March 8, 2024 on LessWrong.
As part of the MATS Winter 2023-24 Program, scholars were invited to take part in a series of weekly discussion groups on AI safety strategy. Each strategy discussion focused on a specific crux we deemed relevant to prioritizing AI safety interventions and was accompanied by a reading list and suggested discussion questions. The discussion groups were faciliated by several MATS alumni and other AI safety community members and generally ran for 1-1.5 h.
As assessed by our alumni reviewers, scholars in our Summer 2023 Program were much better at writing concrete plans for their research than they were at explaining their research's theory of change. We think it is generally important for researchers, even those early in their career, to critically evaluate the impact of their work, to:
Choose high-impact research directions and career pathways;
Conduct adequate risk analyses to mitigate unnecessary safety hazards and avoid research with a poor safety-capabilities advancement ratio;
Discover blindspots and biases in their research strategy.
We expect that the majority of improvements to the above areas occur through repeated practice, ideally with high-quality feedback from a mentor or research peers. However, we also think that engaging with some core literature and discussing with peers is beneficial. This is our attempt to create a list of core literature for AI safety strategy appropriate for the average MATS scholar, who should have completed the AISF Alignment Course.
We are not confident that the reading lists and discussion questions below are the best possible version of this project, but we thought they were worth publishing anyways. MATS welcomes feedback and suggestions for improvement.
Week 1: How will AGI arise?
What is AGI?
Karnofsky - Forecasting Transformative AI, Part 1: What Kind of AI? (13 min)
Metaculus - When will the first general AI system be devised, tested, and publicly announced? (read Resolution Criteria) (5 min)
How large will models need to be and when will they be that large?
Alexander - Biological Anchors: The Trick that Might or Might Not Work (read Parts I-II) (27 min)
Optional: Davidson - What a compute-centric framework says about AI takeoff speeds (20 min)
Optional: Habryka et al. - AI Timelines (dialogue between Ajeya Cotra, Daniel Kokotajlo, and Ege Erdil) (61 min)
Optional: Halperin, Chow, Mazlish - AGI and the EMH: markets are not expecting aligned or unaligned AI in the next 30 years (31 min)
How far can current architectures scale?
Patel - Will Scaling Work? (16 min)
Epoch - AI Trends (5 min)
Optional: Nostalgebraist - Chinchilla's Wild Implications (13 min)
Optional: Porby - Why I think strong general AI is coming soon (40 min)
What observations might make us update?
Ngo - Clarifying and predicting AGI (5 min)
Optional: Berglund et al. - Taken out of context: On measuring situational awareness in LLMs (33 min)
Optional: Cremer, Whittlestone - Artificial Canaries: Early Warning Signs for Anticipatory and Democratic Governance of AI (34 min)
Suggested discussion questions
If you look at any of the outside view models linked in "Biological Anchors: The Trick that Might or Might Not Work" (e.g., Ajeya Cotra's and Tom Davidson's models), which of their quantitative estimates do you agree or disagree with? Do your disagreements make your timelines longer or shorter?
Do you disagree with the models used to forecast AGI? That is, rather than disagree with their estimates of particular variables, do you disagree with any more fundamental assumptions of the model? How does that change your timelines, if at all?
If you had to make a probabilistic model to forecast AGI, what quantitative variables would you use and what fundamental assumptions would ...