Download - My personal cruxes for working on AI safety by Buck

Discover

Podcast Features
Your all-in-one podcasting solution.

Podcast Studio
Easy-to-use audio recorder app.
Livestream
High-performing audio live, without limits.

Podcast App
The best podcast player & podcast app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Patron & Paid Content
The seamless way for fans to support you directly
from your podcast.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Enterprise
Pricing
Discover

The Nonlinear Library: EA Forum Top Posts

Education

My personal cruxes for working on AI safety by Buck

2021-12-12

Download Right click and do "save link as"

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My personal cruxes for working on AI safety, published by Buck on the Effective Altruism Forum. Write a Review The following is a heavily edited transcript of a talk I gave for the Stanford Effective Altruism club on 19 Jan 2020. I had rev.com transcribe it, and then Linchuan Zhang, Rob Bensinger and I edited it for style and clarity, and also to occasionally have me say smarter things than I actually said. Linch and I both added a few notes throughout. Thanks also to Bill Zito, Ben Weinstein-Raun, and Howie Lempel for comments. I feel slightly weird about posting something so long, but this is the natural place to put it. Over the last year my beliefs about AI risk have shifted moderately; I expect that in a year I'll think that many of the things I said here were dumb. Also, very few of the ideas here are original to me. After all those caveats, here's the talk: Introduction It's great to be here. I used to hang out at Stanford a lot, fun fact. I moved to America six years ago, and then in 2015, I came to Stanford EA every Sunday, and there was, obviously, a totally different crop of people there. It was really fun. I think we were a lot less successful than the current Stanford EA iteration at attracting new people. We just liked having weird conversations about weird stuff every week. It was really fun, but it's really great to come back and see a Stanford EA which is shaped differently. Today I'm going to be talking about the argument for working on AI safety that compels me to work on AI safety, rather than the argument that should compel you or anyone else. I'm going to try to spell out how the arguments are actually shaped in my head. Logistically, we're going to try to talk for about an hour with a bunch of back and forth and you guys arguing with me as we go. And at the end, I'm going to do miscellaneous Q and A for questions you might have. And I'll probably make everyone stand up and sit down again because it's unreasonable to sit in the same place for 90 minutes. Meta level thoughts I want to first very briefly talk about some concepts I have that are about how you want to think about questions like AI risk, before we actually talk about AI risk. Heuristic arguments When I was a confused 15 year old browsing the internet around 10 years ago, I ran across arguments about AI risk, and I thought they were pretty compelling. The arguments went something like, "Well, sure seems like if you had these powerful AI systems, that would make the world be really different. And we don't know how to align them, and it sure seems like almost all goals they could have would lead them to kill everyone, so I guess some people should probably research how to align these things." This argument was about as sophisticated as my understanding went until a few years ago, when I was pretty involved with the AI safety community. I in fact think this kind of argument leaves a lot of questions unanswered. It's not the kind of argument that is solid enough that you'd want to use it for mechanical engineering and then build a car. It's suggestive and heuristic, but it's not trying to cross all the T's and dot all the I's. And it's not even telling you all the places where there's a hole in that argument. Ways heuristic arguments are insufficient The thing which I think is good to do sometimes, is instead of just thinking really loosely and heuristically, you should try to have end-to-end stories of what you believe about a particular topic. And then if there are parts that you don't have answers to, you should write them down explicitly with question marks. I guess I'm basically arguing to do that instead of just saying, "Oh, well, an AI would be dangerous here." And if there's all these other steps as well, then you should write them down, even if you're just going to have your just...