Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: My personal cruxes for working on AI safety, published by Buck on the Effective Altruism Forum.
Write a Review
The following is a heavily edited transcript of a talk I gave for the Stanford Effective Altruism club on 19 Jan 2020. I had rev.com transcribe it, and then Linchuan Zhang, Rob Bensinger and I edited it for style and clarity, and also to occasionally have me say smarter things than I actually said. Linch and I both added a few notes throughout. Thanks also to Bill Zito, Ben Weinstein-Raun, and Howie Lempel for comments.
I feel slightly weird about posting something so long, but this is the natural place to put it.
Over the last year my beliefs about AI risk have shifted moderately; I expect that in a year I'll think that many of the things I said here were dumb. Also, very few of the ideas here are original to me.
After all those caveats, here's the talk:
Introduction
It's great to be here. I used to hang out at Stanford a lot, fun fact. I moved to America six years ago, and then in 2015, I came to Stanford EA every Sunday, and there was, obviously, a totally different crop of people there. It was really fun. I think we were a lot less successful than the current Stanford EA iteration at attracting new people. We just liked having weird conversations about weird stuff every week. It was really fun, but it's really great to come back and see a Stanford EA which is shaped differently.
Today I'm going to be talking about the argument for working on AI safety that compels me to work on AI safety, rather than the argument that should compel you or anyone else. I'm going to try to spell out how the arguments are actually shaped in my head. Logistically, we're going to try to talk for about an hour with a bunch of back and forth and you guys arguing with me as we go. And at the end, I'm going to do miscellaneous Q and A for questions you might have.
And I'll probably make everyone stand up and sit down again because it's unreasonable to sit in the same place for 90 minutes.
Meta level thoughts
I want to first very briefly talk about some concepts I have that are about how you want to think about questions like AI risk, before we actually talk about AI risk.
Heuristic arguments
When I was a confused 15 year old browsing the internet around 10 years ago, I ran across arguments about AI risk, and I thought they were pretty compelling. The arguments went something like, "Well, sure seems like if you had these powerful AI systems, that would make the world be really different. And we don't know how to align them, and it sure seems like almost all goals they could have would lead them to kill everyone, so I guess some people should probably research how to align these things." This argument was about as sophisticated as my understanding went until a few years ago, when I was pretty involved with the AI safety community.
I in fact think this kind of argument leaves a lot of questions unanswered. It's not the kind of argument that is solid enough that you'd want to use it for mechanical engineering and then build a car. It's suggestive and heuristic, but it's not trying to cross all the T's and dot all the I's. And it's not even telling you all the places where there's a hole in that argument.
Ways heuristic arguments are insufficient
The thing which I think is good to do sometimes, is instead of just thinking really loosely and heuristically, you should try to have end-to-end stories of what you believe about a particular topic. And then if there are parts that you don't have answers to, you should write them down explicitly with question marks. I guess I'm basically arguing to do that instead of just saying, "Oh, well, an AI would be dangerous here." And if there's all these other steps as well, then you should write them down, even if you're just going to have your just...
view more