Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: 2020 AI Alignment Literature Review and Charity Comparison, published by Larks on the AI Alignment Forum.
Write a Review
cross-posted to the EA forum here.
Introduction
As in 2016, 2017, 2018, and 2019, I have attempted to review the research that has been produced by various organisations working on AI safety, to help potential donors gain a better understanding of the landscape. This is a similar role to that which GiveWell performs for global health charities, and somewhat similar to a securities analyst with regards to possible investments.
My aim is basically to judge the output of each organisation in 2020 and compare it to their budget. This should give a sense of the organisations' average cost-effectiveness. We can also compare their financial reserves to their 2020 budgets to get a sense of urgency.
I’d like to apologize in advance to everyone doing useful AI Safety work whose contributions I have overlooked or misconstrued. As ever I am painfully aware of the various corners I have had to cut due to time constraints from my job, as well as being distracted by 1) other projects, 2) the miracle of life and 3) computer games.
This article focuses on AI risk work. If you think other causes are important too, your priorities might differ. This particularly affects GCRI, FHI and CSER, who both do a lot of work on other issues which I attempt to cover but only very cursorily.
How to read this document
This document is fairly extensive, and some parts (particularly the methodology section) are largely the same as last year, so I don’t recommend reading from start to finish. Instead, I recommend navigating to the sections of most interest to you.
If you are interested in a specific research organisation, you can use the table of contents to navigate to the appropriate section. You might then also want to Ctrl+F for the organisation acronym in case they are mentioned elsewhere as well. Papers listed as ‘X researchers contributed to the following research lead by other organisations’ are included in the section corresponding to their first author and you can Cntrl+F to find them.
If you are interested in a specific topic, I have added a tag to each paper, so you can Ctrl+F for a tag to find associated work. The tags were chosen somewhat informally so you might want to search more than one, especially as a piece might seem to fit in multiple categories.
Here are the un-scientifically-chosen hashtags:
AgentFoundations
Amplification
Capabilities
Corrigibility
DecisionTheory
Ethics
Forecasting
GPT-3
IRL
Misc
NearAI
OtherXrisk
Overview
Politics
RL
Strategy
Textbook
Transparency
ValueLearning
New to Artificial Intelligence as an existential risk?
If you are new to the idea of General Artificial Intelligence as presenting a major risk to the survival of human value, I recommend this Vox piece by Kelsey Piper, or for a more technical version this by Richard Ngo.
If you are already convinced and are interested in contributing technically, I recommend this piece by Jacob Steinheart, as unlike this document Jacob covers pre-2019 research and organises by topic, not organisation, or this from Critch & Krueger, or this from Everitt et al, though it is a few years old now
Research Organisations
FHI: The Future of Humanity Institute
FHI is an Oxford-based Existential Risk Research organisation founded in 2005 by Nick Bostrom. They are affiliated with Oxford University. They cover a wide variety of existential risks, including artificial intelligence, and do political outreach. Their research can be found here.
Their research is more varied than MIRI's, including strategic work, work directly addressing the value-learning problem, and corrigibility work - as well as work on other Xrisks.
They run a Research Scholars Program, where people can join them to do research at FHI. There is...
view more