Download - EA - Center on Long-Term Risk: Annual review and fundraiser 2023 by Center on Long-Term Risk

Discover

Podcast Features
Your all-in-one podcasting solution.

Podcast Studio
Easy-to-use audio recorder app.
Livestream
High-performing audio live, without limits.

Podcast App
The best podcast player & podcast app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Patron & Paid Content
The seamless way for fans to support you directly
from your podcast.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Enterprise
Pricing
Discover

The Nonlinear Library: EA Forum

Education

EA - Center on Long-Term Risk: Annual review and fundraiser 2023 by Center on Long-Term Risk

2023-12-13

Download Right click and do "save link as"

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Center on Long-Term Risk: Annual review and fundraiser 2023, published by Center on Long-Term Risk on December 13, 2023 on The Effective Altruism Forum.Jesse CliftonCrossposted to LessWrong hereThis is a brief overview of theCenter on Long-Term Risk (CLR)'s activities in 2023 and our plans for 2024. We are hoping to fundraise $770,000 to fulfill our target budget in 2024.About usCLR works on addressing the worst-case risks from the development and deployment of advanced AI systems in order to reduces-risks. Our research primarily involves thinking about how to reduce conflict and promote cooperation in interactions involving powerful AI systems. In addition to research, we do a range of activities aimed at building a community of people interested in s-risk reduction, and support efforts that contribute to s-risk reduction via theCLR Fund.Review of 2023ResearchOur research in 2023 primarily fell in a few buckets:Commitment races and safe Pareto improvements deconfusion. Many researchers in the area considercommitment races a potentially important driver of conflict involving AI systems. But we have been missing a precise understanding of the mechanisms by which they could lead to conflict. We believe we made significant progress on this over the last year. This includes progress on understanding the conditions under which an approach to bargaining called "safe Pareto improvements (SPIs)" can prevent catastrophic conflict.Most of this work is non-public, but public documents that came out of this line of work includeOpen-minded updatelessness,Responses to apparent rationalist confusions about game / decision theory, and a forthcoming paper (seedraft) & post on SPIs for expected utility maximizers.Paths to implementing surrogate goals.Surrogate goals are a special case of SPIs and we consider them a promising route to reducing the downsides from conflict. We (along with CLR-external researchers Nathaniel Sauerberg and Caspar Oesterheld) thought about how implementing surrogate goals could be both credible and counterfactual (i.e., not done by AIs by default), e.g., usingcompute monitoring schemes.CLR researchers, in collaboration with Caspar Oesterheld and Filip Sondej, are also working on a project to "implement" surrogate goals/SPIs in contemporary language models.Conflict-prone dispositions. We thought about the kinds of dispositions that could exacerbate conflict, and how they might arise in AI systems. The primary motivation for this line of work is that, even if alignment does not fully succeed, we may be able to shape their dispositions in coarse-grained ways that reduce the risks of worse-than-extinction outcomes. See our post onmaking AIs less likely to be spiteful.Evaluations of LLMs. We continued ourearlier work on evaluating cooperation-relevant properties in LLMs. Part of this involved cheap exploratory work with GPT-4 and Claude (e.g., looking at behavior in scenarios from theMachiavelli dataset) to see if there were particularly interesting behaviors worth investing more time in.We also worked with external collaborators to develop "Welfare Diplomacy", a variant of the Diplomacy game environment designed to be better for facilitating Cooperative AI research. Wewrote a paper introducing the benchmark and using it to evaluate several LLMs.Community buildingProgress on s-risk community building was slow, due to the departures of our community building staff and funding uncertainties that prevented us from immediately hiring another Community Manager.We continued having career calls;We ran our fourthSummer Research Fellowship, with 10 fellows;We have now hired a new Community Manager, Winston Oswald-Drummond, who has just started.Staff & leadership changesWe saw some substantial staff changes this year, with three staff m...