Download - LW - Non-Disparagement Canaries for OpenAI by aysja

Discover

Podcast Features
Monetization
Podbean App
- Podcast Studio
  Easy-to-use audio recorder app.
- Podcast App
  The best podcast player & podcast app.

Help and Support
Popular Topics

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Advertisers
Enterprise
Pricing
Resources
- Help and Support
- Popular Topics
Discover

The Nonlinear Library

Education

LW - Non-Disparagement Canaries for OpenAI by aysja

2024-05-30

Download Right click and do "save link as"

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Non-Disparagement Canaries for OpenAI, published by aysja on May 30, 2024 on LessWrong.
Since at least 2017, OpenAI has asked departing employees to sign offboarding agreements which legally bind them to permanently - that is, for the rest of their lives - refrain from criticizing OpenAI, or from otherwise taking any actions which might damage its finances or reputation.[1]
If they refused to sign, OpenAI threatened to take back (or make unsellable) all of their already-vested equity - a huge portion of their overall compensation, which often amounted to millions of dollars. Given this immense pressure, it seems likely that most employees signed.
If they did sign, they became personally liable forevermore for any financial or reputational harm they later caused. This liability was unbounded, so had the potential to be financially ruinous - if, say, they later wrote a blog post critical of OpenAI, they might in principle be found liable for damages far in excess of their net worth.
These extreme provisions allowed OpenAI to systematically silence criticism from its former employees, of which there are now hundreds working throughout the tech industry. And since the agreement also prevented signatories from even disclosing that they had signed this agreement, their silence was easy to misinterpret as evidence that they didn't have notable criticisms to voice.
We were curious about who may have been silenced in this way, and where they work now, so we assembled an (incomplete) list of former OpenAI employees.[2] From what we were able to find, it appears that over 500 people may have signed these agreements, of which only 3 have publicly reported being released so far.[3]
We were especially alarmed to notice that the list contains at least 12 former employees currently working on AI policy, and 6 working on safety evaluations.[4] This includes some in leadership positions, for example:
Beth Barnes (Head of Research, METR)
Bilva Chandra (Senior AI Policy Advisor, NIST)
Charlotte Stix (Head of Governance, Apollo Research)
Chris Painter (Head of Policy, METR)
Geoffrey Irving (Research Director, AI Safety Institute)
Jack Clark (Co-Founder [focused on policy and evals], Anthropic)
Jade Leung (CTO, AI Safety Institute)
Paul Christiano (Head of Safety, AI Safety Institute)
Remco Zwetsloot (Executive Director, Horizon Institute for Public Service)
In our view, it seems hard to trust that people could effectively evaluate or regulate AI, while under strict legal obligation to avoid sharing critical evaluations of a top AI lab, or from taking any other actions which might make the company less valuable (as many regulations presumably would). So if any of these people are not subject to these agreements, we encourage them to mention this in public.
It is rare for company offboarding agreements to contain provisions this extreme - especially those which prevent people from even disclosing that the agreement itself exists. But such provisions are relatively common in the American intelligence industry. The NSA periodically forces telecommunications providers to reveal their clients' data, for example, and when they do the providers are typically prohibited from disclosing that this ever happened.
In response, some companies began listing warrant canaries on their websites - sentences stating that they had never yet been forced to reveal any client data. If at some point they did receive such a warrant, they could then remove the canary without violating their legal non-disclosure obligation, thereby allowing the public to gain indirect evidence about this otherwise-invisible surveillance.
Until recently, OpenAI succeeded at preventing hundreds of its former employees from ever being able to criticize them, and prevented most others - including many of their current employees! - from...