Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some for-profit AI alignment org ideas, published by Eric Ho on December 14, 2023 on LessWrong.
Summary
This is a brain dump of some for-profit AI alignment organization ideas, along with context for why I believe a for-profit alignment organization can make a big contribution to AI safety. This is far from a complete list, and I welcome ideas and feedback. Also, if anyone wants to or is working on any of these ideas, I'd be happy to support in any way I can!
Context
I'm Eric, formerly co-founder of RippleMatch, an AI recruiting company with ~$80M raised, millions of users, and ~10% of the Fortune 500 as customers. I made the difficult decision to leave RippleMatch this year because I'm concerned about catastrophic risk from AI, and have been spending the last year thinking about ways to help. Given my background, I've been thinking a lot about for-profit ideas to help with alignment - many that can be VC-backed. Some of these ideas speak more directly to reducing catastrophic risk than others, but I think that all can put a founder in a strong position to help in the future.
Why I believe for-profit alignment orgs are valuable
I don't think for-profit approaches are inherently better than building non-profits, pursuing government regulation, or other approaches, but I think that for-profit orgs can make a substantial impact while attracting a different pool of talent eager to work on the problem.
With VC dollars, a for-profit organization can potentially scale far more quickly than a non-profit. It could make a huge impact and not have its growth capped by donor generosity. As a result, there can be far more organizations working on safety in the ecosystem tapping into a different pool of resources. That said, any VC-backed company has a relatively low chance of success, so it's a riskier approach.
Fundamentally, I believe that risk and compliance spend will grow extremely quickly over the coming decade, scaling with generative AI revenue. With comps in finance and cybersecurity, I'd guess that mid to high single digit percentages of overall AI spend will be on risk and compliance, which would suggest big businesses can be built here. Many startups tackling alignment will need to start by addressing short term safety concerns, but in doing so will position themselves to tackle long-term risks over time.
Onto the actual ideas!
Robustness approaches
Testing / benchmarking software
Test case management needs to look very different for LLMs compared to typical software. The idea is to sell companies deploying LLMs a SaaS platform with the ability to generate and manage test cases for their LLMs to make sure they are performing properly and ensure that performance doesn't drift from version to version. This startup would also incorporate a marketplace of common benchmarks that companies can pull off the shelf if relevant to their use case (e.g. common adversarial prompts).
Currently, my impression is that most companies don't use any software to manage their language model test suites, which is a problem given how often an LLM can fail to produce a good result.
Red-teaming as a service
Just as software companies penetration test their software, companies that use LLMs as well as companies who build frontier models will need to red-team their models with a wide variety of adversarial prompts. This would mostly test models for how they handle misuse and make them more robust against jailbreaking.
Just as a proper penetration test employs both manual and automated penetration testing, this startup would require building / fine-tuning the best automated red-teaming LLM that likely draws on multiple frontier models, as well as employ the best manual red-teamers in the space. Enterprises would likely pay a subscription depending on their usage, which would likely be spiky.
The...
view more