The Nonlinear Library: EA Forum Podcast - EA - Clarifying two uses of "alignment" by Matthew Barnett

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

The Nonlinear Library: EA Forum

Education

EA - Clarifying two uses of ”alignment” by Matthew Barnett

2024-03-10

iOS

Android Share

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Clarifying...

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Clarifying two uses of "alignment", published by Matthew Barnett on March 10, 2024 on The Effective Altruism Forum.Paul Christiano once clarified AI alignment as follows:When I say an AI A is aligned with an operator H, I mean:A is trying to do what H wants it to do.This definition is clear enough for many purposes, but it leads to confusion when one wants to make a point about two different types of alignment:A is trying to do what H wants it to do because A is trading or cooperating with H on a mutually beneficial outcome for the both of them. For example, H could hire A to perform a task, and offer a wage as compensation.A is trying to do what H wants it to do because A has the same values as H - i.e. its "utility function" overlaps with H's utility function - and thus A intrinsically wants to pursue what H wants it to do.These cases are important to distinguish because they have dramatically different consequences for the difficulty and scope of alignment.To solve alignment in the sense (1), A and H don't necessarily need to share the same values with each other in any strong sense. Instead, the essential prerequisite seems to be for A and H to operate in an environment in which it's mutually beneficial to them to enter contracts, trade, or cooperate in some respect.For example, one can imagine a human hiring a paperclip maximizer AI to perform work, paying them a wage. In return the paperclip maximizer could use their wages to buy more paperclips. In this example, the AI performed their duties satisfactorily, without any major negative side effects resulting from their differing values, and both parties were made better off as a result.By contrast, alignment in the sense of (2) seems far more challenging to solve. In the most challenging case, this form of alignment would require solving extremal goodhart, in the sense that A's utility function would need to be almost perfectly matched with H's utility function. Here, the idea is that even slight differences in values yield very large differences when subject to extreme optimization pressure.Because it is presumably easy to make slight mistakes when engineering AI systems, by assumption, these mistakes could translate into catastrophic losses of value.Effect on alignment difficultyMy impression is that people's opinions about AI alignment difficulty often comes down to differences in how much they think we need to solve the second problem relative to the first problem, in order to get AI systems that generate net-positive value for humans.If you're inclined towards thinking that trade and compromise is either impossible or inefficient between agents at greatly different levels of intelligence, then you might think that we need to solve the second problem with AI, since "trading with the AIs" won't be an option. My understanding is that this is Eliezer Yudkowsky's view, and the view of most others who are relatively doomy about AI.In this frame, a common thought is that AIs would have no need to trade with humans, as humans would be like ants to them.On the other hand, you could be inclined - as I am - towards thinking that agents at greatly different levels of intelligence can still find positive sum compromises when they are socially integrated with each other, operating under a system of law, and capable of making mutual agreements. In this case, you might be a lot more optimistic about the prospects of alignment.To sketch one plausible scenario here, if AIs can own property and earn income by selling their labor on an open market, then they can simply work a job and use their income to purchase whatever it is they want, without any need to violently "take over the world" to satisfy their goals. At the same time, humans could retain power in this system through capital ownership and other gran...

More Episodes

EA - AI things that are perhaps as important as human-controlled AI (Chi version) by Chi

2024-03-03

EA - How to Speedrun a New Drug Application (Interview with Alvea's former CEO) by Aaron Gertler

2024-03-03

EA - Running 200 miles for New Incentives by Emma Cameron

2024-03-02

EA - Review of EA Global Bay Area 2024 (Global Catastrophic Risks) by frances lorenz

2024-03-01

EA - Forum feature updates: add buttons, see your stats, send DMs easier, and more (March '24) by tobytrem

2024-03-01

EA - Creative video ads significantly increase GWWC's social media engagement and web traffic to pledge page by James Odene [User-Friendly]

2024-03-01

EA - What are the biggest misconceptions about biosecurity and pandemic risk? by 80000 Hours

2024-02-29

EA - Wholesomeness and Effective Altruism by Owen Cotton-Barratt

2024-02-29

EA - Evidential Cooperation in Large Worlds: Potential Objections & FAQ by Chi

2024-02-28

EA - This is why we can't have nice laws by LewisBollard

2024-02-28

EA - Results of my 2024 r/Vegan Survey on what influences people to go Vegan. by PreciousPig

2024-02-28

EA - Announcing Draft Amnesty Week (March 11-17) by tobytrem

2024-02-27

EA - What posts would you like someone to write? by tobytrem

2024-02-27

EA - Meta Charity Funders: Launching the 2nd round by Vilhelm Skoglund

2024-02-27

EA - Nuclear war tail risk has been exaggerated? by Vasco Grilo

2024-02-26

EA - How much parenting harms productivity and how you can reduce it by Nicholas Kruus

2024-02-26

EA - How we started our own EA charity (and why we decided to wrap up) by KvPelt

2024-02-26

EA - Bloomberg: Unacknowledged problems with LLINs are causing a rise in malaria. by Ian Turner

2024-02-25

EA - Cooperating with aliens and (distant) AGIs: An ECL explainer by Chi

2024-02-25

EA - My favorite articles by Brian Tomasik and what they are about by Timothy Chan

2024-02-25

012345678910111213141516171819

Create your
podcast in
minutes

Full-featured podcast site
Unlimited storage and bandwidth
Comprehensive podcast stats
Distribute to Apple Podcasts, Spotify, and more
Make money with your podcast

Get started

It is Free

Podcast Services

MONETIZATION & MORE

KNOWLEDGE BASE

Support

Podbean

More Episodes

You may also like

Podcast Services

MONETIZATION & MORE

KNOWLEDGE BASE

Support

Podbean