Download - EA - Analyzing the moral value of unaligned AIs by Matthew Barnett

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

The Nonlinear Library

Education

EA - Analyzing the moral value of unaligned AIs by Matthew Barnett

2024-04-08

Download Right click and do "save link as"

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Analyzing the moral value of unaligned AIs, published by Matthew Barnett on April 8, 2024 on The Effective Altruism Forum.
A crucial consideration in assessing the risks of advanced AI is the moral value we place on "unaligned" AIs - systems that do not share human preferences - which could emerge if we fail to make enough progress on technical alignment.
In this post I'll consider three potential moral perspectives, and analyze what each of them has to say about the normative value of the so-called "default" unaligned AIs that humans might eventually create:
Standard total utilitarianism combined with longtermism: the view that what matters most is making sure the cosmos is eventually filled with numerous happy beings.
Human species preservationism: the view that what matters most is making sure the human species continues to exist into the future, independently from impartial utilitarian imperatives.
Near-termism or present-person affecting views: what matters most is improving the lives of those who currently exist, or will exist in the near future.
I argue that from the first perspective, unaligned AIs don't seem clearly bad in expectation relative to their alternatives, since total utilitarianism is impartial to whether AIs share human preferences or not. A key consideration here is whether unaligned AIs are less likely to be conscious, or less likely to bring about consciousness, compared to alternative aligned AIs. On this question, I argue that there are considerations both ways, and no clear answers.
Therefore, it tentatively appears that the normative value of alignment work is very uncertain, and plausibly neutral, from a total utilitarian perspective.
However, technical alignment work is much more clearly beneficial from the second and third perspectives. This is because AIs that share human preferences are likely to both preserve the human species and improve the lives of those who currently exist. However, in the third perspective, pausing or slowing down AI is far less valuable than in the second perspective, since it forces existing humans to forego benefits from advanced AI, which I argue will likely be very large.
I personally find moral perspectives (1) and (3) most compelling, and by contrast find view (2) to be uncompelling as a moral view. Yet it is only from perspective (2) that significantly delaying advanced AI for alignment reasons seems clearly beneficial, in my opinion. This is a big reason why I'm not very sympathetic to pausing or slowing down AI as a policy proposal.
While these perspectives do not exhaust the scope of potential moral views, I think this analysis can help to sharpen what goals we intend to pursue by promoting particular forms of AI safety work.
Unaligned AIs from a total utilitarian point of view
Let's first consider the normative value of unaligned AIs from the first perspective. From a standard total utilitarian perspective, entities matter morally if they are conscious (under hedonistic utilitarianism) or if they have preferences (under preference utilitarianism). From this perspective, it doesn't actually matter much intrinsically if AIs don't share human preferences, so long as they are moral patients and have their preferences satisfied.
The following is a prima facie argument that utilitarians shouldn't care much about technical AI alignment work. Utilitarianism is typically not seen as partial to human preferences in particular. Therefore, efforts to align AI systems with human preferences - the core aim of technical alignment work - may be considered morally neutral from a utilitarian perspective.
The reasoning here is that changing the preferences of AIs to better align them with the preferences of humans doesn't by itself clearly seem to advance the aims of utilitarianism, in the sense of filling the cosmos w...