Download - Why Subagents? by johnswentworth

Discover

Podcast Features
Your all-in-one podcasting solution.

Podcast Studio
Easy-to-use audio recorder app.
Livestream
High-performing audio live, without limits.

Podcast App
The best podcast player & podcast app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Patron & Paid Content
The seamless way for fans to support you directly
from your podcast.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Enterprise
Pricing
Discover

The Nonlinear Library: LessWrong Top Posts

Education

Why Subagents? by johnswentworth

2021-12-11

Download Right click and do "save link as"

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why Subagents?, published by johnswentworth on the AI Alignment Forum. Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. The justification for modelling real-world systems as “agents” - i.e. choosing actions to maximize some utility function - usually rests on various coherence theorems. They say things like “either the system’s behavior maximizes some utility function, or it is throwing away resources” or “either the system’s behavior maximizes some utility function, or it can be exploited” or things like that. Different theorems use slightly different assumptions and prove slightly different things, e.g. deterministic vs probabilistic utility function, unique vs non-unique utility function, whether the agent can ignore a possible action, etc. One theme in these theorems is how they handle “incomplete preferences”: situations where an agent does not prefer one world-state over another. For instance, imagine an agent which prefers pepperoni over mushroom pizza when it has pepperoni, but mushroom over pepperoni when it has mushroom; it’s simply never willing to trade in either direction. There’s nothing inherently “wrong” with this; the agent is not necessarily executing a dominated strategy, cannot necessarily be exploited, or any of the other bad things we associate with inconsistent preferences. But the preferences can’t be described by a utility function over pizza toppings. In this post, we’ll see that these kinds of preferences are very naturally described using subagents. In particular, when preferences are allowed to be path-dependent, subagents are important for representing consistent preferences. This gives a theoretical grounding for multi-agent models of human cognition. Preference Representation and Weak Utility Let’s expand our pizza example. We’ll consider an agent who: Prefers pepperoni, mushroom, or both over plain cheese pizza Prefers both over pepperoni or mushroom alone Does not have a stable preference between mushroom and pepperoni - they prefer whichever they currently have We can represent this using a directed graph: The arrows show preference: our agent prefers B over A if (and only if) there is a directed path from A to B along the arrows. There is no path from pepperoni to mushroom or from mushroom to pepperoni, so the agent has no preference between them. In this case, we’re interpreting “no preference” as “agent prefers to keep whatever they have already”. Note that this is NOT the same as “the agent is indifferent”, in which case the agent is willing to switch back and forth between the two options as long as the switch doesn’t cost anything. Key point: there is no cycle in this graph. If the agent’s preferences are cyclic, that’s when they provably throw away resources, paying to go in circles. As long as the preferences are acyclic, we call them “consistent”. Now, at this point we can still define a “weak” utility function by ignoring the “missing” preference between pepperoni and mushroom. Here’s the idea: a normal utility function says “the agent always prefers the option with higher utility”. A weak utility function says: “if the agent has a preference, then they always prefer the option with higher utility”. The missing preference means we can’t build a normal utility function, but we can still build a weak utility function. Here’s how: since our graph has no cycles, we can always order the nodes so that the arrows only go forward along the sorted nodes - a technique called topological sorting. Each node’s position in the topological sort order is its utility. A small tweak to this method also handles indifference. (Note: I’m using the term “weak utility” here because it seems natural; I don’t know of any standard term for this in the literature. Most people don’t distinguish between these two ...