Download - What counts as defection? by Alex Turner

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

The Nonlinear Library: Alignment Forum Top Posts

Education

What counts as defection? by Alex Turner

2021-12-04

Download Right click and do "save link as"

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: What counts as defection?, published by Alex Turner on the AI Alignment Forum.
Thanks to Michael Dennis for proposing the formal definition; to Andrew Critch for pointing me in this direction; to Abram Demski for proposing non-negative weighting; and to Alex Appel, Scott Emmons, Evan Hubinger, philh, Rohin Shah, and Carroll Wainwright for their feedback and ideas.
There's a good chance I'd like to publish this at some point as part of a larger work. However, I wanted to make the work available now, in case that doesn't happen soon.
They can't prove the conspiracy... But they could, if Steve runs his mouth.
The police chief stares at you.
You stare at the table. You'd agreed (sworn!) to stay quiet. You'd even studied game theory together. But, you hadn't understood what an extra year of jail meant.
The police chief stares at you.
Let Steve be the gullible idealist. You have a family waiting for you.
Sunlight stretches across the valley, dappling the grass and warming your bow. Your hand anxiously runs along the bowstring. A distant figure darts between trees, and your stomach rumbles. The day is near spent.
The stags run strong and free in this land. Carla should meet you there. Shouldn't she? Who wants to live like a beggar, subsisting on scraps of lean rabbit meat?
In your mind's eye, you reach the stags, alone. You find one, and your arrow pierces its barrow. The beast shoots away; the rest of the herd follows. You slump against the tree, exhausted, and never open your eyes again.
You can't risk it.
People talk about 'defection' in social dilemma games, from the prisoner's dilemma to stag hunt to chicken. In the tragedy of the commons, we talk about defection. The concept has become a regular part of LessWrong discourse.
Informal definition. A player defects when they increase their personal payoff at the expense of the group.
This informal definition is no secret, being echoed from the ancient Formal Models of Dilemmas in Social Decision-Making to the recent Classifying games like the Prisoner's Dilemma:
you can model the "defect" action as "take some value for yourself, but destroy value in the process".
Given that the prisoner's dilemma is the bread and butter of game theory and of many parts of economics, evolutionary biology, and psychology, you might think that someone had already formalized this. However, to my knowledge, no one has.
Formalism
Consider a finite
n
-player normal-form game, with player
i
having pure action set
A
i
and payoff function
P
i
A
1
×
×
A
n
→
R
. Each player
i
chooses a strategy
s
i
∈
Δ
A
i
(a distribution over
A
i
). Together, the strategies form a strategy profile
s
s
1
s
n
s
−
i
s
1
s
i
−
1
s
i
1
s
n
is the strategy profile, excluding player
i
's strategy. A payoff profile contains the payoffs for all players under a given strategy profile.
A utility weighting
α
j
j
1
n
is a set of
n
non-negative weights (as in Harsanyi's utilitarian theorem). You can consider the weights as quantifying each player's contribution; they might represent a percieved social agreement or be the explicit result of a bargaining process.
When all
α
j
are equal, we'll call that an equal weighting. However, if there are "utility monsters", we can downweight them accordingly.
We're implicitly assuming that payoffs are comparable across players. We want to investigate: given a utility weighting, which actions are defections?
Definition. Player
i
's action
a
∈
A
i
is a defection against strategy profile
s
and weighting
α
j
j
1
n
if
Personal gain:
P
i
a
s
−
i
P
i
s
i
s
−
i
Social loss:
∑
j
α
j
P
j
a
s
−
i
∑
j
α
j
P
j
s
i
s
−
i
If such an action exists for some player
i
, strategy profile
s
, and weighting, then we say that there is an opportunity for defection in the game.
Remark. For an equal weighting, condition (2) is equivalent to demanding that the action n...