Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: The True Prisoner's Dilemma, published by Eliezer Yudkowsky on the AI Alignment Forum.
It occurred to me one day that the standard visualization of the Prisoner's Dilemma is fake.
The core of the Prisoner's Dilemma is this symmetric payoff matrix:
1: C 1: D
2: C (3, 3) (5, 0)
2: D (0, 5) (2, 2)
Player 1, and Player 2, can each choose C or D. 1 and 2's utility for the final outcome is given by the first and second number in the pair. For reasons that will become apparent, "C" stands for "cooperate" and D stands for "defect".
Observe that a player in this game (regarding themselves as the first player) has this preference ordering over outcomes: (D, C) > (C, C) > (D, D) > (C, D).
D, it would seem, dominates C: If the other player chooses C, you prefer (D, C) to (C, C); and if the other player chooses D, you prefer (D, D) to (C, D). So you wisely choose D, and as the payoff table is symmetric, the other player likewise chooses D.
If only you'd both been less wise! You both prefer (C, C) to (D, D). That is, you both prefer mutual cooperation to mutual defection.
The Prisoner's Dilemma is one of the great foundational issues in decision theory, and enormous volumes of material have been written about it. Which makes it an audacious assertion of mine, that the usual way of visualizing the Prisoner's Dilemma has a severe flaw, at least if you happen to be human.
The classic visualization of the Prisoner's Dilemma is as follows: you are a criminal, and you and your confederate in crime have both been captured by the authorities.
Independently, without communicating, and without being able to change your mind afterward, you have to decide whether to give testimony against your confederate (D) or remain silent (C).
Both of you, right now, are facing one-year prison sentences; testifying (D) takes one year off your prison sentence, and adds two years to your confederate's sentence.
Or maybe you and some stranger are, only once, and without knowing the other player's history, or finding out who the player was afterward, deciding whether to play C or D, for a payoff in dollars matching the standard chart.
And, oh yes - in the classic visualization you're supposed to pretend that you're entirely selfish, that you don't care about your confederate criminal, or the player in the other room.
It's this last specification that makes the classic visualization, in my view, fake.
You can't avoid hindsight bias by instructing a jury to pretend not to know the real outcome of a set of events. And without a complicated effort backed up by considerable knowledge, a neurologically intact human being cannot pretend to be genuinely, truly selfish.
We're born with a sense of fairness, honor, empathy, sympathy, and even altruism - the result of our ancestors adapting to play the iterated Prisoner's Dilemma. We don't really, truly, absolutely and entirely prefer (D, C) to (C, C), though we may entirely prefer (C, C) to (D, D) and (D, D) to (C, D). The thought of our confederate spending three years in prison, does not entirely fail to move us.
In that locked cell where we play a simple game under the supervision of economic psychologists, we are not entirely and absolutely unsympathetic for the stranger who might cooperate. We aren't entirely happy to think what we might defect and the stranger cooperate, getting five dollars while the stranger gets nothing.
We fixate instinctively on the (C, C) outcome and search for ways to argue that it should be the mutual decision: "How can we ensure mutual cooperation?" is the instinctive thought. Not "How can I trick the other player into playing C while I play D for the maximum payoff?"
For someone with an impulse toward altruism, or honor, or fairness, the Prisoner's Dilemma doesn't really have the critical payoff matrix - whatever the financial payoff to ...
view more