Download - LW - Logical Line-Of-Sight Makes Games Sequential or Loopy by StrivingForLegibility

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

The Nonlinear Library: LessWrong

Education

LW - Logical Line-Of-Sight Makes Games Sequential or Loopy by StrivingForLegibility

2024-01-19

Download Right click and do "save link as"

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Logical Line-Of-Sight Makes Games Sequential or Loopy, published by StrivingForLegibility on January 19, 2024 on LessWrong.
In the last post, we talked about strategic time and the strategic time loops studied in open-source game theory. In that context, agents have logical line-of-sight to each other and the situation they're both facing, which creates a two-way information flow at the time each is making their decision. In this post I'll describe how agents in one context can use this logical line-of-sight to condition their behavior on how they behave in other contexts. This in turn makes those contexts strategically sequential or loopy, in a way that a purely causal decision theory doesn't pick up on.
Sequential Games and Leverage
As an intuition pump, consider the following ordinary game: Alice and Bob are going to play a Prisoners' Dilemma, and then an Ultimatum game. My favorite framing of the Prisoners' Dilemma is by Nicky Case: each player stands in front of a machine which accepts a certain amount of money, e.g. $100.[1] Both players choose simultaneously whether to put some of their own money into the machine. If Alice places $100 into the machine in front of her, $200 comes out of Bob's machine, and vice versa.
If a player withholds their money, nothing comes out of the other player's machine. We call these strategies Cooperate and Defect respectively.
Since neither player can cause money to come out of their own machine, Causal Decision Theory (CDT) identifies Defect as a dominant strategy for both players. Dissatisfaction with this answer has motivated many to dig into the foundations of decision theory, and coming up with different conditions that enable Cooperation in the Prisoners' Dilemma has become a cottage industry for the field.
I myself keep calling it the Prisoners' Dilemma (rather than the Prisoner's Dilemma) because I want to frame it as a dilemma they're facing together, where they can collaboratively implement mechanisms that incentivize mutual Cooperation. The mechanism I want to describe today is leverage: having something the other player wants, and giving it to them if and only if they do what you want.
Suppose that the subsequent Ultimatum game is about how to split $1,000. After the Prisoners' Dilemma, a fair coin is flipped to determine Alice and Bob's roles in the Ultimatum game. The evaluator can employ probabilistic rejection to shape the incentives of the proposer, so that the proposer has the unique best-response of offering a fair split. (According to the evaluator's notion of fairness.) And both players might have common knowledge that "a fair split" depends on what both players did in the Prisoners' Dilemma.
If Alice is the evaluator, and she Cooperated in the first round but Bob Defected, then she is $200 worse-off than if Bob had Cooperated, and she can demand that Bob compensate her for this loss. Similarly, if Alice is the proposer, she might offer Bob $500 if he Cooperated but $300 if he Defected. Since Bob only gained $100 compared to Cooperating, his best-response is to Cooperate if he believes Alice will follow this policy. And Bob can employ the same policy, stabilizing the socially optimal payoff of ($600, $600) as a Nash equilibrium where neither has an incentive to change their policy.
Crucially, this enforcement mechanism relies on each player having enough leverage in the subsequent game to incentivize Cooperation in the first round. If the Ultimatum game had been for stakes less than $200, this would be less than a Defector can obtain for themselves if the other player Cooperates. Knowing that neither can incentivize Cooperation, both players might fall back into mutual Defection.
Bets vs Unexploitability
Even if Alice knows she has enough leverage that she can incentivize Bob to Cooperate, she might be uncert...