Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: The Power of Reinforcement, published by The Power of Reinforcement on the LessWrong.
Part of the sequence: The Science of Winning at Life
Also see: Basics of Animal Reinforcement, Basics of Human Reinforcement, Physical and Mental Behavior, Wanting vs. Liking Revisited, Approving reinforces low-effort behaviors, Applying Behavioral Psychology on Myself.
Story 1:
On Skype with Eliezer, I said: "Eliezer, you've been unusually pleasant these past three weeks. I'm really happy to see that, and moreover, it increases my probability than an Eliezer-led FAI research team will work. What caused this change, do you think?"
Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."
Story 2:
I once witnessed a worker who hated keeping a work log because it was only used "against" him. His supervisor would call to say "Why did you spend so much time on that?" or "Why isn't this done yet?" but never "I saw you handled X, great job!" Not surprisingly, he often "forgot" to fill out his worklog.
Ever since I got everyone at the Singularity Institute to keep work logs, I've tried to avoid connections between "concerned" feedback and staff work logs, and instead take time to comment positively on things I see in those work logs.
Story 3:
Chatting with Eliezer, I said, "Eliezer, I get the sense that I've inadvertently caused you to be slightly averse to talking to me. Maybe because we disagree on so many things, or something?"
Eliezer's reply was: "No, it's much simpler. Our conversations usually run longer than our previously set deadline, so whenever I finish talking with you I feel drained and slightly cranky."
Now I finish our conversations on time.
Story 4:
A major Singularity Institute donor recently said to me: "By the way, I decided that every time I donate to the Singularity Institute, I'll set aside an additional 5% for myself to do fun things with, as a motivation to donate."
The power of reinforcement
It's amazing to me how consistently we fail to take advantage of the power of reinforcement.
Maybe it's because behaviorist techniques like reinforcement feel like they don't respect human agency enough. But if you aren't treating humans more like animals than most people are, then you're modeling humans poorly.
You are not an agenty homunculus "corrupted" by heuristics and biases. You just are heuristics and biases. And you respond to reinforcement, because most of your motivation systems still work like the motivation systems of other animals.
A quick reminder of what you learned in high school
A reinforcer is anything that, when it occurs in conjunction with an act, increases the probability that the act will occur again.
A positive reinforcer is something the subject wants, such as food, petting, or praise. Positive reinforcement occurs when a target behavior is followed by something the subject wants, and this increases the probability that the behavior will occur again.
A negative reinforcer is something the subject wants to avoid, such as a blow, a frown, or an unpleasant sound. Negative reinforcement occurs when a target behavior is followed by some relief from something the subject doesn't want, and this increases the probability that the behavior will happen again.
What works
Small reinforcers are fine, as long as there is a strong correlation between the behavior and the reinforcer (Schneider 1973; Todorov et al. 1984). All else equal, a large reinforcer is more effective than a small one (Christopher 1988; Ludvig et al. 2007; Wolfe 1936), but the more you increase the reinforcer magnitude, the less benefit you get from the increase (Frisch & Dickinson 1990).
The reinforcer should immediately follow the target behavior (Escobar & Bruner 2007; Schlinger & Blakely 1994; Schneider 1990). Pryor...
view more