Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Conditionals All The Way Down, published by lunatic at large on October 2, 2023 on LessWrong.
(I thought about this idea on my own before Googling to see if anyone had already written it up. I found something very similar at so all credit for this line of thinking should go to the authors of that paper. Still, I think that this concept deserves a writeup on Lesswrong and I also want to write a series of posts on this kind of topic so I need to start somewhere. If this idea has already been written up on Lesswrong then please let me know!)
Alice and Bob are driving in a car and Alice wants to know whether the driver in front of them will turn at the next light.
Alice asks Bob, "What's the probability that the driver will turn at the next light?" Unfortunately, Bob doesn't know how to estimate that. However, Bob does know that there are cherry blossoms which might be in bloom off the next exit. Bob is able to use his predictive talent to determine that there's a 50% chance that the driver will turn if there are cherry blossoms on display and that there's a 25% chance that the driver will turn if there aren't any cherry blossoms on display. Bob tells Alice that no other variables will interfere with these conditional probabilities.
Alice then asks Bob, "What's the probability that there will be cherry blossoms on display?" Again, Bob is unable to determine this probability. However, Bob does know that the city government was considering chopping the cherry trees down. Bob tells Alice that if the city chopped them down then there's a 5% chance of finding cherry blossoms and that if the city didn't chop them down then there's a 70% of finding cherry blossoms. Bob knows that no other variables can impact these conditional probabilities.
Alice now asks Bob, "What's the probability that the city cut down the cherry trees?" Predictably, Bob doesn't know how to answer that. However, Bob again uses his magical powers of perception to deduce that there's an 80% chance the city chopped them down if the construction company that was lobbying for them to be cut down won its appeal and a 10% chance the city chopped them down if the construction company that was lobbying for them to be cut down lost its appeal.
Now imagine that this conversation goes on forever: whether the construction company won is determined by whether the pro-business judge was installed which is determined by whether the governor was under pressure and so on. At the end we get an infinite Bayesian network that's a single chain extending infinitely far in one direction. Importantly, there's no "starting" node we can assign an outright probability to.
So Alice will never be able to get an answer, right? If there's no "starting" node we have an outright probability for then how can Alice hope to propagate forward to determine the probability that the driver will turn at the light?
I claim that Alice can actually do pretty well. Let's draw a picture to see why:
I'm using A0 to denote the event where the driver turns right, A1 to denote the event where the cherry blossoms are on display, and so on. If we know P(Ai) for positive integer i then we can compute P(Ai-1) via
P(Ai-1)=P(Ai-1|Ai)P(Ai)+P(Ai-1|ACi)P(ACi)
=P(Ai-1|Ai)P(Ai)+P(Ai-1|ACi)(1-P(Ai))
=P(Ai-1|ACi)+(P(Ai-1|Ai)-P(Ai-1|ACi))P(Ai)
where P(Ai-1|Ai) and P(Ai-1|ACi) are the constants which Bob has provided to Alice. Let's think of these as functions fi:[0,1][0,1] defined by fi(x)=P(Ai-1|ACi)+(P(Ai-1|Ai)-P(Ai-1|ACi))x where we know that P(Ai-1)=fi(P(Ai)). I've illustrated the behavior of these functions with black arrows in the diagram above.
Alice wants to find P(A0). What can she do? Well, she knows that P(A0) must be an output of f1, i.e. P(A0)∈f1([0,1]). Visually:
Alice also knows that P(A1) is an output of f2, so actually P(A0)∈f1(f2([0,1])):
Alice can kee...
view more