Download - LW - UDT1.01: Plannable and Unplanned Observations (3/10) by Diffractor

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

The Nonlinear Library: LessWrong

Education

$LW - UDT1.01: Plannable and Unplanned Observations (3/10) by Diffractor$

LW - UDT1.01: Plannable and Unplanned Observations (3/10) by Diffractor

2024-04-12

Download Right click and do "save link as"

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: UDT1.01: Plannable and Unplanned Observations (3/10), published by Diffractor on April 12, 2024 on LessWrong.
The Omnipresence of Unplanned Observations
Time to introduce some more concepts. If an observation is "any data you can receive which affects your actions", then there seem to be two sorts of observations. A plannable observation is the sort of observation where you could plan ahead of time how to react to it. A unplanned observation is the sort which you can't (or didn't) write a lookup-table style policy for.
Put another way, if a policy tells you how to map histories of observations to actions, those "histories" are the plannables.
However, to select that policy in the first place, over its competitors, you probably had to do some big computation to find some numbers like "expected utility if I prepare a sandwich when I'm in the kitchen but not hungry", or "the influence of my decisions in times of war on the probability of war in the first place", or "the probability distribution on what the weather will be if I step outside", or "my own default policy about revealing secret information".
These quantities affect your choice of action. If they were different, your action would be different. In some sense you're observing these numbers, in order to pick your action. And yet, the lookup-table style policies which UDT produces are phrased entirely in terms of environmental observations. You can write a lookup-table style policy about how to react to environmental observations.
However, these beliefs about the environment aren't the sort of observation that's present in our lookup table. You aren't planning in advance how to react to these observations, you're just reacting to them, so they're unplanned.
Yeah, you could shove everything in your prior. But to have a sufficiently rich prior, which catches on to highly complex patterns, including patterns in what your own policy ends up being... well, unfolding that prior probably requires a bunch of computational work, and observing the outputs of long computations. These outputs of long computations that you see when you're working out your prior would, again, be unplanned observations.
If you do something like "how about we run a logical inductor for a while, and then ask the logical inductor to estimate these numbers, and freeze our policy going forward from there?", then the observations from the environment would be the plannables, and the observations from the logical inductor state would be the unplanned observations.
The fundamental obstacle of trying to make updatelessness work with logical uncertainty (being unsure about the outputs of long computations), is this general pattern. In order to have decent beliefs about long computations, you have to think for a while. The outputs of that thinking also count as observations. You could try being updateless about them and treat them as plannable observations, but then you'd end up with an even bigger lookup table to write.
Going back to our original problem, where we'll be seeing n observations/binary bits, and have to come up with a plan to how to react to the bitstrings... Those bitstrings are our plannable observations. However, in the computation for how to react to all those situations, we see a bunch of other data in the process. Maybe these observations come from a logical inductor or something.
We could internalize these as additional plannable observations, to go from "we can plan over environmental observations" to "we can plan over environmental observations, and math observations". But then that would make our tree of (plannable) observations dramatically larger and more complex.
And doing that would introduce even more unplanned observations, like "what's the influence of action A in "world where I observe that I think the influence of action A...