Support ongoing human narrations of LessWrong's curated posts:
www.patreon.com/LWCurated
The goal of this post is to clarify a few concepts relating to AI Alignment under a common framework. The main concepts to be clarified:
- Optimization. Specifically, this will be a type of Vingean agency. It will split into Selection vs Control variants.
- Reference (the relationship which holds between map and territory; aka semantics, aka meaning). Specifically, this will be a teleosemantic theory.
The main new concepts employed will be endorsement and legitimacy.
TLDR:
- Endorsement of a process is when you would take its conclusions for your own, if you knew them.
- Legitimacy relates to endorsement in the same way that good relates to utility. (IE utility/endorsement are generic mathematical theories of agency; good/legitimate refer to the specific thing we care about.)
- We perceive agency when something is better at doing something than us; we endorse some aspect of its reasoning or activity. (Endorse as a way of achieving its goals, if not necessarily our own.)
- We perceive meaning (semantics/reference) in cases where something has been optimized for accuracy -- that is, the goal we endorse a conclusion with respect to is some notion of accuracy of representation.
This write-up owes a large debt to many conversations with Sahil, although the views expressed here are my own.
Source:
https://www.lesswrong.com/posts/bnnhypM5MXBHAATLw/meaning-and-agency
Narrated for LessWrong by Perrin Walker.
Share feedback on this narration.
[Curated Post] ✓