Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: Optimization Concepts in the Game of Life, published by Vika, Ramana Kumar on the AI Alignment Forum.
Abstract: We define robustness and retargetability (two of Flint’s measures of optimization) in Conway’s Game of Life and apply the definitions to a few examples. The same approach likely works in most embedded settings, and provides a frame for conceptualizing and quantifying these aspects of agency. We speculate on the relationship between robustness and retargetability, and identify various directions for future work.
Motivation
We would like to better understand the fundamental principles of agency (and related phenomena including optimization and goal-directedness). We focus on agency because we believe agency is a core source of risk from AI systems, especially in worlds with one (or few) most-capable systems. The goals of the most competent consequence-driven systems are more likely to be achieved, because trying outperforms not trying or less competent trying. We do not want to create a world where such systems are working against us. By better understanding agency, we hope to improve our ability to avoid mistakenly building systems working capably against us, and to correct course if we do.
A rich source of confusions about agency comes from attending to the fact that goal-directed systems are part of – embedded in – the environment that their goals are about. Most practical work on AI avoids the confusions of embedded agency by constructing and enforcing a Cartesian boundary between agent and environment, using frameworks such as reinforcement learning (RL) that define an interaction protocol. We focus on embedded agency because we expect not to be able to enforce a Cartesian boundary for highly capable agents in general domains, and, as a particularly strong instance of this, because agents may emerge unexpectedly in systems where we did not design how they interface with the rest of the world.
Our approach to deconfusion in this post is to identify concepts that seem relevant to embedded agency but do not have technical definitions, to propose some definitions, and see how they fare on some examples. More generally, we are interested in analyzing small examples of agency-related phenomena in the hope that some examples will be simple enough to yield insight while retaining essential features of the phenomenon.
Optimization in the Game of Life
Concepts
We draw two concepts from Alex Flint’s essay The Ground of Optimization. Flint defines an optimizing system as a system that evolves towards a small set of target configurations from a broad basin of attraction, despite perturbations. The essay introduces measures for quantifying optimization systems. One is robustness: how robust to perturbations is the process of reaching the target set, e.g. the number of dimensions on which perturbations can be made or the magnitude of the perturbations. Another measure is retargetability: whether the system can be transformed into another optimizing system with a different target configuration set via a small change.
Here, we develop more precise definitions of these concepts by concentrating on a particular concrete domain: Conway’s Game of Life. This is a natural setting for studying embedded agency because it is a deterministic environment with no pre-specified Cartesian boundaries, which is rich enough to support emergent goal-directed behavior, yet simple enough to define the concepts above explicitly.
Examples
Before getting to the definitions, let’s look at how we might draw analogies between some of the examples of systems (including optimizing systems) from the Ground of Optimization post and structures in the Game of Life.
The Ground of Optimization Game of Life Optimizing system?
Bottle cap
Block
No
Satellite in orbit
Glider
No
Ball in a valley
Eater
Yes
Ball...
view more