Download - Transformers Represent Belief State Geometry in their Residual Stream

Discover

Podcast Features
Monetization
Podbean App
- Podcast Studio
  Easy-to-use audio recorder app.
- Podcast App
  The best podcast player & podcast app.

Help and Support
Popular Topics

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Advertisers
Enterprise
Pricing
Resources
- Help and Support
- Popular Topics
Discover

LessWrong (Curated & Popular)

Technology

Transformers Represent Belief State Geometry in their Residual Stream

2024-04-17

Download Right click and do "save link as"

Produced while being an affiliate at PIBBSS[1]. The work was done initially with funding from a Lightspeed Grant, and then continued while at PIBBSS. Work done in collaboration with @Paul Riechers, @Lucas Teixeira, @Alexander Gietelink Oldenziel, and Sarah Marzen. Paul was a MATS scholar during some portion of this work. Thanks to Paul, Lucas, Alexander, and @Guillaume Corlouer for suggestions on this writeup.

Introduction.

What computational structure are we building into LLMs when we train them on next-token prediction? In this post we present evidence that this structure is given by the meta-dynamics of belief updating over hidden states of the data-generating process. We'll explain exactly what this means in the post. We are excited by these results because

We have a formalism that relates training data to internal structures in LLMs.
Conceptually, our results mean that LLMs synchronize to their internal world model as they move [...]

The original text contained 10 footnotes which were omitted from this narration.

---

First published:
April 16th, 2024

Source:
https://www.lesswrong.com/posts/gTZ2SxesbHckJ3CkF/transformers-represent-belief-state-geometry-in-their

---

Narrated by TYPE III AUDIO.