LW - On the CrowdStrike Incident by Zvi
EA - Peter Singer AMA (July 30th) by Toby Tremlett
AF - Coalitional agency by Richard Ngo
LW - A simple model of math skill by Alex Altair
AF - aimless ace analyzes active amateur: a micro-aaaaalignment proposal by Luke H Miles
EA - animal welfare outside of EA - numbers and thoughts on UFAW 2024 by ARM
EA - New 80k problem profile: Nuclear weapons by Benjamin Hilton
LW - Why Georgism Lost Its Popularity by Zero Contradictions
LW - (Approximately) Deterministic Natural Latents by johnswentworth
AF - A more systematic case for inner misalignment by Richard Ngo
AF - BatchTopK: A Simple Improvement for TopK-SAEs by Bart Bussmann
EA - New Book: "Minimalist Axiologies: Alternatives to 'Good Minus Bad' Views of Value" by Teo Ajantaival
LW - Feature Targeted LLC Estimation Distinguishes SAE Features from Random Directions by Lidor Banuel Dabbah
AF - Feature Targeted LLC Estimation Distinguishes SAE Features from Random Directions by Lidor Banuel Dabbah
AF - Truth is Universal: Robust Detection of Lies in LLMs by Lennart Buerger
AF - JumpReLU SAEs + Early Access to Gemma 2 SAEs by Neel Nanda
EA - CEEALAR: 2024 Update by CEEALAR
EA - Taking Uncertainty Seriously (or, Why Tools Matter) by Bob Fischer
EA - AI companies are not on track to secure model weights by Jeffrey Ladish
LW - How do we know that "good research" is good? (aka "direct evaluation" vs "eigen-evaluation") by Ruby
Create your
podcast in
minutes
It is Free
In the Great Khan’s Tent
Visualize Meditations
The No-Frills Teacher Podcast
The Jordan B. Peterson Podcast
The Mel Robbins Podcast