Download - LW - AI #54: Clauding Along by Zvi

Discover

Podcast Features
Your all-in-one podcasting solution.

Blog to Podcast
Turn your blog into an engaging podcast.
Livestream
High-performing audio live, without limits.

Podcast Studio
Easy-to-use audio recorder app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Podcast App
The best podcast player & podcast app.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.
Live Streaming
Receive livestream rewards from your audience and earn
recurring income from your Fan Club membership.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Advertisers
Enterprise
Pricing
Discover

The Nonlinear Library: LessWrong

Education

LW - AI #54: Clauding Along by Zvi

2024-03-08

Download Right click and do "save link as"

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #54: Clauding Along, published by Zvi on March 8, 2024 on LessWrong.
The big news this week was of course the release of Claude 3.0 Opus, likely in some ways the best available model right now. Anthropic now has a highly impressive model, impressive enough that it seems as if it breaks at least the spirit of their past commitments on how far they will push the frontier. We will learn more about its ultimate full capabilities over time.
We also got quite the conversation about big questions of one's role in events, which I immortalized as Read the Roon. Since publication Roon has responded, which I have edited into the post along with some additional notes.
That still leaves plenty of fun for the full roundup. We have spies. We have accusations of covert racism. We have Elon Musk suing OpenAI. We have a new summary of simulator theory. We have NIST, tasked with AI regulation, literally struggling to keep a roof over their head. And more.
Table of Contents
Introduction.
Table of Contents.
Language Models Offer Mundane Utility. Predict the future.
Language Models Don't Offer Mundane Utility. Provide basic info.
LLMs: How Do They Work? Emmett Shear rederives simulators, summarizes.
Copyright Confrontation. China finds a copyright violation. Curious.
Oh Elon. He sues OpenAI to… force it to change its name? Kind of, yeah.
DNA Is All You Need. Was I not sufficiently impressed with Evo last week?
GPT-4 Real This Time. A question of intelligence.
Fun With Image Generation. Be careful not to have too much fun.
Deepfaketown and Botpocalypse Soon. This will not give you a hand.
They Took Our Jobs. They gave us a few back. For now, at least.
Get Involved. Davidad will have direct report, it could be you.
Introducing. An AI-based RPG will never work, until one does.
In Other AI News. The fallout continues, also other stuff.
More on Self-Awareness. Not the main thing to worry about.
Racism Remains a Problem for LLMs. Covert is a generous word for this.
Project Maven. Yes, we are putting the AIs in charge of weapon targeting.
Quiet Speculations. Claimed portents of various forms of doom.
The Quest for Sane Regulation. NIST might need a little help.
The Week in Audio. Sergey Brin Q&A.
Rhetorical Innovation. It is not progress. We still keep trying.
Another Open Letter. Also not really progress. We still keep trying.
Aligning a Smarter Than Human Intelligence is Difficult. Recent roundup.
Security is Also Difficult. This too is not so covert, it turns out.
The Lighter Side. It's me, would you like a fries with that?
Language Models Offer Mundane Utility
Forecast almost as well, or sometimes better, than the wisdom of crowds using GPT-4? Paper says yes. Prompt they used is here.
This does require an intensive process.
First, we generate search queries that are used to invoke news APIs to retrieve historical articles. We initially implement a straightforward query expansion prompt (Figure 12a), instructing the model to create queries based on the question and its background. However, we find that this overlooks sub-considerations that often contribute to accurate forecasting.
To achieve broader coverage, we prompt the model to decompose the forecasting question into sub-questions and use each to generate a search query (Min et al., 2019); see Figure 12b for the prompt. For instance, when forecasting election outcomes, the first approach searches directly for polling data, while the latter creates sub-questions that cover campaign finances, economic indicators, and geopolitical events. We combine both approaches for comprehensive coverage.
Next, the system retrieves articles from news APIs using the LM-generated search queries. We evaluate 5 APIs on the relevance of the articles retrieved and select NewsCatcher1 and Google News (Section E.2). Our initial retrieval provides wide covera...