Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #54: Clauding Along, published by Zvi on March 8, 2024 on LessWrong.
The big news this week was of course the release of Claude 3.0 Opus, likely in some ways the best available model right now. Anthropic now has a highly impressive model, impressive enough that it seems as if it breaks at least the spirit of their past commitments on how far they will push the frontier. We will learn more about its ultimate full capabilities over time.
We also got quite the conversation about big questions of one's role in events, which I immortalized as Read the Roon. Since publication Roon has responded, which I have edited into the post along with some additional notes.
That still leaves plenty of fun for the full roundup. We have spies. We have accusations of covert racism. We have Elon Musk suing OpenAI. We have a new summary of simulator theory. We have NIST, tasked with AI regulation, literally struggling to keep a roof over their head. And more.
Table of Contents
Introduction.
Table of Contents.
Language Models Offer Mundane Utility. Predict the future.
Language Models Don't Offer Mundane Utility. Provide basic info.
LLMs: How Do They Work? Emmett Shear rederives simulators, summarizes.
Copyright Confrontation. China finds a copyright violation. Curious.
Oh Elon. He sues OpenAI to… force it to change its name? Kind of, yeah.
DNA Is All You Need. Was I not sufficiently impressed with Evo last week?
GPT-4 Real This Time. A question of intelligence.
Fun With Image Generation. Be careful not to have too much fun.
Deepfaketown and Botpocalypse Soon. This will not give you a hand.
They Took Our Jobs. They gave us a few back. For now, at least.
Get Involved. Davidad will have direct report, it could be you.
Introducing. An AI-based RPG will never work, until one does.
In Other AI News. The fallout continues, also other stuff.
More on Self-Awareness. Not the main thing to worry about.
Racism Remains a Problem for LLMs. Covert is a generous word for this.
Project Maven. Yes, we are putting the AIs in charge of weapon targeting.
Quiet Speculations. Claimed portents of various forms of doom.
The Quest for Sane Regulation. NIST might need a little help.
The Week in Audio. Sergey Brin Q&A.
Rhetorical Innovation. It is not progress. We still keep trying.
Another Open Letter. Also not really progress. We still keep trying.
Aligning a Smarter Than Human Intelligence is Difficult. Recent roundup.
Security is Also Difficult. This too is not so covert, it turns out.
The Lighter Side. It's me, would you like a fries with that?
Language Models Offer Mundane Utility
Forecast almost as well, or sometimes better, than the wisdom of crowds using GPT-4? Paper says yes. Prompt they used is here.
This does require an intensive process.
First, we generate search queries that are used to invoke news APIs to retrieve historical articles. We initially implement a straightforward query expansion prompt (Figure 12a), instructing the model to create queries based on the question and its background. However, we find that this overlooks sub-considerations that often contribute to accurate forecasting.
To achieve broader coverage, we prompt the model to decompose the forecasting question into sub-questions and use each to generate a search query (Min et al., 2019); see Figure 12b for the prompt. For instance, when forecasting election outcomes, the first approach searches directly for polling data, while the latter creates sub-questions that cover campaign finances, economic indicators, and geopolitical events. We combine both approaches for comprehensive coverage.
Next, the system retrieves articles from news APIs using the LM-generated search queries. We evaluate 5 APIs on the relevance of the articles retrieved and select NewsCatcher1 and Google News (Section E.2). Our initial retrieval provides wide covera...
view more