Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #69: Nice, published by Zvi on June 20, 2024 on LessWrong.
Nice job breaking it, hero, unfortunately. Ilya Sutskever, despite what I sincerely believe are the best of intentions, has decided to be the latest to do The Worst Possible Thing, founding a new AI company explicitly looking to build ASI (superintelligence). The twists are zero products with a 'cracked' small team, which I suppose is an improvement, and calling it Safe Superintelligence, which I do not suppose is an improvement.
How is he going to make it safe? His statements tell us nothing meaningful about that.
There were also changes to SB 1047. Most of them can be safely ignored. The big change is getting rid of the limited duty exception, because it seems I was one of about five people who understood it, and everyone kept thinking it was a requirement for companies instead of an opportunity. And the literal chamber of commerce fought hard to kill the opportunity. So now that opportunity is gone.
Donald Trump talked about AI. He has thoughts.
Finally, if it is broken, and perhaps the it is 'your cybersecurity,' how about fixing it? Thus, a former NSA director joins the board of OpenAI. A bunch of people are not happy about this development, and yes I can imagine why. There is a history, perhaps.
Remaining backlog update: I still owe updates on the OpenAI Model spec, Rand report and Seoul conference, and eventually The Vault. We'll definitely get the model spec next week, probably on Monday, and hopefully more. Definitely making progress.
Table of Contents
Other AI posts this week: On DeepMind's Frontier Safety Framework, OpenAI #8: The Right to Warn, and The Leopold Model: Analysis and Reactions.
1. Introduction.
2. Table of Contents.
3. Language Models Offer Mundane Utility. DeepSeek could be for real.
4. Language Models Don't Offer Mundane Utility. Careful who you talk to about AI.
5. Fun With Image Generation. His full story can finally be told.
6. Deepfaketown and Botpocalypse Soon. Every system will get what it deserves.
7. The Art of the Jailbreak. Automatic red teaming. Requires moderation.
8. Copyright Confrontation. Perplexity might have some issues.
9. A Matter of the National Security Agency. Paul Nakasone joins OpenAI board.
10. Get Involved. GovAI is hiring. Your comments on SB 1047 could help.
11. Introducing. Be the Golden Gate Bridge, or anything you want to be.
12. In Other AI News. Is it time to resign?
13. Quiet Speculations. The quest to be situationally aware shall continue.
14. AI Is Going to Be Huuuuuuuuuuge. So sayeth The Donald.
15. SB 1047 Updated Again. No more limited duty exemption. Democracy, ya know?
16. The Quest for Sane Regulation. Pope speaks truth. Mistral CEO does not.
17. The Week in Audio. A few new options.
18. The ARC of Progress. Francois Chollet goes on Dwarkesh, offers $1mm prize.
19. Put Your Thing In a Box. Do not open the box. I repeat. Do not open the box.
20. What Will Ilya Do? Alas, create another company trying to create ASI.
21. Actual Rhetorical Innovation. Better names might be helpful.
22. Rhetorical Innovation. If at first you don't succeed.
23. Aligning a Smarter Than Human Intelligence is Difficult. How it breaks down.
24. People Are Worried About AI Killing Everyone. But not maximally worried.
25. Other People Are Not As Worried About AI Killing Everyone. Here they are.
26. The Lighter Side. It cannot hurt to ask.
Language Models Offer Mundane Utility
Coding rankings dropped from the new BigCodeBench (blog) (leaderboard)
Three things jump out.
1. GPT-4o is dominating by an amount that doesn't match people's reports of practical edge. I saw a claim that it is overtrained on vanilla Python, causing it to test better than it plays in practice. I don't know.
2. The gap from Gemini 1.5 Flash to Gemini 1.5 Pro and GPT-4-Turbo is very small. Gemini ...
view more