Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: "Humanity vs. AGI" Will Never Look Like "Humanity vs. AGI" to Humanity, published by Thane Ruthenis on December 16, 2023 on LessWrong.
When discussing AGI Risk, people often talk about it in terms of a war between humanity and an AGI. Comparisons between the amounts of resources at both sides' disposal are brought up and factored in, big impressive nuclear stockpiles are sometimes waved around, etc.
I'm pretty sure it's not how that'd look like, on several levels.
1. Threat Ambiguity
I think what people imagine, when they imagine a war, is Terminator-style movie scenarios where the obviously evil AGI becomes obviously evil in a way that's obvious to everyone, and then it's a neatly arranged white-and-black humanity vs. machines all-out fight. Everyone sees the problem, and knows everyone else sees it too, the problem is common knowledge, and we can all decisively act against it.[1]
But in real life, such unambiguity is rare. The monsters don't look obviously evil, the signs of fatal issues are rarely blatant.
And if you're not that sure, well...
Better not act up. Better not look like you're panicking. Act very concerned, sure, but in a calm, high-status manner. Provide a measured response. Definitely don't take any drastic, unilateral actions. After all, what if you do, but the threat turns out not to be real? Depending on what you've done, the punishment inflicted might range from embarrassment to complete social ostracization, and the fear of those is much more acute in our minds, compared to some vague concerns about death.
And the AGI, if it's worth the name, would not fail to exploit this. Even when it starts acting to amass power, there would always be a prosocial, plausible-sounding justification for why it's doing that. It'd never stop making pleasant noises about having people's best interests at heart. It'd never stop being genuinely useful to someone. It'd ensure that there's always clear, unambiguous harm in shutting it down. It would ensure that the society as a whole is always doubtful regarding its intentions - and thus, that no-one would feel safe outright attacking it.
Much like there's no fire alarm for AGI, there would be no fire alarm for the treacherous turn. There would never be a moment, except maybe right before the end, where "we must stop the malign AGI from killing us all!" would sound obviously right to everyone. This sort of message would always appear a bit histrionic, an extremist stance that no respectable person would shout out. There would always be fear that if we act now, we'll then turn around and realize that we jumped at shadows. Right until the end, humans will fight using slow, ineffectual, "measured" responses.
The status-quo bias, asymmetric justice, the Copenhagen Interpretation of Ethics, threat ambiguity - all of that would be acting to ensure this.
There's a world of difference between 90% confidence and 99% confidence, when it comes to collective action. And the AGI would need to screw up very badly indeed, for the whole society to become 99% certain it's malign.
2. Who Are "We"?
Another error is thinking about a unitary response from some ephemeral "us". "We" would fight the AGI, "we" would shut it down, "we" would not give it power over the society / the economy / weapons / factories.
But who are "we"? Humanity is not a hivemind; we don't even have a world government. Humans are, in fact, notoriously bad at coordination. So if you're imagining "us" naturally responding to the threat in some manner that, it seems, is guaranteed to prevail against any AGI adversary incapable of literal mind-hacking...
Are you really, really sure that "we", i. e. the dysfunctional mess of the human civilization, are going to respond in this manner? Are you sure you're not falling prey to the Typical Mind Fallacy, when you're imagining a...
view more