Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Cooperative AI: Three things that confused me as a beginner (and my current understanding), published by C Tilli on April 17, 2024 on The Effective Altruism Forum.
I started working in cooperative AI almost a year ago, and as an emerging field I found it quite confusing at times since there is very little introductory material aimed at beginners. My hope with this post is that by summing up my own confusions and how I understand them now I might help to speed up the process for others who want to get a grasp on what cooperative AI is.
I work at Cooperative AI Foundation (CAIF) and there will be a lot more polished and official material coming from there, so this is just a quick personal write-up to get something out in the meantime. We're working on a cooperative AI curriculum that should be published within the next couple of months, and we're also organising a summer school in June for people new to the area (
application deadline April 26th).
Contradicting definitions
When I started to learn about cooperative AI I came across a lot of different definitions of the concept. While drafting this post I dug up my old interview preparation doc for my current job, where I had listed different descriptions of cooperative AI that I had found while reading up:
"the objective of this research would be to study the many aspects of the problems of cooperation and to innovate in AI to contribute to solving these problems"
"AI research trying to help individuals, humans and machines, to find ways to improve their joint welfare"
"AI research which can help contribute to solving problems of cooperation"
"building machine agents with the capabilities needed for cooperation"
"building tools to foster cooperation in populations of (machine and/or human) agents"
"conducting AI research for insight relevant to problems of cooperation"
To me this did not paint a very clear picture and I was pretty frustrated to be unable to find a concise answer to the most basic question: What is cooperative AI and what is it not?
At this point I still don't have a clear, final definition, but I am less frustrated by it because I no longer think that this is just a failure of understanding or failure of communication - the situation is simply that the field is so new that there is no single definition that people working in the field agree on, and it is still an ongoing discussion where the boundaries should be drawn.
That said, my current favourite explanation of what cooperative AI is is that while AI alignment deals with the question of how to make one powerful AI system behave in a way that is aligned with (good) human values, cooperative AI is about making things go well with powerful AI systems in a messy world where there might be many different AI systems, lots of different humans and human groups and different sets of (sometimes contradictory) values.
Another recurring framing is that cooperative AI is about improving the cooperative intelligence of advanced AI, which leads to the question of what cooperative intelligence is. Here also there are many different versions in circulation, but the following one is the one I find most useful so far:
Cooperative intelligence is an agent's ability to achieve their goals in ways that also promote social welfare, in a wide range of environments and with a wide range of other agents.
Is this really different from alignment?
The second major issue I had was to figure out how cooperative AI really differed from AI alignment. The description of "cooperative intelligence" seemed like it could be understood as just a certain framing of alignment - "achieve the goals in a way that is also good for everyone".
As I have been learning more about cooperative AI, it seems to me like the term "cooperative intelligence" is best understood in the context of social dil...
view more