November 23, 2022
Meta has unveiled Cicero, an AI model that has learned to collaborate with people and build rapport by playing the strategy game Diplomacy.
In Diplomacy, players control the armed forces of various European powers in the years leading up to World War I. Cicero learned to communicate with multiple players at once to build rapport and achieve their shared goals. Meta claims it is the first AI to play Diplomacy at a human level.
AI systems learning to play games is nothing new. This year marks 25 years since IBM’s DeepBlue defeated Chess grandmaster Garry Kasparov.
Cicero’s task, however, differs from games like chess. Diplomacy requires players to understand people’s motivations and perspectives – as well as make plans, adjust strategies and use language to convince people to form alliances – something AI has previously found difficult.
According to Meta’s AI team, Cicero achieved human-level performance playing the web version of Diplomacy. It racked up more than double the average score of the human players, ranking in the top 10% of participants who played more than one game.
Ultimately, the team hopes that Cicero's skills can be translated into practical applications for the real world, something that is not feasible for computers that play chess or Go.
“What’s truly exciting about technology like Cicero is how it could play a bit part in our lives someday – helping us collaborate, connect and learn, in the physical world and in the metaverse,” said Joe Spisak, director of product management at Meta AI.
“Unlike games like Chess and Go, Diplomacy is a game about people rather than pieces,” according to Meta. “If an agent can't recognize that someone is likely bluffing or that another player would see a certain move as aggressive, it will quickly lose the game. Likewise, if it doesn't talk like a real person - showing empathy, building relationships, and speaking knowledgeably about the game - it won't find other players willing to work with it.”
To get the AI model to play the game, Meta’s AI researchers developed new techniques for strategic reasoning and natural language processing. Using those techniques, Cicero can infer that later in the game it will need the support of one particular player, and then craft a strategy to win that person’s favor.
Powering Cicero is a controllable dialogue model coupled with a strategic reasoning engine.
Meta explains how the model operates: “At each point in the game, Cicero looks at the game board and its conversation history, and models how the other players are likely to act. It then uses this plan to control a language model that can generate free-form dialogue, informing other players of its plans and proposing reasonable actions for the other players that coordinate well with them.”
How it works
Using the board state and current dialogue, Cicero makes an initial prediction of what everyone will do. The model then refines that prediction and then uses those predictions to form an intent for itself and its partner.
It generates several candidate messages based on the board state, dialogue and its intents. Then it filters the candidate message to reduce nonsense, maximize value and ensure consistency with our intents.
Meta used several filtering mechanisms, including classifiers trained to distinguish between human and model-generated text to “ensure that our dialogue is sensible, consistent with the current game state and previous messages, and strategically sound.
A paper outlining Cicero from Meta’s researchers states that the company deployed additional filters to “detect toxic language and heuristics to curb bad behaviors including repetition and off-topic messages.”
To be sure, the model – and its filtering - isn't perfect. Meta says that Cicero would sometimes generate inconsistent dialogue that can undermine its objectives.
For example, while playing as the Austro-Hungarian empire, the AI agent contradicts itself by asking Italy to take Venice.
“While our suite of filters aims to detect these sorts of mistakes, it is not perfect,” Meta said.
Meta has open-sourced the model, with the codes for Cicero available via GitHub.