The Future Is Here
We may earn a commission from links on this page

Meta's 'Cicero' AI Trounced Humans at Diplomacy Without Revealing Its True Identity

Cicero doubled the average score of human players across 40 online games, and ranked in the top 10% of players who played more than one game.

Image for article titled Meta's 'Cicero' AI Trounced Humans at Diplomacy Without Revealing Its True Identity
Photo: Alberto Pizzoli (Getty Images)

Ever since IBM’s Deep Blue artificial intelligence system defeated world chess champion Garry Kasparov at his own game in 1997, humanity has watched, haplessly, year after year as our code based underlings vanquish us in ever more complicated games. There’s a catch though. While AI bots increasingly excel at trumping humans in head-to-head adversarial board and video games matches, the systems typically fare far worse when instructed to cooperate with other humans to accomplish a shared task. Researchers at Meta believe their new “Cicero” AI may have something to say about that.

For the first time, according to a new study shared with Gizmodo by Meta’s Fundamental AI Research Diplomacy Team, researchers trained an AI to attain “human level performance” in the war strategy board game Diplomacy. The new AI agent, named after the classical statesman and scholar who witnessed the fall of the Roman Republic, was able to effectively communicate and strategize with other human players, plan best methods for victory, and in some cases, even pass as a human. Now, the researchers say Cicero, which accomplished its tasks by combining dialogue and strategic reasoning models, stands as a “benchmark” for multi-AI agent learning.

Advertisement

Over the past twenty or so years, an increasingly impressive assortment of AI systems from a variety of companies have trounced human players in everything from Chess and Go to more modern games like Starcraft. While different in content, those games all share one fundamental similarity: they are all a zero-sum, winner takes all competition.

Advertisement

Diplomacy is different. In Diplomacy, seven players compete to control the majority of supply centers. Players are constantly interacting with each other and each round begins with a series of pre-round negotiations. Crucially, Diplomacy players may attempt to deceive others and may also think the AI is lying. Researchers said Diplomacy is particularly challenging because it requires building trust with others, “in an environment that encourages players to not trust anyone.” In other words, for an AI to “win” at Diplomacy it needs to both understand the rules of the game efficiently but also fundamentally understand human interactions, deceptions, and cooperation, and know how to string together sentences without sounding like a malfunctioning dishwasher.

Advertisement

Cicero more or less did that. Meta says Cicero more than doubled the average score of human players across 40 anonymous online Diplomacy games and ranked in the top 10% of players who played more than one game. Cicero even placed 1st in an eight game tournament involving 21 participants. At every stage of the game, Cicero would model how other competing players were likely to act based on their game performance and text conversations.

The researchers conducted their study experiment August 19 and October 13 2022 over 40 anonymized online weDiplomacy.net games totaling 72 hours of play. Researchers say they didn’t see in-game messages to suggest human players believed they were playing against an AI. Cicero apparently “passed as a human player,” in 40 games of Diplomacy with 82 unique players. In one case highlighted in the study, Cicero was able to successfully change a human player’s mind by proposing a mutually beneficial move.

Advertisement

Cicero was trained on a hefty chunk of previous Diplomacy data in order to get it to prepare it to properly communicate with other players. The researchers say the AI was trained on a dataset of 125,261 anonymized Diplomacy games, around 40,000 of which contained dialogue. In total, that dataset contained over 12 million messages exchanged between human players.

Cicero wasn’t perfect though. The AI’s dialogue was mostly limited to actions in its current turn and Cicero wasn’t great at modeling how its dialogue with one player could affect the relationships with others over the long run. It also occasionally sent messages that contained “grounding errors,” or others that contradicted its own plans. (It’s worth noting that humans often make those same mistakes). Still, despite those caveats the researchers said Cicero should take its place among the AI board gaming hall of fame due to its unique ability to cooperate with humans.

Advertisement

While this is just one study from one board game, Meta’s new findings signal a potentially novel, and less apocalyptic lens with which to view incremental AI successes. Rather than feel creeped out by AI systems gradually beating out humans at games we once held dear, Cicero hints towards a future where humans and AI’s can potentially work side by side as partners, or at the very least, mutual acquaintances, to solve problems.