Skip to main contentSkip to navigationSkip to navigation
Man walks past a billboard depicting Boris Johnson waving Russian national flags reading 'Thank you Boris'.
The Brexit referendum in 2016 is thought to have been targeted by foreign activity on social media. Photograph: Daniel Sorabji/AFP/Getty Images
The Brexit referendum in 2016 is thought to have been targeted by foreign activity on social media. Photograph: Daniel Sorabji/AFP/Getty Images

AI system detects posts by foreign ‘trolls’ on Facebook and Twitter

This article is more than 3 years old

Researchers train machine learning to pick up patterns of social media posts aimed at manipulating political events

Foreign manipulation campaigns on social media can be spotted by looking at clues in the timing and length of posts and the URLs they contain, researchers have found.

From the 2016 US presidential election to Brexit, a growing number of political events are thought to have been targeted by foreign activity on social media platforms such as Facebook, Twitter and Reddit.

Now researchers say they have developed an automated machine learning system – a type of artificial intelligence – that can spot such posts, based on their content.

“We can use machine learning to automatically identify the content of troll postings and track an online information operation without human intervention,” said Dr Meysam Alizadeh, of Princeton University, a co-author of the research.

The team say the approach differs from simply detecting bots, which they say is important since such campaigns often include posts by humans. They say the sheer volume of posts in so-called troll campaigns means standard templates and operating procedures are probably needed.

Writing in the journal Science Advances, the team report how they carried out their work using posts from four known social media campaigns that targeted the US, attributed to China, Russia and Venezuela.

The team focused on troll posts made on Twitter between December 2015 and December 2018 and on Reddit between July 2015 and December 2016.

For comparison, they also used data from thousands of US Twitter accounts, from users engaged with US politics and average users, as well as posts from thousands of Reddit accounts not linked to foreign manipulation campaigns.

After training the system on a subset of the data, the team explored five different questions. These included whether the machine learning system could tell apart posts from trolls and those linked to normal activity, and whether feeding the system with troll posts from one month would allow it to spot posts made during the following month by new troll accounts.

The results show that the approach worked well, with the posts flagged by the system generally coming from trolls. However, not all troll posts were identified by the system.

“This is an important, interesting and sometimes intriguing piece of analysis,” said Prof Martin Innes, the director of Cardiff University’s Crime and Security Research Institute. “That machine learning algorithms should be able to identify similar content from within bounded datasets is perhaps to be expected, as after all there were already signals in the data that enabled them to be connected. But as the authors quite correctly clarify, there is a gap still to be bridged in terms of applying these approaches ‘in the wild’ to identify ‘live’ operations.”

The team also found differences in the system’s performance depending on the country behind the campaign, with Chinese activity easier to spot than Russian activity. “In terms of the Venezuelan campaigns, our performance is near-perfect; close to 99% accurate,” Alizadeh said.

“In terms of the Chinese one, our performance is around 90%, 91%. The Russian was the most complicated and most sophisticated campaign: our performance was around 85%.”

Alizadeh said that did not mean that Russian information actors are necessarily better at blending in with regular US users. “They are good at mimicking political American users on Twitter,” he said. “But there are other reasons they aren’t discoverable: For example, the Venezuelans always talk about politics. The Russian trolls, some of them never talk about politics – they engage in hashtag games or share links to download music. Why are Russian trolls doing that? One answer could be to build their own audience.”

The authors say the work also highlights how foreign campaigns change tactics over time: for example, in the case of Russian activity, hashtag use fell after a peak in late 2015 and early 2016.

“We have had this idea from the beginning to develop a public dashboard that journalists and people can check on a daily basis, to understand what’s going on in social media with regards to foreign and domestic information operations,” Alizadeh said.

Something like that “would be a first step in securing our democracies. And then we should also extend it to other countries, other democracies and other languages.”

Most viewed

Most viewed