How to Detect AI-Generated Text, According to Researchers

ChatGPT is not as random as a human—for now.
Illustration: James Marshall

AI-generated text, from tools like ChatGPT, is starting to impact daily life. Teachers are testing it out as part of classroom lessons. Marketers are champing at the bit to replace their interns. Memers are going buck wild. Me? It would be a lie to say I’m not a little anxious about the robots coming for my writing gig. (ChatGPT, luckily, can’t hop on Zoom calls and conduct interviews just yet.)

With generative AI tools now publicly accessible, you’ll likely encounter more synthetic content while surfing the web. Some instances might be benign, like an auto-generated BuzzFeed quiz about which deep-fried dessert matches your political beliefs. (Are you Democratic beignet or a Republican zeppole?) Other instances could be more sinister, like a sophisticated propaganda campaign from a foreign government.

Academic researchers are looking into ways to detect whether a string of words was generated by a program like ChatGPT. Right now, what’s a decisive indicator that whatever you’re reading was spun up with AI assistance?

A lack of surprise.

Entropy, Evaluated

Algorithms with the ability to mimic the patterns of natural writing have been around for a few more years than you might realize. In 2019, Harvard and the MIT-IBM Watson AI Lab released an experimental tool that scans text and highlights words based on their level of randomness.

Why would this be helpful? An AI text generator is fundamentally a mystical pattern machine: superb at mimicry, weak at throwing curve balls. Sure, when you type an email to your boss or send a group text to some friends, your tone and cadence may feel predictable, but there's an underlying capricious quality to our human style of communication.

Edward Tian, a student at Princeton, went viral earlier this year with a similar, experimental tool, called GPTZero, targeted at educators. It gauges the likeliness that a piece of content was generated by ChatGPT based on its “perplexity” (aka randomness) and “burstiness” (aka variance). OpenAI, which is behind ChatGPT, dropped another tool made to scan text that’s over 1,000 characters long and make a judgment call. The company is up-front about the tool’s limitations, like false positives and limited efficacy outside English. Just as English-language data is often of the highest priority to those behind AI text generators, most tools for AI-text detection are currently best suited to benefit English speakers.

Could you sense if a news article was composed, at least in part, by AI? “These AI generative texts, they can never do the job of a journalist like you Reece,” says Tian. It’s a kind-hearted sentiment. CNET, a tech-focused website, published multiple articles written by algorithms and dragged across the finish line by a human. ChatGPT, for the moment, lacks a certain chutzpah, and it occasionally hallucinates, which could be an issue for reliable reporting. Everyone knows qualified journalists save the psychedelics for after-hours.

Entropy, Imitated

While these detection tools are helpful for now, Tom Goldstein, a computer science professor at the University of Maryland, sees a future where they become less effective, as natural language processing grows more sophisticated. “These kinds of detectors rely on the fact that there are systematic differences between human text and machine text,” says Goldstein. “But the goal of these companies is to make machine text that is as close as possible to human text.” Does this mean all hope of synthetic media detection is lost? Absolutely not.

Goldstein worked on a recent paper researching possible watermark methods that could be built into the large language models powering AI text generators. It’s not foolproof, but it’s a fascinating idea. Remember, ChatGPT tries to predict the next likely word in a sentence and compares multiple options during the process. A watermark might be able to designate certain word patterns to be off-limits for the AI text generator. So, when the text is scanned and the watermark rules are broken multiple times, it indicates a human being likely banged out that masterpiece.

Micah Musser, a research analyst at Georgetown University’s Center for Security and Emerging Technology, expresses skepticism about whether this watermarking style will actually work as intended. Wouldn’t a bad actor try to get their hands on a non-watermarked version of the generator? Musser contributed to a paper studying mitigation tactics to counteract AI-fueled propaganda. OpenAI and the Stanford Internet Observatory were also part of the research, laying out key examples of potential misuse as well as detection opportunities.

One of the paper’s core ideas for synthetic-text spotting builds off Meta’s 2020 look into the detection of AI-generated images. Instead of relying on changes made by those in charge of the model, developers and publishers could flick a few drops of poison into their online data and wait for it to be scraped up as part of the big ole data set that AI models are trained on. Then, a computer could attempt to find trace elements of the poisoned, planted content in a model’s output.

The paper acknowledges that the best way to avoid misuse would be to not create these large language models in the first place. And in lieu of going down that path, it posits AI-text detection as a unique predicament: “It seems likely that, even with the use of radioactive training data, detecting synthetic text will remain far more difficult than detecting synthetic image or video content.” Radioactive data is a difficult concept to transpose from images to word combinations. A picture brims with pixels; a Tweet can be 5 words.

What unique qualities are left to human-composed writing? Noah Smith, a professor at the University of Washington and NPL researcher at the Allen Institute for AI, points out that while the models may appear to be fluent in English, they still lack intentionality. “It really messes with our heads, I think,” Smith says. “Because we've never conceived of what it would mean to have fluency without the rest. Now we know.” In the future, you may need to rely on new tools to determine whether a piece of media is synthetic, but the advice for not writing like a robot will remain the same.

Avoid the rote, and keep it random.