How Do AI Text Detectors Work and Are They Accurate?

Janie S.
Janie S.
How Do AI Text Detectors Work And Are They Accurate Thumbnail
Link Icon
arrow up

ChatGPT has led to the creation of a whole new ecosystem of products and services built around AI text generation. Matching this, many AI content detectors have emerged as counterparts with bold claims.

Today, we get to the bottom of it. Is it true? Can they tell the difference between human-written and AI-generated content? If so, how does that work?

We will tell you all you need to know about these software tools and whether or not they are worth using.

How Do AI Text Detectors Work?

AI text detectors work much in the same way as AI text generators do: based on probabilities.

AI generators complete phrases based on the most likely outcome. Here are three examples:

  • "That's one small step for a man, a giant leap for ____" – AI would likely fill in "mankind."
  • "The love of money is the root of all ____" – The bible and countless other publications AI was trained on would indicate that "evil" is the right answer.
  • "I'm gonna make him an offer he can't ____" – Anybody who's seen The Godfather knows the answer is "refuse." However, that's also what makes the most sense from any perspective.

Now, here's where it gets interesting. AI detectors use those probabilities as well. If you use the most probable next word in your sentence, it may get tagged as AI-generated. That is our first red flag, because you wouldn't want to change the meaning of your sentence just for the sake of not being flagged as AI-generated.

AI detectors also look at two factors when assessing a text: perplexity and burstiness.

Perplexity

Perplexity boils down to unpredictability. If the detector is "surprised" at the word choice that comes next, it will think it is more human. AI language models are taught to produce texts with low perplexity. The words used are what you'd expect, but that also makes them more predictable.

That means a regular sentence any audience can understand would be more likely to get flagged. To evade perplexity accusations, you have to write something with high perplexity like: "The loquacious feline perched atop the plush, velvety furnishing, its countenance exuding an air of insouciant contentment."

Yikes.

AI detectors also work on the premise that human writing uses more creative language choices but also commits more typos. But what about Grammarly? Any writer worth their salt will run their text through software to correct errors.

Companies behind AI detectors have admitted that texts are flagged for being grammatically correct, especially when corrected with a software tool like Grammarly. The reasoning behind this is that the text has been edited with AI, so it becomes unnatural and inhuman.

Burstiness

Burstiness refers to the variation in the number of words used and the structure of the sentences in a text. AI content often has lower burstiness than human writing, so detectors want some very short sentences and other very long sentences to be seen as human-written. However, long, complicated sentences are not recommended when writing content that aims to be readable for a broad audience.

This raises the question, of whether AI content detectors are looking for things that go against many good writing practices.

Are AI Detectors Accurate?

In a word, no. They are not accurate. Critics see AI text detector companies as capitalizing on our need for control, giving the illusion of having control over AI text without any concrete proof that the software works reliably.

In one incident, a user ran the US Constitution and the biblical book of Genesis through an AI detector, and it came back flagged as AI-generated. By that logic, there were robot writers in 1787 to help write the Constitution.

Would you trust or pay an advisor who got their facts wrong like this?

OpenAI Statement

If you need more proof, perhaps you'll be more trusting of the pioneers of generative AI themselves: OpenAI. They are the team behind ChatGPT and even had their own AI detector at one point, which they quickly sunsetted.

In a statement issued especially to teachers using these tools to unjustly fail students, OpenAI admitted AI detectors do not work.

Dependent on Industry

It's important to note that some industries use naturally more formal language than others and a more steady, serious tone. You won't worry about perplexity or burstiness when writing a technical text. You worry about factuality.

Likewise, typos are unacceptable in academic papers, so you'll remove all of them and take no liberties, following English language rules to the letter. That will also get your text flagged more easily.

Originality AI Fake Claims

Then, we have what many would consider to be outright lies by some companies, such as Originality AI. Their approach to selling the product is based on using statistics to prove reliability.

More specifically, they use percentages to represent false positives and successful AI detection. These figures seem to be drawn out of thin air with very small datasets (which are exclusively owned and controlled by the company).

When Model 2.0 of their software came out, they proudly announced it had a 99% success rate. Then, less than a year later, they launched a new model: Turbo 3.0, which should somehow be better. Yet, it also has a “99% success rate”, and they now admit that version 2 has a 90% success rate.

Since the math doesn't add up, it can only mean that the company has been inflating its success rate and misleading customers.

Topcontent Research

Finally, Topcontent is a content creation platform founded in 2013. They conducted research in the hopes of finding a good "AI detection" tool. To do so, they used different versions of texts written on their platform in 2016 (way before any AI was around).

What they found was really interesting. The better the text was, the higher the "AI score" it got from the detectors. A first draft written by a human would get a low "AI probability" score. But after an editor polished it up and proofread it, the detector would flag it as more likely to be AI-generated.

It's contradictory, but if we return to our previous section about how AI content detectors work, it makes sense. AI text generators are trained on human text and grammar rules. So, when an editor comes around and applies those rules, the detector is only following its training by flagging it as AI-generated.

Consequences of Relying on AI Detectors

What happens when you rely on AI detectors? Well, for one, you are wasting money on something that doesn’t work. However, the consequences can be much more severe. There have been countless stories online of students being falsely accused by teachers who rely on these tools, ruining their futures.

Thankfully, some schools like Vanderbilt University are disabling their use of Turnitin's AI detector.

Finally, from a business standpoint, if you start trying to change your text just to avoid AI detection, you can end up with something that sounds unnatural, is littered with mistakes, and is of overall lower quality. Since Google and other search engines prioritize quality above all, your rankings would suffer.

Speaking of Google, even they are all in on the AI game with Gemini and there are no SEO penalties for using AI in your web content as long as it provides value to the end reader.

Bottom Line

As for the reliability of AI detectors, let's recap: No, AI detectors do not work accurately.

An actual detective looks for concrete clues. AI detectors are more like psychics putting people in jail over a hunch or intuition (probabilities). There is no watermark, thumbprint, or definite clues to "detect".

Regardless of whether you use AI content and what you use it for, make sure that your text always comes across in the desired way. Just as you can make a text sound unnatural, AI can do the same right from the start if your prompts are not precise.

Link Icon
arrow up

I make sure that companies don't forget who they are and what they stand for. Social media marketing is more than posting a vivid picture or video every now and then. Consistency and strategy are the drivers of any successful brand.