What is AI detector false positive?

Definition

An AI detector false positive is human-written text wrongly flagged as machine-generated. Documented rates are highest for non-native English speakers — one Stanford study found detectors flagged 61% of TOEFL essays as AI-written.

False positives are not occasional glitches; they are structural. Detectors flag text that looks statistically predictable, and predictability correlates with things that have nothing to do with AI: writing in a second language, following genre conventions, or simply writing cleanly.

The consequences land asymmetrically. A student flagged by a detector faces an integrity process with no way to disprove the score; a freelancer loses a client. The accused party bears the burden of proving a negative — which finished text cannot do.

This asymmetry is the core argument for certification: evidence collected during writing means the writer never has to argue against a black box.

Go deeper

Why detectors false-positive, in depth

Related terms

Perplexity

Perplexity is a measure of how predictable a piece of text is to a language model. Low perplexity means the model finds each next word unsurprising; AI detectors treat low perplexity as a hint that text was machine-generated.

Burstiness

Burstiness is the variation in sentence length and structure across a text. Human writing tends to alternate short and long sentences unpredictably, while AI-generated text is often more uniform — so detectors use low burstiness as a machine signal.

All glossary terms