Skip to main content

Table 1 Performance metrics for various purported AI detectors compared to human evaluators. Here TP is True Positive, FP False Positive, TN True Negative and FN False Negative

From: ChatGPT versus human essayists: an exploration of the impact of artificial intelligence for authorship and academic integrity in the humanities

Detector

TP

FP

TN

FN

Accuracy

Precision

GPTZero

33

2

26

7

86.76%

0.94

Quillbot

37

0

28

3

95.59%

1.0

ZeroGPT

16

0

28

24

64.71%

1.0

Human

30

4

24

10

79.41%

0.88