International Journal for Educational Integrity

Table 1 Performance metrics for various purported AI detectors compared to human evaluators. Here TP is True Positive, FP False Positive, TN True Negative and FN False Negative

From: ChatGPT versus human essayists: an exploration of the impact of artificial intelligence for authorship and academic integrity in the humanities

Detector	TP	FP	TN	FN	Accuracy	Precision
GPTZero	33	2	26	7	86.76%	0.94
Quillbot	37	0	28	3	95.59%	1.0
ZeroGPT	16	0	28	24	64.71%	1.0
Human	30	4	24	10	79.41%	0.88

Back to article page

ISSN: 1833-2595

Contact us

General enquiries: journalsubmissions@springernature.com