Detection of GPT-4 generated text in higher education: Combining academic judgement and software to identify generative AI tool misuse

Perkins, Mike; Roe, Jasper; Postma, Darius; McGaughran, James; Hickerson, Don

doi:10.1007/s10805-023-09492-6

Detection of GPT-4 generated text in higher education: Combining academic judgement and software to identify generative AI tool misuse

Perkins, Mike; Roe, Jasper; Postma, Darius; McGaughran, James; Hickerson, Don

Authors

Mike Perkins

Dr Jasper Roe jasper.j.roe@durham.ac.uk
Assistant Professor

Darius Postma

James McGaughran

Don Hickerson

Abstract

This study explores the capability of academic staff assisted by the Turnitin Artificial Intelligence (AI) detection tool to identify the use of AI-generated content in university assessments. 22 different experimental submissions were produced using Open AI’s ChatGPT tool, with prompting techniques used to reduce the likelihood of AI detectors identifying AI-generated content. These submissions were marked by 15 academic staff members alongside genuine student submissions. Although the AI detection tool identified 91% of the experimental submissions as containing AI-generated content, only 54.8% of the content was identified as AI-generated, underscoring the challenges of detecting AI content when advanced prompting techniques are used. When academic staff members marked the experimental submissions, only 54.5% were reported to the academic misconduct process, emphasising the need for greater awareness of how the results of AI detectors may be interpreted. Similar performance in grades was obtained between student submissions and AI-generated content (AI mean grade: 52.3, Student mean grade: 54.4), showing the capabilities of AI tools in producing human-like responses in real-life assessment situations. Recommendations include adjusting the overall strategies for assessing university students in light of the availability of new Generative AI tools. This may include reducing the overall reliance on assessments where AI tools may be used to mimic human writing, or by using AI-inclusive assessments. Comprehensive training must be provided for both academic staff and students so that academic integrity may be preserved.

Citation

Perkins, M., Roe, J., Postma, D., McGaughran, J., & Hickerson, D. (2024). Detection of GPT-4 generated text in higher education: Combining academic judgement and software to identify generative AI tool misuse. Journal of Academic Ethics, 22, 89-113. https://doi.org/10.1007/s10805-023-09492-6

Journal Article Type	Article
Acceptance Date	Oct 24, 2023
Online Publication Date	Oct 31, 2023
Publication Date	2024-03
Deposit Date	Jan 29, 2025
Journal	Journal of Academic Ethics
Print ISSN	1570-1727
Electronic ISSN	1572-8544
Publisher	Springer
Peer Reviewed	Peer Reviewed
Volume	22
Pages	89-113
DOI	https://doi.org/10.1007/s10805-023-09492-6
Public URL	https://durham-repository.worktribe.com/output/3355804

Is GenAI the Future of Feedback? Understanding Student and Staff Perspectives on AI in Assessment (2024)
Journal Article

Perspectives on academic integrity in the ASEAN region (2024)
Book Chapter

Deepfakes and higher education: A research agenda and scoping review of synthetic media (2024)
Journal Article

Simple techniques to bypass GenAI text detectors: implications for inclusive education (2024)
Journal Article

Paraphrase or plagiarism? Exploring EAP students’ use of source material in a transnational university context (2024)
Journal Article

Downloadable Citations

HTML

BIB

RTF

Authors

Abstract

Citation

You might also like

Downloadable Citations