Dr Will Yeadon will.yeadon@durham.ac.uk
Career Development Fellow
Evaluating AI and human authorship quality in academic writing through physics essays
Yeadon, Will; Agra, Elise; Inyang, Oto-Obong; Mackay, Paul; Mizouri, Arin
Authors
Dr Elise Agra elise.s.agra@durham.ac.uk
Career Development Fellow
Dr Oto Obong Inyang o.o.a.inyang@durham.ac.uk
Assistant Professor
Dr Paul Mackay paul.t.mackay@durham.ac.uk
Career Development Fellow
Dr Arin Mizouri arin.mizouri@durham.ac.uk
Assistant Professor
Abstract
This study aims to compare the academic writing quality and detectability of authorship between human and AI-generated texts by evaluating n = 300 short-form physics essay submissions, equally divided between student work submitted before the introduction of ChatGPT and those generated by OpenAI’s GPT-4. In blinded evaluations conducted by five independent markers who were unaware of the origin of the essays, we observed no statistically significant differences in scores between essays authored by humans and those produced by AI (p-value = 0.107, α = 0.05). Additionally, when the markers subsequently attempted to identify the authorship of the essays on a 4-point Likert scale—from ‘Definitely AI’ to ‘Definitely Human’—their performance was only marginally better than random chance. This outcome not only underscores the convergence of AI and human authorship quality but also highlights the difficulty of discerning AI-generated content solely through human judgment. Furthermore, the effectiveness of five commercially available software tools for identifying essay authorship was evaluated. Among these, ZeroGPT was the most accurate, achieving a 98% accuracy rate and a precision score of 1.0 when its classifications were reduced to binary outcomes. This result is a source of potential optimism for maintaining assessment integrity. Finally, we propose that texts with ≤50% AI-generated content should be considered the upper limit for classification as human-authored, a boundary inclusive of a future with ubiquitous AI assistance whilst also respecting human-authorship.
Citation
Yeadon, W., Agra, E., Inyang, O.-O., Mackay, P., & Mizouri, A. (2024). Evaluating AI and human authorship quality in academic writing through physics essays. European Journal of Physics, 45(5), Article 055703. https://doi.org/10.1088/1361-6404/ad669d
Journal Article Type | Article |
---|---|
Acceptance Date | Jul 23, 2024 |
Online Publication Date | Sep 2, 2024 |
Publication Date | Sep 1, 2024 |
Deposit Date | Sep 13, 2024 |
Publicly Available Date | Sep 13, 2024 |
Journal | European Journal of Physics |
Print ISSN | 0143-0807 |
Electronic ISSN | 1361-6404 |
Publisher | IOP Publishing |
Peer Reviewed | Peer Reviewed |
Volume | 45 |
Issue | 5 |
Article Number | 055703 |
DOI | https://doi.org/10.1088/1361-6404/ad669d |
Keywords | benchmark, ChatGPT, AI, academic writing |
Public URL | https://durham-repository.worktribe.com/output/2800164 |
Files
Published Journal Article
(940 Kb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
Quantum gravity and scalar fields
(2010)
Journal Article
Using H5P to create asynchronous and synchronous learning activities
(2022)
Journal Article
Low Energy Quantum Gravity
(2012)
Thesis
The impact of AI in physics education: a comprehensive review from GCSE to university levels
(2024)
Journal Article
The death of the short-form physics essay in the coming AI revolution
(2023)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search