Skip to main content

Research Repository

Advanced Search

ChatGPT versus human essayists: an exploration of the impact of artificial intelligence for authorship and academic integrity in the humanities

Revell, T.; Yeadon, W.; Cahilly-Bretzin, G.; Clarke, I.; Manning, G.; Jones, J.; Mulley, C.; Pascual, R. J.; Bradley, N.; Thomas, D.; Leneghan, F.

ChatGPT versus human essayists: an exploration of the impact of artificial intelligence for authorship and academic integrity in the humanities Thumbnail


Authors

T. Revell

G. Cahilly-Bretzin

I. Clarke

G. Manning

J. Jones

C. Mulley

R. J. Pascual

N. Bradley

D. Thomas

F. Leneghan



Abstract

Generative AI has prompted educators to reevaluate traditional teaching and assessment methods. This study examines AI’s ability to write essays analysing Old English poetry; human markers assessed and attempted to distinguish them from authentic analyses of poetry by first-year undergraduate students in English at the University of Oxford. Using the standard UK University grading system, AI-written essays averaged a score of 60.46, whilst human essays achieved 63.57, a margin of difference not statistically significant (p = 0.10). Notably, student submissions applied a nuanced understanding of cultural context and secondary criticism to their close reading, while AI essays often described rather than analysed, lacking depth in the evaluation of poetic features, and sometimes failing to properly recognise key aspects of passages. Distinguishing features of human essays included detailed and sustained analysis of poetic style, as well as spelling errors and lack of structural cohesion. AI essays, on the other hand, exhibited a more formal structure and tone but sometimes fell short in incisive critique of poetic form and effect. Human markers correctly identified the origin of essays 79.41% of the time. Additionally, we compare three purported AI detectors, finding that the best, ‘Quillbot’, correctly identified the origin of essays 95.59% of the time. However, given the high threshold for academic misconduct, conclusively determining origin remains challenging. The research also highlights the potential benefits of generative AI’s ability to advise on structuring essays and suggesting avenues for research. We advocate for transparency regarding AI’s capabilities and limitations, and this study underscores the importance of human critical engagement in teaching and learning in Higher Education. As AI’s proficiency grows, educators must reevaluate what authentic assessment is, and consider implementing dynamic, holistic methods to ensure academic integrity.

Citation

Revell, T., Yeadon, W., Cahilly-Bretzin, G., Clarke, I., Manning, G., Jones, J., Mulley, C., Pascual, R. J., Bradley, N., Thomas, D., & Leneghan, F. (2024). ChatGPT versus human essayists: an exploration of the impact of artificial intelligence for authorship and academic integrity in the humanities. International Journal for Educational Integrity, 20(1), Article 18. https://doi.org/10.1007/s40979-024-00161-8

Journal Article Type Article
Acceptance Date Aug 12, 2024
Online Publication Date Oct 21, 2024
Publication Date Oct 21, 2024
Deposit Date Oct 30, 2024
Publicly Available Date Oct 30, 2024
Journal International Journal for Educational Integrity
Electronic ISSN 1833-2595
Publisher BioMed Central
Peer Reviewed Peer Reviewed
Volume 20
Issue 1
Article Number 18
DOI https://doi.org/10.1007/s40979-024-00161-8
Keywords AI text detection, Artificial intelligence, ChatGPT, Assessment, Higher education
Public URL https://durham-repository.worktribe.com/output/2988762

Files





You might also like



Downloadable Citations