Skip to main content

Research Repository

Advanced Search

A Deep Learning Approach for Paragraph-Level Paraphrase Generation for Plagiarism Detection

Saqaabi, Arwa Al; Stewart, Craig; Akrida, Eleni; Cristea, Alexandra I.

A Deep Learning Approach for Paragraph-Level Paraphrase Generation for Plagiarism Detection Thumbnail


Authors



Abstract

Expressing information in different forms is an important skill that students should develop in school. This skill positively impacts academic reading and writing. However, it can also lead to negative consequences, such as plagiarism. Students may paraphrase original texts and present them as their own work. Therefore, the need to develop effective approaches to detect plagiarism and identify paraphrase has become increasingly important in academia, journalism, publishing, and other fields where innovation, novelty, and originality are highly valued, especially with the rising incidence of plagiarism in these areas because of the easy access to information on the internet and the capabilities of large language models. Most published detection methods analyse plagiarism at the sentence-level. We have developed approaches for generating and detecting paraphrased paragraphs by considering inter-sentence and intra-sentence relations, which enables the identification of paraphrased text at the paragraph-level. This includes joining, splitting, and/or shifting sentences within a paragraph, as students often plagiarise paragraphs. In the generating stage, we create the ALECS dataset, by developing three algorithms and applying a masking approach to tackle the paragraph’s syntactic and lexical layers while maintaining the paragraph’s semantics. ALECS can contribute to developing students’ abilities in paraphrasing, as there are more than 6 different forms for each source paragraph. In addition, as in this study, ALECS can be employed to train deep learning models for the purpose of generating or detecting plagiarised paragraphs. For the detection phase, our method shows robust results and outperforms existing work in detecting paragraph-level paraphrases, achieving a 90.1 F1 score with Longformer and reaching 96 when using a fine-tuned GPT-3.5. Graphical Abstract:

Citation

Saqaabi, A. A., Stewart, C., Akrida, E., & Cristea, A. I. (2025). A Deep Learning Approach for Paragraph-Level Paraphrase Generation for Plagiarism Detection. Neural Processing Letters, 57, 59. https://doi.org/10.1007/s11063-025-11771-9

Journal Article Type Article
Acceptance Date May 10, 2025
Online Publication Date Jun 12, 2025
Publication Date Jun 12, 2025
Deposit Date Jun 19, 2025
Publicly Available Date Jun 23, 2025
Journal Neural Processing Letters
Print ISSN 1370-4621
Electronic ISSN 1573-773X
Publisher Springer
Peer Reviewed Peer Reviewed
Volume 57
Pages 59
DOI https://doi.org/10.1007/s11063-025-11771-9
Keywords Plagiarism detection, Paraphrase identification, Artificial intelligence, Paragraph-level, Academic writing, Natural language processing, Large language models
Public URL https://durham-repository.worktribe.com/output/4104818

Files





You might also like



Downloadable Citations