Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation

Yu, Jialin; Cristea, Alexandra I.; Harit, Anoushka; Sun, Zhongtian; Aduragba, Olanrewaju Tahir; Shi, Lei; Al Moubayed, Noura

doi:10.1016/j.aiopen.2023.05.001

Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation

Yu, Jialin; Cristea, Alexandra I.; Harit, Anoushka; Sun, Zhongtian; Aduragba, Olanrewaju Tahir; Shi, Lei; Al Moubayed, Noura

Authors

Jialin Yu jialin.yu@durham.ac.uk
Academic Visitor

Professor Alexandra Cristea alexandra.i.cristea@durham.ac.uk
Professor

Anoushka Harit anoushka.harit@durham.ac.uk
PGR Student Master of Science

Zhongtian Sun zhongtian.sun@durham.ac.uk
PGR Student Doctor of Philosophy

Tahir Olanrewaju Aduragba olanrewaju.m.aduragba@durham.ac.uk
PGR Student Doctor of Philosophy

Lei Shi

Dr Noura Al Moubayed noura.al-moubayed@durham.ac.uk
Associate Professor

Abstract

This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-ofthe-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin (𝑝 < .05; Wilcoxon test).

Citation

Yu, J., Cristea, A. I., Harit, A., Sun, Z., Aduragba, O. T., Shi, L., & Al Moubayed, N. (2023). Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation. AI open, 4, 19-32. https://doi.org/10.1016/j.aiopen.2023.05.001

Journal Article Type	Article
Acceptance Date	May 18, 2023
Online Publication Date	May 26, 2023
Publication Date	2023
Deposit Date	May 30, 2023
Publicly Available Date	May 30, 2023
Journal	AI Open
Electronic ISSN	2666-6510
Peer Reviewed	Peer Reviewed
Volume	4
Pages	19-32
DOI	https://doi.org/10.1016/j.aiopen.2023.05.001
Public URL	https://durham-repository.worktribe.com/output/1172628

Files

Published Journal Article (997 Kb)
PDF

Publisher Licence URL
http://creativecommons.org/licenses/by-nc-nd/4.0/

Copyright Statement
© 2023 The Authors. Publishing services by Elsevier B.V. on behalf of KeAi Communications Co. Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

MONEY: Ensemble learning for stock price movement prediction via a convolutional network with adversarial hypergraph model (2023)
Journal Article

Is Unimodal Bias Always Bad for Visual Question Answering? A Medical Domain Study with Dynamic Attention (2022)
Presentation / Conference Contribution

Contrastive Learning with Heterogeneous Graph Attention Networks on Short Text Classification (2022)
Presentation / Conference Contribution

A Generative Bayesian Graph Attention Network for Semi-supervised Classification on Scarce Data (2021)
Presentation / Conference Contribution

Analysing Learner Behaviour in an Ontology-Based E-learning System: A Graph Neural Network Approach (2024)
Presentation / Conference Contribution

Downloadable Citations

HTML

BIB

RTF

Authors

Abstract

Citation

Files

You might also like

Downloadable Citations