Tahir Olanrewaju Aduragba olanrewaju.m.aduragba@durham.ac.uk
PGR Student Doctor of Philosophy
Improving Health Mention Classification Through Emphasising Literal Meanings: A Study Towards Diversity and Generalisation for Public Health Surveillance
Aduragba, Tahir Olanrewaju; Yu, Jialin; Cristea, Alexandra I.; Long, Yang
Authors
Jialin Yu jialin.yu@durham.ac.uk
Academic Visitor
Professor Alexandra Cristea alexandra.i.cristea@durham.ac.uk
Professor
Dr Yang Long yang.long@durham.ac.uk
Associate Professor
Abstract
People often use disease or symptom terms on social media and online forums in ways other than to describe their health. Thus the NLP health mention classification (HMC) task aims to identify posts where users are discussing health conditions literally, not figuratively. Existing computational research typically only studies health mentions within well-represented groups in developed nations. Developing countries with limited health surveillance abilities fail to benefit from such data to manage public health crises. To advance the HMC research and benefit more diverse populations, we present the Nairaland health mention dataset (NHMD), a new dataset collected from a dedicated web forum for Nigerians. NHMD consists of 7,763 manually labelled posts extracted based on four prevalent diseases (HIV/AIDS, Malaria, Stroke and Tuberculosis) in Nigeria. With NHMD, we conduct extensive experiments using current state-of-the-art models for HMC and identify that, compared to existing public datasets, NHMD contains out-of-distribution examples. Hence, it is well suited for domain adaptation studies. The introduction of the NHMD dataset imposes better diversity coverage of vulnerable populations and generalisation for HMC tasks in a global public health surveillance setting. Additionally, we present a novel multi-task learning approach for HMC tasks by combining literal word meaning prediction as an auxiliary task. Experimental results demonstrate that the proposed approach outperforms state-of-the-art methods statistically significantly (p < 0.01, Wilcoxon test) in terms of F1 score over the state-of-the-art and shows that our new dataset poses a strong challenge to the existing HMC methods.
Citation
Aduragba, T. O., Yu, J., Cristea, A. I., & Long, Y. (2023, April). Improving Health Mention Classification Through Emphasising Literal Meanings: A Study Towards Diversity and Generalisation for Public Health Surveillance. Presented at WWW '23: The ACM Web Conference 2023, Austin, Texas
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | WWW '23: The ACM Web Conference 2023 |
Start Date | Apr 30, 2023 |
End Date | May 4, 2023 |
Acceptance Date | Feb 7, 2023 |
Online Publication Date | Apr 30, 2023 |
Publication Date | Apr 30, 2023 |
Deposit Date | Feb 8, 2023 |
Publisher | Association for Computing Machinery (ACM) |
Pages | 3928-3936 |
Book Title | WWW '23: Proceedings of the ACM Web Conference 2023 |
DOI | https://doi.org/10.1145/3543507.3583877 |
Public URL | https://durham-repository.worktribe.com/output/1134115 |
You might also like
Efficient Uncertainty Quantification for Multilabel Text Classification
(2022)
Presentation / Conference Contribution
INTERACTION: A Generative XAI Framework for Natural Language Inference Explanations
(2022)
Presentation / Conference Contribution
Digital Inclusion in Nothern England: Training Women from Underrepresented Communities in Tech: A Data Analytics Case Study
(2020)
Presentation / Conference Contribution
Research on Prediction of Infectious Diseases, their spread via Social Media and their link to Education
(2019)
Presentation / Conference Contribution
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search