Explainable text-tabular models for predicting mortality risk in companion animals

Burton, James; Farrell, Sean; Mäntylä Noble, Peter-John; Al Moubayed, Noura

doi:10.1038/s41598-024-64551-1

Explainable text-tabular models for predicting mortality risk in companion animals

Burton, James; Farrell, Sean; Mäntylä Noble, Peter-John; Al Moubayed, Noura

Authors

James Burton james.burton@durham.ac.uk
Demonstrator (Ptt)

Sean Farrell sean.farrell2@durham.ac.uk
PGR Student Doctor of Philosophy

Peter-John Mäntylä Noble

Dr Noura Al Moubayed noura.al-moubayed@durham.ac.uk
Associate Professor

Abstract

As interest in using machine learning models to support clinical decision-making increases, explainability is an unequivocal priority for clinicians, researchers and regulators to comprehend and trust their results. With many clinical datasets containing a range of modalities, from the free-text of clinician notes to structured tabular data entries, there is a need for frameworks capable of providing comprehensive explanation values across diverse modalities. Here, we present a multimodal masking framework to extend the reach of SHapley Additive exPlanations (SHAP) to text and tabular datasets to identify risk factors for companion animal mortality in first-opinion veterinary electronic health records (EHRs) from across the United Kingdom. The framework is designed to treat each modality consistently, ensuring uniform and consistent treatment of features and thereby fostering predictability in unimodal and multimodal contexts. We present five multimodality approaches, with the best-performing method utilising PetBERT, a language model pre-trained on a veterinary dataset. Utilising our framework, we shed light for the first time on the reasons each model makes its decision and identify the inclination of PetBERT towards a more pronounced engagement with free-text narratives compared to BERT-base’s predominant emphasis on tabular data. The investigation also explores the important features on a more granular level, identifying distinct words and phrases that substantially influenced an animal’s life status prediction. PetBERT showcased a heightened ability to grasp phrases associated with veterinary clinical nomenclature, signalling the productivity of additional pre-training of language models.

Citation

Burton, J., Farrell, S., Mäntylä Noble, P.-J., & Al Moubayed, N. (2024). Explainable text-tabular models for predicting mortality risk in companion animals. Scientific Reports, 14(1), Article 14217. https://doi.org/10.1038/s41598-024-64551-1

Journal Article Type	Article
Acceptance Date	Jun 10, 2024
Online Publication Date	Jun 20, 2024
Publication Date	Jun 20, 2024
Deposit Date	Jun 25, 2024
Publicly Available Date	Jun 25, 2024
Journal	Scientific Reports
Electronic ISSN	2045-2322
Publisher	Nature Research
Peer Reviewed	Peer Reviewed
Volume	14
Issue	1
Article Number	14217
DOI	https://doi.org/10.1038/s41598-024-64551-1
Public URL	https://durham-repository.worktribe.com/output/2493209