Heather Davies
Text mining for disease surveillance in veterinary clinical data: part two, training computers to identify features in clinical text
Davies, Heather; Nenadic, Goran; Alfattni, Ghada; Arguello Casteleiro, Mercedes; Al Moubayed, Noura; Farrell, Sean; Radford, Alan D.; Noble, P.-J. M.
Authors
Goran Nenadic
Ghada Alfattni
Mercedes Arguello Casteleiro
Dr Noura Al Moubayed noura.al-moubayed@durham.ac.uk
Associate Professor
Sean Farrell sean.farrell2@durham.ac.uk
PGR Student Doctor of Philosophy
Alan D. Radford
P.-J. M. Noble
Abstract
In part two of this mini-series, we evaluate the range of machine-learning tools now available for application to veterinary clinical text-mining. These tools will be vital to automate extraction of information from large datasets of veterinary clinical narratives curated by projects such as the Small Animal Veterinary Surveillance Network (SAVSNET) and VetCompass, where volumes of millions of records preclude reading records and the complexities of clinical notes limit usefulness of more “traditional” text-mining approaches. We discuss the application of various machine learning techniques ranging from simple models for identifying words and phrases with similar meanings to expand lexicons for keyword searching, to the use of more complex language models. Specifically, we describe the use of language models for record annotation, unsupervised approaches for identifying topics within large datasets, and discuss more recent developments in the area of generative models (such as ChatGPT). As these models become increasingly complex it is pertinent that researchers and clinicians work together to ensure that the outputs of these models are explainable in order to instill confidence in any conclusions drawn from them.
Citation
Davies, H., Nenadic, G., Alfattni, G., Arguello Casteleiro, M., Al Moubayed, N., Farrell, S., Radford, A. D., & Noble, P.-J. M. (2024). Text mining for disease surveillance in veterinary clinical data: part two, training computers to identify features in clinical text. Frontiers in Veterinary Science, 11, Article 1352726. https://doi.org/10.3389/fvets.2024.1352726
Journal Article Type | Article |
---|---|
Acceptance Date | Jul 17, 2024 |
Online Publication Date | Aug 22, 2024 |
Publication Date | Aug 22, 2024 |
Deposit Date | Sep 13, 2024 |
Publicly Available Date | Sep 13, 2024 |
Journal | Frontiers in Veterinary Science |
Electronic ISSN | 2297-1769 |
Publisher | Frontiers Media |
Peer Reviewed | Peer Reviewed |
Volume | 11 |
Article Number | 1352726 |
DOI | https://doi.org/10.3389/fvets.2024.1352726 |
Keywords | clinical records, neural language modeling, machine learning, companion animals, big data |
Public URL | https://durham-repository.worktribe.com/output/2820409 |
Files
Published Journal Article
(565 Kb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
Explainable text-tabular models for predicting mortality risk in companion animals
(2024)
Journal Article
Evaluating ChatGPT text mining of clinical records for companion animal obesity monitoring
(2023)
Journal Article
Racial Bias within Face Recognition: A Survey
(2024)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search