Skip to main content

Research Repository

Advanced Search

Text mining for disease surveillance in veterinary clinical data: part two, training computers to identify features in clinical text

Davies, Heather; Nenadic, Goran; Alfattni, Ghada; Arguello Casteleiro, Mercedes; Al Moubayed, Noura; Farrell, Sean; Radford, Alan D.; Noble, P.-J. M.

Text mining for disease surveillance in veterinary clinical data: part two, training computers to identify features in clinical text Thumbnail


Authors

Heather Davies

Goran Nenadic

Ghada Alfattni

Mercedes Arguello Casteleiro

Sean Farrell sean.farrell2@durham.ac.uk
PGR Student Doctor of Philosophy

Alan D. Radford

P.-J. M. Noble



Abstract

In part two of this mini-series, we evaluate the range of machine-learning tools now available for application to veterinary clinical text-mining. These tools will be vital to automate extraction of information from large datasets of veterinary clinical narratives curated by projects such as the Small Animal Veterinary Surveillance Network (SAVSNET) and VetCompass, where volumes of millions of records preclude reading records and the complexities of clinical notes limit usefulness of more “traditional” text-mining approaches. We discuss the application of various machine learning techniques ranging from simple models for identifying words and phrases with similar meanings to expand lexicons for keyword searching, to the use of more complex language models. Specifically, we describe the use of language models for record annotation, unsupervised approaches for identifying topics within large datasets, and discuss more recent developments in the area of generative models (such as ChatGPT). As these models become increasingly complex it is pertinent that researchers and clinicians work together to ensure that the outputs of these models are explainable in order to instill confidence in any conclusions drawn from them.

Citation

Davies, H., Nenadic, G., Alfattni, G., Arguello Casteleiro, M., Al Moubayed, N., Farrell, S., Radford, A. D., & Noble, P.-J. M. (2024). Text mining for disease surveillance in veterinary clinical data: part two, training computers to identify features in clinical text. Frontiers in Veterinary Science, 11, Article 1352726. https://doi.org/10.3389/fvets.2024.1352726

Journal Article Type Article
Acceptance Date Jul 17, 2024
Online Publication Date Aug 22, 2024
Publication Date Aug 22, 2024
Deposit Date Sep 13, 2024
Publicly Available Date Sep 13, 2024
Journal Frontiers in Veterinary Science
Electronic ISSN 2297-1769
Publisher Frontiers Media
Peer Reviewed Peer Reviewed
Volume 11
Article Number 1352726
DOI https://doi.org/10.3389/fvets.2024.1352726
Keywords clinical records, neural language modeling, machine learning, companion animals, big data
Public URL https://durham-repository.worktribe.com/output/2820409

Files





You might also like



Downloadable Citations