Skip to main content

Research Repository

Advanced Search

Outputs (80)

Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models (2025)
Presentation / Conference Contribution
Leask, P., & Al Moubayed, N. (2025, July). Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models. Presented at International Conference on Machine Learning (ICML 2025), Vancouver, Canada

Sparse Autoencoders (SAEs) are a popular method for decomposing Large Language Model (LLM) activations into interpretable latents, however they have a substantial training cost and SAEs learned on different models are not directly comparable. Motivat... Read More about Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models.

Sparse Autoencoders Do Not Find Canonical Units of Analysis (2025)
Presentation / Conference Contribution
Leask, P., Bussmann, B., Pearce, M., Bloom, J., Tigges, C., Al Moubayed, N., Sharkey, L., & Nanda, N. (2025, April). Sparse Autoencoders Do Not Find Canonical Units of Analysis. Presented at ICLR2025: The Thirteenth International Conference on Learning Representations, Singapore

A common goal of mechanistic interpretability is to decompose the activations of neural networks into features: interpretable properties of the input computed by the model. Sparse autoencoders (SAEs) are a popular method for finding these features in... Read More about Sparse Autoencoders Do Not Find Canonical Units of Analysis.

Sparse Autoencoders Do Not Find Canonical Units of Analysis (2025)
Presentation / Conference Contribution
Leask, P., Bussmann, B., Pearce, M. T., Isaac Bloom, J., Tigges, C., Al Moubayed, N., Sharkey, L., & Nanda, N. (2025, April). Sparse Autoencoders Do Not Find Canonical Units of Analysis. Presented at The Thirteenth International Conference on Learning Representations, Singapore

A common goal of mechanistic interpretability is to decompose the activations of neural networks into features: interpretable properties of the input computed by the model. Sparse autoencoders (SAEs) are a popular method for finding these features in... Read More about Sparse Autoencoders Do Not Find Canonical Units of Analysis.

The variable relationship between the National Early Warning Score on admission to hospital, the primary discharge diagnosis and in-hospital mortality Authors information (2025)
Journal Article
Holland, M., Kellett, J., Boulitsakis-Logothetis, S., Watson, M., Al Moubayed, N., & Green, D. (online). The variable relationship between the National Early Warning Score on admission to hospital, the primary discharge diagnosis and in-hospital mortality Authors information. Internal and Emergency Medicine, https://doi.org/10.1007/s11739-024-03828-9

Background: Patients with an elevated admission National Early Warning Score (NEWS) are more likely to die while in hospital. However, it is not known if this increased mortality risk is the same for all diagnoses. The aim of this study was to determ... Read More about The variable relationship between the National Early Warning Score on admission to hospital, the primary discharge diagnosis and in-hospital mortality Authors information.

Racial Bias within Face Recognition: A Survey (2024)
Journal Article
Yucer, S., Tekras, F., Al Moubayed, N., & Breckon, T. P. (2025). Racial Bias within Face Recognition: A Survey. ACM Computing Surveys, 57(4), 1-39. https://doi.org/10.1145/3705295

Facial recognition is one of the most academically studied and industrially developed areas within computer vision where we readily find associated applications deployed globally. This widespread adoption has uncovered significant performance variati... Read More about Racial Bias within Face Recognition: A Survey.

Performance of machine learning versus the national early warning score for predicting patient deterioration risk: a single-site study of emergency admissions (2024)
Journal Article
Watson, M., Boulitsakis Logothetis, S., Green, D., Holland, M., Chambers, P., & Al Moubayed, N. (2024). Performance of machine learning versus the national early warning score for predicting patient deterioration risk: a single-site study of emergency admissions. BMJ Health & Care Informatics, 31(1), Article e101088. https://doi.org/10.1136/bmjhci-2024-101088

Objectives: Increasing operational pressures on emergency departments (ED) make it imperative to quickly and accurately identify patients requiring urgent clinical intervention. The widespread adoption of electronic health records (EHR) makes rich fe... Read More about Performance of machine learning versus the national early warning score for predicting patient deterioration risk: a single-site study of emergency admissions.

Premature mortality analysis of 52,000 deceased cats and dogs exposes socioeconomic disparities (2024)
Journal Article
Farrell, S., Anderson, K., Noble, P.-J. M., & Al Moubayed, N. (2024). Premature mortality analysis of 52,000 deceased cats and dogs exposes socioeconomic disparities. Scientific Reports, 14(1), Article 28763. https://doi.org/10.1038/s41598-024-77385-8

Monitoring mortality rates offers crucial insights into public health by uncovering the hidden impacts of diseases, identifying emerging trends, optimising resource allocation, and informing effective policy decisions. Here, we present a novel approa... Read More about Premature mortality analysis of 52,000 deceased cats and dogs exposes socioeconomic disparities.

From prediction to practice: mitigating bias and data shift in machine-learning models for chemotherapy-induced organ dysfunction across unseen cancers (2024)
Journal Article
Watson, M., Chambers, P., Steventon, L., Harmsworth King, J., Ercia, A., Shaw, H., & Al Moubayed, N. (2024). From prediction to practice: mitigating bias and data shift in machine-learning models for chemotherapy-induced organ dysfunction across unseen cancers. BMJ Oncology, 3(1), Article e000430. https://doi.org/10.1136/bmjonc-2024-000430

Objectives: Routine monitoring of renal and hepatic function during chemotherapy ensures that treatment-related organ damage has not occurred and clearance of subsequent treatment is not hindered; however, frequency and timing are not optimal. Model... Read More about From prediction to practice: mitigating bias and data shift in machine-learning models for chemotherapy-induced organ dysfunction across unseen cancers.

SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval (2024)
Presentation / Conference Contribution
Wu, S., Li, Y., Zhu, K., Zhang, G., Liang, Y., Ma, K., Xiao, C., Zhang, H., Yang, B., Chen, W., Huang, W., Al Moubayed, N., Fu, J., & Lin, C. (2024, August). SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval. Presented at ACL 2024: Annual Meeting of the Association for Computational Linguistics, Bangkok, Thailand

Multi-modal information retrieval (MMIR) is a rapidly evolving field where significant progress has been made through advanced representation learning and cross-modality alignment research, particularly in image-text pairs. However, current benchmark... Read More about SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval.

Text mining for disease surveillance in veterinary clinical data: part two, training computers to identify features in clinical text (2024)
Journal Article
Davies, H., Nenadic, G., Alfattni, G., Arguello Casteleiro, M., Al Moubayed, N., Farrell, S., Radford, A. D., & Noble, P.-J. M. (2024). Text mining for disease surveillance in veterinary clinical data: part two, training computers to identify features in clinical text. Frontiers in Veterinary Science, 11, Article 1352726. https://doi.org/10.3389/fvets.2024.1352726

In part two of this mini-series, we evaluate the range of machine-learning tools now available for application to veterinary clinical text-mining. These tools will be vital to automate extraction of information from large datasets of veterinary clini... Read More about Text mining for disease surveillance in veterinary clinical data: part two, training computers to identify features in clinical text.