Skip to main content

Research Repository

Advanced Search

A sparse Bayesian hierarchical vector autoregressive model for microbial dynamics in a wastewater treatment plant

Hannaford, N.E.; Heaps, S.E.; Nye, T.M.W.; Curtis, T.P.; Allen, B.; Golightly, A.; Wilkinson, D.J.

A sparse Bayesian hierarchical vector autoregressive model for microbial dynamics in a wastewater treatment plant Thumbnail


Authors

N.E. Hannaford

T.M.W. Nye

T.P. Curtis

B. Allen



Abstract

Proper function of a wastewater treatment plant (WWTP) relies on maintaining a delicate balance between a multitude of competing microorganisms. Gaining a detailed understanding of the complex network of interactions therein is essential to maximising not only current operational efficiencies, but also for the effective design of new treatment technologies. Metagenomics offers an insight into these dynamic systems through the analysis of the microbial DNA sequences present. Unique taxa are deduced through sequence clustering to form operational taxonomic units (OTUs), with per-taxa abundance estimates obtained from corresponding sequence counts. The data in this study comprise weekly OTU counts from an activated sludge (AS) tank of a WWTP along with corresponding measurements of chemical and environmental (CE) covariates. Directly fitting a model to the OTU data is incredibly challenging because of the high dimensionality and sparsity of the observations. The first step is therefore to aggregate the OTUs into twelve microbial communities or “bins” using a seasonal phase-based clustering approach. The mean abundances in the twelve bins are assumed to vary over time according to a multivariate linear regression on the CE covariates. Deviations from the mean are then modelled using a vector autoregressive (VAR) model of order one, which is a linear approximation to the commonly used generalised Lotka-Volterra (gLV) model. Sparsity is assumed in the interactions between microbial communities by carrying out inference in a hierarchical Bayesian framework which uses a shrinkage prior for the autoregressive coefficient matrix of the VAR model. Different shrinkage priors are explored by analysing simulated data sets before selecting the regularised horseshoe prior for the biological application. It is found that ammonia and chemical oxygen demand have a positive relationship with several bins and pH has a positive relationship with one bin. These results are supported by findings in the biological literature. Several negative interactions are also identified. These novel biological findings suggest OTUs in different bins may be competing for resources and that these relationships are complex. Although simpler than a gLV model, the VAR model is still able to offer valuable insight into the microbial dynamics of the WWTP.

Citation

Hannaford, N., Heaps, S., Nye, T., Curtis, T., Allen, B., Golightly, A., & Wilkinson, D. (2023). A sparse Bayesian hierarchical vector autoregressive model for microbial dynamics in a wastewater treatment plant. Computational Statistics & Data Analysis, 179, https://doi.org/10.1016/j.csda.2022.107659

Journal Article Type Article
Acceptance Date Oct 17, 2022
Online Publication Date Nov 22, 2022
Publication Date 2023-03
Deposit Date Oct 27, 2022
Publicly Available Date Jan 9, 2023
Journal Computational Statistics and Data Analysis
Print ISSN 0167-9473
Electronic ISSN 1872-7352
Publisher Elsevier
Peer Reviewed Peer Reviewed
Volume 179
DOI https://doi.org/10.1016/j.csda.2022.107659
Public URL https://durham-repository.worktribe.com/output/1190380

Files





You might also like



Downloadable Citations