A sparse Bayesian hierarchical vector autoregressive model for microbial dynamics in a wastewater treatment plant
Hannaford, N.E.; Heaps, S.E.; Nye, T.M.W.; Curtis, T.P.; Allen, B.; Golightly, A.; Wilkinson, D.J.
Dr Sarah Heaps firstname.lastname@example.org
Professor Andrew Golightly email@example.com
Professor Darren Wilkinson firstname.lastname@example.org
Proper function of a wastewater treatment plant (WWTP) relies on maintaining a delicate balance between a multitude of competing microorganisms. Gaining a detailed understanding of the complex network of interactions therein is essential to maximising not only current operational efficiencies, but also for the effective design of new treatment technologies. Metagenomics offers an insight into these dynamic systems through the analysis of the microbial DNA sequences present. Unique taxa are deduced through sequence clustering to form operational taxonomic units (OTUs), with per-taxa abundance estimates obtained from corresponding sequence counts. The data in this study comprise weekly OTU counts from an activated sludge (AS) tank of a WWTP along with corresponding measurements of chemical and environmental (CE) covariates. Directly fitting a model to the OTU data is incredibly challenging because of the high dimensionality and sparsity of the observations. The first step is therefore to aggregate the OTUs into twelve microbial communities or “bins” using a seasonal phase-based clustering approach. The mean abundances in the twelve bins are assumed to vary over time according to a multivariate linear regression on the CE covariates. Deviations from the mean are then modelled using a vector autoregressive (VAR) model of order one, which is a linear approximation to the commonly used generalised Lotka-Volterra (gLV) model. Sparsity is assumed in the interactions between microbial communities by carrying out inference in a hierarchical Bayesian framework which uses a shrinkage prior for the autoregressive coefficient matrix of the VAR model. Different shrinkage priors are explored by analysing simulated data sets before selecting the regularised horseshoe prior for the biological application. It is found that ammonia and chemical oxygen demand have a positive relationship with several bins and pH has a positive relationship with one bin. These results are supported by findings in the biological literature. Several negative interactions are also identified. These novel biological findings suggest OTUs in different bins may be competing for resources and that these relationships are complex. Although simpler than a gLV model, the VAR model is still able to offer valuable insight into the microbial dynamics of the WWTP.
Hannaford, N., Heaps, S., Nye, T., Curtis, T., Allen, B., Golightly, A., & Wilkinson, D. (2023). A sparse Bayesian hierarchical vector autoregressive model for microbial dynamics in a wastewater treatment plant. Computational Statistics & Data Analysis, 179, https://doi.org/10.1016/j.csda.2022.107659
|Journal Article Type||Article|
|Acceptance Date||Oct 17, 2022|
|Online Publication Date||Nov 22, 2022|
|Deposit Date||Oct 27, 2022|
|Publicly Available Date||Jan 9, 2023|
|Journal||Computational Statistics and Data Analysis|
|Peer Reviewed||Peer Reviewed|
Published Journal Article
Publisher Licence URL
Published by Elsevier B.V. This is an open access article under the CC BY license<br /> (http://creativecommons.org/licenses/by/4.0/).
You might also like
A Bayesian spatio‐temporal model for short‐term forecasting of precipitation fields
Enforcing Stationarity through the Prior in Vector Autoregressions
Incorporating compositional heterogeneity into Lie Markov models for phylogenetic inference
Generalizing rate heterogeneity across sites in statistical phylogenetics
Identifying the effect of public holidays on daily demand for gas