Skip to main content

Research Repository

Advanced Search

Identifying the spatial pattern and driving factors of nitrate in groundwater using a novel framework of interpretable stacking ensemble learning

Li, Xuan; Liang, Guohua; Wang, Lei; Yang, Yuesuo; Li, Yuanyin; Li, Zhongguo; He, Bin; Wang, Guoli

Identifying the spatial pattern and driving factors of nitrate in groundwater using a novel framework of interpretable stacking ensemble learning Thumbnail


Authors

Xuan Li

Guohua Liang

Lei Wang

Yuesuo Yang

Yuanyin Li yuanyin.li@durham.ac.uk
PGR Student Doctor of Philosophy

Zhongguo Li

Bin He

Guoli Wang



Abstract

Groundwater nitrate contamination poses a potential threat to human health and environmental safety globally. This study proposes an interpretable stacking ensemble learning (SEL) framework for enhancing and interpreting groundwater nitrate spatial predictions by integrating the two-level heterogeneous SEL model and SHapley Additive exPlanations (SHAP). In the SEL model, five commonly used machine learning models were utilized as base models (gradient boosting decision tree, extreme gradient boosting, random forest, extremely randomized trees, and k-nearest neighbor), whose outputs were taken as input data for the meta-model. When applied to the agricultural intensive area, the Eden Valley in the UK, the SEL model outperformed the individual models in predictive performance and generalization ability. It reveals a mean groundwater nitrate level of 2.22 mg/L-N, with 2.46% of sandstone aquifers exceeding the drinking standard of 11.3 mg/L-N. Alarmingly, 8.74% of areas with high groundwater nitrate remain outside the designated nitrate vulnerable zones. Moreover, SHAP identified that transmissivity, baseflow index, hydraulic conductivity, the percentage of arable land, and the C:N ratio in the soil were the top five key driving factors of groundwater nitrate. With nitrate threatening groundwater globally, this study presents a high-accuracy, interpretable, and flexible modeling framework that enhances our understanding of the mechanisms behind groundwater nitrate contamination. It implies that the interpretable SEL framework has great promise for providing valuable evidence for environmental management, water resource protection, and sustainable development, particularly in the data-scarce area.

Citation

Li, X., Liang, G., Wang, L., Yang, Y., Li, Y., Li, Z., He, B., & Wang, G. (2024). Identifying the spatial pattern and driving factors of nitrate in groundwater using a novel framework of interpretable stacking ensemble learning. Environmental Geochemistry and Health, 46(11), Article 482. https://doi.org/10.1007/s10653-024-02201-1

Journal Article Type Article
Acceptance Date Aug 27, 2024
Online Publication Date Oct 29, 2024
Publication Date Nov 1, 2024
Deposit Date Nov 7, 2024
Publicly Available Date Nov 7, 2024
Journal Environmental Geochemistry and Health
Print ISSN 0269-4042
Electronic ISSN 1573-2983
Publisher Springer
Peer Reviewed Peer Reviewed
Volume 46
Issue 11
Article Number 482
DOI https://doi.org/10.1007/s10653-024-02201-1
Keywords Ensemble learning, Groundwater, Interpretable machine learning, Spatial distribution, Driving factors, Water quality
Public URL https://durham-repository.worktribe.com/output/3043792

Files





You might also like



Downloadable Citations