Xuan Li
Identifying the spatial pattern and driving factors of nitrate in groundwater using a novel framework of interpretable stacking ensemble learning
Li, Xuan; Liang, Guohua; Wang, Lei; Yang, Yuesuo; Li, Yuanyin; Li, Zhongguo; He, Bin; Wang, Guoli
Authors
Guohua Liang
Lei Wang
Yuesuo Yang
Yuanyin Li yuanyin.li@durham.ac.uk
PGR Student Doctor of Philosophy
Zhongguo Li
Bin He
Guoli Wang
Abstract
Groundwater nitrate contamination poses a potential threat to human health and environmental safety globally. This study proposes an interpretable stacking ensemble learning (SEL) framework for enhancing and interpreting groundwater nitrate spatial predictions by integrating the two-level heterogeneous SEL model and SHapley Additive exPlanations (SHAP). In the SEL model, five commonly used machine learning models were utilized as base models (gradient boosting decision tree, extreme gradient boosting, random forest, extremely randomized trees, and k-nearest neighbor), whose outputs were taken as input data for the meta-model. When applied to the agricultural intensive area, the Eden Valley in the UK, the SEL model outperformed the individual models in predictive performance and generalization ability. It reveals a mean groundwater nitrate level of 2.22 mg/L-N, with 2.46% of sandstone aquifers exceeding the drinking standard of 11.3 mg/L-N. Alarmingly, 8.74% of areas with high groundwater nitrate remain outside the designated nitrate vulnerable zones. Moreover, SHAP identified that transmissivity, baseflow index, hydraulic conductivity, the percentage of arable land, and the C:N ratio in the soil were the top five key driving factors of groundwater nitrate. With nitrate threatening groundwater globally, this study presents a high-accuracy, interpretable, and flexible modeling framework that enhances our understanding of the mechanisms behind groundwater nitrate contamination. It implies that the interpretable SEL framework has great promise for providing valuable evidence for environmental management, water resource protection, and sustainable development, particularly in the data-scarce area.
Citation
Li, X., Liang, G., Wang, L., Yang, Y., Li, Y., Li, Z., He, B., & Wang, G. (2024). Identifying the spatial pattern and driving factors of nitrate in groundwater using a novel framework of interpretable stacking ensemble learning. Environmental Geochemistry and Health, 46(11), Article 482. https://doi.org/10.1007/s10653-024-02201-1
Journal Article Type | Article |
---|---|
Acceptance Date | Aug 27, 2024 |
Online Publication Date | Oct 29, 2024 |
Publication Date | Nov 1, 2024 |
Deposit Date | Nov 7, 2024 |
Publicly Available Date | Nov 7, 2024 |
Journal | Environmental Geochemistry and Health |
Print ISSN | 0269-4042 |
Electronic ISSN | 1573-2983 |
Publisher | Springer |
Peer Reviewed | Peer Reviewed |
Volume | 46 |
Issue | 11 |
Article Number | 482 |
DOI | https://doi.org/10.1007/s10653-024-02201-1 |
Keywords | Ensemble learning, Groundwater, Interpretable machine learning, Spatial distribution, Driving factors, Water quality |
Public URL | https://durham-repository.worktribe.com/output/3043792 |
Files
Published Journal Article
(4.7 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by-nc-nd/4.0/
You might also like
Nitrate transport velocity data in the global unsaturated zones
(2022)
Journal Article