Modeling streamflow in non-gauged watersheds with sparse data considering physiographic, dynamic climate, and anthropogenic factors using explainable soft computing techniques

Madhushani, Charuni, Dananjaya, Kusal, Ekanayake, I.U., Meddage, D.P.P., Kantamaneni, Komali orcid iconORCID: 0000-0002-3852-4374 and Rathnayake, Upaka (2024) Modeling streamflow in non-gauged watersheds with sparse data considering physiographic, dynamic climate, and anthropogenic factors using explainable soft computing techniques. Journal of Hydrology, 631 . ISSN 0022-1694

[thumbnail of AAM] PDF (AAM) - Accepted Version
Restricted to Repository staff only until 8 February 2026.
Available under License Creative Commons Attribution Non-commercial No Derivatives.

3MB

Official URL: https://doi.org/10.1016/j.jhydrol.2024.130846

Abstract

Streamflow forecasting is essential for effective water resource planning and early warning systems. Streamflow and related parameters are often characterized by uncertainties and complex behaviors. Recent studies have turned to machine learning (ML) to predict streamflow. However, many of these methods have overlooked the interpretability and causality of their predictions, which undermine the confidence of end-users in the reliability of machine learning. Besides, non-gauged basins have been receiving more attention due to the inherent risks involved in streamflow prediction. This study aims to overcome these limitations by utilizing ML to model streamflow in a non-gauged basin using anthropogenic, static physiographic, and dynamic climate variables, while also providing interpretability through the use of Shapley Additive Explanations (SHAP). Four ML algorithms were employed in this study, including Histogram Gradient Boosting (HGB), Extreme Gradient Boosting (XGB), Deep Neural Network (DNN), and Convolutional Neural Network (CNN) to forecast streamflow. XGB outperformed the other models with a correlation coefficient (R) of 0.91 for training and 0.884 for testing, along with mean absolute errors (MAE) of 0.02 for training and 0.023 for testing. Significantly, the use of SHAP provided insights into the inner workings of XGB predictions, revealing how these predictions are made. SHAP provides the feature importance, interactions among features, and dependencies. This explainable model (SHAP) is an invaluable addition to ML-based streamflow predictions and early warning systems, offering human-comprehensible interpretations. The findings of this study are specially imperative to manage flood risk factors in urban areas.


Repository Staff Only: item control page