A Comparative Study of Machine Learning Algorithms for Water Quality Prediction Using SHAP-based Explainability
Abstract: Accurate and interpretable water quality prediction is crucial for environmental monitoring and public health. This study evaluates six machine learning models—Random Forest, Long Short-Term Memory (LSTM), K-Nearest Neighbors (KNN), Linear Regression, Ridge Regression, and Support Vector Regression (SVR)—using real-world groundwater data from ARPAE. Model performance was assessed via Mean Absolute Error (MAE) and Mean Squared Error (MSE), while SHAP values were employed for feature-level interpretability. Results indicate that Random Forest outperforms all models in both accuracy and explainability, whereas SVR demonstrates poor predictive capability and lacks meaningful interpretability. The study highlights the trade-offs between predictive power and transparency, offering insights for selecting appropriate models in water quality monitoring systems.
External IDs:dblp:conf/wetice/CabriR25
Loading