Hybrid machine learning forecasting of aquatic ecosystem dynamics using sensor-based monitoring systems
Abstract
This study aims to develop an integrated hybrid machine learning model that combines sensor-based environmental monitoring for the accurate prediction of aquatic ecosystem dynamics, focusing on key water quality indicators. A novel hybrid framework integrating CatBoost and XGBoost regressors was constructed, optimized through LASSO feature selection, and enhanced by SHAP analysis for interpretability. Real-time data on dissolved oxygen, hardness, transparency, and nutrient content were collected using IoT-enabled multi-parameter water sensors in lakes in Northern Kazakhstan. The hybrid model outperformed individual algorithms, achieving an RMSE of 0.362 for dissolved oxygen predictions. SHAP analysis revealed that nitrate nitrogen, total phosphorus, pH, and suspended solids were the most significant influencing factors. Additionally, the system effectively forecasted impacts on biota, indicating a potential reduction in phytoplankton and an increase in zooplankton populations in 2024. The integration of hybrid machine learning with real-time monitoring significantly improves prediction accuracy and interpretability, providing a practical decision-support tool for environmental agencies and water resource managers to proactively monitor and manage water bodies under the pressures of climate change and anthropogenic influences.
Authors

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.