An Alternative to Laboratory Testing: Random Forest-Based Water Quality Prediction Framework for Inland and Nearshore Water Bodies
Water quality monitoring plays a vital role in the water environment management, while efficient monitoring provides direction and verification of the effectiveness of water management. Traditional water quality monitoring for a variety of water parameters requires the placement of multiple sensors, and some water quality data (e.g., total nitrogen (TN)) requires testing instruments or laboratory analysis to obtain results, which takes longer than the sensors. In this paper, we designed a water
... quality prediction framework, which uses available water quality variables (e.g., temperature, pH, conductivity, etc.) to predict total nitrogen concentrations in inland water bodies. The framework was also used to predict nearshore seawater salinity and temperature using remote sensing bands. We conducted experiments on real water quality datasets and random forest was chosen to be the core algorithm of the framework by comparing and analyzing the performance of different machine learning algorithms. The results show that among all tested machine learning models, random forest performs the best. The data prediction error rate of the random forest model in predicting the total nitrogen concentration in inland rivers was 4.9%. Moreover, to explore the prediction effect of random forest algorithm when the independent variable is non-water quality data, we took the reflectance of remote sensing bands as the independent variables and successfully inverted the salinity distribution of Shenzhen Bay in the Google Earth Engine (GEE) platform. According to the experimental results, the random forest-based water quality prediction framework can achieve 92.94% accuracy in predicting the salinity of nearshore waters.