Lučin, I., Družeta, S., Mauša, G., Alvir, M., Grbčić, L., Vukić Lušić, D., Sikirica, A., Kranjčević, L.
Predictive modeling of microbiological seawater quality in karst region using cascade model
Science of The Total Environment, (2022), doi.org/10.1016/j.scitotenv.2022.158009, (quartile Q1)
Abstract: This paper presents an in-depth analysis of seawater quality measurements during the bathing seasons from year 2009 to 2020 in the city of Rijeka, Croatia. Due to rare occurrences of measurements with less than excellent water quality, considered dataset is deeply imbalanced. Additionally, it incorporates measurements under the influence of submerged groundwater discharges (SGD), which were observed in some bathing locations. These discharges were previously thought to dry up during the summer season and are now suspected to be one of the causes of increased Escherichia coli values. Consequently, and in view of the fact that the accuracy of prediction models can be significantly influenced by temporal and spatial variation of the input data, a novel cascade prediction modeling strategy was proposed. It consists of a sequence of prediction models which tend to identify general environmental conditions which confidently lead to excellent bathing water quality. The proposed model uses environmental features which can rather easily be estimated or obtained from the weather forecast. The model was trained on a highly biased dataset, consisting of data from locations with and without SGD influence, and for the time period spanning extremely dry and warm seasons, extremely wet seasons, as well as normal seasons. To simulate realistic application, the model was tested using temporal and spatial stratification of data. The cascade strategy was shown to be a good approach for reliably detecting environmental parameters which produce excellent water quality. Proposed model is designed as a filter method, where instances classified as less-than-excellent water quality require further analysis. The cascade model provides great flexibility as it can be customized to the particular needs of the investigated area and dataset specifics.