Evaluation of the Efficiency of Artificial Neural Network and Random Forest Models in Predicting Groundwater Quality Parameters of Yazd-Ardakan Plain
Groundwater resources are one of the main natural resources that support socio-economic development of countries (Siebert et al., 2010). Management and protection of groundwater resources is of great importance in countries such as Iran, which are located in arid and semi-arid regions without surface water resources. Therefore, resource conservation, in addition to short-term and long-term planning, is essential to optimizing their productivity (Abu-Khalaf et al., 2013). These natural resources face a variety of issues that threaten their sustainability, such as the effects of climate change, human activities and natural processes (Alabjah et al., 2018; Baghvand et al., 2010; Burri et al., 2019; El Asri et al., 2019; Houemenou et al., 2020; Mountadar et al., 2018). Considering the great importance of recognizing the quality characteristics of groundwater in desert areas and the need for its proper management, predicting groundwater quality for the management and exploitation of water resources in the Yazd-Ardakan plain seems urgent. Therefore, the aim of this study was to evaluate and compare the efficiency of artificial neural network and random forest models in predicting EC, SAR, SO4- and TDS values. In this research, modeling will be done based on the relationship between environmental (auxiliary) data and groundwater quality parameters.
The study area is spread over 482900 ha of land, which in terms of position is 53° 08´ 36˝ and 54° 85´ 32˝ E longitude, is located in the central plateau of Iran and the central part of Yazd province. The maximum height of the area is 2677 m and the minimum height is 997 m from the sea level. To evaluate the quality of groundwater sample, 201 wells of the monitoring network of Yazd Regional Water Company in 2016, the measured parameters EC, SAR, SO4- and TDS were used. The factors of environmental (auxiliary) data in this research include geological data, land use, vegetation indices, soil salinity indices, derivation of digital elevation model, distance from mines, distance from road, distance from river, distance from residential areas, rainfall and population.After preparing the groundwater quality parameters and environmental data, the values of EC, SAR, SO4- and TDS were predicted using artificial neural network and random forest models. In order to validate the random forest model, the cross-validation method (10-fold) was used in R statistical software. In this method, the data is divided into 10 parts. 9 parts of the data are used for modeling and the remaining part is used to validate the obtained model. MAE, RMSE and R2 statistical indices were used to evaluate the efficiency of artificial neural network and random forest models for groundwater parameters.
In this research, in simulating EC, SAR, SO4- and TDS parameters, the best structure obtained from 100 repetitions of artificial neural network learning has 2 hidden layers and 10 hidden neurons in each layer. The results of random forest sensitivity analysis show the high importance of the parameters extracted from the DEM. The use of satellite image data to investigate the groundwater quality parameters is also a convenient and cost-effective method. Furthermore, combining satellite data with DEM to investigate groundwater quality parameters and their zoning makes the results more efficient and increases its accuracy. A very important point in the analysis of parameters is the important role of parameters such as distance from the road and distance from the mines. This indicates the direct effect of man-made environmental factors on the groundwater quality of the study area. The results of modeling the EC, SAR, SO4- and TDS parameters using artificial neural network and random forest models show the appropriate accuracy of the artificial neural network model in predicting these parameters. Accordingly, the neural network model has predicted the parameters of EC, SAR, SO4- and TDS of groundwater with coefficient of determination of 0.82, 0.92, 0.92 and 0.97, respectively, while the model Stochastic forest has predicted EC, SAR, SO4- and TDS parameters of groundwater with coefficients of determination of 0.37, 0.65, 0.80 and 0.68, respectively. Based on the obtained results, it can be concluded that the artificial neural network model has a higher accuracy than the random forest model in predicting groundwater quality parameters. EC, SAR, SO4- and TDS parameters have the highest values in the north, center and southwest, and the lowest values in the southeast and southwest.
It can be said that the technique is highly efficient for estimating EC, SAR, SO4 and TDS parameters in Yazd-Ardakan plain as long as a proper and a sufficient number of input elements, a proper and compatible artificial neural network, and an appropriate calibration are used.
-
Using Machine Learning Algorithms for Modeling Groundwater Resources in Arid Rangeland Western Iran
Nazanin Salimi, Marzban Faramarzi*, Mohsen Tavakoli, Hasan Fathizad
Journal of Spatial Analysis Environmental Hazarts, -
Investigating the relationship between the effect of geological formations on groundwater quality (Study area: Yazd province)
*, MohammadAli Hakimzadeh Ardakani
Journal of Integrated Watershed Management,