Rainfall-Runoff Modeling of Khormazard and Bonab Hydrometric Stations Using Support Vector Machine and Random Forest Algorithms

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Introduction

Water plays a crucial role in ensuring the sustainable development of any region. Given that our country consists primarily of arid and semi-arid regions, where the majority of rivers are also found, along with the critical state of groundwater extraction and the growing importance of surface water, It is crucial to have a deep understanding of the future condition of water resources within the country's watersheds (Fathollahi et al., 2015). By utilizing intelligent models, it becomes feasible to represent the inherent relationships between data that cannot be solved by conventional mathematical methods. Support vector machine (SVM) and Random Forest algorithms are two types of machine learning methods that utilize essential algorithms for making repeated and accurate predictions (Kisi & Parmarm, 2016). The most recent study conducted by Zarei et al. (2022) evaluated the risk of flooding using data mining models of SVM and RF (case study: Frizi watershed). By analyzing the results, it was found that both the SVM algorithm and the new random forest algorithm showed higher accuracy in predicting flooding risks, both in terms of the educational data and algorithmic performance. The purpose of this study is to simulate the precipitation-runoff process in the hydrometric stations at the end of the Maragheh plain (Khormazard station on the Mahpari chai river and Bonab station on the Sufichai river) in East Azerbaijan province using support vector machine and random forest modeling algorithms. This study has been conducted over a period of 43 years, making it one of the few research cases in this area.

Materials and Methods

The Maragheh Sufi chai basin is situated in the eastern region of Lake Urmia, within the East Azarbaijan province. It covers an area of 611.89 square kilometers and is located between longitudes 45° and 40´ to 46° and 25´and latitudes from 37° and 15´ to 37° and 55´ north. The average height of the basin is 1767 meters above sea level (Sharmod et al., 2015). Based on the substantial changes observed in the runoff trend in the data since 1994 (without any noticeable change in the precipitation trend), the available data was divided into two distinct periods. The first period spans from 1976 to 1994, and the second period covers the years 1995 to 2019. To simulate rainfall-runoff, first the average rainfall of Maragheh plain was calculated by polygonal method. Subsequently, this data was combined with the discharge output from Bonab and Khormazard stations, with a one-day time lag. These inputs were then utilized in two models, SVM (kernel function) and RF. For this purpose, 70% of the data was used for the training stage and 30% of the data was used for the validation stage. Then, the rainfall and runoff training sets from one day before were chosen as the predictor variables, while the runoff training set was designated as the target variable. Several combinations of runoff and rainfall inputs were evaluated for the purpose of modeling. The inputs consist of the monthly Q and P values that were recorded previously (Pt, Qt-1), while the output represents the current runoff data (Qt), with the subscript t indicating the time step. As a result, two input combinations were constructed from Q and P data (as seen in Table 3) and SVM and RF models were used for rainfall-runoff modeling to determine the optimal input combination.
Calculating average rainfall through the Thiessen Polygons method Thiessen polygons, which are Voronoi cells, are used to define rainfall polygons that correspond to the surface area (Ai). These polygons are used to weight the rainfall measured by each rain gauge (ri). Consequently, the area-weighted rainfall is equivalent to:
(1)
Random Forest Algorithm
Random forest is a modern type of tree-based methods that includes a multitude of classification and regression trees. This algorithm is one of the most widely used machine learning algorithms due to its simplicity and usability for both classification and regression tasks.
Support Vector Machine (SVM) algorithm
Support vector machines works like other artificial intelligence methods based on data mining algorithm. The most important functions of the support vector machine model are classification and linearization or data regression.
       Evaluation Criteria
To evaluate the models and compare their effectiveness, this research employs metrics such as the root mean square error (RMSE), correlation coefficient (r), explanation coefficient (R2) and Nash-Sutcliffe efficiency coefficient (NS) are used. Below are the relationships among these criteria:
(2)
(3)                                                     
(4)
(5)

Results and Discussion

 Figure 6 displays the time series data for rainfall and runoff during the two study periods, before and after 1994.The analysis of the figures showed that for Bonab station, during the two study periods, the value of Kendall's statistic for precipitation variable was 0.044 and 0.028, respectively. For Khormazard station, this statistic value for the first and second period was 0.030, and 0.028, respectively. However, these values are not significant at the 95% level. This indicates that the annual rainfall for the two studied stations during these years is not statistically significant. Therefore, it is concluded that the annual rainfall in these stations between the years 1976 to 2019 did not show any significant trend. The variations observed during this period were deemed normal, suggesting that the time series of rainfall displayed fluctuating patterns. However, it should be noted that there were instances of both increasing and decreasing trends in certain years Examining the time series reveals varying trends Initially, the outflow from Bonab station (both a and b) displayed fluctuating patterns, followed by periods of both decreasing and increasing trends. However, in recent years, there has an increase in outflow from this station. The Mann-Kendall test statistic for the two study periods for this station is 0.325 and 0.512, respectively. These values are significantly different at the 95% level, indicating that the increasing trend of discharge for both time periods was statistically significant. The reason for this trend at the Bonab station, compared to other entrance stations to Lake Urmia, is the lower demand for water in the Sofichai basin for agricultural and industrial purposes, in contrast to other rivers. To explore the root cause of this issue, studies should be conducted to examine both underground and surface water sources, as well as the utilization of water in the agricultural and industrial sectors of this region. On the contrary, the trend observed at Khormazard station (c and d) is different. Unlike Bonab station, the discharge from Khormazard station exhibited a complete downward trend. The Mann-Kendall test statistic for the discharge variable during our two research periods were -0.269 and -0.412, respectively. At the 95% level, the decreasing trend of discharge in this station was found to be significant. On the other hand, it is apparent that the volume of discharge in this hydrometric station has decreased drastically since 1976 (d). Apart from 2007, when there was a sudden increase in discharge volume, the water inflow into lake Urmia has remained at its lowest level throughout the years. To analyze the Bonab and Khormazard stations during two distinct periods, rainfall and runoff statistics (average, minimum, maximum) for the first period (1976-1994) and the second period (1995-2019) are presented in Tables 4 and 5. Based on the data presented in both tables, the Bonab station displays the highest average rainfall and runoff values in the total data column, while the Khormazard station has the lowest average rainfall and runoff values.
As mentioned, in order to model rainfall-runoff data using SVM and RF models, a portion of the data was used for training purposes, while another portion was used for validation. Tables 5 and 6 present the values of the calculated statistical indicators associated with the results obtained from the training and validation sections for both SVM and RF models. According to the results of Tables 6 and 7, it is clear that in both study periods, the SVM model outperformed the RF model at the Bonab station. The SVM model demonstrated superior accuracy in simulating both flow rate and monthly rainfall. Conversely, at the Kharmazard station during these periods, the RF model displayed better performance compared to the SVM model. The modeling results in the test set for both stations revealed that the mutual correlation values for the first and second study periods at the Bonab station were 0.85 and 0.84, respectively. For the Kharmazard station, these values were 0.79 and 0.75, respectively.

Conclusion

The results indicate that for both periods at the Bonab station, the SVM model exhibited higher efficiency compared to the RF model. Conversely, at the Khormazard station, the RF model outperformed the SVM model for both periods. Mutual correlation values for the test sets were 0.85 and 0.84 for the first and second study periods at the Bonab station, respectively, for the SVM model test set. For the Khormazard station, these values were 0.79 and 0.75, respectively, for the RF model test set. Other notable findings of this research include the analysis of the time series data for rainfall and runoff over 43 years. Graphs obtained for both stations, along with the Mann-Kendall statistic for precipitation and flow parameters, revealed no discernible trend in precipitation during the two study periods. Instead, precipitation in these areas displayed fluctuating patterns However, the analysis of the time series and statistical values for the discharge of Sofichai and Mahpari chai rivers at the Bonab and Khormazard stations showed different results. In the Bonab station, the discharge exhibited fluctuations, with an increase observed in the second period. Conversely, at the Khormazard station, the discharge trend was downward in both study periods. The volume of Mahpari chai River outflow notably decreased in recent years, as evidenced by the Mann-Kendall statistic showing a decreasing trend.

Language:
Persian
Published:
Journal of water and soil, Volume:37 Issue: 6, 2024
Pages:
971 to 989
https://magiran.com/p2698334  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!