Development of Reinforcement Learning Algorithm for Automation of Slide Gate Check Structure in Canals

Author(s):

K. Shahverdi , M.J. Monem

Message:

Abstract:

Introduction

Nowadays considering water shortage and weak management in agricultural water sector and for optimal uses of water, irrigation networks performance need to be improveed. Recently, intelligent management of water conveyance and delivery, and better control technologies have been considered for improving the performance of irrigation networks and their operation. For this affair, providing of mathematical model of automatic control system and related structures, which connected with hydrodynamic models, is necessary. The main objective of this research, is development of mathematical model of RL upstream control algorithm inside ICSS hydrodynamic model as a subroutine.

Materials And Methods

In the learning systems, a set of state-action rules called classifiers compete to control the system based on the system's receipt from the environment. One could be identified five main elements of the RL: an agent, an environment, a policy, a reward function, and a simulator. The learner (decision-maker) is called the agent. The thing it interacts with, comprising everything outside the agent, is called the environment. The agent selects an action based on existing state in the environment. When the agent takes an action and performs on environment, the environment goes new state and reward is assigned based on it. The agent and the environment continually interact to maximize the reward. The policy is a set of state-action pair, which have higher rewards. It defines the agent's behavior and says which action must be taken in which state. The reward function defines the goal in a RL problem. The reward function defines what the good and bad events are for the agent. The higher the reward, the better the action. The simulator provides environment information. In irrigation canals, the agent is the check structures. The action and state are the check structures adjustment and the water depth, respectively. The environment comprises the hydraulic information existing in the canal. Policy is a map of water depth-check structure pairs. Reward function is defined based on the difference between water depth and target depth, and the simulator is a hydrodynamic model which, in the present study, was Irrigation Conveyance System Simulation (ICSS). In the developed RL, the RL begins with required initializations, and then the canal structures are operated. While the maximum reward is reached at the time step of t, the agent receives some representation of the state of the environment and, on that basis, selects an action. The simulator performs the action and provides information on the state of the new environment by simulating the canal system. Finally, the reward is assigned. Maximizing the reward, the RL goes on to the next time step. This process is continued until the final simulation time step is reached. The learning process is similar for all operations. The ICSS hydrodynamic model was used to simulate the canal system and provide the environmental information. Input to the ICSS was the new selected action. The ICSS performed the action and simulated the canal system. The output from the ICSS was information on the new environment for use in the next time step. Two scenarios of flow increase and decrease with initial flow of 25 l/s were simulated. MAE (Maximum Absolute Error), IAE (Integral of Absolute magnitude of Error) and SRT (System Response Time) indicators have been used to assess developed model. For flow decrease scenario, the indicators value are obtained zero.

Results And Discussion

Results were obtained from the performed scenarios. In the flow increase scenario, water depth variations were inside the dead band, therefore, SRT indicator was obtained zero. The MAE and IAE indicators were obtained 3.5% and 2.57%, respectively, which showed the water depth deviations from target depth was very low. In the flow decrease scenario, the all indicators values were obtained zero. At time zero in two scenarios, 1000 populations were generated and tested. As the RL controlled the water depth, it generated new populations, too. The reason for this is that the RL generates a new population if there is no classifier with maximum reward in the population. There are no new generation after 0.03 hr and 0.16 hr in flow increase and flow decrease scenarios, respectively. Considering the results, it could be concluded that the developed control system is a powerful technique in terms of accuracy and response time for water depth control.

Conclusion

In this research, the RL upstream control system was developed and connected with ICSS hydrodynamic model and evaluated in two scenarios of flow increase and flow decrease. The results showed an ability to control of deviations, short response time and accurate performance of the developed RL control system, which could be used for further study in irrigation canals.

Keywords:

Learning , Model development , Performance improvement

Language:

Persian

Published:

Journal of water and soil, Volume:29 Issue: 4, 2015

Pages:

828 to 837

magiran.com/p1498566

دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:

اشتراک شخصی

با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!

اشتراک سازمانی

به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!

اطلاعات بیشتر

توجه!

حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.

In order to view content subscription is required

Personal subscription

Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.

Organization subscription

Please contact us to subscribe your university or library for unlimited access!

More information

علمی مصوب