A Data-driven Model for Predicting the Yield of Recoverable Sugar from Sugarcane

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Introduction

Sugarcane is a strategic agricultural product and increasing productivity and self-sufficiency in its production is of special importance. The most important product of sugarcane is sugar. Various factors like climatic and management conditions affect the yield of sugarcane and recoverable sugar. Crop yield forecasting is one of the most important topics in precision agriculture, which is used to estimate yield, match product supply with demand and manage product to increase productivity. The purpose of this study is to predict and model the factors affecting sugar extracted from sugarcane (recoverable sugar) in the farms of Amir-Kabir sugarcane agro-industry Company of Khuzestan province using machine learning methods.

Materials and Methods

To conduct this study, data from the agro-industrial company Amir-Kabir in the province of Khuzestan from 2010 to 2017 were used. This data has 3223 records which include four sets of data: climate, soil, crop and farm management. This data includes continuous and discrete variables. Discrete variables include production management, soil type, farm, variety, age (cane class), the month of harvest and times irrigation. Continuous variables include area, chemical fertilizer consumption, water consumption per hectare, total water consumption, drain, crop season duration, yield (cane yield) soil EC, purity, time interval drying off to crop harvest, precipitation, min and max temperature, min and max relative humidity, wind speed and evaporation. The recoverable sugar variable is considered as the target variable and is divided into two classes, values greater than or equal to 9 are in the optimal class and less than 9 are in the undesirable class. The other variables are considered as predictor variables. For modeling using the Holdout method the data were randomly divided into two independent sets, a training set and a test set. 70% of the data which includes 2256 records were used for training and 30% of the data which includes 967 records were used for testing. The modeling of this study was performed with the Python programming language version 3.8.6 in the Jupyter notebook environment. Random Forest, Adaboost, XGBoost and SVM (support vector machine) algorithms were used for modeling.

Results and Discussion

To evaluate the models, metrics of accuracy, precision, recall, f1 score and k-fold cross validation were used. The XGBoost model with 94.8% accuracy on the training set and the Adaboost model with 92.4% accuracy on the test set, are the best models. Based on precision and recall metrics Adaboost model with 87% precision and SVM model with 87% recall have better performance than the other models. Based on Repeated 10-fold stratified cross validation using two repeats the SVM model with 92.3% accuracy is the best model. The variables of purity, time interval drying off to crop harvest and crop season duration are the most important variables in predicting the recoverable sugar.

Conclusion

In this study a new approach based on machine learning methods for predicting recoverable sugar from sugarcane was presented. The most important innovation of this study is the simultaneous consideration of management and climatic factors, along with other factors such as soil and crop characteristics for modeling and classification the recoverable sugar percentage from sugarcane. The results show that the performance of all models is acceptable and machine learning methods and ensemble learning algorithms can be used to predict crop yield. The results of this study and the analysis of the rules obtained from the set of decision trees made in the random forest model can be used for managers of different agro-industries in determining appropriate strategies and preparing the conditions to achieve optimal production.For future research as well as policy making and decision making Amir-Kabir sugarcane agro-industry Company the following suggestions are offered: more samples can be used to obtain more reliable results. Also can be used Deep learning methods, time series analysis and image processing. Use of IOT equipment to collect and real-time processing data on Amir-Kabir sugarcane agro-industry farms.

Language:
Persian
Published:
Journal of Agricultural Machinery, Volume:12 Issue: 4, 2022
Pages:
543 to 558
https://magiran.com/p2513988  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!