Magiran | جستجوی کلیدواژه "random forest algorithm"

کاربرد الگوریتم جنگل تصادفی در برآورد آثار نشانگرها و تعیین ژن های کاندیدا برای صفات تولیدمثلی در گاو شیری هلشتاین ایران

جیران جباری تورچی، صادق علیجانی*، سید عباس رافت، مختارعلی عباسی

فصلنامه تحقیقات تولیدات دامی، سال سیزدهم شماره 1 (بهار 1403)، صص 95 -109

روش یادگیری ماشین، رویکرد قدرتمندی برای مطالعات ژنومی است. هدف تحقیق حاضر، استفاده از روش یادگیری ماشین (جنگل تصادفی) برای پویش ژنومی پیشنهادی صفات تولیدمثلی شامل سن در زمان اولین زایش (AFC)، روزهای باز (DO)، فاصله گوساله زایی (CI) و نرخ آبستنی دختران (DPR) در گاوهای هلشتاین ایران بود. اطلاعات لازم از مرکز اصلاح نژاد و بهبود تولیدات دامی کشور اخذ شد. اطلاعات ژنوتیپی شامل نشانگرهای چند شکلی تک نوکلئوتیدی (SNP) مربوط به 2419 راس گاو هلشتاین نر بود. فایل داده مشتمل بر رکوردهای ثبت شده سال های 1360 تا 1398 شامل 2774183 راس دام بود. با توجه به تفاوت تراکم در اطلاعات ژنومی گاوهای نر، تعداد نشانگرهای آن ها نیز با یکدیگر متفاوت بود. برای یکسان سازی نشانگرها از نرم افزار FImpute برای جانهی ژنوتیپ استفاده شد. در این تحقیق با استفاده از الگوریتم جنگل تصادفی که نمونه ای از الگوریتم های با نظارت و از نوع رگرسیونی است، در مجموع، 21 نشانگر با میزان اهمیت بالا برای صفات مختلف تولید مثلی مشخص شد. سپس، با استفاده از روش هستی شناسی ژن، ژن های پیشنهادی مهمی برای این صفات شناسایی شدند. ژن های MPZL1 و CD247 شناسایی شده روی کروموزوم 3 در ارتباط با صفت AFC و ژن های RPS6KC1 و FAM170A در ارتباط با صفت DPR برای بهبود عملکرد تولید مثلی گاوهای شیری، مهم بوده و می توانند مورد استفاده قرار گیرند. نشانگرها و ژن های شناسایی شده در این تحقیق می توانند اطلاعات جدیدی را در مورد معماری ژنتیکی صفات تولید مثلی برای بهبود ژنومی آن ها ارائه دهد و در طراحی تراشه ها برای ارزیابی صفات تولید مثلی مورد استفاده قرار گیرد.

کلید واژگان: الگوریتم جنگل تصادفی, ژنوتیپ, گاو شیری, نشانگر, یادگیری ماشین

Application of a random forest algorithm to estimate marker effects and identify candidate genes for reproductive traits in Iranian Holstein dairy cattle

J. Jabbari Tourchi, S. Alijani *, S. A. Rafat, M. A. Abbasi

Animal Production Research, Volume:13 Issue: 1, 2024, PP 95 -109

Introduction

The genome-wide association study (GWAS) is a powerful approach to identify genomic regions associated with fertility traits that explain a significant portion of the genetic variance associated with these traits and identify the relevant causal mutations. Evaluating the correlation between each genotyped marker and trait is an essential strategy for GWAS studies that examine the effects of all markers by considering their possible interactions, environmental factors, and even mutual effects between markers. Recently, machine learning methods have been introduced to genomic topics, and the basis of these methods is different from the common methods of genomic evaluation. The machine learning method is used to estimate the genomic breeding values of the candidate animals by considering the training data (genotypic and phenotypic information of the reference population). One of the key advantages of this method is the ability to analyze large data. Machine learning is a branch of artificial intelligence whose goal is to achieve machines that can extract knowledge (learning) from the environment. A variety of machine learning methods (random forest, boosting, and deep learning) are used to model genetic variance and environmental factors, study gene networks, GWAS, study epistasis effects, and genomic evaluation. Random forest is one of the machine learning methods that has been successfully used in various fields of science. This research was conducted to identify markers and genes related to reproductive traits such as calving interval (CI), days open (DO), daughter pregnancy rate (DPR), and age at first calving (AFC) in Iranian Holstein dairy cattle. These traits have already been investigated with the ssGBLUP method and using a smaller sample size. However, in the present research, by using more genotyped animals, a random forest algorithm was used to identify markers and genes related to reproductive traits.

Materials and methods

The records used in this research were provided by the National Animal Breeding Center and Promotion of Animal Products of Iran and included AFC, DO, CI, and DPR related to the genotyped bulls' daughters. In this research, the pedigree information of 2774183 animals was used. The genotypic information of the markers related to 2419 Holstein bulls was used. Genomic data quality control was performed using factors such as the number of genotyped SNPs per animal (ACR), the number of genotyped animals per SNP (CR), Hardy-Weinberg equilibrium (HWE), and minor allele Frequency (MAF). When filtering genomic data, the markers whose MAF was less than 5% were removed, and then the samples whose genotyped frequency was less than 90% were identified and removed. Then, the markers whose genotyping rate was less than 95% in the samples were identified and removed. Finally, the SNPs that deviated from the HWE test (P<10-6) were excluded from the analysis as a measure of genotyping error. To control the quality of genomic data, PLINK 1.9 software was used. Then Ranfog software was used in the Linux environment to perform analysis through random forest algorithm.

Results and discussion

By using the random forest algorithm, a total of 21 important SNPs were observed, then important fertility trait candidate genes were identified by the gene ontology method, and 62 genes were within 250 Kb of these SNPs. The most significant SNP was observed for AFC. The main SNP for AFC is in ARS-BFGL-NGS-22647 BTA3, for CI is in ARS-BFGL-NGS-114194 (BTA11), for DO is in BTA-74076 -no-rs (BTA5), and for DPR is in ARS-BFGL-NGS-32553 (BTA26). The researchers, who studied fertility traits in Nellore cattle using machine learning methods, identified MPZL1 and CD247 genes on chromosome number 3 and this gene was associated with age at first calving. Many pathways of cell biology affect the performance of reproductive traits. Research has reported the relationship between the CD247 gene and pathways of biology, including cell development and function. Research has shown that the IFFO2 gene plays an important role in the molecular structure of cells, as well as in the mechanism of blastocyst formation, embryos, and the length of gestation in cattle. In a study conducted on the mouse population on the structure of the flagellum and the sperm maturation process, the role of the ALDH4A1 gene in the sperm maturation process was reported. The association of the RPS6KC1 gene with pregnancy rate and antral follicle number in Nellore heifers has been reported. The KAT2B gene is a transcriptional activator that plays an essential role in regulating the correction of histone acetylation and plays an important role in improving carcass quality, muscle and fat development, and metabolism in native Chinese cattle. In addition, they play a key role in regulating biological processes and are related to cell growth, metabolism and immune system function.

Conclusions

According to the objectives of this research, new information on markers and candidate genes related to reproductive traits in Iranian Holstein dairy cattle was reported. The markers and candidate genes identified in the present research can be used in genomic selection to improve the reproductive traits of Holstein dairy cattle.

Keywords: Random Forest Algorithm, Genotype, Dairy Cow, Marker, Machine Learning

پیش بینی مکانی حساسیت زمین لغزش با استفاده از الگوریتم های پیشرفته یادگیری ماشین (مطالعه موردی: شهرستان سروآباد، استان کردستان)

بهارک معتمدوزیری *، هیمن راست خدیو، سیداکبر جوادی، حسن احمدی

نشریه تحقیقات منابع طبیعی تجدید شونده، سال چهاردهم شماره 2 (پیاپی 40، پاییز و زمستان 1402)، صص 87 -100

وقوع رخداد زمین لغزش در مناطق کوهستانی ممکن است به زیرساخت ها از جمله جاده ها آسیب جدی وارد کند، همچنین ممکن است به مرگ ومیر انسان ها منجر شود. هدف از انجام این مطالعه، پیش بینی مکانی خطر زمین لغزش با استفاده از الگوریتم های پیشرفته داده کاوی در شهرستان سروآباد (استان...) است. در این مطالعه، پتانسیل یابی خطر زمین لغزش با استفاده از دو الگوریتم پیشرفته داده کاوی شامل جنگل تصادفی (RF) و درخت تصمیم (DT) انجام شد. ابتدا، فایل نقطه ای 166 زمین لغزش رخ داده در شهرستان سروآباد به عنوان نقشه موجودی زمین لغزش در نظر گرفته شد. به منظور تهیه مدل و اعتبار سنجی آن، نقاط زمین لغزش به دو بخش داده های آموزشی (70 درصد) و داده های اعتبارسنجی (30 درصد) تقسیم می شوند. در مجموع 16 پارامتر شامل شیب، جهت جغرافیایی، ارتفاع از سطح دریا، فاصله از آبراهه، فاصله از جاده، تراکم رودخانه، فاصله از گسل، تراکم گسل، تراکم جاده، بارندگی، کاربری و پوشش اراضی، شاخص NDVI، لیتولوژی، زمین لرزه، شاخص توان آبراهه (SPI) و شاخص رطوبت توپوگرافی (TWI) به منظور پهنه بندی خطر زمین لغزش استفاده شدند. در نهایت، عملکرد مدل ها با استفاده از منحنی مشخصه عملکرد سیستم (ROC) مورد بررسی قرار گرفت. نتایج تحلیل منحنی ویژگی عملگر نسبی نشان داد که مدل های درخت تصمیم و جنگل تصادفی به ترتیب دارای مقدار AUC برابر 942/0 و 951/0 می باشند؛ بنابراین مدل جنگل تصادفی نسبت به درخت تصمیم دارای بالاترین مقدار AUC بوده و بهترین مدل برای پیش بینی خطر زمین لغزش در آینده در منطقه مورد مطالعه می باشد. نقشه های پتانسیل وقوع زمین لغزش، ابزارهای کارآمدی بوده؛ به طوری که می توان آن ها را برای مدیریت زیست محیطی، برنامه ریزی کاربری زمین و توسعه زیرساخت ها مورد استفاده قرار داد.

کلید واژگان: مخاطرات طبیعی, الگوریتم جنگل تصادفی, الگوریتم درخت تصمیم, استان کردستان

Landslide susceptibility mapping using advanced machine learning algorithms (Case study: Sarovabad city, Kurdistan province)

Hemen Rastkhadiv, Baharak Motamedvaziri*, Seied Akbar Javadi, Hasan Ahmadi

Journal of Renewable Natural Resources Research, Volume:14 Issue: 2, 2024, PP 87 -100

The occurrence of landslides in mountainous areas may cause serious damage to road infrastructure, and may also lead to human deaths. Therefore, the purpose of this study is to landslide susceptibility mapping using advanced machine learning algorithms in Sarovabad city. In this study, landslide susceptibility was determined using two advanced data mining algorithms including random forest (RF) and decision tree (DT). First, the point file of 166 landslides occurred in Sarovabad city was considered as the landslide inventory map. The landslide points are divided into training data (70%) and validation data (30%). A total of 16 parameters including slope, aspect, elevation, river proximity, road proximity, river density, fault proximity, fault density, road density, precipitation, land use, NDVI, lithology, earthquake, stream power index (SPI) and topographic wetness index (TWI) were used in order to landslide susceptibility mapping. Finally, the performance of the models was evaluated using the ROC curve. The results of the ROC showed that the decision tree and random forest models have AUC values of 0.942 and 0.951, respectively. Therefore, the random forest model has the highest AUC value compared to the decision tree and was the best model for predicting the risk of landslides in the future in the study area. Landslide potential maps are efficient tools; so that they can be used for environmental management, land use planning and infrastructure development.

Keywords: Decision tree algorithm, Kurdistan province, Landslide, Natural hazards, Random forest algorithm

ارزیابی تغییرات مکانی- زمانی کاربری/ پوشش زمین و شوری خاک و تاثیر آن بر مدیریت مناطق خشک (مطالعه موردی: بخشی از حوضه سیستان، جنوب شرقی ایران)

سجاد کربلایی صالح، سولماز عموشاهی*، اکرم سنایی

نشریه محیط زیست طبیعی، سال هفتاد و هفتم شماره 1 (بهار 1403)، صص 107 -121

اثرات منفی شوری خاک بر محیط های طبیعی و انسانی این پدیده را به یکی از تهدیدات جدی در مدیریت پایدار مناطق خشک و نیمه خشک تبدیل کرده است. بنابراین، در مطالعه حاضر تغییرات مکانی- زمانی شوری خاک و تغییرات کاربری/ پوشش زمین در بخشی از حوضه سیستان واقع در مناطق خشک جنوب شرقی ایران که در سال های اخیر در معرض پدیده شوری خاک قرار گرفته است مورد بررسی قرار گرفت. در این مطالعه با استفاده از اندازه گیری های حاصل از نمونه برداری های زمینی و ابزارهایی نظیر سنجش از دور (RS) و سامانه اطلاعات جغرافیایی (GIS) نقشه های کاربری/ پوشش زمین و شوری خاک برای سال های 1989 و 2019 تهیه شد. بر اساس نتایج، میزان میانگین نرمال شده شوری خاک در سال 1989 برابر 322/0 بوده و در سال 2019 این میزان با رشد حدود 188/0 به 52/0رسیده است. همچنین، نتایج حاصل از مقایسه روند افزایش شوری و تغییرات کاربری/ پوشش زمین در منطقه نشان می دهد که این دو عامل متقابلا تاثیر به سزایی بر یکدیگر دارند. از سوی دیگر، تبدیل کاربری/ پوشش زمین از کاربری های کشاورزی آبی و بسترهای آبی به کاربری های مناطق بایر، کشاورزی دیم و مناطق انسان ساخت موجب کاهش پوشش گیاهی و مناطق آبی در منطقه شده که به دلیل ایجاد فرسایش حاصل از بادهای 120 روزه و نشست ذرات نمک در کل منطقه موجب افزایش شوری خاک می شود. اگرچه احداث چاه نیمه ها در این منطقه، اندکی از مشکلات محیط زیستی آن کاسته است. با این حال، بر طبق نتایج، این چاه نیمه ها نتوانسته اند اثرات منفی حاصل از تخریب زیستگاه ها و نیز خشکی بخشی از دریاچه هامون و رودخانه هیرمند را به طور کامل جبران کنند.

کلید واژگان: تغییرات شوری خاک, تغییرات کاربری, پوشش زمین, مناطق خشک, الگوریتم جنگل تصادفی, گوگل ارث انجین (GEE)

Assessing Spatio-temporal Variations in Land Use/Land Cover and Soil Salinity and their Impact on Managing Dry Areas (Case Study: A Part of Sistan Basin, Southeast of Iran)

Sajjad Karbalaei Saleh, Solmaz Amoushahi *, Akram Sanaei

Journal of Natural Environment, Volume:77 Issue: 1, 2024, PP 107 -121

The negative effects of soil salinity on natural and human environments have turned such phenomenon into one of the serious threats to the sustainable management of arid and semi-arid areas. The present study aims to evaluate the spatial-temporal variations in land use/land cover and soil salinity in a part of Sistan basin located in the arid regions of southeastern Iran, which has been exposed to the phenomenon of soil salinity during the recent years. To this aim, land use/land cover and soil salinity maps were prepared for 1989 and 2019 using the measurements obtained from ground sampling and instruments such as remote sensing (RS) and geographic information system (GIS). Based on the results, the normalized average soil salinity was 0.322 during 1989, reaching 0.52 during 2019 with a growth of 0.188. In addition, comparing the trend of salinity increase and land use/land cover variations in the region indicates that such factors affect each other significantly. Further, the conversion of land use/land cover from irrigated agricultural uses and water bodies to bare lands, rainfed agriculture, and man-made areas has decreased the vegetation cover and water areas, leading to an increase in soil salinity due to the erosion created by the 120- day wind of Sistan and sedimentation of salt particles in the whole region. Chah-nimehs have not been able to fully compensate for the adverse effects generated by the destruction of habitats and drying up of a part of Hamun Lake and Hirmand River, despite their slight success in reducing the environmental obstacles.

Keywords: Soil salinity changes, Land use, land cover changes, Arid areas, random forest algorithm, Google Earth Engine (GEE)

ارزیابی نقشه حساسیت زمین لغزش با استفاده از الگوریتم یادگیری ماشین و اولویت بندی عوامل موثر بر وقوع زمین لغزش

علی دسترنج*، حمزه نور، فرزانه وکیلی تجره

مجله علوم و مهندسی آبخیزداری ایران، پیاپی 63 (زمستان 1402)، صص 71 -83

هدف از مطالعه پیش رو، مدل سازی مکانی حساسیت وقوع زمین لغزش با استفاده از الگوریتم یادگیری ماشین جنگل تصادفی و اولویت بندی عوامل موثر بر وقوع زمین لغزش در حوزه آبخیز بار نیشابور، استان خراسان رضوی است. الگوریتم جنگل تصادفی مبتنی بر دسته ای از درخت های تصمیم است و در حال حاضر یکی از بهترین الگوریتم های یادگیری ماشین است. برای این منظور، لایه نقشه پراکنش زمین لغزش های منطقه شامل 73 زمین لغزش تهیه و به دو دسته برای آموزش مدل (70 درصد) و اعتبارسنجی مدل (30 درصد) به صورت تصادفی تقسیم شدند. همچنین، 16 عامل موثر بر وقوع زمین لغزش در منطقه موردمطالعه با توجه به مرور منابع گسترده شناسایی و لایه های رقومی در سامانه اطلاعات جغرافیایی تهیه شد. به منظور ارزیابی قدرت پیش بینی مدل از مساحت زیر منحنی تشخیص عملکرد نسبی (ROC) برای دو مرحله آموزش و اعتبارسنجی مدل استفاده شد. نتایج ارزیابی مدل نشان داد که مدل جنگل تصادفی با مقادیر سطح زیر منحنی 0/9 دارای دقت عالی در مرحله آموزش و 0/89 دارای دقت خیلی خوب در مرحله اعتبارسنجی است. نتایج اولویت بندی عوامل موثر بر وقوع زمین لغزش در منطقه موردمطالعه نشان داد که عوامل طول شیب و شیب دارای بیشترین اهمیت هستند. بر اساس نتایج مدل جنگل تصادفی، 23/7 درصد منطقه موردمطالعه در پهنه حساسیت خیلی زیاد و زیاد واقع شده است.

کلید واژگان: الگوریتم جنگل تصادفی, حوزه بار, ROC

Landslide Susceptibility Mapping Using Machine Learning Algorithms and Prioritization of Factors Affecting the Occurrence of Landslides

Ali Dastranj*, Hamzeh Noor, Farzaneh Vakili

Iranian Journal of Watershed Management Science and Engineering, Volume:17 Issue: 63, 2024, PP 71 -83

The aim of this study was to model the landslide susceptibility using the Random Forest Machine learning technique and prioritization of effective factors on landslide occurrence in Bar watershed in Khorasan Razavi province. The random forest algorithm is based on a bunch of decision trees and is currently one of the best machine learning algorithms. For this purpose, a landslide inventory map was created with 73 historical landslides, which was randomly divided into two datasets for model training (70%) and model testing (30%). A total of 16 landslide-conditioning factors were considered for the susceptibility landslide mapping. The random forest algorithm was run and a landslide susceptibility map was prepared. The RF-based model was validated using the area under the receiver operating characteristic (ROC) curve. The results of evaluation indicated that the success and prediction rates of the model were 90% in training and 89% in validation, respectively. These results conﬁrm the ability of random forest method for prediction of landslide susceptibility models. Also, prioritization of the effective factors showed that the slope length and slope had the highest effect on landslide occurrence. Based on the results of the random forest model, 23.7% of the study area is located in a very high and high sensitivity zone.

Keywords: Random Forest algorithm, Bar watershed, ROC

برآورد زی توده روی زمینی توده های جنگلی دست کاشت عرب داغ استان گلستان با استفاده از داده های ماهواره ای سنتینل 2

حسان علی، جهانگیر محمدی*

فصلنامه پژوهشهای علوم و فناوری چوب و جنگل، سال سی‌ام شماره 4 (زمستان 1402)، صص 93 -110

سابقه و هدف

جنگل های دست کاشت امروزه یکی از مهم ترین منابع ذخیره کربن جنگلی و از عوامل کاهش دهنده روند تخریبی عرصه های طبیعی هستند. زی توده روی زمینی درختان نقش اساسی در مدیریت پایدار جنگل و در کاهش روند گرم شدن کره زمین و یک منبع اطلاعاتی مهم محسوب می شود. معادلات آلومتریک ابزاری مهم برای کمی کردن زی توده روی زمینی درختان در جنگل ها هستند. در سال های اخیر، فنون سنجش ازدور با استفاده از روش های ناپارامتریک مانند الگوریتم رندوم فارست به طور گسترده برای برآورد زی توده درختان جنگل مورد استفاده قرارگرفته است. در این تحقیق قابلیت داده های سنتینل 2 با استفاده از الگوریتم رندوم فارست در برآورد زی توده روی زمینی توده های جنگلی دست کاشت عرب داغ استان گلستان مورد ارزیابی قرار گرفت.

مواد و روش ها

در این مطالعه از اطلاعات زی توده 180 قطعه نمونه دایره ای به مساحت 400 مترمربع حاصل از روش نمونه برداری خوشه ای استفاده شد. همچنین مختصات مراکز قطعات نمونه با استفاده از DGPS ثبت شد. زی توده روی زمینی قطعات نمونه از معادلات آلومتریک تهیه شده است. در این بررسی از داده های سنتینل 2 که پیش پردازش رادیومتری و هندسی شده بودند استفاده شد و براساس آن، شاخص های مختلف پوشش گیاهی تهیه شد. در اجرای الگوریتم رندوم فارست ارتباط بین مشخصه ی زی توده به عنوان متغیر وابسته و ارزش های طیفی شاخص های گیاهی تهیه شده به عنوان متغیرهای مستقل مورد بررسی قرار گرفت. مدل سازی با استفاده از 75 درصد قطعات نمونه (135 قطعه نمونه) با استفاده الگوریتم رندوم فارست صورت گرفت وارزیابی برآوردها با استفاده از 25 درصد قطعات نمونه (45 قطعه نمونه) انجام شد.

یافته ها

نتایج نشان داد که در بین متغیرهای مستقل مورد استفاده شاخص ها NDVI و GNDVI دارای بیشترین همبستگی در برآورد زی توده روی زمینی را داشتند و الگوریتم رندوم فارست با 310 درخت و 5 پیش بینی کننده و درصد مجذور میانگین مربعات خطا 83/35 درصد و ضریب تبیین 51/0 توانسته است که زی توده روی زمینی توده های دست کاشت عرب داغ را برآورد نمایند. همچنین نتایج نشان داد که الگوریتم رندوم فارست با استفاده از داده های سنتینل 2، مقادیر زی توده روی زمینی درختان را بیشتر از مقدار واقعی برآورد نموده اند. بین مقادیر زی توده روی زمینی برآورد شده و واقعی تفاوت معنی داری در سطح احتمال 95 درصد وجود ندارد (p-value > 0.05).

نتیجه گیری

نتایج این تحقیق نشان داد که داده های سنتینل 2 با دقت قابل قبول توانسته اند زی توده روی زمینی توده های دست کاشت عرب داغ را برآورد نمایند و باتوجه به نتایج حاصل شده در این مقاله می توان گفت که اطلاعات باندهای اصلی و شاخص های طیفی نقش مهم در برآورد زی توده روی زمینی داشتند.

کلید واژگان: الگوریتم رندوم فارست, مادون قرمز نزدیک, معادلات آلومتریک, نمونه برداری خوشه ای, سنجش از دور

Estimation of above-ground biomass of Arabdagh reforested stands, Golestan province using Sentinel-2 satellite data

Hassan Ali, Jahangir Mohammadi *

Wood & Forest Science and Technology, Volume:30 Issue: 4, 2023, PP 93 -110

Background and objectives

Today, reforested stands are one of the most important sources of forest carbon storage and one of the factors that reduce the process of destruction of natural areas. Above-ground biomass (AGB) plays an essential role in sustainable forest management and reducing global warming and is an important source of information. Allometric equations are an important tool for quantifying above-ground biomass in forests. In recent years, remote sensing techniques using non-parametric methods such as the Random Forest algorithm have been widely used to estimate forest tree biomass. In this research, the ability of Sentinel 2 data using the random forest algorithm to estimate the above-ground biomass of Arabdagh reforested stands in Golestan province was evaluated.

Materials and methods

In this study, 180 circular sample plots with an area of 400 square meters were measured using the cluster sampling method and the diameter at breast height (DBH) and tree height (H) were measured. Also, the exact coordinates of the centers of the sample plots were recorded using DGPS. Then, using the prepared allometric equations, the above-ground biomass of trees was calculated. In this study, Sentinel 2 pre-processed radiometric and geometrical data were used, and based on that, different vegetation indices were prepared. In the implementation of the random forest algorithm, the relationship between the characteristics of biomass as a dependent variable and the spectral values of vegetation indices as independent variables were investigated. Modeling was done using 75% of sample plots (135 sample plots) with random forest algorithm and validation of estimates was done using 25% of sample plots (45 sample plots).

Results

The results showed that NDVI and GNDVI indices had the highest correlation in the estimation of above-ground biomass and the random forest algorithm with 310 trees and 5 predictors and the percentage root mean square error of 35.83% and the coefficient of determination 0.51 was able to estimate the above-ground biomass of Arabdagh reforested stands. Also, the results showed that using the data of Sentinel 2, the random forest algorithm has estimated the above-ground biomass of trees more than the actual values. There is no significant difference at the 95% probability level between the estimated and real above-ground biomass values (p-value > 0.05). Also, among the independent variables used.

Conclusion

The results of this research showed that Sentinel 2 data has been able to estimate the above-ground biomass of Arabdagh reforested stands with acceptable accuracy. According to the results of this article, it can be said that the information of the main bands and spectral indices played an important role in the estimation of above-ground biomass.

Keywords: Random Forest Algorithm, Near-infrared band, Allometric equation, Remote sensing

ارزیابی مکانی مناطق در معرض ریسک فرونشست زمین در روستای فدافن، شهرستان کاشمر

احسان حسین نژاد مکی، مهدی بشیری*، حمیدرضا مرادی

نشریه راهبردهای توسعه روستایی، سال نهم شماره 3 (پیاپی 35، پاییز 1401)، صص 411 -426

رشد جمعیت همراه با توسعه صنعت و کشاورزی، افزایش مصرف آب را به دنبال داشته است. محدودیت منابع آب های سطحی، باعث برداشت بیش ازحد از سفره های آب زیرزمینی گردیده و پیامدهای جبران ناپذیری را بر منابع آب و محیط زیست کشور وارد کرده است؛ از جمله پدیده فرونشست، که اغلب دشت های کشور را فراگرفته است. هدف این پژوهش، شناسایی عوامل موثر و مناطق در معرض ریسک فرونشست در روستای فدافن کاشمر است. جهت پهنه بندی ریسک، طی سال 1398،عوامل سنگ شناسی، کاربری اراضی، خاک شناسی، میزان برداشت از آبخوان، فاصله از آبراهه، گسل، چاه های بهره برداری، چشمه و قنات ها و نیز عوامل ژیومورفولوژی شامل شیب، جهت و ارتفاع بررسی و هر یک از عوامل، تبدیل به یک لایه اطلاعاتی شد و با الگوریتم جنگل تصادفی در نرم افزار R، مدل سازی و ارزیابی انجام گردید. سپس جهت تعیین نواحی مستعد فرونشست، نقشه های پهنه بندی ریسک در پنج کلاس با دو روش ارزش اطلاعاتی و تراکم سطح در محیط ArcGIS استخراج شدند. نتایج نشان داد در روش های تراکم سطح و ارزش اطلاعات به ترتیب 01/97 و 04/91 درصد فرونشست ها درکلاس خطر خیلی زیاد و زیاد قرار گرفته است. بنابراین هر دو روش در پهنه بندی مناطق در معرض ریسک، موفق عمل کرده اند و عوامل برداشت از آبخوان و کاربری اراضی بیشترین اهمیت در وقوع فرونشست را دارند. همچنین براساس منحنی ROC، الگوریتم جنگل تصادفی با دقت بسیار بالا (93 درصد)، نتایج خوبی در اولویت بندی و اهمیت عوامل موثر در فرونشست ارایه کرده است و بخش جنوبی منطقه با کاربری مرتع، بیشترین ریسک و زراعت آبی در منطقه، کمترین ریسک در توسعه مکانی فرونشست ها را دارد.در نتیجه مدیریت تغذیه آبخوان با پخش سیلاب ها و کاهش استحصال آب در جنوب منطقه می تواند در کاهش ریسک وقوع و توسعه فرونشست ها موثر و کاربردی باشد.

کلید واژگان: آبخوان, الگوریتم جنگل تصادفی, حساسیت به فرونشست, سیستم اطلاعات جغرافیایی, مدل سازی

Spatial assessment of areas at risk of land subsidence in the Fadafan village, Kashmar County

Ehsan Hosein-Nezhad-Makki, Mehdi Bashiri *, Hamid-Reza Moradi

Rural Development Strategies, Volume:9 Issue: 3, 2022, PP 411 -426

Population growth, along with the development of industry and agriculture, has led to an increase in water consumption. Limited surface water resources have led to over-harvesting of groundwater aquifers and has had irreparable consequences for the country's water resources and environment, including the subsidence phenomenon, which has covered most of the country's plains. The present research aims to identify the effective factors and areas at risk of subsidence in Fadafan village of Kashmar. For risk zoning, during 2019, the lithology, land use, Petrology, aquifer Extraction rate, Distance from the stream, Fault, exploitation wells, springs and aqueducts factors as well as geomorphological factors including slope, direction and height studied and each factor turned into an information layer, then modeling and evaluation were performed using random forest algorithm in R software. Then, to determine the areas prone to subsidence, risk zoning maps in five classes were extracted using two methods of information value and area density in ArcGIS environment. The results showed that in the methods of area density and information value, 97.01 and 91.04% of subsidence were in the very- high and high-risk class, respectively. Therefore, both methods have been successful in risk zoning. Also, the aquifer extraction and land use factors are most important in subsidence. Also based on the ROC curve, random forest algorithm with very high accuracy (93%) has provided good results in prioritizing and the importance of effective factors in subsidence. The southern part of the region with rangeland use, has the highest and irrigated agriculture in the region has the lowest risk in the spatial development of land subsidence.As a result, aquifer recharge management by spreading floods and reducing water extraction in the southern part of the region can be effective and practical in reducing the risk of occurrence and development of subsidence.

Keywords: aquifer, Random Forest Algorithm, Geographic information system, Modeling, Susceptibility to subsidence

آشکارسازی درختان پرتقال و تشخیص تنش گیاهی بر اساس داده های طیفی اخذ شده از پهپاد

مژده میرکی، هرمز سهرابی*، پرویز فاتحی

نشریه پژوهش در علوم باغبانی، سال یکم شماره 1 (پیاپی 0، بهار و تابستان 1401)، صص 27 -40

یکی از الزامات برای رسیدن به کارایی و حداکثر بهره وری، پایش و بررسی مداوم محصولات باغی و کشاورزی است. اما روش های سنتی پایش باغ ها بسیار وقت گیر، پرکار و پرهزینه است. هواپیماهای بدون سرنشین (پهپادها) با استفاده از تصویربرداری رنگی واقعی (قرمز/ سبز/آبی) ، ممکن است یک گزینه اقتصادی مناسب برای تشخیص تنش و بیماری ها باشند. این تحقیق با دو هدف متفاوت، توانایی داده های پهپاد در 1) آشکارسازی درختان و 2) ارزیابی سلامت مرکبات انجام گرفته است. به همین منظور تصویربرداری و برداشت زمینی درختان در تیر ماه 1398 انجام شد. پس از پردازش تصاویر و تولید مدل ارتفاع تاج در پنج اندازه پیکسل مختلف، آشکارسازی درختان پرتقال با استفاده از الگوریتم رشد ناحیه ای انجام شد. مدل ارتفاع تاج با اندازه پیکسل 50 سانتی متر، بالاترین مقدار صحت کلی (63/0) را ارائه داد. در گام بعدی، اورتوموزاییک جنگل مورد مطالعه با انداره پیکسل دو و نیم سانتی متر با استفاده از الگوریتم ساختار حرکت مبنا تولید شد. سپس باندها و شاخص های پوشش گیاهی به دست آمده از اورتوموزاییک و فضای رنگی به عنوان داده ورودی در الگوریتم طبقه بندی جنگل تصادفی مورد استفاده قرار گرفت. نتایج طبقه بندی تشخیص درختان دارای تنش را با صحت کلی 69 درصد نشان داد که کارآیی پهپادها را برای آگاهی دادن به ذینفعان نشان می دهد.

کلید واژگان: پهپاد, فضای رنگی, استرس, الگوریتم جنگل تصادفی, الگوریتم رشد ناحیه ای

Citrus trees identification and trees stress detection based on spectral data derived from UAVs

Mojdeh Miraki, Hormoz Sohrabi *, Parviz Fatehi

Journal of Research in Horticultural Sciences, Volume:1 Issue: 1, 2022, PP 27 -40

One of the requirements for achieving efficiency and maximum productivity is continuous monitoring and identifying of horticultural and agricultural products. Traditional plant monitoring and evaluation methods are time-consuming, labor-intensive, and costly. Unmanned aerial vehicles (UAVs) using real color imaging (red/green/blue) is a game-changer in horticultural and agriculture and an economically viable option for recognizing stress and disease. In this paper, the ability of UAV images was evaluated for identifying citrus trees and determining their health using a simple method. For this purpose, In June 2019, the study area was photographed and surveyed. The region growing algorithm was tested for a series of CHMs generated from point clouds, across a range of spatial resolutions. The highest overall accuracy for the individual tree crown delineation was achieved for a spatial resolution of 50 cm (F-score =0.63). In the next step, the orthomosaic was generated with a pixel size of 2.5 cm was generated by structure from motion algorithm. Then vegetation indices and bands obtained from orthomosaic and CIE L* a* b* color space were used as input data in a random forest classification algorithm. We classified the trees into 2 classes: health trees and unhealthy trees; then the random forest algorithm was applied using R software. The classification accuracy for identified trees was performed using 10-fold cross-validation. The classification resulted in overall accuracies of 69%; that display the effectiveness of UAVs to inform stakeholders.

Keywords: UAV, Color Space, Stress, Random Forest Algorithm, Region-Growing Algorithm

استفاده از تصاویر سنتینل-1 جهت پایش خسارت سیلاب فروردین 1399، جنوب استان کرمان براساس الگوریتم جنگل تصادفی

فرشاد سلیمانی ساردو، الهام رفیعی ساردوئی*، طیبه مصباح زاده، علی آذره

مجله علوم و مهندسی آبخیزداری ایران، پیاپی 53 (تابستان 1400)، صص 23 -32

ارزیابی خسارت سیل، جهت مدیریت زود هنگام سیل امری ضروریست. در این مقاله چارچوبی جهت برآورد سریع خسارات سیلاب و شناسایی مناطق سیل زده در فروردین 1399، با استفاده از داده های ماهواره ای Sentinel-1 ارایه شده است. در پژوهش حاضر بعد از اعمال پیش پردازش های لازم در نرم افزار SNAP 6 ضریب پراکنش سیگما صفر هر دو تصویر مربوط به قبل و بعد از وقوع سیل استخراج شد. جهت تفکیک تصویر به دو طبقه آب و غیر آب، از هیستوگرام ضریب پراکنش تصویر استفاده و حدآستانه 01/ 0 به دست آمد. سپس با اعمال عملیات ریاضی روی هر دو تصویر ضریب پراکنش، تصویر باینری آب و غیر آب به صورت صفر و یک تهیه و براساس اختلاف دو تصویر، منطقه سیل زده مشخص گردید. پس از آشکار سازی مناطق سیل زده، تصاویر سنتینل با استفاده از الگوریتم های طبقه بندی نظارت شده به سه کلاس پهنه آبی قبل از سیل، مناطق سیل زده و سایر اراضی طبقه بندی شد. نتایج حاکی از صحت بالای روش طبقه بندی جنگل تصادفی (ضریب کاپا=92/ 0) نسبت به سایر الگوریتم ها بود. با روی هم گذاری نقشه کاربری اراضی و مناطق سیل زده، درصد آب گرفتگی هریک از کاربری ها مشخص شد. بر طبق نتایج، اراضی بایر، مسکونی و مرتع به ترتیب با میزان 9/ 27، 16 و 12 درصد دارای بیش ترین درصد آب گرفتگی بودند.

Utilizing Sentinel 1 Images for Monitoring Damage of Flood Event in March 2020, the South of Kerman Province Based on Random Forest Algorithm

Farshad Soleimani Sardoo, Elham Rafiei Sarooi*, Tayyebeh Mesbahzadeh, Ali Azareh

Iranian Journal of Watershed Management Science and Engineering, Volume:15 Issue: 53, 2021, PP 23 -32

Flood damage assessment is often necessary for early flood management. To this end, this paper provides a framework of rapid estimation of flood damage and identification the flooded areas in March 2020 using Sentinel-1 satellite data. To this end, in the present study, after applying the necessary pre-processing in SNAP6 software, the backscattering coefficient, or sigma naught for two images related to before and after the flood occurrence was extracted. The backscattering coefficient histogram was used to separate the image into two classes including water and non-water and the threshold of 0.01 was obtained based on it. Then, by applying mathematical operations on both backscattering images, the binary image of water and non-water was prepared and the flooded areas were determined based on the difference between the two images. After detecting the flooded areas, Sentinel images were classified into three classes including waterbody before flood, flooded area and other lands using supervised classification algorithms. The results indicated the high accuracy of the Random Forest algorithm with kappa of 0.92 compared to other algorithms. By overlaying the land use and flooded areas maps, the inundation percentage for each land use was determined. According to the results, bare lands with 27.9 percent, residential land with 16 percent and rangelands with 12 percent had the highest inundation percentage, respectively.

Keywords: Radar images, Flood, Damage, Sentinel-1, Random Forest algorithm

بررسی قابلیت تصاویر ماهواره ای Sentinel-2A در بستر Google Earth Engine برای تهیه نقشه پوشش زمین

ناصر احمدی ثانی*

نشریه تحقیقات منابع طبیعی تجدید شونده، سال یازدهم شماره 2 (پیاپی 34، پاییز و زمستان 1399)، صص 89 -99

نقشه پوشش زمین، پراکنش مکانی چشم اندازهای مختلف کشاورزی، منابع طبیعی، آب و منابع انسان ساخت را نشان می دهد و به عنوان یک ابزار مهم برای مدیریت و کاهش ریسک در مسایل چالش برانگیز مانند خشکسالی و اثرات آن، امنیت غذایی، کنترل سیل و برنامه ریزی شهری ارزشمند است. به منظور غلبه بر محدودیت های کار میدانی در تهیه نقشه پوشش زمین، استفاده از تصاویر ماهواره ای به دلیل دارا بودن داده های وسیع، چندطیفی و به روز مناسب به نظر می رسد. در منطقه مورد مطالعه، وجود پدیده های ناهمگن مکانی نیز طبقه بندی پدیده ها را با مشکل مواجه می سازد. هدف اصلی این پژوهش، تهیه نقشه پوشش زمین با وضوح بالا با کاربرد تصاویر Sentinel-2A در بستر Google Earth Engine است. در این رابطه، سه الگوریتم طبقه بندی جنگل تصادفی، ماشین بردار پشتیبان و درخت تصمیم مورد ارزیابی و مقایسه قرار گرفت. شاخص های متعدد با استفاده از روش های تبدیل و نسبت گیری طیفی تهیه شد. صحت نقشه های حاصل از طبقه بندی در مقایسه با نقشه های مرجع زمینی ارزیابی شد. در رابطه با ارزیابی تک باندها، بهترین صحت کلی برابر 49 درصد با کاربرد شاخص CVI به دست آمد. بهترین صحت کلی و ضریب کاپا به ترتیب برابر 86 درصد و 0/82 توسط الگوریتم جنگل تصادفی حاصل شد. بنابراین ضمن تاکید بر مزایای GEE شامل دسترسی آسان به داده ها و قابلیت پردازش و مقایسه سریع آنها، می توان ادعا نمود که تصاویر Sentinel-2A برای تهیه نقشه پوشش زمین از لحاظ هزینه، زمان و دقت، کارآیی بالایی دارد و این نقشه می تواند برای مدیریت و برنامه ریزی منابع مختلف طبیعی و انسان ساخت در راستای توسعه پایدار بسیار مفید واقع گردد.

کلید واژگان: الگوریتم جنگل تصادفی, سنجش ازدور, شاخص های گیاهی, ضریب کاپا

Investigation on land cover mapping using Sentinel-2A images in the Google Earth Engine Platform

Naser Ahmadi Sani *

Journal of Renewable Natural Resources Research, Volume:11 Issue: 2, 2021, PP 89 -99

Land cover map show the spatial distribution of different landscapes such as agricultue, natural resources, water and man-made area. It is a valuable tool to managing and reducing risk in challenging issues such as drought and its effects, food security, flood control, and urban planning. In order to overcome the limitations of field work in the mapping of land cover, the use of satellite images due to the wide, multispectral and update data seems to be suitable. In the study area, the spatially heterogeneous landscapes also makes it difficult to classify features. Therefore, the main purpose of the study is accurate and high resolution land cover mapping using Sentinel-2A images in the Google Earth Engine platform. In this regard, three classification algorithms including RF, SVM and CART were evaluated and compared. Various indices were prepared using ratioing and transformation methods. The accuracy of the classifications was evaluated in comparison with ground reference data. Individual bands evaluation showed that the best overall accuracy (49%) was obtained using the CVI index.The best overall accuracy and kappa coefficient of 86% and 0.82 were obtained by RF algorithm. Therefore, while pointing to the advantages of the GEE including easily accessible data and the ability to process and quickly compare of data, it can be claimed that Sentinel-2A images for land cover mapping in terms of cost, time and accuracy, have high efficiency and the map can be very useful for the management and decision making in different natural and man-made resources for the successful implementation of sustainable development.

Keywords: Kappa coefficient, Random Forest Algorithm, Remote sensing, vegetation indices

بررسی قابلیت و حساسیت سنجی شاخص های طیفی ماهواره ای در پهنه بندی شدت آتش سوزی مناطق جنگلی (مطالعه موردی: جنگل کاری عرب داغ-گلستان)

محمدواثق الحاجی خلف، شعبان شتایی*، رقیه جهدی

فصلنامه جنگل و فرآورده های چوب، سال هفتاد و سوم شماره 1 (بهار 1399)، صص 97 -110

تهیه نقشه دقیق شدت آتش سوزی برای مدیریت ریسک آتش در اکوسیستم های جنگلی حایز اهمیت است. شاخص های طیفی از سنجنده های نوری به عنوان یکی از باندهای قابل قبول برای طبقه بندی و نشان دادن تفاوت طیفی طبقات مختلف پوشش گیاهی شناخته شده است. در این تحقیق قابلیت مجموعه ای از شاخص های استخراج شده از تصاویر ماهواره های Sentinel-2 و Landsat-8 با اندازه تفکیک مکانی مختلف برای تهیه نقشه دقیق شدت آتش سوزی با استفاده از الگوریتم جنگل تصادفی در منطقه دچار آتش سوزی سال 1397 جنگلکاری های عرب داغ استان گلستان بررسی شد. بعد از پیش پردازش های لازم، شاخص های تک و دوزمانه مناسب از تصاویر سنجنده های تحت بررسی ایجاد شد. مقادیر شاخص بهینه برای باندها در فضای دوبعدی قبل و بعد از آتش سوزی برای بررسی حساسیت این باندها به تغییرات اتفاق افتاده درون طبقات آتش سوزی محاسبه شد. بهترین نتیجه مربوط به باندهای NIR-SWIR2 با مقدار شاخص بهینه 77/0 برای سنجنده Sentinel-2 و 68/0 برای سنجنده Landsat8-OLI به دست آمد. براساس مقادیر شاخص بهینه، بهترین شاخص ها انتخاب شد و مقادیر این شاخص ها پس از آتش سوزی و همچنین شاخص های دوزمانه (قبل و بعد آتش سوزی) استخراج شدند. نقشه واقعیت زمینی نمونه ای طبقات شدت آتش سوزی با استفاده از روش نمونه گیری انتخابی با بازدید میدانی از طبقات شدت دچار آتش سوزی در منطقه تهیه شد. طبقه بندی با شاخص های مختلف با الگوریتم جنگل تصادفی انجام گرفت و نتایج با نقشه واقعیت زمینی نمونه ای ارزیابی شد. بهترین نتیجه با تلفیق شاخص ها از همه باندهای استخراج شده از سنجنده Landsat8-OLI به روش شاخص دوزمانه با ضریب کاپای 96/0 به دست آمد.

کلید واژگان: الگوریتم جنگل تصادفی, تصاویر ماهواره ای, شاخص طیفی دوزمانه, شاخص بهینه, شدت آتش سوزی

Ability and Sensitivity Study of Spectral Indices for Wildfire Severity Mapping (Case Study: Arabdagh-Golestan Reforestations)

Mhd.Wathek Alhaj Khalaf, Shaban Shataee *, Roghaye Jahdi

Journal of Forest and Wood Products, Volume:73 Issue: 1, 2020, PP 97 -110

Fire severity mapping is very important for managing the fires in forest ecosystems. The extraction of spectral indices from optical sensors is recognized as one of the most effective bands for the classification of vegetation classes. In this study, the ability and sensitivity of some spectral indices extracted from Sentinel-2 and Landsat 8-OLI images with different spatial resolutions have been investigated for fire severity mapping using the Random Forest algorithm in a burned area located in the reforested area of Arabdagh, Golestan province. After necessary preprocessing on the bands, the appropriate mono and bi-temporal spectral vegetation indices were created. The optimal index values for bands in the bi-spectral spaces pre/post-fire were calculated to evaluate the sensitivity of bands to the changes occurring within the fire classes. The best results were obtained for the NIR-SWIR2 bands with an optimal index value of 0.77 for Sentinel-2 and 0.67 for Landsat8-OLI. The best indices were selected based on values of optimality index. The values of these indices were calculated after the fire as well as the differential (pre/post-fire) ones. The ground truth of fire severity classes map was prepared by a selective sampling method through field surveying. The classification was done with different indices by random forest (RF) algorithm and the results were assessed by the ground truth points. The result showed that the best results were obtained for a combination of many differential indices from all bi-bands of Landsat 8-OLI with kappa coefficient (0.96).

Keywords: Bi-Spectral indices, Optimality, random forest algorithm, Satellite images, wildfire severity

برآورد مکانی محصول کلزای پاییزه بر اساس روش های ناپارامتریک (کاربرد در برنامه ریزی فضایی کشاورزی)

حامد ادب*، آزاده عتباتی، سید مهدی پورباقر کردی، محمد آرمین، حسن ذبیحی

فصلنامه پژوهش های تولید گیاهی، سال بیست و ششم شماره 3 (پاییز 1398)، صص 199 -217

سابقه و هدف

استان خراسان رضوی به دلیل شرایط ویژه آب و هوایی از توان لازم برای کشت و تولید کلزا برخوردار است به-طوریکه شهرهای شمالی و مرکزی استان خراسان رضوی از قابلیت بالایی به منظور کشت کلزا برخوردار است. قبل از توسعه کشت های جدید کلزا در مناطق مختلف ایران، ابتدا نیاز به بررسی پارامترهای موثر طبیعی و اقلیمی در عملکرد کلزا است، تا به وسیله آن توان اکولوژیکی مناطق به منظور کشت کلزا شناخته شود. مدل سازی مکانی در سیستم اطلاعات جغرافیایی از مهم ترین راهکارهایی است که می تواند با ترکیب روش های آماری و داده های مکانی، زمینه را برای سنجش عوامل محیطی و تناسب اراضی برای کشت یک محصول خاص فراهم آورد. در این پژوهش رابطه مکانی بین عملکرد محصول کلزای پاییزه و عوامل آب، خاک و هواشناسی طی دوره رشد در مزارع نمونه بررسی شد.

مواد و روش ها

در این پژوهش با به کارگیری دستگاه موقعیت یاب جهانی از 24 مزرعه کشت کلزای پاییزه نمونه برداری شد و عملکرد واقعی آن محاسبه گردید. سپس مقادیر ده عامل محیطی شامل ارتفاع، شیب و جهت شیب توپوگرافی، EC و pH آب زیرزمینی، میانگین دما، تابش کل دریافتی مستقیم و پراکنده طی، تبخیر و تعرق پتانسیل، شاخص عرضه باد و بافت خاک به روش نزدیکترین همسایه برای مزارع انتخابی استخراج گردید. سپس بعد از نرمال سازی متغیرها و با در نظر گرفتن دامنه اعداد، نمونه ها به دو قسمت آموزش (60 درصد، 14 مزرعه) و آزمون (40 درصد، 10 مزرعه) به طور تصادفی تقسیم گردید. سپس از دو روش ناپارامتریک K نزدیک ترین همسایه و جنگل تصادفی به منظور برآورد توان محیطی عملکرد تولیدات کلزا استفاده شد و در محیط سامانه اطلاعات جغرافیایی نقشه برآورد عملکرد محصول کلزا تهیه گردید.

یافته ها

نتایج میانگین درصد خطای مطلق در روش های مورداستفاده نشان داد که روش K نزدیک ترین همسایه با 26 درصد خطا و جنگل تصادفی با 11 درصد خطا است. نتایج شاخص کارایی نش ساتکلیف برای داده های آزمون نشان دهنده مقدار 65/0 برای روش K نزدیک ترین همسایه و 82/0 برای روش جنگل تصادفی هست. به طورکلی نتایج نشان دهنده آن است که روش جنگل تصادفی خطای کمتری نسبت به روش K نزدیک ترین همسایه در برآورد تولیدات کلزا در منطقه مورد مطالعه دارد. یافته های این تحقیق بر اساس مدل جنگل تصادفی نشان داد که شاخص عرضه باد و میانگین دما بیشترین تاثیر و عوامل توپوگرافی جهت شیب جغرافیایی و ارتفاع از کمترین تاثیر برخوردار هستند. همچنین pH و EC آب زیرزمینی یکی دیگر از عوامل مهم در عملکرد مدل شده دانه کلزا در این مطالعه است.

نتیجه گیری

نخستین گام در رسیدن به موفقیت در طرح جامع تولید دانه های روغنی کشور، شناسایی توان زراعی- بوم شناختی کشور به منظور تعیین مناطق مستعد کشت است. نتایج نشان داد که نواحی مناسب کشت گیاه کلزای پاییزه در مناطق شمالی و شمال غربی منطقه سبزوار واقع شده است. مناطق مرکزی با عملکرد پایین مشخص هستند که عمدتا به دلیل و وجود سازندهای گچی، نمکی در این مناطق و همچنین وجود زهکش کال شور و وجود نمکزارها است. ازاین رو توصیه میشود که برای توسعه کشت این گیاه مناطق شمالی و شمال غربی منطقه سبزوار در اولویت کشت قرار گیرد. عملکرد محصولات زراعی در نتیجه تاثیر مجموعه ای از عوامل ساختار ژنتیکی گیاه و همچنین شرایط محیطی کشت است که در این مطالعه بر عوامل محیطی آن تاکید شد.

کلید واژگان: محصول کلزای پاییزه, روش -Kنزدیک ترین همسایه, روش جنگل تصادفی, آمایش کشاورزی منطقه ای, منطقه سبزوار

Spatial yield prediction of winter rapeseed based on non-parametric methods (Application in spatial agricultural planning)

Hamed Adab *, Azadeh Atabati, S.Mahdi Pourbagher kordi, Mohammad Armin, Hasan Zabihi

Journal of Plant Production, Volume:26 Issue: 3, 2019, PP 199 -217

Background and objectives

Khorasan Razavi province has the potential for growing and producing rapeseed because of favorable environmental conditions, so that the northern and central cities of province have high potential for cultivation of rapeseed. Modeling the correct relationship between environmental conditions and yields is a critical step to find how crop-planting choices in different regions of Iran.Spatial modeling in GIS is one of the most important strategies that can provide a basis for measuring environmental factors and land suitability for the cultivation of a particular product by combining statistical methods and spatial data. In this research, the link between water, soil and meteorological factors and yields modeled during the growing season in sample farms.

Materials and methods

In this research, the position of 24 sample fields of rapeseed farming was recorded by Global Positioning System (GPS) and then actual yield was calculated. To explore how the environmental conditions and yields relationship has changed over space, we used ten environmental parameters influencing rapeseed productions yield, including elevation, slope, aspect, EC and pH groundwater resources, mean air temperature, incoming solar radiation, potential evapotranspiration, wind exposition index, Soil texture during the growing season. The values of each independent variables were extracted into samples by nearest neighbor method. Then, after normalizing the variables and taking into account the range of numbers, the samples were divided into two subsets: training (60%, 14 farms) and the testing dataset (40%, 10 farms) randomly. Two methods of nonparametric K of the nearest neighbor and random forest were then used to estimate rapeseed yield over the study area.

Results

The results of mean absolute error percentage in the methods used showed that K is the nearest neighbor with 26% error and random forest with 11% error. The results of Nash–Sutcliffe efficiency index for validation data set represent the value of 0.65 for K nearest neighbor and 0.82 for random forest method. In general, the results indicate that the random forest method has a lesser error than the K nearest neighbor method in estimating the yield of rapeseed productions for the study area.

Conclusion

Based on the results of this research, it can be concluded that among the variables used, two variables of wind supply index and average temperature had the most effect on the yield of rapeseed in comparison with other variables. Also, according to the final map, it was determined that suitable areas for rapeseed cultivation over Sabzevar region are located in the northern and northwestern regions. Low yield in the central regions of this part is mainly due to the excessive salinity of water and gypsum formations. Crop yield is a result of combination of genetic factors and also environmental conditions of the cultivation, which we emphasized on the environmental factors in this study.

Keywords: Winter rapeseed yield, K-Nearest Neighbors Algorithm, Random Forest Algorithm, Regional Sustainable Agriculture, Sabzevar region

مدل سازی خطر وقوع فرونشست زمین با استفاده از الگوریتم جنگل تصادفی (مطالعه موردی: حوزه آبریز دشت تسوج)

داود مختاری، حمید ابراهیمی*، سعید سلمانی

نشریه سنجش از دور و سامانه اطلاعات جغرافیایی در منابع طبیعی، سال دهم شماره 3 (پاییز 1398)، صص 93 -105

وقوع پدیده فرونشست زمین و خطرات احتمالی آن در حوزه آبریز دشت تسوج استان آذربایجان شرقی به علت بروز بحران آب و دوره خشک سالی موجود در منطقه طی سالیان اخیر افزایش چشمگیری داشته است. به منظور برنامه ریزی برای کاهش خطرات ناشی از فرونشست زمین، شناسایی مناطق پرخطر و مستعد وقوع این پدیده ضروری است. در این مطالعه به مدل سازی خطر وقوع فرونشست زمین در بستر الگوریتم جنگل تصادفی با بهره گیری از نقاط فرونشست های ثبت شده و یازده متغیر محیطی موثر بر وقوع فرونشست (ارتفاع، شیب، جهت، شاخص رطوبت توپوگرافی، فاصله از آبراهه، تراکم زهکشی، فاصله از گسل، سنگ شناسی، کاربری اراضی، سطح آب زیرزمینی و افت سطح آب زیرزمینی) پرداخته شده است. قابلیت پیش بینی و صحت نتایج مدل با استفاده از منحنی تشخیص عملکرد نسبی (ROC) و مساحت سطح زیر این منحنی (AUC) مورد ارزیابی قرارگرفته است. ارزیابی نتایج مدل نشان دهنده دقت بسیار مناسب مدل 0.86 است. بر اساس نتایج حاصل از مدل با روش میانگین کاهشی دقت، متغیر های سطح آب زیرزمینی، فاصله از گسل و افت سطح آب زیرزمینی تاثیر بیشتری بر پتانسیل وقوع فرونشست در منطقه موردمطالعه داشته اند. همچنین مطابق نتایج حاصل به ترتیب 18 و 11 درصد از مساحت منطقه موردمطالعه در کلاس پرخطر و بسیار پرخطر ازنظر وقوع فرونشست قرارگرفته که حاکی از شرایط خطرناک منطقه است. استفاده از نتایج حاصل از این تحقیق توسط مدیران و برنامه ریزان نقش موثری در کاهش خطرات ناشی از فرونشست زمین خواهد داشت، همچنین ارائه و اجرای راهکار های عملیاتی را تسهیل می کند.

کلید واژگان: فرونشست زمین, سطح آب زیرزمینی, الگوریتم جنگل تصادفی, دشت تسوج

Land subsidence susceptibility modeling using random forest approach (Case study: Tasuj plane catchment)

Davoud Mokhtari, Hamid Ebrahimy *, Saeed Salmani

Journal of Rs and Gis for natural Resources, Volume:10 Issue: 3, 2019, PP 93 -105

Land subsidence occurrence in the Tasuj plane might become more frequent and hazardous in the near future due to its relationship with the water crisis and drought periods. In order to mitigate the damage caused by land subsidence, it is necessary to determine the susceptible or prone areas. The purpose of this study is to produce land subsidence susceptibility map based on the random forest approach to land subsidence occurrence data and eleven environmental variables that have significant influence on land subsidence occurrences (altitude, slope, aspect, distance to drainage line, drainage density, distance from the fault, topographic wetness index, land cover, lithology, groundwater level and decline in groundwater level) were used as inputs of the random forest model. The random forest approach was applied to produce the land subsidence susceptibility map. The performance of the model was assessed using the receiver operating characteristics (ROC) curve and the area under the curve (AUC). The model results indicate the accuracy of 0.86. Based on the result of the mean decrease accuracy method, the most important conditioning factors were groundwater level, distance from the fault, and a decline in groundwater level, respectively. According to the result, about 18% and 11% of the study area was located within high to very high susceptibility classes. The result of this study can be used by stakeholders and local authorities to mitigate related hazards of land subsidence occurrences in the study area.

Keywords: Land subsidence, Groundwater level, Random forest algorithm, Tasuj plane

ارزیابی عملکرد سه روش طبقه بندی تصویر (جنگل تصادفی، ماشین بردار پشتیبان و بیشترین شباهت) در تهیه نقشه کاربری اراضی

فرشید جهانبخشی، محمدرضا اختصاصی *

نشریه علوم آب و خاک (علوم و فنون کشاورزی و منابع طبیعی)، سال بیست و دوم شماره 4 (پیاپی 86، زمستان 1397)، صص 235 -247

نقشه های کاربری/ پوشش اراضی ورودی پایه برای بسیاری از مدل های شبیه سازی محیط طبیعی است؛ بنابراین، صحت نقشه های حاصل از طبقه بندی تصاویر ماهواره ای، عدم قطعیت را در مدل سازی کاهش می دهد. .این مطالعه با هدف ارزیابی صحت نقشه های کاربری اراضی تولید شده توسط روش های طبقه بندی مبتنی بر یادگیری ماشین (الگوریتم جنگل تصادفی و ماشین بردار پشتیبان) و مقایسه آن با روش متداول بیشترین شباهت انجام شد. بدین منظور تصویر سنجنده OLI ماهواره لندست 8، مربوط به منطقه مورد مطالعه (حوضه سد ستارخان در آذربایجان شرقی)، پس از انجام تصحیحات اولیه، مورد استفاده قرار گرفت. پنج کاربری شهر، کشاورزی آبی، کشاورزی دیم، مرتع و پهنه آبی مورد توجه قرار گرفت. داده های واقعیت زمینی در قالب دو مجموعه داده های تعلیمی (70 درصد از نمونه ها) و داده های آزمون (30 درصد) برای انجام طبقه بندی نظارت شده استفاده شد. صحت نقشه های حاصل از سه الگوریتم، با استفاده از شاخص های ارزیابی صحت مورد مقایسه قرار گرفت. همچنین به منظور بررسی اختلاف معنادار آماری میان نتایج طبقه بندی از آزمون مک- نمار استفاده شد. نتایج نشان داد، صحت کل برای روش ماشین بردار پشتیبان، جنگل تصادفی و بیشترین شباهت به ترتیب برابر با 6/96، 8/90 و 8/90 درصد و ضریب کاپا به ترتیب 934/0، 813/0 و 834/0 بوده است. نتایج آزمون مک- نمار نیز معناداری اختلاف عملکرد در سطح پنج درصد آماری روش ماشین بردار پشتیبان با دو روش دیگر را تایید کرد.

کلید واژگان: یادگیری ماشین, طبقه بند ناپارامتری, آزمون مک- نمار, الگوریتم جنگل تصادفی

Performance Evaluation of Three Image Classification Methods (Random Forest, Support Vector Machine and the Maximum Likelihood) in Land Use Mapping

F. Jahanbakhshi, M. R. Ekhtesasi *

Journal of Hydrology and Soil Science, Volume:22 Issue: 4, 2019, PP 235 -247

Land use/cover maps are the basic inputs for most of the environmental simulation models; hence, the accuracy of the maps derived from the classification of the satellite images reduces the uncertainty in modeling. The aim of this study was to assess the accuracy of the maps produced by machine learning based on classification methods (Random Forest and Support Vector Machine) and to compare them with a common classification method (Maximum Likelihood). For this purpose, the image of the OLI sensor of Landsat 8 for the study area (Sattarkhan Dam’s basin in the Eastern Azerbaijan) was used after the initial corrections. Five land uses including urban, irrigated and rain-fed agriculture, range and water body were considered. For conducting the supervised classification, ground truth data were used in two sets of educational (70% of the total) and test (30%) data. Accuracy indexes were used and the McNemar test was employed to show the significant statistical difference between the performances of the methods. The results indicates that the overall accuracy of Support Vector Machine, Random Forest, and Maximum Likelihood methods was 96.6, 90.8, and 90.8 %, respectively; also the Kappa coefficient for these methods was 0.93, 0.81 and 0.83, respectively. The existence of a significant statistical difference at the 95% confidence between the performances of the Support Vector Machine algorithm and the other two algorithms was confirmed by the McNemar test.

Keywords: Machine learning, Non-parametric classifier, McNemar test, Random forest algorithm

به جمع مشترکان مگیران بپیوندید!

جستجوی مقالات مرتبط با کلیدواژه "random forest algorithm" در نشریات گروه "کشاورزی"