Application of a random forest algorithm to estimate marker effects and identify candidate genes for reproductive traits in Iranian Holstein dairy cattle

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:
Introduction
The genome-wide association study (GWAS) is a powerful approach to identify genomic regions associated with fertility traits that explain a significant portion of the genetic variance associated with these traits and identify the relevant causal mutations. Evaluating the correlation between each genotyped marker and trait is an essential strategy for GWAS studies that examine the effects of all markers by considering their possible interactions, environmental factors, and even mutual effects between markers. Recently, machine learning methods have been introduced to genomic topics, and the basis of these methods is different from the common methods of genomic evaluation. The machine learning method is used to estimate the genomic breeding values of the candidate animals by considering the training data (genotypic and phenotypic information of the reference population). One of the key advantages of this method is the ability to analyze large data. Machine learning is a branch of artificial intelligence whose goal is to achieve machines that can extract knowledge (learning) from the environment. A variety of machine learning methods (random forest, boosting, and deep learning) are used to model genetic variance and environmental factors, study gene networks, GWAS, study epistasis effects, and genomic evaluation. Random forest is one of the machine learning methods that has been successfully used in various fields of science. This research was conducted to identify markers and genes related to reproductive traits such as calving interval (CI), days open (DO), daughter pregnancy rate (DPR), and age at first calving (AFC) in Iranian Holstein dairy cattle. These traits have already been investigated with the ssGBLUP method and using a smaller sample size. However, in the present research, by using more genotyped animals, a random forest algorithm was used to identify markers and genes related to reproductive traits.
Materials and methods
The records used in this research were provided by the National Animal Breeding Center and Promotion of Animal Products of Iran and included AFC, DO, CI, and DPR related to the genotyped bulls' daughters. In this research, the pedigree information of 2774183 animals was used. The genotypic information of the markers related to 2419 Holstein bulls was used. Genomic data quality control was performed using factors such as the number of genotyped SNPs per animal (ACR), the number of genotyped animals per SNP (CR), Hardy-Weinberg equilibrium (HWE), and minor allele Frequency (MAF). When filtering genomic data, the markers whose MAF was less than 5% were removed, and then the samples whose genotyped frequency was less than 90% were identified and removed. Then, the markers whose genotyping rate was less than 95% in the samples were identified and removed. Finally, the SNPs that deviated from the HWE test (P<10-6) were excluded from the analysis as a measure of genotyping error. To control the quality of genomic data, PLINK 1.9 software was used. Then Ranfog software was used in the Linux environment to perform analysis through random forest algorithm.
Results and discussion
By using the random forest algorithm, a total of 21 important SNPs were observed, then important fertility trait candidate genes were identified by the gene ontology method, and 62 genes were within 250 Kb of these SNPs. The most significant SNP was observed for AFC. The main SNP for AFC is in ARS-BFGL-NGS-22647 BTA3, for CI is in ARS-BFGL-NGS-114194 (BTA11), for DO is in BTA-74076 -no-rs (BTA5), and for DPR is in ARS-BFGL-NGS-32553 (BTA26). The researchers, who studied fertility traits in Nellore cattle using machine learning methods, identified MPZL1 and CD247 genes on chromosome number 3 and this gene was associated with age at first calving. Many pathways of cell biology affect the performance of reproductive traits. Research has reported the relationship between the CD247 gene and pathways of biology, including cell development and function. Research has shown that the IFFO2 gene plays an important role in the molecular structure of cells, as well as in the mechanism of blastocyst formation, embryos, and the length of gestation in cattle. In a study conducted on the mouse population on the structure of the flagellum and the sperm maturation process, the role of the ALDH4A1 gene in the sperm maturation process was reported. The association of the RPS6KC1 gene with pregnancy rate and antral follicle number in Nellore heifers has been reported. The KAT2B gene is a transcriptional activator that plays an essential role in regulating the correction of histone acetylation and plays an important role in improving carcass quality, muscle and fat development, and metabolism in native Chinese cattle. In addition, they play a key role in regulating biological processes and are related to cell growth, metabolism and immune system function.
Conclusions
  According to the objectives of this research, new information on markers and candidate genes related to reproductive traits in Iranian Holstein dairy cattle was reported. The markers and candidate genes identified in the present research can be used in genomic selection to improve the reproductive traits of Holstein dairy cattle.
Language:
Persian
Published:
Animal Production Research, Volume:13 Issue: 1, 2024
Pages:
95 to 109
magiran.com/p2734192  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!