Penalized logistic regression models for phenotype prediction based on Single Nucleotide Polymorphisms

Author(s):

Seyedeh Rezwan Hosseini , Farnaz Ghassemi * , Mohammad Hasan Moradi

Message:

Article Type:

Research/Original Article (دارای رتبه معتبر)

Abstract:

Most of the studies on phenotype differences, including some diseases, are based on studying the areas of the genome called Single Nucleotide Polymorphism (SNP). Some SNPs on their own and some by interacting with other SNPs play an important role in any phenotype or specific disease. Various models, including regression models, are designed and implemented for prediction of these diseases. As the phenotypes are both quantitative and binary, linear regression is used for models predicting quantitative ones, which is only based on the number of minor alleles per SNP, and logistic regression is used for binary ones like complex diseases. Since complex diseases are not caused only by independent SNPs, but by the interaction of a large number of SNPs, which mostly exceeds the number of samples, penalized logistic regressions are counted to be a better choice. These models, therefore, can overcome the limitation of ordinary logistic regression on high-dimensional SNP datasets. In this paper, three regression models, including Ridge, Lasso and Elastic Net (EN), were implemented on 10000 samples of the SNP datasets of OWKIN-Inserm Institute to predict the risk of a specific disease (undisclosed for confidentiality reasons). Among these three, the Lasso model with minimizer lambda indicated higher accuracy (73.73%) and AUC (83.54%). The model is also less complex since it eliminates less related features as much as possible and keeps only the most informative ones. Besides, getting better results with Lasso indicates that multicollinearity is either not existed between variables or is low that can be neglected.

Keywords:

Complex diseases prediction , genotype-phenotype associations , SNP , Regression , penalized logistic regression

Language:

English

Published:

Amirkabir International Journal of Electrical & Electronics Engineering, Volume:53 Issue: 1, Winter-Spring 2021

Page:

https://magiran.com/p2283210

دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:

اشتراک شخصی

با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!

اشتراک سازمانی

به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!

اطلاعات بیشتر

توجه!

حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.

In order to view content subscription is required

Personal subscription

Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.

Organization subscription

Please contact us to subscribe your university or library for unlimited access!

More information

علمی مصوب

Amirkabir International Journal of Electrical & Electronics Engineering

مجله بین المللی مهندسی برق و الکترونیک

دوفصلنامه فنی مهندسی به زبان انگلیسی

آخرین شماره | آرشیو

ISSN: 2588-2910 eISSN: 2588-2929

صاحب امتیاز:

دانشگاه صنعتی امیرکبیر

مدیر مسئول:

دکتر حسین حسینی تودشکی

سردبیر:

دکتر حسام الدین صادقی

تلفن نشریه: ۰۲۱-۶۶۴۹۱۱۲۳

اطلاعات بیشتر نشریه

درباره نشریه پیام به نشریه سایت اختصاصی نشریه پذیرش الکترونیکی مقاله

به جمع مشترکان مگیران بپیوندید!

Penalized logistic regression models for phenotype prediction based on Single Nucleotide Polymorphisms

Seyedeh Rezwan Hosseini , Farnaz Ghassemi * , Mohammad Hasan Moradi

Complex diseases prediction , genotype-phenotype associations , SNP , Regression , penalized logistic regression

Amirkabir International Journal of Electrical & Electronics Engineering

مجله بین المللی مهندسی برق و الکترونیک