فهرست مطالب نویسنده:

mohammad ali sadrnia

انتخاب همه

تعیین دوز بهینه دارو برای کنترل جعیت سلول های سرطانی با لحاظ اثرات زیان بار دارو در بیمار مبتلا به ملانوما با استفاده از روش مسیرهای شایستگی

الناز کلهر، امین نوری*، سارا صبوری راد، محمدعلی صدرنیا

مجله رایانش نرم و فناوری اطلاعات، سال دهم شماره 1 (بهار 1400)، صص 72 -92

هدف اصلی در این مقاله، تعیین میزان بهینه دوز دارو برای کاهش جمعیت سلول های سرطانی در بیماران مبتلا به سرطان ملانوما می باشد. برای این کار از روش مسیرهای شایستگی که یکی از روش های حل مسئله یادگیری تقویتی می باشد، استفاده شده است. این روش مزایای دو روش مرسوم یادگیری تقویتی شامل یادگیری تفاوت گذرا و مونت کارلو را دارا می باشد. از دیگر مزایای این روش می توان به بی نیاز بودن آن به مدل ریاضی اشاره کرد ولی چون امکان پیاده سازی بر روی سیستم واقعی امکان پذیر نبوده است، برای بررسی عملکرد کنترلر پیشنهادی از مدل ریاضی غیرخطی تاخیردار جهت شبیه سازی رفتار محیط استفاده گردیده است. با توجه به بررسی هایی که تاکنون انجام شده است،لازم به ذکر می باشد که بر روی این مدل ریاضی هیچ نوع روش کنترلی پیاده سازی نشده است و این اولین باری می باشد که کنترل جمعیت سلول های سرطانی برای این مدل انجام گرفته است. در کنترل بهینه دوز دارو، میزان دارو می بایست به گونه ای باشد تا از اثرات زیان بار دارو بر روی سلول های سالم تا حد امکان جلوگیری شود. با توجه به نتایج حاصل از شبیه سازی، مشاهده می شود که روش انتخابی توانسته است با تزریق زیر بهینه میزان دوز دارو، جمعیت سلول های سرطانی را کنترل کرده، کاهش داده و به صفر برساند که این امر، در کنار افزایش سلول های ایمنی بدن رخ داده است. در انتها برای نشان دادن مزیت روش انتخابی در افزایش سرعت برای کاهش سلول های سرطانی، این روش با روش الگوریتم یادگیری Q که یکی دیگر از روش های حل مسئله یادگیری تقویتی می باشد و روش کنترل بهینه مقایسه شده است. با اعمال عیب به سنسور سیستم نیز، عملکرد کنترلر پیشنهادی برای کاهش سلول های سرطانی در حضور عیب مورد بررسی قرار گرفت. برای بررسی یکی از مزایای روش یادگیری تقویتی که تطبیق پذیری آن با محیط می باشد، با لحاظ عدم قطعیت در پارامترهای سیستم و شرایط اولیه، کنترل جمعیت سلول های سرطانی در پنج بیمار مبتلا به سرطان ملانوما انجام شده است. همچنین سرعت همگرایی هر دو روش مسیرهای شایستگی و الگوریتم یادگیری Q در کاهش سلول های سرطانی به ازای نرخ های آموزش مختلف مورد بررسی قرار گرفته است.

کلید واژگان: اثرات زیان بار دارو, الگوریتم یادگیری Q, کنترل جمعیت سلول های سرطانی, ملانوما, یادگیری تقویتی, مسیرهای شایستگی, کنترل بهینه

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Using Eligibility Traces Algorithm to Specify the Optimal Dosage for the Purpose of Cancer Cell Population Control in Melanoma Patients with a Consideration of the Side Effects

Elnaz Kalhor, Amin Noori *, Sara Saboori Rad, Mohammad Ali Sadrnia

Journal of Soft Computing and Information Technology, Volume:10 Issue: 1, 2021, PP 72 -92

This paper mainly aims to determine the optimal drug dosage for the purpose of reducing the population of cancer cells in melanoma patients. To do so, Reinforcement Learning method and the eligibility traces algorithm are employed, giving us the advantage of creating a compromise between the two algorithms of the reinforcement learning, being Monte-Carlo and Temporal Difference. Furthermore, it can be said that using this approach, there was no need to employ a mathematical model in the whole process. However, as its implementation on the real system was not possible, a delayed nonlinear mathematical model is used to investigate the performance of the proposed controller and simulate the behavior of the environment. It should be noted this mathematical model made use of no control method. This is the first time that population control of cancer cells is applied and tested on this model. To know of the optimal dosage of the drug, it should be mentioned that the drug is required to prevent the side effects on healthy/normal cells as much as possible. According to the obtained results, the eligibility traces algorithm is able to control and reduce the population of cancer cells through injecting the sub-optimal drug dose. This will increase the level of immunity in our body. Finally, to demonstrate the advantage of a selective method of increasing the rate of cancer cell death, this method is compared with the Q-learning algorithm and optimal control. By applying the fault to the sensor, the performance of the proposed controller to reduce cancer cells was investigated. The adaptability of the proposed method with the environment changes is checked afterwards. To this end, uncertainty in the system parameters and initial conditions are applied and the population of cancer cells are controlled in five melanoma patients. Moreover, having added noise to the system, it was shown that the eligibility traces algorithm is able to control the population of cancer cells and make it reach zero. Additionally, the convergence speed of both eligibility traces algorithm and Q learning algorithm in reducing the number of cancer cells for different learning rates was investigated.

Keywords: Side effects of drugs, Q-learning algorithm, cancer cells population control, Melanoma, Reinforcement Learning, Eligibility Traces, Optimal control method

Abstract View Paper Research/Original Article Original: Persian
کنترل جمعیت سلول های سرطانی در مدل غیرخطی سرطان ملانوما با لحاظ عدم قطعیت با استفاده از الگوریتم یادگیری Q تحت سیاست استدلال مبتنی بر مورد (CBR)

امین نوری*، الناز کلهر، محمدعلی صدرنیا، سارا صبوری راد

مجله مهندسی برق و الکترونیک ایران، سال هفدهم شماره 3 (پاییز 1399)، صص 25 -37

سرطان پوست یکی از خطرناک ترین سرطان هایی است که همه ساله افراد زیادی به آن مبتلا می شوند. به همین دلیل تشخیص و درمان سریع این سرطان بسیار برای پزشکان حایز اهمیت می باشد، در چند دهه اخیر برای بهبود تشخیص و درمان این بیماری استفاده از روش های هوشمند بسیار مورد توجه قرار گرفته است. هدف اصلی در این مقاله، تعیین مقدار بهینه دارو برای از بین بردن سلول های سرطانی می باشد به گونه ای که از تاثیر سوء دارو بر روی سلول های سالم جلوگیری شود. از الگوریتم یادگیری Q بدین منظور استفاده شده است. برای انتخاب اعمال، از سیاست استدلال مبتنی بر مورد با نام اختصاری CBR که یک نوع سیاست اکتشافی شتاب داده شده می باشد، استفاده گشته است که باعث افزایش سرعت یادگیری و کاهش زمان، برای رسیدن به سیاست بهینه می شود. مورد دیگری که در این مقاله لحاظ شده است، تاثیر نیمه عمر دارو برای بدست آوردن اثر دارو در هر لحظه در بدن بیمار می باشد. برای اینکه عملکرد روش یادگیری تقویتی در کنترل سلول های سرطانی و تعیین میزان بهینه دوز دارو بهتر نشان داده شود، این روش با یکی از روش های کنترل بهینه به نام روش همیلتونین و روش تزریق دوز داروی ثابت مقایسه شده است. در نهایت نشان داده شده است مجموع دوز داروی تزریقی به بیمار با استفاده از روش یادگیری تقویتی در مقایسه با حالتی که از روش کنترل بهینه و دوز داروی ثابت برای تمام زمان ها استفاده شده است، بسیار کاهش پیدا کرده است و در ضمن جمعیت سلول های سرطانی نیز کنترل شده است. با اعمال نویز و عدم قطعیت در پارامترهای سیستم و شرایط اولیه باز هم روش انتخابی قادر به کنترل سلول های سرطانی می باشد.

کلید واژگان: سرطان ملانوما, الگوریتم یادگیری Q, سیاست استدلال مبتی بر مورد, اثرات سوء دارو, نیمه عمر دارو, کنترل بهینه

چکیده مشاهده متن مقاله پژوهشی/اصیل زبان: فارسی

Controlling the Cancer Cells in a Nonlinear Model of Melanoma by Considering the Uncertainty Using Q-learning Algorithm Under the Case Based Reasoning Policy

Amin Noori*, Elnaz Kalhor, MohammadAli Sadrnia, Sara Saboori Rad

Journal of Iranian Association of Electrical and Electronics Engineers, Volume:17 Issue: 3, 2020, PP 25 -37

Melanoma is one of the most dangerous types of cancers and every year, many people suffer from this cancer. Hence, quick diagnosis and treatment are significantly important for the physicians. In the recent decade, intelligent methods have attracted considerable attention for diagnosing and treating the melanoma. The main objective of this paper is determining the optimal dosage of the drug for the elimination of the cancer cells while preventing from the side effect of the drug on the normal cells. To this aim, the Q-learning algorithm is employed. In order to select the actions, a Case-Based Reasoning (CBR) policy is used, which is an accelerated heuristic policy. The considered policy has increased the learning speed and reduced the overall time, to reach the optimal policy. The half-life effect of the drug is also considered to obtain the side effect of the drug on the patientchr('39')s body, at each time step. In order to demonstrate Q-learning algorithm performance in cancer cells control and optimal dosage determination purposes, Q-learning is compared with two methods, including fix dosage injection method and Hamiltonian method, which is one of the most important optimal control methods. Finally, it is revealed that the total injected dosage by using Reinforcement Learning method (Q-learning) is significantly reduced within the whole period of time in comparison with employing the optimal control and a fixed dosage injection cases. The number of cancer cells is controlled, as well. It should be noted that by applying the noise and uncertainty to the system parameters and the initial conditions, the proposed method can successfully control the cancer cells.

Keywords: melanoma cancer, Q-learning algorithm, case based reasoning, side effect of the drug, half-life of drug, optimal control

Abstract View Paper Research/Original Article Original: Persian
Fault Tolerant Control of Blood Glucose Concentration Using Reinforcement Learning

Amin Noori *, MohammadAli Sadrnia, MohammadBagher Naghibi-Sistani

International Journal of Industrial Electronics, Control and Optimization, Volume:3 Issue: 3, Summer 2020, PP 353 -364

In this paper, the main focus is on blood glucose level control and the possible sensor and actuator faults which can be observed in a given system. To this aim, the eligibility traces algorithm (a Reinforcement Learning method) and its combination with sliding mode controllers is used to determine the injection dosage. Through this method, the optimal dosage will be determined to be injected to the patient in order to decrease the side effects of the drug. To detect the fault in the system, residual calculation techniques are utilized. To calculate the residual, it is required to predict states of the normal system at each time step, for which, the Radial Basis Function neural network is used. The proposed method is compared with another reinforcement learning method (Actor-Critic method) with its combination with the sliding mode controller. Finally, both RL-based methods are compared with a combinatory method, Neural network and sliding mode control. Simulation results have revealed that the eligibility traces algorithm and actor-critic method can control the blood glucose concentration and the desired value can be reached, in the presence of the fault. However, in addition to the reduced injected dosage, the eligibility traces algorithm can provide lower variations about the desired value. The reduced injected dosage will result in the mitigated side effects, which will have considerable advantages for diabetic patients.

Keywords: Fault Tolerant Control, Reinforcement Learning, Eligibility Traces, Actor Critic, Diabetic Model

Abstract View Paper Research/Original Article Original: English

بدانید!

در این صفحه نام مورد نظر در اسامی نویسندگان مقالات جستجو می‌شود. ممکن است نتایج شامل مطالب نویسندگان هم نام و حتی در رشته‌های مختلف باشد.
همه مقالات ترجمه فارسی یا انگلیسی ندارند پس ممکن است مقالاتی باشند که نام نویسنده مورد نظر شما به صورت معادل فارسی یا انگلیسی آن درج شده باشد. در صفحه جستجوی پیشرفته می‌توانید همزمان نام فارسی و انگلیسی نویسنده را درج نمایید.
در صورتی که می‌خواهید جستجو را با شرایط متفاوت تکرار کنید به صفحه جستجوی پیشرفته مطالب نشریات مراجعه کنید.

به جمع مشترکان مگیران بپیوندید!

mohammad ali sadrnia