Discovering the Clinical Knowledge about Breast Cancer Diagnosis Using Rule-Based Machine Learning Algorithms
Breast cancer represents one of the most prevalent cancers and is also the main cause of cancer-related deaths in women globally. Thus, this study was aimed to construct and compare the performance of several rule-based machine learning algorithms in predicting breast cancer.
The data were collected from the Breast Cancer Registry database in the Ayatollah Taleghani Hospital, Abadan, Iran, from December 2017 to January 2021 and had information from 949 non-breast cancer and 554 breast cancer cases. Then the mean values and K-nearest neighborhood algorithm were used for replacing the lost quantitative and qualitative data fields, respectively. In the next step, the Chi-square test and binary logistic regression were used for feature selection. Finally, the best rule-based machine learning algorithm was obtained based on comparing different evaluation criteria. The Rapid Miner Studio 7.1.1 and Weka 3.9 software were utilized.
As a result of feature selection the nine variables were considered as the most important variables for data mining. Generally, the results of comparing rule-based machine learning demonstrated that the J-48 algorithm with an accuracy of 0.991, F-measure of 0.987, and also AUC of 0.9997 had a better performance than others.
It’s found that J-48 facilitates a reasonable level of accuracy for correct BC risk prediction. We believe it would be beneficial for designing intelligent decision support systems for the early detection of high-risk patients that will be used to inform proper interventions by the clinicians.
- حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران میشود.
- پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانههای چاپی و دیجیتال را به کاربر نمیدهد.