The ensemble clustering with maximize diversity using evolutionary optimization algorithms

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:

Data clustering is one of the main steps in data mining, which is responsible for exploring hidden patterns in non-tagged data. Due to the complexity of the problem and the weakness of the basic clustering methods, most studies today are guided by clustering ensemble methods. Diversity in primary results is one of the most important factors that can affect the quality of the final results. Also, the quality of the initial results is another factor that affects the quality of the results of the ensemble. Both factors have been considered in recent research on ensemble clustering. Here, a new framework for improving the efficiency of clustering has been proposed, which is based on the use of a subset of primary clusters, and the proposed method answers the above questions and ambiguities. The selection of this subset plays a vital role in the efficiency of the assembly. Since evolutionary intelligent algorithms have been able to solve the majority of complex engineering problems, this paper also uses these intelligent methods to select subsets of primary clusters. This selection is done using three intelligent methods (genetic algorithm, simulation annealing and particle swarm optimization). In this paper a clustering ensemble method is proposed which is based on a subset of primary clusters. The main idea behind this method is using more stable clusters in the ensemble. The stability is applied as a goodness measure of the clusters. The clusters which satisfy a threshold of this measure are selected to participate in the ensemble. For combining the chosen clusters, a co-association based consensus function is applied. A new EAC based method which is called Extended Evidence Accumulation Clustering, EEAC, is proposed for constructing the Co-association Matrix from the subset of clusters. Experimental results on several standard datasets with normalized mutual information evaluation, Fisher and accuracy criteria compared to Alizadeh, Azimi, Berikov, CLWGC, RCESCC, KME, CFSFDP, DBSCAB, NSC and Chen methods show the significant improvement of the proposed method in comparison with other ones.

Language:
Persian
Published:
Signal and Data Processing, Volume:19 Issue: 4, 2023
Pages:
95 to 120
https://magiran.com/p2562935  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!