Online Learning for Imbalanced Data Streams with Concept Drift by Belief Theory and Chaotic Function
Continual learning from data streams is a pivotal aspect of machine learning, requiring the development of algorithms capable of adapting to incoming data. However, the ongoing evolution of data streams presents a formidable challenge as previously acquired knowledge may become outdated. This challenge, known as concept drift, demands timely detection for the effective adaptation of learning models. While various drift detectors have been proposed, they often assume a relatively balanced class distribution. In scenarios with imbalanced data streams, these detectors may exhibit bias toward majority classes, overlooking shifts in minority classes. Moreover, the imbalance among classes can change over time, with roles shifting between majority and minority classes, especially when relationships among classes become complex due to overlapping regions. In this paper, a novel classification method is introduced for imbalanced streaming data affected by concept drift. The proposed method continuously monitors arriving streams to detect and adapt to both imbalances and concept drift. Upon receiving a new block of data, the proposed method employs the k-means clustering approach to identify non-dense regions and performs oversampling for minority classes. Cluster centers are selected using the belief function to address overlapping issues between majority and minority classes. Utilizing a chaotic approach, the new sample is added based on its neighborhood and the size of that neighborhood. Subsequently, concept drift detection is conducted using three pre-defined thresholds that cover time intervals and classification errors. Finally, the label prediction process is done by ensemble learning and weighted majority voting. Experiments conducted on benchmark datasets from the UCI database evaluate the performance of the proposed method using Leave-One-Out (LOO) validation and comparisons with state-of-the-art methods. The results demonstrate the superiority of the proposed method across various evaluation criteria, highlighting its effectiveness in addressing imbalanced streaming data with concept drift.
- حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران میشود.
- پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانههای چاپی و دیجیتال را به کاربر نمیدهد.