The GC3 framework : grid density based clustering for classification of streaming data with concept drift.

Deep learning framework for handling concept drift and class imbalanced complex decision-making on streaming data

Complex & Intelligent Systems ◽

10.1007/s40747-021-00456-0 ◽

2021 ◽

Author(s):

S. Priya ◽

R. Annie Uthra

Keyword(s):

Decision Making ◽

Deep Learning ◽

Concept Drift ◽

Class Imbalance ◽

Streaming Data ◽

Superior Performance ◽

Data Streaming ◽

Minority Class ◽

Concept Drift Detection

AbstractIn present times, data science become popular to support and improve decision-making process. Due to the accessibility of a wide application perspective of data streaming, class imbalance and concept drifting become crucial learning problems. The advent of deep learning (DL) models finds useful for the classification of concept drift in data streaming applications. This paper presents an effective class imbalance with concept drift detection (CIDD) using Adadelta optimizer-based deep neural networks (ADODNN), named CIDD-ADODNN model for the classification of highly imbalanced streaming data. The presented model involves four processes namely preprocessing, class imbalance handling, concept drift detection, and classification. The proposed model uses adaptive synthetic (ADASYN) technique for handling class imbalance data, which utilizes a weighted distribution for diverse minority class examples based on the level of difficulty in learning. Next, a drift detection technique called adaptive sliding window (ADWIN) is employed to detect the existence of the concept drift. Besides, ADODNN model is utilized for the classification processes. For increasing the classifier performance of the DNN model, ADO-based hyperparameter tuning process takes place to determine the optimal parameters of the DNN model. The performance of the presented model is evaluated using three streaming datasets namely intrusion detection (NSL KDDCup) dataset, Spam dataset, and Chess dataset. A detailed comparative results analysis takes place and the simulation results verified the superior performance of the presented model by obtaining a maximum accuracy of 0.9592, 0.9320, and 0.7646 on the applied KDDCup, Spam, and Chess dataset, respectively.

Download Full-text

Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble

PeerJ Computer Science ◽

10.7717/peerj-cs.459 ◽

2021 ◽

Vol 7 ◽

pp. e459

Author(s):

Martin Sarnovsky ◽

Michal Kolarik

Keyword(s):

Data Streams ◽

Concept Drift ◽

Ensemble Methods ◽

Predictive Performance ◽

Streaming Data ◽

Underlying Structure ◽

Adaptive Models ◽

Resource Requirements ◽

Continuous Stream

Data streams can be defined as the continuous stream of data coming from different sources and in different forms. Streams are often very dynamic, and its underlying structure usually changes over time, which may result to a phenomenon called concept drift. When solving predictive problems using the streaming data, traditional machine learning models trained on historical data may become invalid when such changes occur. Adaptive models equipped with mechanisms to reflect the changes in the data proved to be suitable to handle drifting streams. Adaptive ensemble models represent a popular group of these methods used in classification of drifting data streams. In this paper, we present the heterogeneous adaptive ensemble model for the data streams classification, which utilizes the dynamic class weighting scheme and a mechanism to maintain the diversity of the ensemble members. Our main objective was to design a model consisting of a heterogeneous group of base learners (Naive Bayes, k-NN, Decision trees), with adaptive mechanism which besides the performance of the members also takes into an account the diversity of the ensemble. The model was experimentally evaluated on both real-world and synthetic datasets. We compared the presented model with other existing adaptive ensemble methods, both from the perspective of predictive performance and computational resource requirements.

Download Full-text

Decision Tree Classification Algorithm within Concept Similarity

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.235.9 ◽

2012 ◽

Vol 235 ◽

pp. 9-14

Author(s):

Chun Hua Ju ◽

Li Li Mao

Keyword(s):

Data Streams ◽

Data Stream ◽

Concept Drift ◽

Classification Algorithm ◽

Streaming Data ◽

Decision Tree Classification ◽

The Cost ◽

Prediction Efficiency ◽

Concept Similarity

Data stream mining has been applied in many domains, but the concept drifts of data streams bring great obstacles to data mining. Current researches about classification algorithm for streaming data with concept drift have achieved many successes, while they pay little attention to the iterancy of data streams, namely, the situation of the historical concept reappears. For this characteristic, this paper puts forward that it utilizes the classifier model of the historical concepts or high similarity concepts through calculating the concept similarity to classify and predict. In this way, we don’t need training any more. Meanwhile, it reduces the cost of update model, speeds up the classification of the rate and improves the prediction efficiency.

Download Full-text

Streaming Data Classification using Hybrid Classifiers to tackle Stability-Plasticity Dilemma and Concept Drift

2020 IEEE 4th Conference on Information & Communication Technology (CICT) ◽

10.1109/cict51604.2020.9312077 ◽

2020 ◽

Author(s):

A L Amutha ◽

R Annie Uthra ◽

J Preetha Roselyn ◽

R Golda Brunet

Keyword(s):

Concept Drift ◽

Data Classification ◽

Streaming Data ◽

Hybrid Classifiers

Download Full-text

Handling adversarial concept drift in streaming data

Expert Systems with Applications ◽

10.1016/j.eswa.2017.12.022 ◽

2018 ◽

Vol 97 ◽

pp. 18-40 ◽

Cited By ~ 14

Author(s):

Tegjyot Singh Sethi ◽

Mehmed Kantardzic

Keyword(s):

Concept Drift ◽

Streaming Data

Download Full-text

A reduced labeled samples (RLS) framework for classification of imbalanced concept-drifting streaming data.

10.18297/etd/2602 ◽

2016 ◽

Author(s):

Elaheh Arabmakki

Keyword(s):

Streaming Data

Download Full-text

Instance-Based Classification of Streaming Data Using Emerging Patterns

Information and Communication Technologies - Communications in Computer and Information Science ◽

10.1007/978-3-642-15766-0_33 ◽

2010 ◽

pp. 228-236 ◽

Cited By ~ 3

Author(s):

Mohd. Amir ◽

Durga Toshniwal

Keyword(s):

Streaming Data ◽

Emerging Patterns

Download Full-text

Concept Drift Detection on Streaming Data with Dynamic Outlier Aggregation

Lecture Notes in Business Information Processing - Process Mining Workshops ◽

10.1007/978-3-030-72693-5_16 ◽

2021 ◽

pp. 206-217

Author(s):

Ludwig Zellner ◽

Florian Richter ◽

Janina Sontheim ◽

Andrea Maldonado ◽

Thomas Seidl

Keyword(s):

Concept Drift ◽

Streaming Data ◽

Concept Drift Detection

Download Full-text

CLASSIFICATION OF CONCEPT DRIFT IN EVOLVING DATA STREAM

Emerging Extended Reality Technologies For Industry 4.0 ◽

10.1002/9781119654674.ch11 ◽

2020 ◽

pp. 189-205

Author(s):

Mashail Althabiti ◽

Manal Abdullah

Keyword(s):

Data Stream ◽

Concept Drift ◽

Evolving Data

Download Full-text

Data-driven decision support under concept drift in streamed big data

Complex & Intelligent Systems ◽

10.1007/s40747-019-00124-4 ◽

2019 ◽

Vol 6 (1) ◽

pp. 157-163 ◽

Cited By ~ 2

Author(s):

Jie Lu ◽

Anjin Liu ◽

Yiliao Song ◽

Guangquan Zhang

Keyword(s):

Decision Making ◽

Big Data ◽

Real Time ◽

Concept Drift ◽

High Volume ◽

Streaming Data ◽

Data Driven ◽

Research Directions ◽

Decision Outcomes ◽

Past Data

Abstract Data-driven decision-making ($$\mathrm {D^3}$$D3M) is often confronted by the problem of uncertainty or unknown dynamics in streaming data. To provide real-time accurate decision solutions, the systems have to promptly address changes in data distribution in streaming data—a phenomenon known as concept drift. Past data patterns may not be relevant to new data when a data stream experiences significant drift, thus to continue using models based on past data will lead to poor prediction and poor decision outcomes. This position paper discusses the basic framework and prevailing techniques in streaming type big data and concept drift for $$\mathrm {D^3}$$D3M. The study first establishes a technical framework for real-time $$\mathrm {D^3}$$D3M under concept drift and details the characteristics of high-volume streaming data. The main methodologies and approaches for detecting concept drift and supporting $$\mathrm {D^3}$$D3M are highlighted and presented. Lastly, further research directions, related methods and procedures for using streaming data to support decision-making in concept drift environments are identified. We hope the observations in this paper could support researchers and professionals to better understand the fundamentals and research directions of $$\mathrm {D^3}$$D3M in streamed big data environments.

Download Full-text