scholarly journals A reduced labeled samples (RLS) framework for classification of imbalanced concept-drifting streaming data.

2016 ◽  
Author(s):  
Elaheh Arabmakki
Keyword(s):  
Author(s):  
S. Priya ◽  
R. Annie Uthra

AbstractIn present times, data science become popular to support and improve decision-making process. Due to the accessibility of a wide application perspective of data streaming, class imbalance and concept drifting become crucial learning problems. The advent of deep learning (DL) models finds useful for the classification of concept drift in data streaming applications. This paper presents an effective class imbalance with concept drift detection (CIDD) using Adadelta optimizer-based deep neural networks (ADODNN), named CIDD-ADODNN model for the classification of highly imbalanced streaming data. The presented model involves four processes namely preprocessing, class imbalance handling, concept drift detection, and classification. The proposed model uses adaptive synthetic (ADASYN) technique for handling class imbalance data, which utilizes a weighted distribution for diverse minority class examples based on the level of difficulty in learning. Next, a drift detection technique called adaptive sliding window (ADWIN) is employed to detect the existence of the concept drift. Besides, ADODNN model is utilized for the classification processes. For increasing the classifier performance of the DNN model, ADO-based hyperparameter tuning process takes place to determine the optimal parameters of the DNN model. The performance of the presented model is evaluated using three streaming datasets namely intrusion detection (NSL KDDCup) dataset, Spam dataset, and Chess dataset. A detailed comparative results analysis takes place and the simulation results verified the superior performance of the presented model by obtaining a maximum accuracy of 0.9592, 0.9320, and 0.7646 on the applied KDDCup, Spam, and Chess dataset, respectively.


Algorithms ◽  
2018 ◽  
Vol 11 (10) ◽  
pp. 158 ◽  
Author(s):  
Sathya Madhusudhanan ◽  
Suresh Jaganathan ◽  
Jayashree L S

Unstructured data are irregular information with no predefined data model. Streaming data which constantly arrives over time is unstructured, and classifying these data is a tedious task as they lack class labels and get accumulated over time. As the data keeps growing, it becomes difficult to train and create a model from scratch each time. Incremental learning, a self-adaptive algorithm uses the previously learned model information, then learns and accommodates new information from the newly arrived data providing a new model, which avoids the retraining. The incrementally learned knowledge helps to classify the unstructured data. In this paper, we propose a framework CUIL (Classification of Unstructured data using Incremental Learning) which clusters the metadata, assigns a label for each cluster and then creates a model using Extreme Learning Machine (ELM), a feed-forward neural network, incrementally for each batch of data arrived. The proposed framework trains the batches separately, reducing the memory resources, training time significantly and is tested with metadata created for the standard image datasets like MNIST, STL-10, CIFAR-10, Caltech101, and Caltech256. Based on the tabulated results, our proposed work proves to show greater accuracy and efficiency.


2017 ◽  
Vol 20 (6) ◽  
pp. 1507-1525 ◽  
Author(s):  
Hu Li ◽  
Ye Wang ◽  
Hua Wang ◽  
Bin Zhou

2020 ◽  
Vol 18 (1) ◽  
pp. 103-113

One of the noteworthy difficulties in the classification of nonstationary data is handling data with class imbalance. Imbalanced data possess the characteristics of having a lot of samples of one class than the other. It, thusly, results in the biased accuracy of a classifier in favour of a majority class. Streaming data may have inherent imbalance resulting from the nature of dataspace or extrinsic imbalance due to its nonstationary environment. In streaming data, timely varying class priors may lead to a shift in imbalance ratio. The researchers have contemplated ensemble learning, online learning, issue of class imbalance and cost-sensitive algorithms autonomously. They have scarcely ever tended to every one of these issues mutually to deal with imbalance shift in nonstationary data. This correspondence shows a novel methodology joining these perspectives to augment G-mean in no stationary data with Recurrent Imbalance Shifts (RIS). This research modifies the state-of-the-art boosting algorithms,1) AdaC2 to get G-mean based Online AdaC2 for Recurrent Imbalance Shifts (GOA-RIS) and AGOA-RIS (Ageing and G-mean based Online AdaC2 for Recurrent Imbalance Shifts), and 2) CSB2 to get G-mean based Online CSB2 for Recurrent Imbalance Shifts (GOC-RIS) and Ageing and G-mean based Online CSB2 for Recurrent Imbalance Shifts (AGOC-RIS). The study has empirically and statistically analysed the performances of the proposed algorithms and Online AdaC2 (OA) and Online CSB2 (OC) algorithms using benchmark datasets. The test outcomes demonstrate that the proposed algorithms globally beat the performances of OA and OC


Author(s):  
Christine Steinmeier ◽  
Jan Budke ◽  
Dominic Becking
Keyword(s):  

2013 ◽  
Vol 07 (02) ◽  
pp. 173-183 ◽  
Author(s):  
TAMARA SIPES ◽  
NATASHA BALAC ◽  
HOMA KARIMABADI ◽  
NICOLE WOLTER ◽  
KENNETH NUNES ◽  
...  

In this paper we demonstrate a new approach to the classification of multivariate time series streaming data by utilizing a temporal metafeature abstractions method. The technique extracts global features and metafeatures in order to capture the necessary time-lapse information in the streams of data. The features are then used to create a static, intermediate stream representation that includes all the important time-varying information, and is suitable for analysis using the standard supervised data mining techniques. The capability of the new algorithm called MineTool-TS2 was demonstrated through its application to three datasets: UCSD Microgrid energy usage data, a space physics dataset and synthetic data.


2016 ◽  
Vol 80 ◽  
pp. 1724-1733 ◽  
Author(s):  
Michał Woźniak ◽  
Paweł Ksieniewicz ◽  
Bogusław Cyganek ◽  
Andrzej Kasprzak ◽  
Krzysztof Walkowiak

Author(s):  
Ioannis Kontopoulos ◽  
Konstantinos Chatzikokolakis ◽  
Konstantinos Tserpes ◽  
Dimitris Zissis
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document