Time-Decaying Bloom Filters for Data Streams with Skewed Distributions

Author(s):  
Kai Cheng ◽  
Limin Xiang ◽  
M. Iwaihara ◽  
Haiyan Xu ◽  
M.M. Mohania
Author(s):  
Yi Wang

In recent years, there have been some interesting studies on predictive modeling in data streams. However, most such studies assume relatively balanced and stable data streams but cannot handle well skewed (e.g., few positives but lots of negatives) and skewed distributions, which are typical in many data stream applications. In this paper, we propose an ensemble and cluster based sample method to deal with this situation. The study shows that this method has effective result on skewed data streams mining.


2020 ◽  
Vol 13 (12) ◽  
pp. 2355-2367
Author(s):  
Qiyu Liu ◽  
Libin Zheng ◽  
Yanyan Shen ◽  
Lei Chen
Keyword(s):  

Author(s):  
LAKSHMI PRANEETHA

Now-a-days data streams or information streams are gigantic and quick changing. The usage of information streams can fluctuate from basic logical, scientific applications to vital business and money related ones. The useful information is abstracted from the stream and represented in the form of micro-clusters in the online phase. In offline phase micro-clusters are merged to form the macro clusters. DBSTREAM technique captures the density between micro-clusters by means of a shared density graph in the online phase. The density data in this graph is then used in reclustering for improving the formation of clusters but DBSTREAM takes more time in handling the corrupted data points In this paper an early pruning algorithm is used before pre-processing of information and a bloom filter is used for recognizing the corrupted information. Our experiments on real time datasets shows that using this approach improves the efficiency of macro-clusters by 90% and increases the generation of more number of micro-clusters within in a short time.


Sign in / Sign up

Export Citation Format

Share Document