Itemset Support Queries Using Frequent Itemsets and Their Condensed Representations

A False Negative Maximal Frequent Itemsets Mining Algorithm over Stream

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.135-136.21 ◽

2011 ◽

Vol 135-136 ◽

pp. 21-25

Author(s):

Hai Feng Li ◽

Ning Zhang

Keyword(s):

Real World ◽

False Negative ◽

Frequent Itemsets ◽

Experimental Results ◽

Mining Algorithm ◽

Chernoff Bound ◽

Frequent Itemsets Mining ◽

Condensed Representations ◽

Maximal Frequent Itemsets ◽

Landmark Model

Maximal frequent itemsets are one of several condensed representations of frequent itemsets, which store most of the information contained in frequent itemsets using less space, thus being more suitable for stream mining. This paper focuses on mining maximal frequent itemsets approximately over a stream landmark model. A false negative method is proposed based on Chernoff Bound to save the computing and memory cost. Our experimental results on a real world dataset show that our algorithm is effective and efficient.

Download Full-text

Transaction Databases, Frequent Itemsets, and Their Condensed Representations

Lecture Notes in Computer Science - Knowledge Discovery in Inductive Databases ◽

10.1007/11733492_9 ◽

2006 ◽

pp. 139-164 ◽

Cited By ~ 3

Author(s):

Taneli Mielikäinen

Keyword(s):

Frequent Itemsets ◽

Condensed Representations

Download Full-text

An Efficient Approach of Extracting Frequent Itemsets from Large Data Using HDFS Framework

International Journal on Communications Antenna and Propagation (IRECAP) ◽

10.15866/irecap.v7i6.13354 ◽

2017 ◽

Vol 7 (6) ◽

pp. 529

Author(s):

Prajakta G. Kulkarni ◽

S. R. Khonde

Keyword(s):

Large Data ◽

Frequent Itemsets ◽

Efficient Approach

Download Full-text

Predicting Heart-Diseases from Medical Dataset Through Frequent Itemsets Using Improved Algorithm

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i8.325331 ◽

2018 ◽

Vol 6 (8) ◽

pp. 325-331

Author(s):

V. Vijayalakshmi

Keyword(s):

Heart Diseases ◽

Frequent Itemsets ◽

Medical Dataset ◽

Improved Algorithm

Download Full-text

Frequent itemsets grouping algorithm based on Hash list

Journal of Computer Applications ◽

10.3724/sp.j.1087.2013.03045 ◽

2013 ◽

Vol 33 (11) ◽

pp. 3045-3048

Author(s):

Hongmei WANG ◽

Ming HU

Keyword(s):

Frequent Itemsets ◽

Grouping Algorithm

Download Full-text

A Synopsis Based Approach for Itemset Frequency Estimation over Massive Multi-Transaction Stream

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3465238 ◽

2021 ◽

Vol 16 (2) ◽

pp. 1-30

Author(s):

Guangtao Wang ◽

Gao Cong ◽

Ying Zhang ◽

Zhen Hai ◽

Jieping Ye

Keyword(s):

Frequency Estimation ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Experimental Results ◽

Closure Property ◽

Frequent Itemset Mining ◽

Itemset Mining ◽

Minimum Value ◽

Downward Closure ◽

Bounded Size

The streams where multiple transactions are associated with the same key are prevalent in practice, e.g., a customer has multiple shopping records arriving at different time. Itemset frequency estimation on such streams is very challenging since sampling based methods, such as the popularly used reservoir sampling, cannot be used. In this article, we propose a novel k -Minimum Value (KMV) synopsis based method to estimate the frequency of itemsets over multi-transaction streams. First, we extract the KMV synopses for each item from the stream. Then, we propose a novel estimator to estimate the frequency of an itemset over the KMV synopses. Comparing to the existing estimator, our method is not only more accurate and efficient to calculate but also follows the downward-closure property. These properties enable the incorporation of our new estimator with existing frequent itemset mining (FIM) algorithm (e.g., FP-Growth) to mine frequent itemsets over multi-transaction streams. To demonstrate this, we implement a KMV synopsis based FIM algorithm by integrating our estimator into existing FIM algorithms, and we prove it is capable of guaranteeing the accuracy of FIM with a bounded size of KMV synopsis. Experimental results on massive streams show our estimator can significantly improve on the accuracy for both estimating itemset frequency and FIM compared to the existing estimators.

Download Full-text

A Mining Frequent Itemsets Algorithm in Stream Data Based on Sliding Time Decay Window

Proceedings of the 2020 3rd International Conference on Artificial Intelligence and Pattern Recognition ◽

10.1145/3430199.3430226 ◽

2020 ◽

Author(s):

Xin Lu ◽

Shaonan Jin ◽

Xun Wang ◽

Jiao Yuan ◽

Kun Fu ◽

...

Keyword(s):

Frequent Itemsets ◽

Time Decay ◽

Stream Data ◽

Mining Frequent Itemsets

Download Full-text

An Efficient Approach for Interactive Mining of Frequent Itemsets

Advances in Web-Age Information Management - Lecture Notes in Computer Science ◽

10.1007/11563952_13 ◽

2005 ◽

pp. 138-149 ◽

Cited By ~ 1

Author(s):

Zhi-Hong Deng ◽

Xin Li ◽

Shi-Wei Tang

Keyword(s):

Frequent Itemsets ◽

Efficient Approach ◽

Interactive Mining

Download Full-text

Frequent Itemsets Mining of SCADA Data Based on FP-Growth Algorithm

2020 IEEE 4th Conference on Energy Internet and Energy System Integration (EI2) ◽

10.1109/ei250167.2020.9346885 ◽

2020 ◽

Author(s):

Xiaolei Ma ◽

Yongguang Li ◽

Ran Liu ◽

Yanjun Zhang ◽

Liya Ma ◽

...

Keyword(s):

Frequent Itemsets ◽

Frequent Itemsets Mining

Download Full-text

Towards Faster Mining of Disjunction-Based Concise Representations of Frequent Patterns

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213014500018 ◽

2014 ◽

Vol 23 (02) ◽

pp. 1450001

Author(s):

T. Hamrouni ◽

S. Ben Yahia ◽

E. Mephu Nguifo

Keyword(s):

Empirical Study ◽

Real Life ◽

Search Space ◽

Frequent Patterns ◽

Memory Consumption ◽

Efficient Tool ◽

Condensed Representation ◽

Benchmark Datasets ◽

Condensed Representations ◽

Amount Of Knowledge

In many real-life datasets, the number of extracted frequent patterns was shown to be huge, hampering the effective exploitation of such amount of knowledge by human experts. To overcome this limitation, exact condensed representations were introduced in order to offer a small-sized set of elements from which the faithful retrieval of all frequent patterns is possible. In this paper, we introduce a new exact condensed representation only based on particular elements from the disjunctive search space. In this space, a pattern is characterized by its disjunctive support, i.e., the frequency of complementary occurrences – instead of the ubiquitous co-occurrence link – of its items. For several benchmark datasets, this representation has been shown interesting in compactness terms compared to the pioneering approaches of the literature. In this respect, we mainly focus here on proposing an efficient tool for mining this representation. For this purpose, we introduce an algorithm, called DSSRM, dedicated to this task. We also propose several techniques to optimize its mining time as well as its memory consumption. The carried out empirical study on benchmark datasets shows that DSSRM is faster by several orders of magnitude than the MEP algorithm.

Download Full-text