condensed representations
Recently Published Documents


TOTAL DOCUMENTS

22
(FIVE YEARS 0)

H-INDEX

5
(FIVE YEARS 0)

2020 ◽  
Vol 94 ◽  
pp. 103830
Author(s):  
Angelo Impedovo ◽  
Corrado Loglisci ◽  
Michelangelo Ceci ◽  
Donato Malerba

2019 ◽  
Vol 19 (04) ◽  
pp. 505-535
Author(s):  
SERGEY PARAMONOV ◽  
DARIA STEPANOVA ◽  
PAULI MIETTINEN

AbstractDetecting small sets of relevant patterns from a given data set is a central challenge in data mining. The relevance of a pattern is based on user-provided criteria; typically, all patterns that satisfy certain criteria are considered relevant. Rule-based languages like answer set programming (ASP) seem well suited for specifying such criteria in a form of constraints. Although progress has been made, on the one hand, on solving individual mining problems and, on the other hand, developing generic mining systems, the existing methods focus either on scalability or on generality. In this paper, we make steps toward combining local (frequency, size, and cost) and global (various condensed representations like maximal, closed, and skyline) constraints in a generic and efficient way. We present a hybrid approach for itemset, sequence, and graph mining which exploits dedicated highly optimized mining systems to detect frequent patterns and then filters the results using declarative ASP. To further demonstrate the generic nature of our hybrid framework, we apply it to a problem of approximately tiling a database. Experiments on real-world data sets show the effectiveness of the proposed method and computational gains for itemset, sequence, and graph mining, as well as approximate tiling.Under consideration in Theory and Practice of Logic Programming.


2017 ◽  
Vol 244 ◽  
pp. 48-69 ◽  
Author(s):  
Willy Ugarte ◽  
Patrice Boizumault ◽  
Bruno Crémilleux ◽  
Alban Lepailleur ◽  
Samir Loudni ◽  
...  

2014 ◽  
Vol 23 (02) ◽  
pp. 1450001
Author(s):  
T. Hamrouni ◽  
S. Ben Yahia ◽  
E. Mephu Nguifo

In many real-life datasets, the number of extracted frequent patterns was shown to be huge, hampering the effective exploitation of such amount of knowledge by human experts. To overcome this limitation, exact condensed representations were introduced in order to offer a small-sized set of elements from which the faithful retrieval of all frequent patterns is possible. In this paper, we introduce a new exact condensed representation only based on particular elements from the disjunctive search space. In this space, a pattern is characterized by its disjunctive support, i.e., the frequency of complementary occurrences – instead of the ubiquitous co-occurrence link – of its items. For several benchmark datasets, this representation has been shown interesting in compactness terms compared to the pioneering approaches of the literature. In this respect, we mainly focus here on proposing an efficient tool for mining this representation. For this purpose, we introduce an algorithm, called DSSRM, dedicated to this task. We also propose several techniques to optimize its mining time as well as its memory consumption. The carried out empirical study on benchmark datasets shows that DSSRM is faster by several orders of magnitude than the MEP algorithm.


Author(s):  
Alain Casali ◽  
Sébastien Nedjar ◽  
Rosine Cicchetti ◽  
Lotfi Lakhal

In multidimensional database mining, constrained multidimensional patterns differ from the well-known frequent patterns from both conceptual and log­ical points of view because of a common structure and the ability to support various types of constraints. Classical data mining techniques are based on the power set lattice of binary attribute values and, even adapted, are not suitable when addressing the discovery of constrained multidimensional patterns. In this chapter, the authors propose a foundation for various multidimensional database mining problems by introducing a new algebraic structure called cube lattice, which characterizes the search space to be explored. This chapter takes into consideration monotone and/or anti-monotone constraints enforced when mining multidimensional patterns. The authors propose condensed representations of the constrained cube lattice, which is a convex space, and present a generalized levelwise algorithm for computing them. Additionally, the authors consider the formalization of existing data cubes, and the discovery of frequent multidimensional patterns, while introducing a perfect concise representation from which any solution provided with its conjunction, disjunction and negation frequencies. Finally, emphasis on advantages of the cube lattice when compared to the power set lattice of binary attributes in multidimensional database mining are placed.


2011 ◽  
Vol 135-136 ◽  
pp. 21-25
Author(s):  
Hai Feng Li ◽  
Ning Zhang

Maximal frequent itemsets are one of several condensed representations of frequent itemsets, which store most of the information contained in frequent itemsets using less space, thus being more suitable for stream mining. This paper focuses on mining maximal frequent itemsets approximately over a stream landmark model. A false negative method is proposed based on Chernoff Bound to save the computing and memory cost. Our experimental results on a real world dataset show that our algorithm is effective and efficient.


Author(s):  
Jean-Francois Boulicaut

Condensed representations have been proposed in Mannila and Toivonen (1996) as a useful concept for the optimization of typical data-mining tasks. It appears as a key concept within the inductive database framework (Boulicaut et al., 1999; de Raedt, 2002; Imielinski & Mannila, 1996), and this article introduces this research domain, its achievements in the context of frequent itemset mining (FIM) from transactional data, and its future trends.


2010 ◽  
Vol 6 (3) ◽  
pp. 43-72 ◽  
Author(s):  
Alain Casali ◽  
Sébastien Nedjar ◽  
Rosine Cicchetti ◽  
Lotfi Lakhal

In multidimensional database mining, constrained multidimensional patterns differ from the well-known frequent patterns from both conceptual and log­ical points of view because of a common structure and the ability to support various types of constraints. Classical data mining techniques are based on the power set lattice of binary attribute values and, even adapted, are not suitable when addressing the discovery of constrained multidimen­sional patterns. In this paper, the authors propose a foundation for various multidimensional database mining problems by introducing a new algebraic struc­ture called cube lattice, which characterizes the search space to be explored. This paper takes into consideration monotone and/or anti-monotone constraints en­forced when mining multidimensional patterns. The authors propose condensed representations of the constrained cube lattice, which is a convex space, and present a generalized levelwise algorithm for computing them. Additionally, the authors consider the formalization of existing data cubes, and the discovery of frequent multidimensional patterns, while introducing a perfect concise representation from which any solution provided with its conjunction, disjunction and negation frequencies. Fi­nally, emphasis on advantages of the cube lattice when compared to the power set lattice of binary attributes in multidimensional database mining are placed.


Author(s):  
Nicolas Pasquier

After more than one decade of researches on association rule mining, efficient and scalable techniques for the discovery of relevant association rules from large high-dimensional datasets are now available. Most initial studies have focused on the development of theoretical frameworks and efficient algorithms and data structures for association rule mining. However, many applications of association rules to data from different domains have shown that techniques for filtering irrelevant and useless association rules are required to simplify their interpretation by the end-user. Solutions proposed to address this problem can be classified in four main trends: constraint-based mining, interestingness measures, association rule structure analysis, and condensed representations. This chapter focuses on condensed representations that are characterized in the frequent closed itemset framework to expose their advantages and drawbacks.


Sign in / Sign up

Export Citation Format

Share Document