scholarly journals An Incremental High-Utility Mining Algorithm with Transaction Insertion

2015 ◽  
Vol 2015 ◽  
pp. 1-15 ◽  
Author(s):  
Jerry Chun-Wei Lin ◽  
Wensheng Gan ◽  
Tzung-Pei Hong ◽  
Binbin Zhang

Association-rule mining is commonly used to discover useful and meaningful patterns from a very large database. It only considers the occurrence frequencies of items to reveal the relationships among itemsets. Traditional association-rule mining is, however, not suitable in real-world applications since the purchased items from a customer may have various factors, such as profit or quantity. High-utility mining was designed to solve the limitations of association-rule mining by considering both the quantity and profit measures. Most algorithms of high-utility mining are designed to handle the static database. Fewer researches handle the dynamic high-utility mining with transaction insertion, thus requiring the computations of database rescan and combination explosion of pattern-growth mechanism. In this paper, an efficient incremental algorithm with transaction insertion is designed to reduce computations without candidate generation based on the utility-list structures. The enumeration tree and the relationships between 2-itemsets are also adopted in the proposed algorithm to speed up the computations. Several experiments are conducted to show the performance of the proposed algorithm in terms of runtime, memory consumption, and number of generated patterns.

2019 ◽  
Vol 9 (24) ◽  
pp. 5398 ◽  
Author(s):  
Iyad Aqra ◽  
Norjihan Abdul Ghani ◽  
Carsten Maple ◽  
José Machado ◽  
Nader Sohrabi Safa

Data mining is essentially applied to discover new knowledge from a database through an iterative process. The mining process may be time consuming for massive datasets. A widely used method related to knowledge discovery domain refers to association rule mining (ARM) approach, despite its shortcomings in mining large databases. As such, several approaches have been prescribed to unravel knowledge. Most of the proposed algorithms addressed data incremental issues, especially when a hefty amount of data are added to the database after the latest mining process. Three basic manipulation operations performed in a database include add, delete, and update. Any method devised in light of data incremental issues is bound to embed these three operations. The changing threshold is a long-standing problem within the data mining field. Since decision making refers to an active process, the threshold is indeed changeable. Accordingly, the present study proposes an algorithm that resolves the issue of rescanning a database that had been mined previously and allows retrieval of knowledge that satisfies several thresholds without the need to learn the process from scratch. The proposed approach displayed high accuracy in experimentation, as well as reduction in processing time by almost two-thirds of the original mining execution time.


Author(s):  
K Rajendra Prasad

Association rule mining is intently used for determining the frequent itemsets of transactional database; however, it is needed to consider the utility of itemsets in market behavioral applications. Apriori or FP-growth methods generate the association rules without utility factor of items. High-utility itemset mining (HUIM) is a well-known method that effectively determines the itemsets based on high-utility value and the resulting itemsets are known as high-utility itemsets. Fastest high-utility mining method (FHM) is an enhanced version of HUIM. FHM reduces the number of join operations during itemsets generation, so it is faster than HUIM. For large datasets, both methods are very expenisve. Proposed method addressed this issue by building pruning based utility co-occurrence structure (PEUCS) for elimatination of low-profit itemsets, thus, obviously it process only optimal number of high-utility itemsets, so it is called as optimal FHM (OFHM). Experimental results show that OFHM takes less computational runtime, therefore it is more efficient when compared to other existing methods for benchmarked large datasets.


2021 ◽  
Vol 12 (3) ◽  
pp. 58-77
Author(s):  
Sivamathi Abarajithan ◽  
S. Vijayarani Mohan

Association rule mining is an important and widely used data mining technique. It is used to retrieve highly related objects in a database based on the occurrence. Recently, utility-based association rules were proposed to consider significant factors of the object. The main objective of this research work is to retrieve high utility association rules from a database using cockroach swarm optimization algorithm. So far, in the literature, no optimization algorithm was proposed in utility-based association rule mining. In this research work, CSOUAR (cockroach swarm optimization for high utility association rule mining) algorithm was proposed to generate utility association rules. CSOUAR algorithm is based on three behaviours of cockroach: chase-swarming, dispersing, and ruthless. To analyse the performance of CSOUAR, an improved particle swarm optimization (PSO-UAR), animal migration optimization (AMO-UAR), bees swarm optimisation (BSO-UAR), and penguins search optimisation (peSO-UAR) are also proposed in this work.


2015 ◽  
Vol 6 (2) ◽  
Author(s):  
Rizal Setya Perdana ◽  
Umi Laili Yuhana

Kualitas perangkat lunak merupakan salah satu penelitian pada bidangrekayasa perangkat lunak yang memiliki peranan yang cukup besar dalamterbangunnya sistem perangkat lunak yang berkualitas baik. Prediksi defectperangkat lunak yang disebabkan karena terdapat penyimpangan dari prosesspesifikasi atau sesuatu yang mungkin menyebabkan kegagalan dalam operasionaltelah lebih dari 30 tahun menjadi topik riset penelitian. Makalah ini akandifokuskan pada prediksi defect yang terjadi pada kode program (code defect).Metode penanganan permasalahan defect pada kode program akan memanfaatkanpola-pola kode perangkat lunak yang berpotensi menimbulkan defect pada data setNASA untuk memprediksi defect. Metode yang digunakan dalam pencarian polaadalah memanfaatkan Association Rule Mining dengan Cumulative SupportThresholds yang secara otomatis menghasilkan nilai support dan nilai confidencepaling optimal tanpa membutuhkan masukan dari pengguna. Hasil pengujian darihasil pemrediksian defect kode perangkat lunak secara otomatis memiliki nilaiakurasi 82,35%.


Sign in / Sign up

Export Citation Format

Share Document