An Empirical Seed Initialization Idea for K-Means Algorithm Inspired by CLIQUE Algorithm

Scalable mining of maximal quasi-cliques

Proceedings of the VLDB Endowment ◽

10.14778/3436905.3436916 ◽

2020 ◽

Vol 14 (4) ◽

pp. 573-585

Author(s):

Guimu Guo ◽

Da Yan ◽

M. Tamer Özsu ◽

Zhe Jiang ◽

Jalal Khalil

Keyword(s):

Graph Mining ◽

State Of The Art ◽

Minimum Degree ◽

The Other ◽

Dense Subgraph ◽

Load Imbalance ◽

Subgraph Mining ◽

Execution Engine ◽

Clique Algorithm ◽

Dense Subgraph Mining

Given a user-specified minimum degree threshold γ , a γ -quasiclique is a subgraph g = (V g , E g ) where each vertex ν ∈ V g connects to at least γ fraction of the other vertices (i.e., ⌈ γ · (| V g |- 1)⌉ vertices) in g. Quasi-clique is one of the most natural definitions for dense structures useful in finding communities in social networks and discovering significant biomolecule structures and pathways. However, mining maximal quasi-cliques is notoriously expensive. In this paper, we design parallel algorithms for mining maximal quasi-cliques on G-thinker, a distributed graph mining framework that decomposes mining into compute-intensive tasks to fully utilize CPU cores. We found that directly using G-thinker results in the straggler problem due to (i) the drastic load imbalance among different tasks and (ii) the difficulty of predicting the task running time. We address these challenges by redesigning G-thinker's execution engine to prioritize long-running tasks for execution, and by utilizing a novel timeout strategy to effectively decompose long-running tasks to improve load balancing. While this system redesign applies to many other expensive dense subgraph mining problems, this paper verifies the idea by adapting the state-of-the-art quasi-clique algorithm, Quick, to our redesigned G-thinker. Extensive experiments verify that our new solution scales well with the number of CPU cores, achieving 201× runtime speedup when mining a graph with 3.77M vertices and 16.5M edges in a 16-node cluster.

Download Full-text

Reversed Search Maximum Clique Algorithm Based on Recoloring

Advances in Intelligent Systems and Computing - Optimization of Complex Systems: Theory, Models, Algorithms and Applications ◽

10.1007/978-3-030-21803-4_46 ◽

2019 ◽

pp. 458-467

Author(s):

Deniss Kumlander ◽

Aleksandr Porošin

Keyword(s):

Maximum Clique ◽

Clique Algorithm

Download Full-text

A finding maximal clique algorithm for predicting loop of protein structure

Applied Mathematics and Computation ◽

10.1016/j.amc.2006.01.009 ◽

2006 ◽

Vol 180 (2) ◽

pp. 676-682 ◽

Cited By ~ 2

Author(s):

Xiaohong Shi ◽

LuoLiang ◽

Yan Wan ◽

Jin Xu

Keyword(s):

Protein Structure ◽

Maximal Clique ◽

Clique Algorithm

Download Full-text

An Improved Analysis for a Greedy Remote-Clique Algorithm Using Factor-Revealing LPs

Algorithmica ◽

10.1007/s00453-007-9142-2 ◽

2007 ◽

Vol 55 (1) ◽

pp. 42-59 ◽

Cited By ~ 14

Author(s):

Benjamin Birnbaum ◽

Kenneth J. Goldman

Keyword(s):

Clique Algorithm

Download Full-text

K-Anonymity Algorithm Based on CLIQUE for Green Manufacturing

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.312.714 ◽

2013 ◽

Vol 312 ◽

pp. 714-718

Author(s):

Zi Qi Zhao ◽

Xiao Jun Ye ◽

Chun Ping Li

Keyword(s):

Data Processing ◽

Processing Speed ◽

Clustering Analysis ◽

Time Complexity ◽

Clustering Algorithm ◽

Green Manufacturing ◽

Multidimensional Data ◽

Clustering Method ◽

Analysis Algorithm ◽

Clique Algorithm

Multidimensional clustering analysis algorithm is for a class of cell-based clustering method of processing speed quickly, time efficiency, mainly to CLIQUE representatives. With time efficient clustering algorithm CLIQUE algorithm can achieve multi-dimensional k - Anonymous the algorithm KLIQUE, KLIQUE algorithm based CLIQUE efficiently retained their CLIQUE algorithm time complexity of features, can play the CLIQUE multidimensional data for the large amount of data processing advantage.

Download Full-text

A clique algorithm for standard quadratic programming

Discrete Applied Mathematics ◽

10.1016/j.dam.2007.09.020 ◽

2008 ◽

Vol 156 (13) ◽

pp. 2439-2448 ◽

Cited By ~ 24

Author(s):

Andrea Scozzari ◽

Fabio Tardella

Keyword(s):

Quadratic Programming ◽

Clique Algorithm

Download Full-text

Exact Maximum Clique Algorithm for Different Graph Types Using Machine Learning

Mathematics ◽

10.3390/math10010097 ◽

2021 ◽

Vol 10 (1) ◽

pp. 97

Author(s):

Kristjan Reba ◽

Matej Guid ◽

Kati Rozman ◽

Dušanka Janežič ◽

Janez Konc

Keyword(s):

Machine Learning ◽

Maximum Clique ◽

Dynamic Algorithm ◽

Graph Theoretic ◽

Research Areas ◽

Novel Approach ◽

Search Speed ◽

Speed Up ◽

Clique Algorithm ◽

And Function

Finding a maximum clique is important in research areas such as computational chemistry, social network analysis, and bioinformatics. It is possible to compare the maximum clique size between protein graphs to determine their similarity and function. In this paper, improvements based on machine learning (ML) are added to a dynamic algorithm for finding the maximum clique in a protein graph, Maximum Clique Dynamic (MaxCliqueDyn; short: MCQD). This algorithm was published in 2007 and has been widely used in bioinformatics since then. It uses an empirically determined parameter, Tlimit, that determines the algorithm’s flow. We have extended the MCQD algorithm with an initial phase of a machine learning-based prediction of the Tlimit parameter that is best suited for each input graph. Such adaptability to graph types based on state-of-the-art machine learning is a novel approach that has not been used in most graph-theoretic algorithms. We show empirically that the resulting new algorithm MCQD-ML improves search speed on certain types of graphs, in particular molecular docking graphs used in drug design where they determine energetically favorable conformations of small molecules in a protein binding site. In such cases, the speed-up is twofold.

Download Full-text

The Improvement of the CLIQUE Algorithm Based on High Dimensional Data Cleansing

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.452-453.381 ◽

2012 ◽

Vol 452-453 ◽

pp. 381-385

Author(s):

Shao Peng Sun ◽

Kai Hu Hou ◽

Li Hua Chen

Keyword(s):

Data Warehouse ◽

High Dimensional Data ◽

High Dimensional ◽

Incremental Algorithms ◽

Data Cleansing ◽

Pruning Algorithm ◽

Testing Data ◽

Clique Algorithm ◽

Abnormal Points ◽

Low Dimensional

Many data cleansing algorithms are based on the low dimensional data currently, and can't meet the requirement of accuracy that data warehouse in the enterprise processes the high dimensional data. In this paper the idea of using the CLIQUE algorithm to process the high dimensional data was adopted. Aiming at the insufficient processing precision of this algorithm, the meshing and pruning algorithm were improved by using the dynamic incremental algorithms. The result of testing data shows that this algorithm can improve the accuracy of the clustering result and can effectively judge the similar clustering and abnormal points which support the high dimensional data cleansing.

Download Full-text

Data Mining for Social Network Analysis Using a CLIQUE Algorithm

Advances in Social Networking and Online Communities - Cognitive Social Mining Applications in Data Analytics and Forensics ◽

10.4018/978-1-5225-7522-1.ch009 ◽

2019 ◽

pp. 160-187

Author(s):

Phu Ngoc Vo ◽

Tran Vo Thi Ngoc

Keyword(s):

Data Mining ◽

Social Networks ◽

Social Network ◽

Social Network Analysis ◽

Network Analysis ◽

The Social ◽

The World ◽

Commercial Applications ◽

Clique Algorithm

Many different areas of computer science have been developed for many years in the world. Data mining is one of the fields which many algorithms, methods, and models have been built and applied to many commercial applications and research successfully. Many social networks have been invested and developed in the strongest way for the recent years in the world because they have had many big benefits as follows: they have been used by lots of users in the world and they have been applied to many business fields successfully. Thus, a lot of different techniques for the social networks have been generated. Unsurprisingly, the social network analysis is crucial at the present time in the world. To support this process, in this book chapter we have presented many simple concepts about data mining and social networking. In addition, we have also displayed a novel model of the data mining for the social network analysis using a CLIQUE algorithm successfully.

Download Full-text

On Importance of a Special Sorting in the Maximum-Weight Clique Algorithm Based on Colour Classes

Communications in Computer and Information Science - Modelling, Computation and Optimization in Information Systems and Management Sciences ◽

10.1007/978-3-540-87477-5_18 ◽

2008 ◽

pp. 165-174 ◽

Cited By ~ 3

Author(s):

Deniss Kumlander

Keyword(s):

Maximum Weight ◽

Clique Algorithm ◽

Colour Classes ◽

Maximum Weight Clique

Download Full-text