Detection of Complexes in Biological Networks Through Diversified Dense Subgraph Mining

Xiuli Ma; Guangyu Zhou; Jingbo Shang; Jingjing Wang; Jian Peng; Jiawei Han

doi:10.1089/cmb.2017.0037

Detection of Complexes in Biological Networks Through Diversified Dense Subgraph Mining

Journal of Computational Biology ◽

10.1089/cmb.2017.0037 ◽

2017 ◽

Vol 24 (9) ◽

pp. 923-941 ◽

Cited By ~ 4

Author(s):

Xiuli Ma ◽

Guangyu Zhou ◽

Jingbo Shang ◽

Jingjing Wang ◽

Jian Peng ◽

...

Keyword(s):

Biological Networks ◽

Dense Subgraph ◽

Subgraph Mining ◽

Dense Subgraph Mining

Download Full-text

Scalable mining of maximal quasi-cliques

Proceedings of the VLDB Endowment ◽

10.14778/3436905.3436916 ◽

2020 ◽

Vol 14 (4) ◽

pp. 573-585

Author(s):

Guimu Guo ◽

Da Yan ◽

M. Tamer Özsu ◽

Zhe Jiang ◽

Jalal Khalil

Keyword(s):

Graph Mining ◽

State Of The Art ◽

Minimum Degree ◽

The Other ◽

Dense Subgraph ◽

Load Imbalance ◽

Subgraph Mining ◽

Execution Engine ◽

Clique Algorithm ◽

Dense Subgraph Mining

Given a user-specified minimum degree threshold γ , a γ -quasiclique is a subgraph g = (V g , E g ) where each vertex ν ∈ V g connects to at least γ fraction of the other vertices (i.e., ⌈ γ · (| V g |- 1)⌉ vertices) in g. Quasi-clique is one of the most natural definitions for dense structures useful in finding communities in social networks and discovering significant biomolecule structures and pathways. However, mining maximal quasi-cliques is notoriously expensive. In this paper, we design parallel algorithms for mining maximal quasi-cliques on G-thinker, a distributed graph mining framework that decomposes mining into compute-intensive tasks to fully utilize CPU cores. We found that directly using G-thinker results in the straggler problem due to (i) the drastic load imbalance among different tasks and (ii) the difficulty of predicting the task running time. We address these challenges by redesigning G-thinker's execution engine to prioritize long-running tasks for execution, and by utilizing a novel timeout strategy to effectively decompose long-running tasks to improve load balancing. While this system redesign applies to many other expensive dense subgraph mining problems, this paper verifies the idea by adapting the state-of-the-art quasi-clique algorithm, Quick, to our redesigned G-thinker. Extensive experiments verify that our new solution scales well with the number of CPU cores, achieving 201× runtime speedup when mining a graph with 3.77M vertices and 16.5M edges in a 16-node cluster.

Download Full-text

Dense subgraph mining with a mixed graph model

Pattern Recognition Letters ◽

10.1016/j.patrec.2013.03.035 ◽

2013 ◽

Vol 34 (11) ◽

pp. 1252-1262 ◽

Cited By ~ 3

Author(s):

Anita Keszler ◽

Tamás Szirányi ◽

Zsolt Tuza

Keyword(s):

Graph Model ◽

Dense Subgraph ◽

Mixed Graph ◽

Subgraph Mining ◽

Dense Subgraph Mining

Download Full-text

STeller: An approach for context-aware story detection using different similarity metrics and dense subgraph mining

2016 IEEE 20th International Conference on Computer Supported Cooperative Work in Design (CSCWD) ◽

10.1109/cscwd.2016.7565980 ◽

2016 ◽

Cited By ~ 2

Author(s):

Meng Zhao ◽

Chen Zhang ◽

Siyu Lu ◽

Hui Zhang

Keyword(s):

Similarity Metrics ◽

Context Aware ◽

Dense Subgraph ◽

Subgraph Mining ◽

Dense Subgraph Mining ◽

Story Detection

Download Full-text

Group attack detection in recommender systems based on triangle dense subgraph mining

2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA) ◽

10.1109/icaica52286.2021.9497958 ◽

2021 ◽

Author(s):

Hongtao Yu ◽

Shengyu Yuan ◽

Yishu Xu ◽

Ru Ma ◽

Dingli Gao ◽

...

Keyword(s):

Recommender Systems ◽

Attack Detection ◽

Dense Subgraph ◽

Subgraph Mining ◽

Dense Subgraph Mining

Download Full-text

Subspace Clustering Meets Dense Subgraph Mining: A Synthesis of Two Paradigms

2010 IEEE International Conference on Data Mining ◽

10.1109/icdm.2010.95 ◽

2010 ◽

Cited By ~ 53

Author(s):

Stephan Gunnemann ◽

Ines Farber ◽

Brigitte Boden ◽

Thomas Seidl

Keyword(s):

Subspace Clustering ◽

Dense Subgraph ◽

Subgraph Mining ◽

Dense Subgraph Mining

Download Full-text

GAMer: a synthesis of subspace clustering and dense subgraph mining

Knowledge and Information Systems ◽

10.1007/s10115-013-0640-z ◽

2013 ◽

Vol 40 (2) ◽

pp. 243-278 ◽

Cited By ~ 14

Author(s):

Stephan Günnemann ◽

Ines Färber ◽

Brigitte Boden ◽

Thomas Seidl

Keyword(s):

Subspace Clustering ◽

Dense Subgraph ◽

Subgraph Mining ◽

Dense Subgraph Mining

Download Full-text

An Enhancement of Grami Based on Threshold Policy for Pattern Big Graphs

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c9106.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2943-2949

Keyword(s):

Biological Networks ◽

Interconnection Networks ◽

Large Scale ◽

Pattern Mining ◽

Constraint Satisfaction Problem ◽

Interaction Network ◽

Frequent Subgraph Mining ◽

Threshold Policy ◽

Subgraph Mining ◽

Frequent Subgraph

Pattern Mining is the key mechanism to manage large scale data element. Frequent subgraph mining (FSM) considers isomorphism which is a subprocess of pattern mining is a well-studied problem in the data mining. Graphs are considered as a standard structure in many domains such as protein-protein interaction network in biological networks, wired or wireless interconnection networks, web data, etc. FSM is the task of finding all frequent subgraphs from a given database i.e. a single big graph or database of many graphs, whose support is greater than the given threshold value. Many databases consider small graphs for solving complex problems. The classification of graph depends upon the application requirement. A good mining architecture may prevent a lot of memory and time. This paper follows the Grami structure for the analysis of frequent subgraph mining and also introduces the 20% threshold policy for the enhancement of the directed pattern graphs. The constraint satisfaction problem (CSP) has been discussed and analyzed using the Grami approach. The proposed model is compared to Grami on twitter dataset based on the evaluation of time and memory consumed. The proposed algorithm shows an improvement of 3-4 % for both the parameters. The results show that the performance of Grami approach has been improved which shows a 6.6% reduction in time and 21% improvement in memory consumption using the proposed approach.

Download Full-text