K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks

Jie Hou; Xiufen Ye; Chuanlong Li; Yixing Wang

doi:10.3390/genes12010087

K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks

Genes ◽

10.3390/genes12010087 ◽

2021 ◽

Vol 12 (1) ◽

pp. 87

Author(s):

Jie Hou ◽

Xiufen Ye ◽

Chuanlong Li ◽

Yixing Wang

Keyword(s):

Network Analysis ◽

Hierarchical Clustering ◽

Biological Networks ◽

Clustering Algorithms ◽

Strong Stability ◽

Similarity Matrix ◽

Major Drawback ◽

High Enrichment ◽

Gene Modules ◽

Dynamic Tree

Among biological networks, co-expression networks have been widely studied. One of the most commonly used pipelines for the construction of co-expression networks is weighted gene co-expression network analysis (WGCNA), which can identify highly co-expressed clusters of genes (modules). WGCNA identifies gene modules using hierarchical clustering. The major drawback of hierarchical clustering is that once two objects are clustered together, it cannot be reversed; thus, re-adjustment of the unbefitting decision is impossible. In this paper, we calculate the similarity matrix with the distance correlation for WGCNA to construct a gene co-expression network, and present a new approach called the k-module algorithm to improve the WGCNA clustering results. This method can assign all genes to the module with the highest mean connectivity with these genes. This algorithm re-adjusts the results of hierarchical clustering while retaining the advantages of the dynamic tree cut method. The validity of the algorithm is verified using six datasets from microarray and RNA-seq data. The k-module algorithm has fewer iterations, which leads to lower complexity. We verify that the gene modules obtained by the k-module algorithm have high enrichment scores and strong stability. Our method improves upon hierarchical clustering, and can be applied to general clustering algorithms based on the similarity matrix, not limited to gene co-expression network analysis.

Download Full-text

Neighbors-based divisive algorithm for hierarchical analysis in networks

International Journal of Modern Physics C ◽

10.1142/s0129183119400035 ◽

2019 ◽

Vol 30 (07) ◽

pp. 1940003

Author(s):

Junhai Luo ◽

Lei Ye ◽

Xiaoting He

Keyword(s):

Community Structure ◽

Network Analysis ◽

Hierarchical Clustering ◽

Network Structure ◽

Real World ◽

Clustering Algorithms ◽

Hierarchical Analysis ◽

Multilevel Structure ◽

Common Technique ◽

The Common

Hierarchical analysis for network structure can point out which communities can constitute a larger group or give reasonable smaller groups within a community. Numerous methods for discovering community in networks divide networks at only one certain granularity, which does not benefit hierarchical analysis for network structure. Hierarchical clustering algorithms are the common technique that reveals the multilevel structure in the network analysis. In this work, we give a definition for scores of edges according to the basic idea of means clustering. Based on the definition, a neighbors-based divisive algorithm named neighbor-means (NM) is proposed to detect communities in networks, especially for hierarchical analysis. The divisive algorithm repeatedly removes the edge with the highest score to obtain hierarchical partitions but can recalculate the scores of edges quickly with local recalculating strategy and crucial change-rules, which makes its complexity much lower than many divisive algorithms. In addition, when the community structure is ambiguous, benefited from superiority of the defined scores, our method achieves better results than many divisive and agglomerative algorithms. Experiments with artificial and real-world networks demonstrate the superiority of neighbor-means in detecting community structure.

Download Full-text

Handling WSD using Hierarchical Clustering Algorithm with sentences

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset1841120 ◽

2018 ◽

pp. 83-88

Author(s):

Mohana Priya K ◽

Pooja Ragavi S ◽

Krishna Priya G

Keyword(s):

Hierarchical Clustering ◽

Similarity Measure ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Cosine Similarity Measure ◽

Hierarchical Clustering Algorithm ◽

Multiple Levels ◽

Pos Tagger ◽

Sentence Clustering ◽

The Right

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%

Download Full-text

Comparing SOM neural network with Fuzzy c-means, K-means and traditional hierarchical clustering algorithms

European Journal of Operational Research ◽

10.1016/j.ejor.2005.03.039 ◽

2006 ◽

Vol 174 (3) ◽

pp. 1742-1759 ◽

Cited By ~ 150

Author(s):

Sueli A. Mingoti ◽

Joab O. Lima

Keyword(s):

Neural Network ◽

Hierarchical Clustering ◽

Clustering Algorithms ◽

Fuzzy C Means ◽

Som Neural Network

Download Full-text

Hesitant Fuzzy Linguistic Agglomerative Hierarchical Clustering Algorithm and Its Application in Judicial Practice

Mathematics ◽

10.3390/math9040370 ◽

2021 ◽

Vol 9 (4) ◽

pp. 370

Author(s):

Shuangsheng Wu ◽

Jie Lin ◽

Zhenyu Zhang ◽

Yushu Yang

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Agglomerative Hierarchical Clustering ◽

Research Gaps ◽

Judicial Practice ◽

Linguistic Term ◽

Clustering Effect ◽

Hierarchical Clustering Algorithm ◽

Fuzzy Linguistic

The fuzzy clustering algorithm has become a research hotspot in many fields because of its better clustering effect and data expression ability. However, little research focuses on the clustering of hesitant fuzzy linguistic term sets (HFLTSs). To fill in the research gaps, we extend the data type of clustering to hesitant fuzzy linguistic information. A kind of hesitant fuzzy linguistic agglomerative hierarchical clustering algorithm is proposed. Furthermore, we propose a hesitant fuzzy linguistic Boole matrix clustering algorithm and compare the two clustering algorithms. The proposed clustering algorithms are applied in the field of judicial execution, which provides decision support for the executive judge to determine the focus of the investigation and the control. A clustering example verifies the clustering algorithm’s effectiveness in the context of hesitant fuzzy linguistic decision information.

Download Full-text

Network Analysis Based on Unique Spectral Features Enables an Efficient Selection of Genomically Diverse Operational Isolation Units

Microorganisms ◽

10.3390/microorganisms9020416 ◽

2021 ◽

Vol 9 (2) ◽

pp. 416

Author(s):

Charles Dumolin ◽

Charlotte Peeters ◽

Evelien De Canck ◽

Nico Boon ◽

Peter Vandamme

Keyword(s):

Network Analysis ◽

Hierarchical Clustering ◽

Mass Spectra ◽

Spectral Features ◽

Maldi Tof ◽

Maldi Tof Mass Spectra ◽

Diversity Studies ◽

Technical Sample ◽

Efficient Selection ◽

Selection Of

Culturomics-based bacterial diversity studies benefit from the implementation of MALDI-TOF MS to remove genomically redundant isolates from isolate collections. We previously introduced SPeDE, a novel tool designed to dereplicate spectral datasets at an infraspecific level into operational isolation units (OIUs) based on unique spectral features. However, biological and technical variation may result in methodology-induced differences in MALDI-TOF mass spectra and hence provoke the detection of genomically redundant OIUs. In the present study, we used three datasets to analyze to which extent hierarchical clustering and network analysis allowed to eliminate redundant OIUs obtained through biological and technical sample variation and to describe the diversity within a set of spectra obtained from 134 unknown soil isolates. Overall, network analysis based on unique spectral features in MALDI-TOF mass spectra enabled a superior selection of genomically diverse OIUs compared to hierarchical clustering analysis and provided a better understanding of the inter-OIU relationships.

Download Full-text

Development of the WEEE grouping system in South Korea using the hierarchical and non-hierarchical clustering algorithms

Resources Conservation and Recycling ◽

10.1016/j.resconrec.2020.104884 ◽

2020 ◽

Vol 161 ◽

pp. 104884 ◽

Cited By ~ 1

Author(s):

Jihwan Park ◽

Keon Vin Park ◽

Soohyun Yoo ◽

Sang Ok Choi ◽

Sung Won Han

Keyword(s):

South Korea ◽

Hierarchical Clustering ◽

Clustering Algorithms

Download Full-text

Towards Expert-Inspired Automatic Criterion to Cut a Dendrogram for Real-Industrial Applications

10.3233/faia210140 ◽

2021 ◽

Author(s):

Shikha Suman ◽

Ashutosh Karna ◽

Karina Gibert

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithms ◽

Computational Cost ◽

Real Life ◽

Ground Truth ◽

Industrial Applications ◽

Underlying Structure ◽

Cluster Validity ◽

Cluster Validity Index ◽

Number Of Clusters

Hierarchical clustering is one of the most preferred choices to understand the underlying structure of a dataset and defining typologies, with multiple applications in real life. Among the existing clustering algorithms, the hierarchical family is one of the most popular, as it permits to understand the inner structure of the dataset and find the number of clusters as an output, unlike popular methods, like k-means. One can adjust the granularity of final clustering to the goals of the analysis themselves. The number of clusters in a hierarchical method relies on the analysis of the resulting dendrogram itself. Experts have criteria to visually inspect the dendrogram and determine the number of clusters. Finding automatic criteria to imitate experts in this task is still an open problem. But, dependence on the expert to cut the tree represents a limitation in real applications like the fields industry 4.0 and additive manufacturing. This paper analyses several cluster validity indexes in the context of determining the suitable number of clusters in hierarchical clustering. A new Cluster Validity Index (CVI) is proposed such that it properly catches the implicit criteria used by experts when analyzing dendrograms. The proposal has been applied on a range of datasets and validated against experts ground-truth overcoming the results obtained by the State of the Art and also significantly reduces the computational cost.

Download Full-text

An Approach to Spatiotemporal Trajectory Clustering Based on Community Detection

Wireless Communications and Mobile Computing ◽

10.1155/2021/5582341 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Xin Wang ◽

Xinzheng Niu ◽

Jiahui Zhu ◽

Zuoyan Liu

Keyword(s):

Community Detection ◽

Trajectory Analysis ◽

Moving Objects ◽

Clustering Algorithms ◽

Similarity Measures ◽

Detection Algorithm ◽

Similarity Matrix ◽

Trajectory Data ◽

Trajectory Similarity ◽

Community Detection Algorithm

Nowadays, large volumes of multimodal data have been collected for analysis. An important type of data is trajectory data, which contains both time and space information. Trajectory analysis and clustering are essential to learn the pattern of moving objects. Computing trajectory similarity is a key aspect of trajectory analysis, but it is very time consuming. To address this issue, this paper presents an improved branch and bound strategy based on time slice segmentation, which reduces the time to obtain the similarity matrix by decreasing the number of distance calculations required to compute similarity. Then, the similarity matrix is transformed into a trajectory graph and a community detection algorithm is applied on it for clustering. Extensive experiments were done to compare the proposed algorithms with existing similarity measures and clustering algorithms. Results show that the proposed method can effectively mine the trajectory cluster information from the spatiotemporal trajectories.

Download Full-text

Improved minimum-minimum roughness algorithm for clustering categorical data

International Journal of ADVANCED AND APPLIED SCIENCES ◽

10.21833/ijaas.2021.10.006 ◽

2021 ◽

Vol 8 (10) ◽

pp. 43-50

Author(s):

Truong et al. ◽

Keyword(s):

Machine Learning ◽

Data Mining ◽

Hierarchical Clustering ◽

Categorical Data ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Experimental Results ◽

Data Sets ◽

Top Down ◽

Hierarchical Clustering Algorithm

Clustering is a fundamental technique in data mining and machine learning. Recently, many researchers are interested in the problem of clustering categorical data and several new approaches have been proposed. One of the successful and pioneering clustering algorithms is the Minimum-Minimum Roughness algorithm (MMR) which is a top-down hierarchical clustering algorithm and can handle the uncertainty in clustering categorical data. However, MMR tends to choose the category with less value leaf node with more objects, leading to undesirable clustering results. To overcome such shortcomings, this paper proposes an improved version of the MMR algorithm for clustering categorical data, called IMMR (Improved Minimum-Minimum Roughness). Experimental results on actual data sets taken from UCI show that the IMMR algorithm outperforms MMR in clustering categorical data.

Download Full-text

Efficient Assessment on Hierarchical Clustering Algorithms in Wireless Sensor Networks

International Journal of Science and Research (IJSR) ◽

10.21275/v5i2.nov161373 ◽

2016 ◽

Vol 5 (2) ◽

pp. 1422-1425

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Hierarchical Clustering ◽

Clustering Algorithms ◽

Wireless Sensor

Download Full-text