Adaptive partitioning by local density-peaks: An efficient density-based clustering algorithm for analyzing molecular dynamics trajectories

Song Liu; Lizhe Zhu; Fu Kit Sheong; Wei Wang; Xuhui Huang

doi:10.1002/jcc.24664

Accelerating Density Peak Clustering Algorithm

Symmetry ◽

10.3390/sym11070859 ◽

2019 ◽

Vol 11 (7) ◽

pp. 859 ◽

Cited By ~ 1

Author(s):

Lin

Keyword(s):

Clustering Algorithm ◽

Local Density ◽

Early Stage ◽

Separation Distance ◽

Density Peak ◽

Density Peaks ◽

Density Based Clustering ◽

Data Points ◽

Data Point ◽

Density Peak Clustering

The Density Peak Clustering (DPC) algorithm is a new density-based clustering method. It spends most of its execution time on calculating the local density and the separation distance for each data point in a dataset. The purpose of this study is to accelerate its computation. On average, the DPC algorithm scans half of the dataset to calculate the separation distance of each data point. We propose an approach to calculate the separation distance of a data point by scanning only the neighbors of the data point. Additionally, the purpose of the separation distance is to assist in choosing the density peaks, which are the data points with both high local density and high separation distance. We propose an approach to identify non-peak data points at an early stage to avoid calculating their separation distances. Our experimental results show that most of the data points in a dataset can benefit from the proposed approaches to accelerate the DPC algorithm.

Download Full-text

IMPROVED DENSITY BASED ALGORITHM FOR DATA STREAM CLUSTERING

Jurnal Teknologi ◽

10.11113/jt.v77.6492 ◽

2015 ◽

Vol 77 (18) ◽

Cited By ~ 2

Author(s):

Maryam Mousavi ◽

Azuraliza Abu Bakar

Keyword(s):

Data Streams ◽

Data Stream ◽

Clustering Algorithm ◽

Local Density ◽

Clustering Methods ◽

Clustering Techniques ◽

Stream Clustering ◽

Density Based Clustering ◽

Clustering Quality ◽

Data Stream Clustering

In recent years, clustering methods have attracted more attention in analysing and monitoring data streams. Density-based techniques are the remarkable category of clustering techniques that are able to detect the clusters with arbitrary shapes and noises. However, finding the clusters with local density varieties is a difficult task. For handling this problem, in this paper, a new density-based clustering algorithm for data streams is proposed. This algorithm can improve the offline phase of density-based algorithm based on MinPts parameter. The experimental results show that the proposed technique can improve the clustering quality in data streams with different densities.

Download Full-text

Density Peaks Clustering based on Nature Nearest Neighbor and Multi-cluster Mergers

10.21203/rs.3.rs-825405/v1 ◽

2021 ◽

Author(s):

Hui Ma ◽

Ruiqin Wang ◽

Shuai Yang

Keyword(s):

Clustering Algorithm ◽

Nearest Neighbor ◽

Local Density ◽

Density Peaks ◽

Density Peaks Clustering ◽

Assignment Strategy ◽

Cutoff Distance ◽

Clustering Effect ◽

Cluster Mergers ◽

Selection Of

Abstract Clustering by fast search and find of Density Peaks (DPC) has the advantages of being simple, efficient, and capable of detecting arbitrary shapes, etc. However, there are still some shortcomings: 1) the cutoff distance is specified in advance, and the selection of local density formula will affect the final clustering effect; 2) after the cluster centers are found, the assignment strategy of the remaining points may produce “Domino effect”, that is, once a point is misallocated, more points may be misallocated subsequently. To overcome these shortcomings, we propose a density peaks clustering algorithm based on natural nearest neighbor and multi-cluster mergers. In this algorithm, a weighted local density calculation method is designed by the natural nearest neighbor, which avoids the selection of cutoff distance and the selection of the local density formula. This algorithm uses a new two-stage assignment strategy to assign the remaining points to the most suitable clusters, thus reducing assignment errors. The experiment was carried out on some artificial and real-world datasets. The experimental results show that the clustering effect of this algorithm is better than those other related algorithms.

Download Full-text

Clustering Mixed Data Based on Density Peaks and Stacked Denoising Autoencoders

Symmetry ◽

10.3390/sym11020163 ◽

2019 ◽

Vol 11 (2) ◽

pp. 163

Author(s):

Baobin Duan ◽

Lixin Han ◽

Zhinan Gou ◽

Yi Yang ◽

Shuangshuang Chen

Keyword(s):

Clustering Algorithm ◽

Local Density ◽

Clustering Algorithms ◽

Feature Space ◽

Original Data ◽

Mixed Data ◽

Feature Representations ◽

Density Peaks ◽

Categorical Attributes ◽

Data Objects

With the universal existence of mixed data with numerical and categorical attributes in real world, a variety of clustering algorithms have been developed to discover the potential information hidden in mixed data. Most existing clustering algorithms often compute the distances or similarities between data objects based on original data, which may cause the instability of clustering results because of noise. In this paper, a clustering framework is proposed to explore the grouping structure of the mixed data. First, the transformed categorical attributes by one-hot encoding technique and normalized numerical attributes are input to a stacked denoising autoencoders to learn the internal feature representations. Secondly, based on these feature representations, all the distances between data objects in feature space can be calculated and the local density and relative distance of each data object can be also computed. Thirdly, the density peaks clustering algorithm is improved and employed to allocate all the data objects into different clusters. Finally, experiments conducted on some UCI datasets have demonstrated that our proposed algorithm for clustering mixed data outperforms three baseline algorithms in terms of the clustering accuracy and the rand index.

Download Full-text

A Novel Hierarchical Clustering Algorithm Based on Density Peaks for Complex Datasets

Complexity ◽

10.1155/2018/2032461 ◽

2018 ◽

Vol 2018 ◽

pp. 1-8 ◽

Cited By ~ 5

Author(s):

Rong Zhou ◽

Yong Zhang ◽

Shengzhong Feng ◽

Nurbol Luktarhan

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Local Density ◽

Clustering Algorithms ◽

Complex Structure ◽

Density Peak ◽

Global Parameter ◽

Density Peaks ◽

Complex Datasets

Clustering aims to differentiate objects from different groups (clusters) by similarities or distances between pairs of objects. Numerous clustering algorithms have been proposed to investigate what factors constitute a cluster and how to efficiently find them. The clustering by fast search and find of density peak algorithm is proposed to intuitively determine cluster centers and assign points to corresponding partitions for complex datasets. This method incorporates simple structure due to the noniterative logic and less few parameters; however, the guidelines for parameter selection and center determination are not explicit. To tackle these problems, we propose an improved hierarchical clustering method HCDP aiming to represent the complex structure of the dataset. A k-nearest neighbor strategy is integrated to compute the local density of each point, avoiding to select the nonnecessary global parameter dc and enables cluster smoothing and condensing. In addition, a new clustering evaluation approach is also introduced to extract a “flat” and “optimal” partition solution from the structure by adaptively computing the clustering stability. The proposed approach is conducted on some applications with complex datasets, where the results demonstrate that the novel method outperforms its counterparts to a large extent.

Download Full-text

Clustering by Detecting Density Peaks and Assigning Points by Similarity-First Search Based on Weighted K-Nearest Neighbors Graph

Complexity ◽

10.1155/2020/1731075 ◽

2020 ◽

Vol 2020 ◽

pp. 1-17

Author(s):

Qi Diao ◽

Yaping Dai ◽

Qichao An ◽

Weixing Li ◽

Xiaoxue Feng ◽

...

Keyword(s):

Clustering Algorithm ◽

Spatial Clustering ◽

Local Density ◽

Search Algorithm ◽

Real Data ◽

Nearest Neighbors ◽

Adjusted Rand Index ◽

Clustering Methods ◽

K Nearest Neighbors ◽

Density Peaks

This paper presents an improved clustering algorithm for categorizing data with arbitrary shapes. Most of the conventional clustering approaches work only with round-shaped clusters. This task can be accomplished by quickly searching and finding clustering methods for density peaks (DPC), but in some cases, it is limited by density peaks and allocation strategy. To overcome these limitations, two improvements are proposed in this paper. To describe the clustering center more comprehensively, the definitions of local density and relative distance are fused with multiple distances, including K-nearest neighbors (KNN) and shared-nearest neighbors (SNN). A similarity-first search algorithm is designed to search the most matching cluster centers for noncenter points in a weighted KNN graph. Extensive comparison with several existing DPC methods, e.g., traditional DPC algorithm, density-based spatial clustering of applications with noise (DBSCAN), affinity propagation (AP), FKNN-DPC, and K-means methods, has been carried out. Experiments based on synthetic data and real data show that the proposed clustering algorithm can outperform DPC, DBSCAN, AP, and K-means in terms of the clustering accuracy (ACC), the adjusted mutual information (AMI), and the adjusted Rand index (ARI).

Download Full-text

Quantum algorithm for MMNG-based DBSCAN

Scientific Reports ◽

10.1038/s41598-021-95156-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Xuming Xie ◽

Longzhen Duan ◽

Taorong Qiu ◽

Junru Li

Keyword(s):

Domain Knowledge ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Local Density ◽

Quantum Algorithm ◽

Neighbor Graph ◽

Dbscan Algorithm ◽

Density Based Clustering ◽

Input Parameters ◽

Nearest Neighbor Graph

AbstractDBSCAN is a famous density-based clustering algorithm that can discover clusters with arbitrary shapes without the minimal requirements of domain knowledge to determine the input parameters. However, DBSCAN is not suitable for databases with different local-density clusters and is also a very time-consuming clustering algorithm. In this paper, we present a quantum mutual MinPts-nearest neighbor graph (MMNG)-based DBSCAN algorithm. The proposed algorithm performs better on databases with different local-density clusters. Furthermore, the proposed algorithm has a dramatic increase in speed compared to its classic counterpart.

Download Full-text

An Adaptive Clustering Algorithm Based on Local-Density Peaks for Imbalanced Data without Parameters

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2021.3138962 ◽

2021 ◽

pp. 1-1

Author(s):

Wuning Tong ◽

Yuping Wang ◽

Delong Liu

Keyword(s):

Clustering Algorithm ◽

Local Density ◽

Imbalanced Data ◽

Adaptive Clustering ◽

Density Peaks

Download Full-text

A novel bidirectional clustering algorithm based on local density

Scientific Reports ◽

10.1038/s41598-021-93244-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Baicheng Lyu ◽

Wenhua Wu ◽

Zhiqiang Hu

Keyword(s):

Clustering Algorithm ◽

Local Density ◽

Clustering Algorithms ◽

Cluster Number ◽

Denoising Method ◽

Number Of Clusters ◽

Data Points ◽

Cutoff Distance ◽

Large Clusters ◽

Small Clusters

AbstractWith the widely application of cluster analysis, the number of clusters is gradually increasing, as is the difficulty in selecting the judgment indicators of cluster numbers. Also, small clusters are crucial to discovering the extreme characteristics of data samples, but current clustering algorithms focus mainly on analyzing large clusters. In this paper, a bidirectional clustering algorithm based on local density (BCALoD) is proposed. BCALoD establishes the connection between data points based on local density, can automatically determine the number of clusters, is more sensitive to small clusters, and can reduce the adjusted parameters to a minimum. On the basis of the robustness of cluster number to noise, a denoising method suitable for BCALoD is proposed. Different cutoff distance and cutoff density are assigned to each data cluster, which results in improved clustering performance. Clustering ability of BCALoD is verified by randomly generated datasets and city light satellite images.

Download Full-text

Adaptive Density-Based Clustering Algorithm with Shared KNN Conflict Game

Information Sciences ◽

10.1016/j.ins.2021.02.017 ◽

2021 ◽

Author(s):

Rui Zhang ◽

Tao Du ◽

Shouning Qu ◽

Hongwei Sun

Keyword(s):

Clustering Algorithm ◽

Density Based Clustering

Download Full-text