Adaptive partitioning by local density-peaks: An efficient density-based clustering algorithm for analyzing molecular dynamics trajectories

2016 ◽  
Vol 38 (3) ◽  
pp. 152-160 ◽  
Author(s):  
Song Liu ◽  
Lizhe Zhu ◽  
Fu Kit Sheong ◽  
Wei Wang ◽  
Xuhui Huang
Symmetry ◽  
2019 ◽  
Vol 11 (7) ◽  
pp. 859 ◽  
Author(s):  
Lin

The Density Peak Clustering (DPC) algorithm is a new density-based clustering method. It spends most of its execution time on calculating the local density and the separation distance for each data point in a dataset. The purpose of this study is to accelerate its computation. On average, the DPC algorithm scans half of the dataset to calculate the separation distance of each data point. We propose an approach to calculate the separation distance of a data point by scanning only the neighbors of the data point. Additionally, the purpose of the separation distance is to assist in choosing the density peaks, which are the data points with both high local density and high separation distance. We propose an approach to identify non-peak data points at an early stage to avoid calculating their separation distances. Our experimental results show that most of the data points in a dataset can benefit from the proposed approaches to accelerate the DPC algorithm.


2015 ◽  
Vol 77 (18) ◽  
Author(s):  
Maryam Mousavi ◽  
Azuraliza Abu Bakar

In recent years, clustering methods have attracted more attention in analysing and monitoring data streams. Density-based techniques are the remarkable category of clustering techniques that are able to detect the clusters with arbitrary shapes and noises. However, finding the clusters with local density varieties is a difficult task. For handling this problem, in this paper, a new density-based clustering algorithm for data streams is proposed. This algorithm can improve the offline phase of density-based algorithm based on MinPts parameter. The experimental results show that the proposed technique can improve the clustering quality in data streams with different densities.


2021 ◽  
Author(s):  
Hui Ma ◽  
Ruiqin Wang ◽  
Shuai Yang

Abstract Clustering by fast search and find of Density Peaks (DPC) has the advantages of being simple, efficient, and capable of detecting arbitrary shapes, etc. However, there are still some shortcomings: 1) the cutoff distance is specified in advance, and the selection of local density formula will affect the final clustering effect; 2) after the cluster centers are found, the assignment strategy of the remaining points may produce “Domino effect”, that is, once a point is misallocated, more points may be misallocated subsequently. To overcome these shortcomings, we propose a density peaks clustering algorithm based on natural nearest neighbor and multi-cluster mergers. In this algorithm, a weighted local density calculation method is designed by the natural nearest neighbor, which avoids the selection of cutoff distance and the selection of the local density formula. This algorithm uses a new two-stage assignment strategy to assign the remaining points to the most suitable clusters, thus reducing assignment errors. The experiment was carried out on some artificial and real-world datasets. The experimental results show that the clustering effect of this algorithm is better than those other related algorithms.


Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 163
Author(s):  
Baobin Duan ◽  
Lixin Han ◽  
Zhinan Gou ◽  
Yi Yang ◽  
Shuangshuang Chen

With the universal existence of mixed data with numerical and categorical attributes in real world, a variety of clustering algorithms have been developed to discover the potential information hidden in mixed data. Most existing clustering algorithms often compute the distances or similarities between data objects based on original data, which may cause the instability of clustering results because of noise. In this paper, a clustering framework is proposed to explore the grouping structure of the mixed data. First, the transformed categorical attributes by one-hot encoding technique and normalized numerical attributes are input to a stacked denoising autoencoders to learn the internal feature representations. Secondly, based on these feature representations, all the distances between data objects in feature space can be calculated and the local density and relative distance of each data object can be also computed. Thirdly, the density peaks clustering algorithm is improved and employed to allocate all the data objects into different clusters. Finally, experiments conducted on some UCI datasets have demonstrated that our proposed algorithm for clustering mixed data outperforms three baseline algorithms in terms of the clustering accuracy and the rand index.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-8 ◽  
Author(s):  
Rong Zhou ◽  
Yong Zhang ◽  
Shengzhong Feng ◽  
Nurbol Luktarhan

Clustering aims to differentiate objects from different groups (clusters) by similarities or distances between pairs of objects. Numerous clustering algorithms have been proposed to investigate what factors constitute a cluster and how to efficiently find them. The clustering by fast search and find of density peak algorithm is proposed to intuitively determine cluster centers and assign points to corresponding partitions for complex datasets. This method incorporates simple structure due to the noniterative logic and less few parameters; however, the guidelines for parameter selection and center determination are not explicit. To tackle these problems, we propose an improved hierarchical clustering method HCDP aiming to represent the complex structure of the dataset. A k-nearest neighbor strategy is integrated to compute the local density of each point, avoiding to select the nonnecessary global parameter dc and enables cluster smoothing and condensing. In addition, a new clustering evaluation approach is also introduced to extract a “flat” and “optimal” partition solution from the structure by adaptively computing the clustering stability. The proposed approach is conducted on some applications with complex datasets, where the results demonstrate that the novel method outperforms its counterparts to a large extent.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Qi Diao ◽  
Yaping Dai ◽  
Qichao An ◽  
Weixing Li ◽  
Xiaoxue Feng ◽  
...  

This paper presents an improved clustering algorithm for categorizing data with arbitrary shapes. Most of the conventional clustering approaches work only with round-shaped clusters. This task can be accomplished by quickly searching and finding clustering methods for density peaks (DPC), but in some cases, it is limited by density peaks and allocation strategy. To overcome these limitations, two improvements are proposed in this paper. To describe the clustering center more comprehensively, the definitions of local density and relative distance are fused with multiple distances, including K-nearest neighbors (KNN) and shared-nearest neighbors (SNN). A similarity-first search algorithm is designed to search the most matching cluster centers for noncenter points in a weighted KNN graph. Extensive comparison with several existing DPC methods, e.g., traditional DPC algorithm, density-based spatial clustering of applications with noise (DBSCAN), affinity propagation (AP), FKNN-DPC, and K-means methods, has been carried out. Experiments based on synthetic data and real data show that the proposed clustering algorithm can outperform DPC, DBSCAN, AP, and K-means in terms of the clustering accuracy (ACC), the adjusted mutual information (AMI), and the adjusted Rand index (ARI).


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Xuming Xie ◽  
Longzhen Duan ◽  
Taorong Qiu ◽  
Junru Li

AbstractDBSCAN is a famous density-based clustering algorithm that can discover clusters with arbitrary shapes without the minimal requirements of domain knowledge to determine the input parameters. However, DBSCAN is not suitable for databases with different local-density clusters and is also a very time-consuming clustering algorithm. In this paper, we present a quantum mutual MinPts-nearest neighbor graph (MMNG)-based DBSCAN algorithm. The proposed algorithm performs better on databases with different local-density clusters. Furthermore, the proposed algorithm has a dramatic increase in speed compared to its classic counterpart.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Baicheng Lyu ◽  
Wenhua Wu ◽  
Zhiqiang Hu

AbstractWith the widely application of cluster analysis, the number of clusters is gradually increasing, as is the difficulty in selecting the judgment indicators of cluster numbers. Also, small clusters are crucial to discovering the extreme characteristics of data samples, but current clustering algorithms focus mainly on analyzing large clusters. In this paper, a bidirectional clustering algorithm based on local density (BCALoD) is proposed. BCALoD establishes the connection between data points based on local density, can automatically determine the number of clusters, is more sensitive to small clusters, and can reduce the adjusted parameters to a minimum. On the basis of the robustness of cluster number to noise, a denoising method suitable for BCALoD is proposed. Different cutoff distance and cutoff density are assigned to each data cluster, which results in improved clustering performance. Clustering ability of BCALoD is verified by randomly generated datasets and city light satellite images.


Sign in / Sign up

Export Citation Format

Share Document