scholarly journals Improved Density Peaks Clustering Based on Natural Neighbor Expanded Group

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Lin Ding ◽  
Weihong Xu ◽  
Yuantao Chen

Density peaks clustering (DPC) is an advanced clustering technique due to its multiple advantages of efficiently determining cluster centers, fewer arguments, no iterations, no border noise, etc. However, it does suffer from the following defects: (1) difficult to determine a suitable value of its crucial cutoff distance parameter, (2) the local density metric is too simple to find out the proper center(s) of the sparse cluster(s), and (3) it is not robust that parts of prominent density peaks are remotely assigned. This paper proposes improved density peaks clustering based on natural neighbor expanded group (DPC-NNEG). The cores of the proposed algorithm contain two parts: (1) define natural neighbor expanded (NNE) and natural neighbor expanded group (NNEG) and (2) divide all NNEGs into a goal number of sets as the final clustering result, according to the closeness degree of NNEGs. At the same time, the paper provides the measurement of the closeness degree. We compared the state of the art with our proposal in public datasets, including several complex and real datasets. Experiments show the effectiveness and robustness of the proposed algorithm.

2020 ◽  
Vol 2020 ◽  
pp. 1-15 ◽  
Author(s):  
Lin Ding ◽  
Weihong Xu ◽  
Yuantao Chen

Density peaks clustering algorithm (DPC) has attracted the attention of many scholars because of its multiple advantages, including efficiently determining cluster centers, a lower number of parameters, no iterations, and no border noise. However, DPC does not provide a reliable and specific selection method of threshold (cutoff distance) and an automatic selection strategy of cluster centers. In this paper, we propose density peaks clustering by zero-pointed samples (DPC-ZPSs) of regional group borders. DPC-ZPS finds the subclusters and the cluster borders by zero-pointed samples (ZPSs). And then, subclusters are merged into individuals by comparing the density of edge samples. By iteration of the merger, the suitable dc and cluster centers are ensured. Finally, we compared state-of-the-art methods with our proposal in public datasets. Experiments show that our algorithm automatically determines cutoff distance and centers accurately.


2021 ◽  
Author(s):  
Hui Ma ◽  
Ruiqin Wang ◽  
Shuai Yang

Abstract Clustering by fast search and find of Density Peaks (DPC) has the advantages of being simple, efficient, and capable of detecting arbitrary shapes, etc. However, there are still some shortcomings: 1) the cutoff distance is specified in advance, and the selection of local density formula will affect the final clustering effect; 2) after the cluster centers are found, the assignment strategy of the remaining points may produce “Domino effect”, that is, once a point is misallocated, more points may be misallocated subsequently. To overcome these shortcomings, we propose a density peaks clustering algorithm based on natural nearest neighbor and multi-cluster mergers. In this algorithm, a weighted local density calculation method is designed by the natural nearest neighbor, which avoids the selection of cutoff distance and the selection of the local density formula. This algorithm uses a new two-stage assignment strategy to assign the remaining points to the most suitable clusters, thus reducing assignment errors. The experiment was carried out on some artificial and real-world datasets. The experimental results show that the clustering effect of this algorithm is better than those other related algorithms.


Author(s):  
Träger Sylvain ◽  
Tamò Giorgio ◽  
Aydin Deniz ◽  
Fonti Giulia ◽  
Audagnotto Martina ◽  
...  

Abstract Motivation Proteins are intrinsically dynamic entities. Flexibility sampling methods, such as molecular dynamics or those arising from integrative modeling strategies are now commonplace and enable the study of molecular conformational landscapes in many contexts. Resulting structural ensembles increase in size as technological and algorithmic advancements take place, making their analysis increasingly demanding. In this regard, cluster analysis remains a go-to approach for their classification. However, many state-of-the-art algorithms are restricted to specific cluster properties. Combined with tedious parameter fine-tuning, cluster analysis of protein structural ensembles suffers from the lack of a generally applicable and easy to use clustering scheme. Results We present CLoNe, an original Python-based clustering scheme that builds on the Density Peaks algorithm of Rodriguez and Laio. CLoNe relies on a probabilistic analysis of local density distributions derived from nearest neighbors to find relevant clusters regardless of cluster shape, size, distribution and amount. We show its capabilities on many toy datasets with properties otherwise dividing state-of-the-art approaches and improves on the original algorithm in key aspects. Applied to structural ensembles, CLoNe was able to extract meaningful conformations from membrane binding events and ligand-binding pocket opening as well as identify dominant dimerization motifs or inter-domain organization. CLoNe additionally saves clusters as individual trajectories for further analysis and provides scripts for automated use with molecular visualization software. Availability www.epfl.ch/labs/lbm/resources, github.com/LBM-EPFL/CLoNe Supplementary information Supplementary data are available at Bioinformatics online.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 34301-34317 ◽  
Author(s):  
Donghua Yu ◽  
Guojun Liu ◽  
Maozu Guo ◽  
Xiaoyan Liu ◽  
Shuang Yao

Author(s):  
M. Peng ◽  
W. Wan ◽  
Z. Liu ◽  
K. Di

The multi-source DEMs generated using the images acquired in the descent and landing phase and after landing contain supplementary information, and this makes it possible and beneficial to produce a higher-quality DEM through fusing the multi-scale DEMs. The proposed fusion method consists of three steps. First, source DEMs are split into small DEM patches, then the DEM patches are classified into a few groups by local density peaks clustering. Next, the grouped DEM patches are used for sub-dictionary learning by stochastic coordinate coding. The trained sub-dictionaries are combined into a dictionary for sparse representation. Finally, the simultaneous orthogonal matching pursuit (SOMP) algorithm is used to achieve sparse representation. We use the real DEMs generated from Chang’e-3 descent images and navigation camera (Navcam) stereo images to validate the proposed method. Through the experiments, we have reconstructed a seamless DEM with the highest resolution and the largest spatial coverage among the input data. The experimental results demonstrated the feasibility of the proposed method.


Author(s):  
Wenke Zang ◽  
Liyan Ren ◽  
Wenqian Zhang ◽  
Xiyu Liu

Clustering by fast search and finding of Density Peaks (called as DPC) introduced by Alex Rodríguez and Alessandro Laio attracted much attention in the field of pattern recognition and artificial intelligence. However, DPC still has a lot of defects that are not resolved. Firstly, the local density [Formula: see text] of point [Formula: see text] is affected by the cutoff distance [Formula: see text], which can influence the clustering result, especially for small real-world cases. Secondly, the number of clusters is still found intuitively by using the decision diagram to select the cluster centers. In order to overcome these defects, this paper proposes an automatic density peaks clustering approach using DNA genetic algorithm optimized data field and Gaussian process (referred to as ADPC-DNAGA). ADPC-DNAGA can extract the optimal value of threshold with the potential entropy of data field and automatically determine the cluster centers by Gaussian method. For any data set to be clustered, the threshold can be calculated from the data set objectively rather than the empirical estimation. The proposed clustering algorithm is benchmarked on publicly available synthetic and real-world datasets which are commonly used for testing the performance of clustering algorithms. The clustering results are compared not only with that of DPC but also with that of several well-known clustering algorithms such as Affinity Propagation, DBSCAN and Spectral Cluster. The experimental results demonstrate that our proposed clustering algorithm can find the optimal cutoff distance [Formula: see text], to automatically identify clusters, regardless of their shape and dimension of the embedded space, and can often outperform the comparisons.


2020 ◽  
Author(s):  
Xiaoning Yuan ◽  
Hang Yu ◽  
Jun Liang ◽  
Bing Xu

Abstract Recently the density peaks clustering algorithm (dubbed as DPC) attracts lots of attention. The DPC is able to quickly find cluster centers and complete clustering tasks. And the DPC is suitable for many clustering tasks. However, the cutoff distance 𝑑𝑑𝑐𝑐 is depends on human experience which will greatly affect the clustering results. In addition, the selection of cluster centers requires manual participation which will affect the clustering efficiency. In order to solve these problem, we propose a density peaks clustering algorithm based on K nearest neighbors with adaptive merging strategy (dubbed as KNN-ADPC). We propose a clusters merging strategy to automatically aggregate the over-segmented clusters. Additionally, the K nearest neighbors is adopted to divide points more reasonably. The KNN-ADPC only has one parameter and the clustering task can be conducted automatically without human involvement. The experiment results on artificial and real-world datasets prove the higher accuracy of KNN-ADPC compared with DBSCAN, K-means++, DPC and DPC-KNN.


Author(s):  
Xiaoning Yuan ◽  
Hang Yu ◽  
Jun Liang ◽  
Bing Xu

AbstractRecently the density peaks clustering algorithm (DPC) has received a lot of attention from researchers. The DPC algorithm is able to find cluster centers and complete clustering tasks quickly. It is also suitable for different kinds of clustering tasks. However, deciding the cutoff distance $${d}_{c}$$ d c largely depends on human experience which greatly affects clustering results. In addition, the selection of cluster centers requires manual participation which affects the efficiency of the algorithm. In order to solve these problems, we propose a density peaks clustering algorithm based on K nearest neighbors with adaptive merging strategy (KNN-ADPC). A clusters merging strategy is proposed to automatically aggregate over-segmented clusters. Additionally, the K nearest neighbors are adopted to divide data points more reasonably. There is only one parameter in KNN-ADPC algorithm, and the clustering task can be conducted automatically without human involvement. The experiment results on artificial and real-world datasets prove higher accuracy of KNN-ADPC compared with DBSCAN, K-means++, DPC, and DPC-KNN.


Sign in / Sign up

Export Citation Format

Share Document