scholarly journals Fast Searching Density Peak Clustering Algorithm Based on Shared Nearest Neighbor and Adaptive Clustering Center

Symmetry ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 2014
Author(s):  
Yi Lv ◽  
Mandan Liu ◽  
Yue Xiang

The clustering analysis algorithm is used to reveal the internal relationships among the data without prior knowledge and to further gather some data with common attributes into a group. In order to solve the problem that the existing algorithms always need prior knowledge, we proposed a fast searching density peak clustering algorithm based on the shared nearest neighbor and adaptive clustering center (DPC-SNNACC) algorithm. It can automatically ascertain the number of knee points in the decision graph according to the characteristics of different datasets, and further determine the number of clustering centers without human intervention. First, an improved calculation method of local density based on the symmetric distance matrix was proposed. Then, the position of knee point was obtained by calculating the change in the difference between decision values. Finally, the experimental and comparative evaluation of several datasets from diverse domains established the viability of the DPC-SNNACC algorithm.

Symmetry ◽  
2019 ◽  
Vol 11 (7) ◽  
pp. 859 ◽  
Author(s):  
Lin

The Density Peak Clustering (DPC) algorithm is a new density-based clustering method. It spends most of its execution time on calculating the local density and the separation distance for each data point in a dataset. The purpose of this study is to accelerate its computation. On average, the DPC algorithm scans half of the dataset to calculate the separation distance of each data point. We propose an approach to calculate the separation distance of a data point by scanning only the neighbors of the data point. Additionally, the purpose of the separation distance is to assist in choosing the density peaks, which are the data points with both high local density and high separation distance. We propose an approach to identify non-peak data points at an early stage to avoid calculating their separation distances. Our experimental results show that most of the data points in a dataset can benefit from the proposed approaches to accelerate the DPC algorithm.


2019 ◽  
Vol 2019 ◽  
pp. 1-10
Author(s):  
Yaohui Liu ◽  
Dong Liu ◽  
Fang Yu ◽  
Zhengming Ma

Clustering is widely used in data analysis, and density-based methods are developed rapidly in the recent 10 years. Although the state-of-art density peak clustering algorithms are efficient and can detect arbitrary shape clusters, they are nonsphere type of centroid-based methods essentially. In this paper, a novel local density hierarchical clustering algorithm based on reverse nearest neighbors, RNN-LDH, is proposed. By constructing and using a reverse nearest neighbor graph, the extended core regions are found out as initial clusters. Then, a new local density metric is defined to calculate the density of each object; meanwhile, the density hierarchical relationships among the objects are built according to their densities and neighbor relations. Finally, each unclustered object is classified to one of the initial clusters or noise. Results of experiments on synthetic and real data sets show that RNN-LDH outperforms the current clustering methods based on density peak or reverse nearest neighbors.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-8 ◽  
Author(s):  
Rong Zhou ◽  
Yong Zhang ◽  
Shengzhong Feng ◽  
Nurbol Luktarhan

Clustering aims to differentiate objects from different groups (clusters) by similarities or distances between pairs of objects. Numerous clustering algorithms have been proposed to investigate what factors constitute a cluster and how to efficiently find them. The clustering by fast search and find of density peak algorithm is proposed to intuitively determine cluster centers and assign points to corresponding partitions for complex datasets. This method incorporates simple structure due to the noniterative logic and less few parameters; however, the guidelines for parameter selection and center determination are not explicit. To tackle these problems, we propose an improved hierarchical clustering method HCDP aiming to represent the complex structure of the dataset. A k-nearest neighbor strategy is integrated to compute the local density of each point, avoiding to select the nonnecessary global parameter dc and enables cluster smoothing and condensing. In addition, a new clustering evaluation approach is also introduced to extract a “flat” and “optimal” partition solution from the structure by adaptively computing the clustering stability. The proposed approach is conducted on some applications with complex datasets, where the results demonstrate that the novel method outperforms its counterparts to a large extent.


2020 ◽  
Vol 49 (3) ◽  
pp. 395-411
Author(s):  
Qiannan Wu ◽  
Qianqian Zhang ◽  
Ruizhi Sun ◽  
Li Li ◽  
Huiyu Mu ◽  
...  

Cluster analysis plays a crucial component in consumer behavior segment. The density peak clustering algorithm (DPC) is a novel density-based clustering method. However, it performs poorly in high-dimension datasets and the local density for boundary points. In addition, its fault tolerance is affected by one-step allocation strategy. To overcome these disadvantages, an adaptive density peak clustering algorithm based on dimensional-free and reverse k-nearest neighbors (ERK-DPC) is proposed in this paper. First, we compute Euler cosine distance to obtain the similarity of sample points in high-dimension datasets. Then, the adaptive local density formula is used to measure the local density of each point. Finally, the reverse k-nearest neighbor idea is added on two-step allocation strategy, which assigns the remaining points accurately and effectively. The proposed clustering algorithm is experiments on several benchmark datasets and real-world datasets. By comparing the benchmarks, the results demonstrate that the ERK-DPC algorithm superior to some state-of- the-art methods.


2019 ◽  
Vol 1229 ◽  
pp. 012024 ◽  
Author(s):  
Fan Hong ◽  
Yang Jing ◽  
Hou Cun-cun ◽  
Zhang Ke-zhen ◽  
Yao Ruo-xia

Author(s):  
Xiaoyu Qin ◽  
Kai Ming Ting ◽  
Ye Zhu ◽  
Vincent CS Lee

A recent proposal of data dependent similarity called Isolation Kernel/Similarity has enabled SVM to produce better classification accuracy. We identify shortcomings of using a tree method to implement Isolation Similarity; and propose a nearest neighbour method instead. We formally prove the characteristic of Isolation Similarity with the use of the proposed method. The impact of Isolation Similarity on densitybased clustering is studied here. We show for the first time that the clustering performance of the classic density-based clustering algorithm DBSCAN can be significantly uplifted to surpass that of the recent density-peak clustering algorithm DP. This is achieved by simply replacing the distance measure with the proposed nearest-neighbour-induced Isolation Similarity in DBSCAN, leaving the rest of the procedure unchanged. A new type of clusters called mass-connected clusters is formally defined. We show that DBSCAN, which detects density-connected clusters, becomes one which detects mass-connected clusters, when the distance measure is replaced with the proposed similarity. We also provide the condition under which mass-connected clusters can be detected, while density-connected clusters cannot.


Author(s):  
Liping Sun ◽  
Shang Ci ◽  
Xiaoqing Liu ◽  
Xiaoyao Zheng ◽  
Qingying Yu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document