Fast Searching Density Peak Clustering Algorithm Based on Shared Nearest Neighbor and Adaptive Clustering Center

Yi Lv; Mandan Liu; Yue Xiang

doi:10.3390/sym12122014

Fast Searching Density Peak Clustering Algorithm Based on Shared Nearest Neighbor and Adaptive Clustering Center

Symmetry ◽

10.3390/sym12122014 ◽

2020 ◽

Vol 12 (12) ◽

pp. 2014

Author(s):

Yi Lv ◽

Mandan Liu ◽

Yue Xiang

Keyword(s):

Prior Knowledge ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Local Density ◽

Density Peak ◽

Adaptive Clustering ◽

Clustering Center ◽

Density Peak Clustering ◽

Shared Nearest Neighbor ◽

Fast Searching

The clustering analysis algorithm is used to reveal the internal relationships among the data without prior knowledge and to further gather some data with common attributes into a group. In order to solve the problem that the existing algorithms always need prior knowledge, we proposed a fast searching density peak clustering algorithm based on the shared nearest neighbor and adaptive clustering center (DPC-SNNACC) algorithm. It can automatically ascertain the number of knee points in the decision graph according to the characteristics of different datasets, and further determine the number of clustering centers without human intervention. First, an improved calculation method of local density based on the symmetric distance matrix was proposed. Then, the position of knee point was obtained by calculating the change in the difference between decision values. Finally, the experimental and comparative evaluation of several datasets from diverse domains established the viability of the DPC-SNNACC algorithm.

Download Full-text

Accelerating Density Peak Clustering Algorithm

Symmetry ◽

10.3390/sym11070859 ◽

2019 ◽

Vol 11 (7) ◽

pp. 859 ◽

Cited By ~ 1

Author(s):

Lin

Keyword(s):

Clustering Algorithm ◽

Local Density ◽

Early Stage ◽

Separation Distance ◽

Density Peak ◽

Density Peaks ◽

Density Based Clustering ◽

Data Points ◽

Data Point ◽

Density Peak Clustering

The Density Peak Clustering (DPC) algorithm is a new density-based clustering method. It spends most of its execution time on calculating the local density and the separation distance for each data point in a dataset. The purpose of this study is to accelerate its computation. On average, the DPC algorithm scans half of the dataset to calculate the separation distance of each data point. We propose an approach to calculate the separation distance of a data point by scanning only the neighbors of the data point. Additionally, the purpose of the separation distance is to assist in choosing the density peaks, which are the data points with both high local density and high separation distance. We propose an approach to identify non-peak data points at an early stage to avoid calculating their separation distances. Our experimental results show that most of the data points in a dataset can benefit from the proposed approaches to accelerate the DPC algorithm.

Download Full-text

A Novel Local Density Hierarchical Clustering Algorithm Based on Reverse Nearest Neighbors

Mathematical Problems in Engineering ◽

10.1155/2019/2959017 ◽

2019 ◽

Vol 2019 ◽

pp. 1-10

Author(s):

Yaohui Liu ◽

Dong Liu ◽

Fang Yu ◽

Zhengming Ma

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Local Density ◽

Clustering Algorithms ◽

Real Data ◽

Nearest Neighbors ◽

Clustering Methods ◽

Density Peak ◽

Hierarchical Clustering Algorithm

Clustering is widely used in data analysis, and density-based methods are developed rapidly in the recent 10 years. Although the state-of-art density peak clustering algorithms are efficient and can detect arbitrary shape clusters, they are nonsphere type of centroid-based methods essentially. In this paper, a novel local density hierarchical clustering algorithm based on reverse nearest neighbors, RNN-LDH, is proposed. By constructing and using a reverse nearest neighbor graph, the extended core regions are found out as initial clusters. Then, a new local density metric is defined to calculate the density of each object; meanwhile, the density hierarchical relationships among the objects are built according to their densities and neighbor relations. Finally, each unclustered object is classified to one of the initial clusters or noise. Results of experiments on synthetic and real data sets show that RNN-LDH outperforms the current clustering methods based on density peak or reverse nearest neighbors.

Download Full-text

A Novel Hierarchical Clustering Algorithm Based on Density Peaks for Complex Datasets

Complexity ◽

10.1155/2018/2032461 ◽

2018 ◽

Vol 2018 ◽

pp. 1-8 ◽

Cited By ~ 5

Author(s):

Rong Zhou ◽

Yong Zhang ◽

Shengzhong Feng ◽

Nurbol Luktarhan

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Local Density ◽

Clustering Algorithms ◽

Complex Structure ◽

Density Peak ◽

Global Parameter ◽

Density Peaks ◽

Complex Datasets

Clustering aims to differentiate objects from different groups (clusters) by similarities or distances between pairs of objects. Numerous clustering algorithms have been proposed to investigate what factors constitute a cluster and how to efficiently find them. The clustering by fast search and find of density peak algorithm is proposed to intuitively determine cluster centers and assign points to corresponding partitions for complex datasets. This method incorporates simple structure due to the noniterative logic and less few parameters; however, the guidelines for parameter selection and center determination are not explicit. To tackle these problems, we propose an improved hierarchical clustering method HCDP aiming to represent the complex structure of the dataset. A k-nearest neighbor strategy is integrated to compute the local density of each point, avoiding to select the nonnecessary global parameter dc and enables cluster smoothing and condensing. In addition, a new clustering evaluation approach is also introduced to extract a “flat” and “optimal” partition solution from the structure by adaptively computing the clustering stability. The proposed approach is conducted on some applications with complex datasets, where the results demonstrate that the novel method outperforms its counterparts to a large extent.

Download Full-text

Adaptive density peak clustering based on dimensional-free and reverse k-nearest neighbors

Information Technology And Control ◽

10.5755/j01.itc.49.3.23405 ◽

2020 ◽

Vol 49 (3) ◽

pp. 395-411

Author(s):

Qiannan Wu ◽

Qianqian Zhang ◽

Ruizhi Sun ◽

Li Li ◽

Huiyu Mu ◽

...

Keyword(s):

High Dimension ◽

Clustering Algorithm ◽

Local Density ◽

Nearest Neighbors ◽

Allocation Strategy ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

Density Peak ◽

Real World Datasets ◽

Density Peak Clustering

Cluster analysis plays a crucial component in consumer behavior segment. The density peak clustering algorithm (DPC) is a novel density-based clustering method. However, it performs poorly in high-dimension datasets and the local density for boundary points. In addition, its fault tolerance is affected by one-step allocation strategy. To overcome these disadvantages, an adaptive density peak clustering algorithm based on dimensional-free and reverse k-nearest neighbors (ERK-DPC) is proposed in this paper. First, we compute Euler cosine distance to obtain the similarity of sample points in high-dimension datasets. Then, the adaptive local density formula is used to measure the local density of each point. Finally, the reverse k-nearest neighbor idea is added on two-step allocation strategy, which assigns the remaining points accurately and effectively. The proposed clustering algorithm is experiments on several benchmark datasets and real-world datasets. By comparing the benchmarks, the results demonstrate that the ERK-DPC algorithm superior to some state-of- the-art methods.

Download Full-text

Density Peak Clustering Algorithm based on the Nearest Neighbor

Proceedings of the 3rd International Conference on Mechatronics Engineering and Information Technology (ICMEIT 2019) ◽

10.2991/icmeit-19.2019.106 ◽

2019 ◽

Cited By ~ 1

Author(s):

Bangyu Tong

Keyword(s):

Clustering Algorithm ◽

Nearest Neighbor ◽

Density Peak ◽

Density Peak Clustering

Download Full-text

Density Peak Clustering algorithm using knowledge learning-based fruit fly optimization

International Journal of Computers and Applications ◽

10.1080/1206212x.2018.1440340 ◽

2018 ◽

Vol 40 (3) ◽

pp. 1-10

Author(s):

Ruihong Zhou ◽

Qiaoming Liu ◽

Xuming Han ◽

Limin Wang

Keyword(s):

Clustering Algorithm ◽

Fruit Fly ◽

Density Peak ◽

Fruit Fly Optimization ◽

Density Peak Clustering ◽

Knowledge Learning

Download Full-text

A Fast Density Peak Clustering Algorithm Optimized by Uncertain Number Neighbors for Breast MR Image

Journal of Physics Conference Series ◽

10.1088/1742-6596/1229/1/012024 ◽

2019 ◽

Vol 1229 ◽

pp. 012024 ◽

Cited By ~ 1

Author(s):

Fan Hong ◽

Yang Jing ◽

Hou Cun-cun ◽

Zhang Ke-zhen ◽

Yao Ruo-xia

Keyword(s):

Clustering Algorithm ◽

Mr Image ◽

Density Peak ◽

Breast Mr ◽

Density Peak Clustering

Download Full-text

Nearest-Neighbour-Induced Isolation Similarity and Its Impact on Density-Based Clustering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014755 ◽

2019 ◽

Vol 33 ◽

pp. 4755-4762 ◽

Cited By ~ 3

Author(s):

Xiaoyu Qin ◽

Kai Ming Ting ◽

Ye Zhu ◽

Vincent CS Lee

Keyword(s):

Clustering Algorithm ◽

Distance Measure ◽

Nearest Neighbour ◽

Density Peak ◽

Density Based Clustering ◽

New Type ◽

Density Peak Clustering ◽

The Impact ◽

First Time ◽

Tree Method

A recent proposal of data dependent similarity called Isolation Kernel/Similarity has enabled SVM to produce better classification accuracy. We identify shortcomings of using a tree method to implement Isolation Similarity; and propose a nearest neighbour method instead. We formally prove the characteristic of Isolation Similarity with the use of the proposed method. The impact of Isolation Similarity on densitybased clustering is studied here. We show for the first time that the clustering performance of the classic density-based clustering algorithm DBSCAN can be significantly uplifted to surpass that of the recent density-peak clustering algorithm DP. This is achieved by simply replacing the distance measure with the proposed nearest-neighbour-induced Isolation Similarity in DBSCAN, leaving the rest of the procedure unchanged. A new type of clusters called mass-connected clusters is formally defined. We show that DBSCAN, which detects density-connected clusters, becomes one which detects mass-connected clusters, when the distance measure is replaced with the proposed similarity. We also provide the condition under which mass-connected clusters can be detected, while density-connected clusters cannot.

Download Full-text

A privacy‐preserving density peak clustering algorithm in cloud computing

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.5641 ◽

2020 ◽

Vol 32 (11) ◽

Cited By ~ 1

Author(s):

Liping Sun ◽

Shang Ci ◽

Xiaoqing Liu ◽

Xiaoyao Zheng ◽

Qingying Yu ◽

...

Keyword(s):

Cloud Computing ◽

Clustering Algorithm ◽

Privacy Preserving ◽

Density Peak ◽

Density Peak Clustering

Download Full-text

A Multi-Relational Hierarchical Clustering Algorithm Based on Shared Nearest Neighbor Similarity

2007 International Conference on Machine Learning and Cybernetics ◽

10.1109/icmlc.2007.4370836 ◽

2007 ◽

Cited By ~ 1

Author(s):

Jing-Feng Guo ◽

Yu-Yan Zhao ◽

Jing Li

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Hierarchical Clustering Algorithm ◽

Shared Nearest Neighbor

Download Full-text