High-Dimensional Shared Nearest Neighbor Clustering Algorithm

Sharing nearest neighbor (SNN) is a novel metric measure of similarity, and it can conquer two hardships: the low similarities between samples and the different densities of classes. At present, there are two popular SNN similarity based clustering methods: JP clustering and SNN density based clustering. Their clustering results highly rely on the weighting value of the single edge, and thus they are very vulnerable. Motivated by the idea of smooth splicing in computing geometry, the authors design a novel SNN similarity based clustering algorithm within the structure of graph theory. Since it inherits complementary intensity-smoothness principle, its generalizing ability surpasses those of the previously mentioned two methods. The experiments on text datasets show its effectiveness.

Download Full-text

Research and Application of Clustering Algorithm Based on Shared Nearest Neighbor

2017 International Conference on Green Informatics (ICGI) ◽

10.1109/icgi.2017.10 ◽

2017 ◽

Author(s):

Hanmin Ye ◽

Xue Bai ◽

Hao Lv

Keyword(s):

Clustering Algorithm ◽

Nearest Neighbor ◽

Shared Nearest Neighbor

Download Full-text

Unsupervised Building Instance Segmentation of Airborne LiDAR Point Clouds for Parallel Reconstruction Analysis

Remote Sensing ◽

10.3390/rs13061136 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1136

Author(s):

Yongjun Zhang ◽

Wangshan Yang ◽

Xinyi Liu ◽

Yi Wan ◽

Xianzhang Zhu ◽

...

Keyword(s):

Clustering Algorithm ◽

Evaluation Method ◽

Nearest Neighbor ◽

Point Clouds ◽

Airborne Lidar ◽

Model Consistency ◽

Parallel Reconstruction ◽

Shared Nearest Neighbor ◽

Consistency Evaluation ◽

Instance Segmentation

Efficient building instance segmentation is necessary for many applications such as parallel reconstruction, management and analysis. However, most of the existing instance segmentation methods still suffer from low completeness, low correctness and low quality for building instance segmentation, which are especially obvious for complex building scenes. This paper proposes a novel unsupervised building instance segmentation (UBIS) method of airborne Light Detection and Ranging (LiDAR) point clouds for parallel reconstruction analysis, which combines a clustering algorithm and a novel model consistency evaluation method. The proposed method first divides building point clouds into building instances by the improved kd tree 2D shared nearest neighbor clustering algorithm (Ikd-2DSNN). Then, the geometric feature of the building instance is obtained using the model consistency evaluation method, which is used to determine whether the building instance is a single building instance or a multi-building instance. Finally, for multiple building instances, the improved kd tree 3D shared nearest neighbor clustering algorithm (Ikd-3DSNN) is used to divide multi-building instances again to improve the accuracy of building instance segmentation. Our experimental results demonstrate that the proposed UBIS method obtained good performances for various buildings in different scenes such as high-rise building, podium buildings and a residential area with detached houses. A comparative analysis confirms that the proposed UBIS method performed better than state-of-the-art methods.

Download Full-text

Fast Searching Density Peak Clustering Algorithm Based on Shared Nearest Neighbor and Adaptive Clustering Center

Symmetry ◽

10.3390/sym12122014 ◽

2020 ◽

Vol 12 (12) ◽

pp. 2014

Author(s):

Yi Lv ◽

Mandan Liu ◽

Yue Xiang

Keyword(s):

Prior Knowledge ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Local Density ◽

Density Peak ◽

Adaptive Clustering ◽

Clustering Center ◽

Density Peak Clustering ◽

Shared Nearest Neighbor ◽

Fast Searching

The clustering analysis algorithm is used to reveal the internal relationships among the data without prior knowledge and to further gather some data with common attributes into a group. In order to solve the problem that the existing algorithms always need prior knowledge, we proposed a fast searching density peak clustering algorithm based on the shared nearest neighbor and adaptive clustering center (DPC-SNNACC) algorithm. It can automatically ascertain the number of knee points in the decision graph according to the characteristics of different datasets, and further determine the number of clustering centers without human intervention. First, an improved calculation method of local density based on the symmetric distance matrix was proposed. Then, the position of knee point was obtained by calculating the change in the difference between decision values. Finally, the experimental and comparative evaluation of several datasets from diverse domains established the viability of the DPC-SNNACC algorithm.

Download Full-text

A Hybrid Clustering Algorithm Based on Rough Set and Shared Nearest Neighbors

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.145.189 ◽

2011 ◽

Vol 145 ◽

pp. 189-193 ◽

Cited By ~ 3

Author(s):

Horng Lin Shieh

Keyword(s):

Rough Set ◽

Hybrid Method ◽

Data Clustering ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

Nearest Neighbor Algorithm ◽

Data Set ◽

Lower And Upper Approximations ◽

Shared Nearest Neighbor

In this paper, a hybrid method combining rough set and shared nearest neighbor algorithms is proposed for data clustering with non-globular shapes. The roughk-means algorithm is based on the distances between data and cluster centers. It partitions a data set with globular shapes well, but when the data are non-globular shapes, the results obtained by a roughk-means algorithm are not very satisfactory. In order to resolve this problem, a combined rough set and shared nearest neighbor algorithm is proposed. The proposed algorithm first adopts a shared nearest neighbor algorithm to evaluate the similarity among data, then the lower and upper approximations of a rough set algorithm are used to partition the data set into clusters.

Download Full-text

Shared Nearest Neighbor clustering in a Locality Sensitive Hashing framework

10.1101/093898 ◽

2016 ◽

Author(s):

Sawsan Kanj ◽

Thomas Brüls ◽

Stéphane Gazut

Keyword(s):

Nearest Neighbor ◽

Reference Data ◽

Sequence Data ◽

Scale Up ◽

Nearest Neighbors ◽

High Accuracy ◽

Locality Sensitive Hashing ◽

High Dimensional ◽

Nearest Neighbor Rule ◽

Shared Nearest Neighbor

AbstractWe present a new algorithm to cluster high dimensional sequence data, and its application to the field of metagenomics, which aims to reconstruct individual genomes from a mixture of genomes sampled from an environ-mental site, without any prior knowledge of reference data (genomes) or the shape of clusters. Such problems typically cannot be solved directly with classical approaches seeking to estimate the density of clusters, e.g., using the shared nearest neighbors rule, due to the prohibitive size of contemporary sequence datasets. We explore here a new method based on combining the shared nearest neighbor (SNN) rule with the concept of Locality Sensitive Hashing (LSH). The proposed method, called LSH-SNN, works by randomly splitting the input data into smaller-sized subsets (buckets) and, employing the shared nearest neighbor rule on each of these buckets. Links can be created among neighbors sharing a sufficient number of elements, hence allowing clusters to be grown from linked elements. LSH-SNN can scale up to larger datasets consisting of millions of sequences, while achieving high accuracy across a variety of sample sizes and complexities.

Download Full-text

An improved clustering algorithm based on density and shared nearest neighbor

2016 IEEE Information Technology, Networking, Electronic and Automation Control Conference ◽

10.1109/itnec.2016.7560314 ◽

2016 ◽

Author(s):

Hanmin Ye ◽

Hao Lv ◽

Qianting Sun

Keyword(s):

Clustering Algorithm ◽

Nearest Neighbor ◽

Shared Nearest Neighbor

Download Full-text