Research and Application of Clustering Algorithm Based on Shared Nearest Neighbor

Author(s):  
Hanmin Ye ◽  
Xue Bai ◽  
Hao Lv
2021 ◽  
Vol 13 (6) ◽  
pp. 1136
Author(s):  
Yongjun Zhang ◽  
Wangshan Yang ◽  
Xinyi Liu ◽  
Yi Wan ◽  
Xianzhang Zhu ◽  
...  

Efficient building instance segmentation is necessary for many applications such as parallel reconstruction, management and analysis. However, most of the existing instance segmentation methods still suffer from low completeness, low correctness and low quality for building instance segmentation, which are especially obvious for complex building scenes. This paper proposes a novel unsupervised building instance segmentation (UBIS) method of airborne Light Detection and Ranging (LiDAR) point clouds for parallel reconstruction analysis, which combines a clustering algorithm and a novel model consistency evaluation method. The proposed method first divides building point clouds into building instances by the improved kd tree 2D shared nearest neighbor clustering algorithm (Ikd-2DSNN). Then, the geometric feature of the building instance is obtained using the model consistency evaluation method, which is used to determine whether the building instance is a single building instance or a multi-building instance. Finally, for multiple building instances, the improved kd tree 3D shared nearest neighbor clustering algorithm (Ikd-3DSNN) is used to divide multi-building instances again to improve the accuracy of building instance segmentation. Our experimental results demonstrate that the proposed UBIS method obtained good performances for various buildings in different scenes such as high-rise building, podium buildings and a residential area with detached houses. A comparative analysis confirms that the proposed UBIS method performed better than state-of-the-art methods.


Symmetry ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 2014
Author(s):  
Yi Lv ◽  
Mandan Liu ◽  
Yue Xiang

The clustering analysis algorithm is used to reveal the internal relationships among the data without prior knowledge and to further gather some data with common attributes into a group. In order to solve the problem that the existing algorithms always need prior knowledge, we proposed a fast searching density peak clustering algorithm based on the shared nearest neighbor and adaptive clustering center (DPC-SNNACC) algorithm. It can automatically ascertain the number of knee points in the decision graph according to the characteristics of different datasets, and further determine the number of clustering centers without human intervention. First, an improved calculation method of local density based on the symmetric distance matrix was proposed. Then, the position of knee point was obtained by calculating the change in the difference between decision values. Finally, the experimental and comparative evaluation of several datasets from diverse domains established the viability of the DPC-SNNACC algorithm.


2011 ◽  
Vol 145 ◽  
pp. 189-193 ◽  
Author(s):  
Horng Lin Shieh

In this paper, a hybrid method combining rough set and shared nearest neighbor algorithms is proposed for data clustering with non-globular shapes. The roughk-means algorithm is based on the distances between data and cluster centers. It partitions a data set with globular shapes well, but when the data are non-globular shapes, the results obtained by a roughk-means algorithm are not very satisfactory. In order to resolve this problem, a combined rough set and shared nearest neighbor algorithm is proposed. The proposed algorithm first adopts a shared nearest neighbor algorithm to evaluate the similarity among data, then the lower and upper approximations of a rough set algorithm are used to partition the data set into clusters.


2015 ◽  
Vol 11 (3) ◽  
pp. 26-48 ◽  
Author(s):  
Guilherme Moreira ◽  
Maribel Yasmina Santos ◽  
João Moura Pires ◽  
João Galvão

Huge amounts of data are available for analysis in nowadays organizations, which are facing several challenges when trying to analyze the generated data with the aim of extracting useful information. This analytical capability needs to be enhanced with tools capable of dealing with big data sets without making the analytical process an arduous task. Clustering is usually used in the data analysis process, as this technique does not require any prior knowledge about the data. However, clustering algorithms usually require one or more input parameters that influence the clustering process and the results that can be obtained. This work analyses the relation between the three input parameters of the SNN (Shared Nearest Neighbor) clustering algorithm, providing a comprehensive understanding of the relationships that were identified between k, Eps and MinPts, the algorithm's input parameters. Moreover, this work also proposes specific guidelines for the definition of the appropriate input parameters, optimizing the processing time, as the number of trials needed to achieve appropriate results can be substantial reduced.


2015 ◽  
Vol 719-720 ◽  
pp. 1160-1165 ◽  
Author(s):  
Ya Ran Su ◽  
Xi Xian Niu

Clustering analysis continually consider as a hot field in Data Mining. For different types data sets and application purposes, the relevant researchers concern on various aspect, such as the adaptability to fit density and shape, noise detection, outliers identification, cluster number determination, accuracy and optimization. Lots of related works focus on the Shared Nearest Neighbor measure method, due to its best and wide adaptability to deal with complex distribution data set. Based on Shared Nearest Neighbor, an improved algorithm is proposed in this paper, it mainly target on the problems solution of natural distribute density, arbitrary shape and cluster number determination. The new algorithm start with random selected seed, follow the direction of its nearest neighbors, search and find its neighbors which have the greatest similar features, form the local maximum cluster, dynamically adjust the data objects’ affiliation to realize the local optimization at the same time, and then end the clustering procedure until identify all the data objects. Experiments verify the new algorithm has the advanced ability to fit the problems such as different density, shape, noise, cluster number and so on, and can realize fast optimization searching.


Sign in / Sign up

Export Citation Format

Share Document