Incremental Hierarchical Clustering for Data Insertion and Its Evaluation
Clustering is employed in various fields. However, the conventional method does not consider changing data. Therefore, if the data is changed, the entire dataset must be re-clustered. This article proposes a clustering method to update the clustering result obtained by a hierarchical clustering method without re-clustering when a point is inserted. This article defines the center and the radius of a cluster and determine the cluster to be inserted. The insertion location is determined by similarity based on the conventional clustering method. this research introduces the concept of outliers and consider creating a cluster caused by the insertion. By examining the multimodality of a cluster, the cluster is divided. In addition, when the number of clusters increases, data points previously inserted are updated by re-insertion. Compared with the conventional method, the experimental results demonstrate that the execution time of the proposed method is significantly smaller and clustering accuracy is comparable for some data.