scholarly journals Interactive K-Means Clustering Method Based on User Behavior for Different Analysis Target in Medicine

2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Yang Lei ◽  
Dai Yu ◽  
Zhang Bin ◽  
Yang Yang

Clustering algorithm as a basis of data analysis is widely used in analysis systems. However, as for the high dimensions of the data, the clustering algorithm may overlook the business relation between these dimensions especially in the medical fields. As a result, usually the clustering result may not meet the business goals of the users. Then, in the clustering process, if it can combine the knowledge of the users, that is, the doctor’s knowledge or the analysis intent, the clustering result can be more satisfied. In this paper, we propose an interactive K-means clustering method to improve the user’s satisfactions towards the result. The core of this method is to get the user’s feedback of the clustering result, to optimize the clustering result. Then, a particle swarm optimization algorithm is used in the method to optimize the parameters, especially the weight settings in the clustering algorithm to make it reflect the user’s business preference as possible. After that, based on the parameter optimization and adjustment, the clustering result can be closer to the user’s requirement. Finally, we take an example in the breast cancer, to testify our method. The experiments show the better performance of our algorithm.

2013 ◽  
Vol 321-324 ◽  
pp. 1939-1942
Author(s):  
Lei Gu

The locality sensitive k-means clustering method has been presented recently. Although this approach can improve the clustering accuracies, it often gains the unstable clustering results because some random samples are employed for the initial centers. In this paper, an initialization method based on the core clusters is used for the locality sensitive k-means clustering. The core clusters can be formed by constructing the σ-neighborhood graph and their centers are regarded as the initial centers of the locality sensitive k-means clustering. To investigate the effectiveness of our approach, several experiments are done on three datasets. Experimental results show that our proposed method can improve the clustering performance compared to the previous locality sensitive k-means clustering.


Author(s):  
Zhang Xiaodan ◽  
Hu Xiaohua ◽  
Xia Jiali ◽  
Zhou Xiaohua ◽  
Achananuparp Palakorn

In this article, we present a graph-based knowledge representation for biomedical digital library literature clustering. An efficient clustering method is developed to identify the ontology-enriched k-highest density term subgraphs that capture the core semantic relationship information about each document cluster. The distance between each document and the k term graph clusters is calculated. A document is then assigned to the closest term cluster. The extensive experimental results on two PubMed document sets (Disease10 and OHSUMED23) show that our approach is comparable to spherical k-means. The contributions of our approach are the following: (1) we provide two corpus-level graph representations to improve document clustering, a term co-occurrence graph and an abstract-title graph; (2) we develop an efficient and effective document clustering algorithm by identifying k distinguishable class-specific core term subgraphs using terms’ global and local importance information; and (3) the identified term clusters give a meaningful explanation for the document clustering results.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yaping Li

The main objective of this paper is to present a new clustering algorithm for metadata trees based on K-prototypes algorithm, GSO (glowworm swarm optimization) algorithm, and maximal frequent path (MFP). Metadata tree clustering includes computing the feature vector of the metadata tree and the feature vector clustering. Therefore, traditional data clustering methods are not suitable directly for metadata trees. As the main method to calculate eigenvectors, the MFP method also faces the difficulties of high computational complexity and loss of key information. Generally, the K-prototypes algorithm is suitable for clustering of mixed-attribute data such as feature vectors, but the K-prototypes algorithm is sensitive to the initial clustering center. Compared with other swarm intelligence algorithms, the GSO algorithm has more efficient global search advantages, which are suitable for solving multimodal problems and also useful to optimize the K-prototypes algorithm. To address the clustering of metadata tree structures in terms of clustering accuracy and high data dimension, this paper combines the GSO algorithm, K-prototypes algorithm, and MFP together to study and design a new metadata structure clustering method. Firstly, MFP is used to describe metadata tree features, and the key parameter of categorical data is introduced into the feature vector of MFP to improve the accuracy of the feature vector to describe the metadata tree; secondly, GSO is combined with K-prototypes to design GSOKP for clustering the feature vector that contains numeric data and categorical data so as to improve the clustering accuracy; finally, tests are conducted with a set of metadata trees. The experimental results show that the designed metadata tree clustering method GSOKP-FP has certain advantages in respect to clustering accuracy and time complexity.


2013 ◽  
Vol 325-326 ◽  
pp. 1632-1636
Author(s):  
Chao Wang ◽  
Ke Luo

As a relatively novel clustering approach, Particle Swarm Optimization (PSO) prevents k-means algorithm from falling into local optimum effectively, and has made relatively notable successes in clustering, however, using Hard C-Means algorithm when randomly obtaining initial clustering centers is required in most existing PSOs, while no definite limit existing in these samples actually. Based on this, we utilized an improved PSO; along with effective processing methods on boundary objects of Rough Set Theory, we proposed a new rough clustering algorithm based on PSO. It can adjust the upper and lower approximations weighting factors dynamically, and coordinate the proportions of upper and lower approximations in different generations as well. Finally, we compared it with several common clustering methods using Iris dataset of UCI. It turned out that the algorithm has higher accuracy and stability, along with better comprehensive performance.


2011 ◽  
Vol 268-270 ◽  
pp. 10-15
Author(s):  
Jun Yan Chen

This paper presents a hybrid-clustering algorithm that is a stochastic disturbance of particle swarm optimization (PSO) for K-means clustering method (SDPSO-K). The proposed algorithm can improve the particle global searching ability in PSO to avoid the K-means disadvantage of being easily trapped in a local optimal solution and to save the expensive computational cost of PSO clustering. The performance of the SDPSO-K, compared with three recently developed modified PSO techniques and related clustering algorithms for six datasets, indicates that the SDPSO-K algorithm is clearly and consistently superior in terms of precision and robustness.


Sign in / Sign up

Export Citation Format

Share Document