A Self-Adaptive Fuzzyc-Means Algorithm for Determining the Optimal Number of Clusters

Computational Intelligence and Neuroscience ◽

10.1155/2016/2647389 ◽

2016 ◽

Vol 2016 ◽

pp. 1-12 ◽

Cited By ~ 13

Author(s):

Min Ren ◽

Peiyu Liu ◽

Zhihao Wang ◽

Jing Yi

Keyword(s):

Adaptive Method ◽

Convergence Result ◽

Decision Function ◽

Optimal Number ◽

Validity Index ◽

Number Of Clusters ◽

Clustering Validity Index ◽

Clustering Validity ◽

Optimal Number Of Clusters ◽

Self Adaptive

For the shortcoming of fuzzyc-means algorithm (FCM) needing to know the number of clusters in advance, this paper proposed a new self-adaptive method to determine the optimal number of clusters. Firstly, a density-based algorithm was put forward. The algorithm, according to the characteristics of the dataset, automatically determined the possible maximum number of clusters instead of using the empirical rulenand obtained the optimal initial cluster centroids, improving the limitation of FCM that randomly selected cluster centroids lead the convergence result to the local minimum. Secondly, this paper, by introducing a penalty function, proposed a new fuzzy clustering validity index based on fuzzy compactness and separation, which ensured that when the number of clusters verged on that of objects in the dataset, the value of clustering validity index did not monotonically decrease and was close to zero, so that the optimal number of clusters lost robustness and decision function. Then, based on these studies, a self-adaptive FCM algorithm was put forward to estimate the optimal number of clusters by the iterative trial-and-error process. At last, experiments were done on the UCI, KDD Cup 1999, and synthetic datasets, which showed that the method not only effectively determined the optimal number of clusters, but also reduced the iteration of FCM with the stable clustering result.

Download Full-text

A Validity Index for Fuzzy Clustering Based on Bipartite Modularity

Journal of Electrical and Computer Engineering ◽

10.1155/2019/2719617 ◽

2019 ◽

Vol 2019 ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Yongli Liu ◽

Xiaoyang Zhang ◽

Jingli Chen ◽

Hao Chao

Keyword(s):

Fuzzy Clustering ◽

Optimal Number ◽

Experimental Results ◽

Validity Index ◽

Number Of Clusters ◽

Validity Indices ◽

Noise Data ◽

Clustering Validity ◽

Optimal Number Of Clusters

Because traditional fuzzy clustering validity indices need to specify the number of clusters and are sensitive to noise data, we propose a validity index for fuzzy clustering, named CSBM (compactness separateness bipartite modularity), based on bipartite modularity. CSBM enhances the robustness by combining intraclass compactness and interclass separateness and can automatically determine the optimal number of clusters. In order to estimate the performance of CSBM, we carried out experiments on six real datasets and compared CSBM with other six prominent indices. Experimental results show that the CSBM index performs the best in terms of robustness while accurately detecting the number of clusters.

Download Full-text

Estimating the Optimal Number of Clusters Via Internal Validity Index

Neural Processing Letters ◽

10.1007/s11063-021-10427-8 ◽

2021 ◽

Author(s):

Shibing Zhou ◽

Fei Liu ◽

Wei Song

Keyword(s):

Internal Validity ◽

Optimal Number ◽

Validity Index ◽

Number Of Clusters ◽

Optimal Number Of Clusters

Download Full-text

Improved Self-Adaptive ACS Algorithm to Determine the Optimal Number of Clusters

International Journal on Advanced Science Engineering and Information Technology ◽

10.18517/ijaseit.11.3.11723 ◽

2021 ◽

Vol 11 (3) ◽

pp. 1092

Author(s):

Ayad Mohammed Jabbar ◽

Ku Ruhana Ku-Mahamud ◽

Rafid Sagban

Keyword(s):

Optimal Number ◽

Number Of Clusters ◽

Acs Algorithm ◽

Optimal Number Of Clusters ◽

Self Adaptive

Download Full-text

Fast Search Algorithm for Determining the Optimal Number of Clusters using Cluster Validity Index

The Journal of the Korea Contents Association ◽

10.5392/jkca.2009.9.9.080 ◽

2009 ◽

Vol 9 (9) ◽

pp. 80-89 ◽

Cited By ~ 1

Author(s):

Sang-Wook Lee

Keyword(s):

Search Algorithm ◽

Optimal Number ◽

Cluster Validity ◽

Cluster Validity Index ◽

Validity Index ◽

Number Of Clusters ◽

Fast Search ◽

Fast Search Algorithm ◽

Optimal Number Of Clusters

Download Full-text

Enhanced cluster validity index for the evaluation of optimal number of clusters for Fuzzy C-Means algorithm

2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) ◽

10.1109/fuzz-ieee.2014.6891591 ◽

2014 ◽

Cited By ~ 10

Author(s):

Neha Bharill ◽

Aruna Tiwari

Keyword(s):

Optimal Number ◽

Cluster Validity ◽

Cluster Validity Index ◽

Validity Index ◽

Number Of Clusters ◽

Fuzzy C Means ◽

Fuzzy C Means Algorithm ◽

Optimal Number Of Clusters

Download Full-text

A new cluster validity index using maximum cluster spread based compactness measure

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-02-2016-0006 ◽

2016 ◽

Vol 9 (2) ◽

pp. 179-204 ◽

Cited By ~ 10

Author(s):

M. Arif Wani ◽

Romana Riyaz

Keyword(s):

Optimal Number ◽

Data Sets ◽

Cluster Validity ◽

Cluster Validity Index ◽

Validity Index ◽

Data Set ◽

Number Of Clusters ◽

Content Type ◽

Validity Indices ◽

Optimal Number Of Clusters

Purpose – The most commonly used approaches for cluster validation are based on indices but the majority of the existing cluster validity indices do not work well on data sets of different complexities. The purpose of this paper is to propose a new cluster validity index (ARSD index) that works well on all types of data sets. Design/methodology/approach – The authors introduce a new compactness measure that depicts the typical behaviour of a cluster where more points are located around the centre and lesser points towards the outer edge of the cluster. A novel penalty function is proposed for determining the distinctness measure of clusters. Random linear search-algorithm is employed to evaluate and compare the performance of the five commonly known validity indices and the proposed validity index. The values of the six indices are computed for all nc ranging from (nc min, nc max) to obtain the optimal number of clusters present in a data set. The data sets used in the experiments include shaped, Gaussian-like and real data sets. Findings – Through extensive experimental study, it is observed that the proposed validity index is found to be more consistent and reliable in indicating the correct number of clusters compared to other validity indices. This is experimentally demonstrated on 11 data sets where the proposed index has achieved better results. Originality/value – The originality of the research paper includes proposing a novel cluster validity index which is used to determine the optimal number of clusters present in data sets of different complexities.

Download Full-text

Research on Fuzzy Clustering Validity

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.40-41.174 ◽

2010 ◽

Vol 40-41 ◽

pp. 174-182

Author(s):

Wei Jin Chen ◽

Huai Lin Dong ◽

Qing Feng Wu ◽

Ling Lin

Keyword(s):

Cluster Analysis ◽

Fuzzy Clustering ◽

Clustering Analysis ◽

Optimal Number ◽

Number Of Clusters ◽

Fuzzy Partition ◽

Geometry Structure ◽

Clustering Validity ◽

Optimal Number Of Clusters

The evaluation of clustering validity is important for clustering analysis, and is one of the hottest spots of cluster analysis. The quality of the evaluation of clustering is that optimal number of clusters is reasonable. For fuzzy clustering, the paper surveys the widely known fuzzy clustering validity evaluation based on the methods of fuzzy partition, geometry structure and statistics.

Download Full-text

Method for determining optimal number of clusters in K-means clustering algorithm

Journal of Computer Applications ◽

10.3724/sp.j.1087.2010.01995 ◽

2010 ◽

Vol 30 (8) ◽

pp. 1995-1998 ◽

Cited By ~ 18

Author(s):

Shi-bing ZHOU ◽

Zhen-yuan XU ◽

Xu-qing TANG

Keyword(s):

Clustering Algorithm ◽

Optimal Number ◽

Number Of Clusters ◽

Optimal Number Of Clusters

Download Full-text

Clustering Count-based RNA Methylation Data Using a Nonparametric Generative Model

Current Bioinformatics ◽

10.2174/1574893613666180601080008 ◽

2018 ◽

Vol 14 (1) ◽

pp. 11-23 ◽

Cited By ~ 3

Author(s):

Lin Zhang ◽

Yanling He ◽

Huaizhi Wang ◽

Hui Liu ◽

Yufei Huang ◽

...

Keyword(s):

Clustering Analysis ◽

Methylation Level ◽

Optimal Number ◽

Generative Model ◽

Methylation Data ◽

Sequencing Data ◽

Number Of Clusters ◽

Rna Methylation ◽

Clustering Effect ◽

Optimal Number Of Clusters

Background: RNA methylome has been discovered as an important layer of gene regulation and can be profiled directly with count-based measurements from high-throughput sequencing data. Although the detailed regulatory circuit of the epitranscriptome remains uncharted, clustering effect in methylation status among different RNA methylation sites can be identified from transcriptome-wide RNA methylation profiles and may reflect the epitranscriptomic regulation. Count-based RNA methylation sequencing data has unique features, such as low reads coverage, which calls for novel clustering approaches. Objective: Besides the low reads coverage, it is also necessary to keep the integer property to approach clustering analysis of count-based RNA methylation sequencing data. Method: We proposed a nonparametric generative model together with its Gibbs sampling solution for clustering analysis. The proposed approach implements a beta-binomial mixture model to capture the clustering effect in methylation level with the original count-based measurements rather than an estimated continuous methylation level. Besides, it adopts a nonparametric Dirichlet process to automatically determine an optimal number of clusters so as to avoid the common model selection problem in clustering analysis. Results: When tested on the simulated system, the method demonstrated improved clustering performance over hierarchical clustering, K-means, MClust, NMF and EMclust. It also revealed on real dataset two novel RNA N6-methyladenosine (m6A) co-methylation patterns that may be induced directly by METTL14 and WTAP, which are two known regulatory components of the RNA m6A methyltransferase complex. Conclusion: Our proposed DPBBM method not only properly handles the count-based measurements of RNA methylation data from sites of very low reads coverage, but also learns an optimal number of clusters adaptively from the data analyzed. Availability: The source code and documents of DPBBM R package are freely available through the Comprehensive R Archive Network (CRAN): https://cran.r-project.org/web/packages/DPBBM/.

Download Full-text

Effective and Optimal Clustering Based on New Clustering Validity Index

2018 IEEE 22nd International Conference on Computer Supported Cooperative Work in Design ((CSCWD)) ◽

10.1109/cscwd.2018.8465344 ◽

2018 ◽

Author(s):

Erzhou Zhu ◽

Peng Li ◽

Zhujuan Ma ◽

Xuejun Li ◽

Feng Liu

Keyword(s):

Validity Index ◽

Clustering Validity Index ◽

Clustering Validity

Download Full-text