Application of grid-based k-means clustering algorithm for optimal image processing

Tingna Shi; Penglong Wang; Jeenshing Wang; Shihong Yue

doi:10.2298/csis120126052s

Application of grid-based k-means clustering algorithm for optimal image processing

Computer Science and Information Systems ◽

10.2298/csis120126052s ◽

2012 ◽

Vol 9 (4) ◽

pp. 1679-1696 ◽

Cited By ~ 3

Author(s):

Tingna Shi ◽

Penglong Wang ◽

Jeenshing Wang ◽

Shihong Yue

Keyword(s):

Image Processing ◽

Image Segmentation ◽

Clustering Algorithm ◽

Optimal Number ◽

Number Of Clusters ◽

General Index ◽

Benchmark Datasets ◽

Basic Characteristics ◽

Grid Based

The effectiveness of K-means clustering algorithm for image segmentation has been proven in many studies, but is limited in the following problems: 1) the determination of a proper number of clusters. If the number of clusters is determined incorrectly, a good-quality segmented image cannot be guaranteed; 2) the poor typicality of clustering prototypes; and 3) the determination of an optimal number of pixels. The number of pixels plays an important role in any image processing, but so far there is no general and efficient method to determine the optimal number of pixels. In this paper, a grid-based K-means algorithm is proposed for image segmentation. The advantages of the proposed algorithm over the existing K-means algorithm have been validated by some benchmark datasets. In addition, we further analyze the basic characteristics of the algorithm and propose a general index based on maximizing grey differences between investigated objective grays and background grays. Without any additional condition, the proposed index is robust in identifying an optimal number of pixels. Our experiments have validated the effectiveness of the proposed index by the image results that are consistent with the visual perception of the datasets.

Download Full-text

Method for determining optimal number of clusters in K-means clustering algorithm

Journal of Computer Applications ◽

10.3724/sp.j.1087.2010.01995 ◽

2010 ◽

Vol 30 (8) ◽

pp. 1995-1998 ◽

Cited By ~ 18

Author(s):

Shi-bing ZHOU ◽

Zhen-yuan XU ◽

Xu-qing TANG

Keyword(s):

Clustering Algorithm ◽

Optimal Number ◽

Number Of Clusters ◽

Optimal Number Of Clusters

Download Full-text

Finding the Number of Clusters in Data and Better Initial Centers for K-means Algorithm

International Journal of Intelligent Systems and Applications ◽

10.5815/ijisa.2020.06.01 ◽

2020 ◽

Vol 12 (6) ◽

pp. 1-20

Author(s):

Ahmed Fahim ◽

Keyword(s):

Data Clustering ◽

Linear Time ◽

Original Data ◽

Local Minima ◽

Expected Number ◽

Open Problems ◽

Number Of Clusters ◽

Benchmark Datasets ◽

Selection Of

The k-means is the most well-known algorithm for data clustering in data mining. Its simplicity and speed of convergence to local minima are the most important advantages of it, in addition to its linear time complexity. The most important open problems in this algorithm are the selection of initial centers and the determination of the exact number of clusters in advance. This paper proposes a solution for these two problems together; by adding a preprocess step to get the expected number of clusters in data and better initial centers. There are many researches to solve each of these problems separately, but there is no research to solve both problems together. The preprocess step requires o(n log n); where n is size of the dataset. This preprocess step aims to get initial portioning of data without determining the number of clusters in advance, then computes the means of initial clusters. After that we apply k-means on original data using the resulting information from the preprocess step to get the final clusters. We use many benchmark datasets to test the proposed method. The experimental results show the efficiency of the proposed method.

Download Full-text

Determination of the optimal number of clusters in harmonic data classification

2008 13th International Conference on Harmonics and Quality of Power ◽

10.1109/ichqp.2008.4668773 ◽

2008 ◽

Cited By ~ 2

Author(s):

A. Asheibi ◽

D. Stirling ◽

D. Sutanto

Keyword(s):

Data Classification ◽

Optimal Number ◽

Number Of Clusters ◽

Optimal Number Of Clusters

Download Full-text

Faults Detection for Photovoltaic Field Based on K-Means, Elbow, and Average Silhouette Techniques through the Segmentation of a Thermal Image

International Journal of Photoenergy ◽

10.1155/2020/6617597 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Abdelilah Et-taleby ◽

Mohammed Boussetta ◽

Mohamed Benslimane

Keyword(s):

Image Processing ◽

Excellent Result ◽

Clustering Algorithms ◽

R Package ◽

Thermal Image ◽

Optimal Number ◽

Number Of Clusters ◽

Processing Methods ◽

Clustering Quality

Clustering or grouping is among the most important image processing methods that aim to split an image into different groups. Examining the literature, many clustering algorithms have been carried out, where the K-means algorithm is considered among the simplest and most used to classify an image into many regions. In this context, the main objective of this work is to detect and locate precisely the damaged area in photovoltaic (PV) fields based on the clustering of a thermal image through the K-means algorithm. The clustering quality depends on the number of clusters chosen; hence, the elbow, the average silhouette, and NbClust R package methods are used to find the optimal number K. The simulations carried out show that the use of the K-means algorithm allows detecting precisely the faults in PV panels. The excellent result is given with three clusters that is suggested by the elbow method.

Download Full-text

Application of grid-based C-means clustering algorithm for image segmentation

2012 International Conference on Systems and Informatics (ICSAI2012) ◽

10.1109/icsai.2012.6223585 ◽

2012 ◽

Author(s):

Shihong Yue ◽

Jian Pan ◽

Lijun Cui

Keyword(s):

Image Segmentation ◽

Clustering Algorithm ◽

Grid Based

Download Full-text

A differential evolution algorithm based automatic determination of optimal number of clusters validated by fuzzy intercluster hostility index

2009 First International Conference on Advanced Computing ◽

10.1109/icadvc.2009.5378262 ◽

2009 ◽

Cited By ~ 4

Author(s):

Sourav De ◽

Siddhartha Bhattacharyya ◽

Paramartha Dutta

Keyword(s):

Differential Evolution ◽

Differential Evolution Algorithm ◽

Optimal Number ◽

Automatic Determination ◽

Number Of Clusters ◽

Evolution Algorithm ◽

Optimal Number Of Clusters

Download Full-text

A Quantitative Discriminant Method of Elbow Point for the Optimal Number of Clusters in Clustering Algorithm

10.21203/rs.3.rs-58011/v3 ◽

2021 ◽

Author(s):

Congming Shi ◽

Bingtao Wei ◽

Shoulin Wei ◽

Wen Wang ◽

Hai Liu ◽

...

Keyword(s):

Clustering Algorithm ◽

Clustering Algorithms ◽

Optimal Number ◽

Machine Learning Method ◽

Cluster Number ◽

Number Of Clusters ◽

Public Dataset ◽

Optimal Cluster ◽

Better Than ◽

Optimal Number Of Clusters

Abstract Clustering, a traditional machine learning method, plays a significant role in data analysis. Most clustering algorithms depend on a predetermined exact number of clusters, whereas, in practice, clusters are usually unpredictable. Although the Elbow method is one of the most commonly used methods to discriminate the optimal cluster number, the discriminant of the number of clusters depends on the manual identification of the elbow points on the visualization curve. Thus, experienced analysts cannot clearly identify the elbow point from the plotted curve when the plotted curve is fairly smooth. To solve this problem, a new elbow point discriminant method is proposed to yield a statistical metric that estimates an optimal cluster number when clustering on a dataset. First, the average degree of distortion obtained by the Elbow method is normalized to the range of 0 to 10. Second, the normalized results are used to calculate the cosine of intersection angles between elbow points. Third, this calculated cosine of intersection angles and the arccosine theorem are used to compute the intersection angles between elbow points. Finally, the index of the above computed minimal intersection angles between elbow points is used as the estimated potential optimal cluster number. The experimental results based on simulated datasets and a well-known public dataset (Iris Dataset) demonstrated that the estimated optimal cluster number obtained by our newly proposed method is better than the widely used Silhouette method.

Download Full-text

ANALYZING THE EFFECT OF VARIATION IN NUMBER OF CLUSTERS ON THE COLOR IMAGE SEGMENTATION USING K - MEANS CLUSTERING ALGORITHM

Journal of Technological Advances and Scientific Research ◽

10.14260/jtasr/2015/1 ◽

2015 ◽

Vol 1 (1) ◽

pp. 1-5

Author(s):

Gunjan Mathur ◽

Hemant Purohit

Keyword(s):

Image Segmentation ◽

Clustering Algorithm ◽

Color Image ◽

Color Image Segmentation ◽

Number Of Clusters

Download Full-text

Determination of Optimal Number of Clusters in Wireless Sensor Networks

International journal of Computer Networks & Communications ◽

10.5121/ijcnc.2012.4415 ◽

2012 ◽

Vol 4 (4) ◽

pp. 235-249 ◽

Cited By ~ 2

Author(s):

Ravi Tandon

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Optimal Number ◽

Wireless Sensor ◽

Number Of Clusters ◽

Optimal Number Of Clusters

Download Full-text

Analisis Kinerja Fuzzy C-Means (FCM) dan Fuzzy Subtractive (FS) dalam Clustering Data Alumni STMIK STIKOM Indonesia

INFORMAL: Informatics Journal ◽

10.19184/isj.v6i1.22077 ◽

2021 ◽

Vol 6 (1) ◽

pp. 41

Author(s):

I Kadek Dwi Gandika Supartha ◽

Adi Panca Saputra Iskandar

Keyword(s):

Partition Coefficient ◽

Clustering Algorithm ◽

Large Data ◽

Optimal Number ◽

Subtractive Clustering ◽

Data Set ◽

Number Of Clusters ◽

Fuzzy C Means ◽

Fitness Value ◽

Clustering Data

In this study, clustering data on STMIK STIKOM Indonesia alumni using the Fuzzy C-Means and Fuzzy Subtractive methods. The method used to test the validity of the cluster is the Modified Partition Coefficient (MPC) and Classification Entropy (CE) index. Clustering is carried out with the aim of finding hidden patterns or information from a fairly large data set, considering that so far the alumni data at STMIK STIKOM Indonesia have not undergone a data mining process. The results of measuring cluster validity using the Modified Partition Coefficient (MPC) and Classification Entropy (CE) index, the Fuzzy C-Means Clustering algorithm has a higher level of validity than the Fuzzy Subtractive Clustering algorithm so it can be said that the Fuzzy C-Means algorithm performs the cluster process better than with the Fuzzy Subtractive method in clustering alumni data. The number of clusters that have the best fitness value / the most optimal number of clusters based on the CE and MPC validity index is 5 clusters. The cluster that has the best characteristics is the 1st cluster which has 514 members (36.82% of the total alumni). With the characteristics of having an average GPA of 3.3617, the average study period is 7.8102 semesters and an average TA work period of 4.9596 months.

Download Full-text