Feature reduction fuzzy C-Means algorithm leveraging the marginal kurtosis measure

2020 ◽  
Vol 39 (5) ◽  
pp. 7259-7279
Author(s):  
Xingguang Pan ◽  
Shitong Wang

The feature reduction fuzzy c-means (FRFCM) algorithm has been proven to be effective for clustering data with redundant/unimportant feature(s). However, the FRFCM algorithm still has the following disadvantages. 1) The FRFCM uses the mean-to-variance-ratio (MVR) index to measure the feature importance of a dataset, but this index is affected by data normalization, i.e., a large MVR value of original feature(s) may become small if the data are normalized, and vice versa. Moreover, the MVR value(s) of the important feature(s) of a dataset may not necessarily be large. 2) The feature weights obtained by the FRFCM are sensitive to the initial cluster centers and initial feature weights. 3) The FRFCM algorithm may be unable to assign the proper weights to the features of a dataset. Thus, in the feature reduction learning process, important features may be discarded, but unimportant features may be retained. These disadvantages can cause the FRFCM algorithm to discard important feature components. In addition, the threshold for the selection of the important feature(s) of the FRFCM may not be easy to determine. To mitigate the disadvantages of the FRFCM algorithm, we first devise a new index, named the marginal kurtosis measure (MKM), to measure the importance of each feature in a dataset. Then, a novel and robust feature reduction fuzzy c-means clustering algorithm called the FRFCM-MKM, which incorporates the marginal kurtosis measure into the FRFCM, is proposed. Furthermore, an accurate threshold is introduced to select important feature(s) and discard unimportant feature(s). Experiments on synthetic and real-world datasets demonstrate that the FRFCM-MKM is effective and efficient.

2021 ◽  
Vol 5 (4) ◽  
pp. 415
Author(s):  
Yessica Nataliani

One of the best-known clustering methods is the fuzzy c-means clustering algorithm, besides k-means and hierarchical clustering. Since FCM treats all data features as equally important, it may obtain a poor clustering result. To solve the problem, feature selection with feature weighting is needed. Besides feature selection by assigning feature weights, there is also feature selection by assigning feature weights and eliminating the unrelated feature(s). THE Feature-reduction FCM (FRFCM) clustering algorithm can improve the FCM clustering result by weighting the features and discarding the unrelated feature(s) during the clustering process. Basketball is one of the famous sports, both international and national. There are five players in basketball, each with a different position. A player can generally be in guard, forward, or center position. Those three general positions need different characteristics of players’ physical conditions. In this paper, FRFCM is used to select the related physical feature(s) for basketball players, consisting of height, weight, age, and body mass index. to determine the basketball players’ position. The result shows that FRFCM can be applied to determine the basketball players’ position, where the most related physical feature is the player’s height. FRFCM gets one incorrect player’s position, so the error rate is 0.0435. As a comparison, FCM gets five incorrect player’s positions, with an error rate of 0.2174. This method can help the coach decide the basketball new player’s position.


2014 ◽  
Vol 998-999 ◽  
pp. 873-877
Author(s):  
Zhen Bo Wang ◽  
Bao Zhi Qiu

To reduce the impact of irrelevant attributes on clustering results, and improve the importance of relevant attributes to clustering, this paper proposes fuzzy C-means clustering algorithm based on coefficient of variation (CV-FCM). In the algorithm, coefficient of variation is used to weigh attributes so as to assign different weights to each attribute in the data set, and the magnitude of weight is used to express the importance of different attributes to clusters. In addition, for the characteristic of fuzzy C-means clustering algorithm that it is susceptible to initial cluster center value, the method for the selection of initial cluster center based on maximum distance is introduced on the basis of weighted coefficient of variation. The result of the experiment based on real data sets shows that this algorithm can select cluster center effectively, with the clustering result superior to general fuzzy C-means clustering algorithms.


2021 ◽  
Vol 40 (1) ◽  
pp. 1017-1024
Author(s):  
Ziheng Wu ◽  
Cong Li ◽  
Fang Zhou ◽  
Lei Liu

Fuzzy C-means clustering algorithm (FCM) is an effective approach for clustering. However, in most existing FCM type frameworks, only in-cluster compactness is taken into account, whereas the between-cluster separability is overlooked. In this paper, to enhance the clustering, by incorporating the feature weighting and data weighting method, we put forward a new weighted fuzzy C-means clustering approach considering between-cluster separability, in which for achieving good compactness and separability, making the in-cluster distances as small as possible and making the between-cluster distances as large as possible, the in-cluster distances and between-cluster distances are taken into account; To achieve the optimal clustering result, the iterative formulas of the feature weights, membership degrees, data weights and cluster centers are obtained by maximizing the in-cluster compactness and the between-cluster separability. Experiments on real-world datasets were carried out, the results showed that the new approach could obtain promising performance.


2013 ◽  
Vol 392 ◽  
pp. 803-807 ◽  
Author(s):  
Xue Bo Feng ◽  
Fang Yao ◽  
Zhi Gang Li ◽  
Xiao Jing Yang

According to the number of cluster centers, initial cluster centers, fuzzy factor, iterations and threshold, Fuzzy C-means clustering algorithm (FCM) clusters the data set. FCM will encounter the initialization problem of clustering prototype. Firstly, the article combines the maximum and minimum distance algorithm and K-means algorithm to determine the number of clusters and the initial cluster centers. Secondly, the article determines the optimal number of clusters with Silhouette indicators. Finally, the article improves the convergence rate of FCM by revising membership constantly. The improved FCM has good clustering effect, enhances the optimized capability, and improves the efficiency and effectiveness of the clustering. It has better tightness in the class, scatter among classes and cluster stability and faster convergence rate than the traditional FCM clustering method.


2018 ◽  
Vol 3 (1) ◽  
pp. 001
Author(s):  
Zulhendra Zulhendra ◽  
Gunadi Widi Nurcahyo ◽  
Julius Santony

In this study using Data Mining, namely K-Means Clustering. Data Mining can be used in searching for a large enough data analysis that aims to enable Indocomputer to know and classify service data based on customer complaints using Weka Software. In this study using the algorithm K-Means Clustering to predict or classify complaints about hardware damage on Payakumbuh Indocomputer. And can find out the data of Laptop brands most do service on Indocomputer Payakumbuh as one of the recommendations to consumers for the selection of Laptops.


2020 ◽  
Vol 15 ◽  
pp. 155892502097832
Author(s):  
Jiaqin Zhang ◽  
Jingan Wang ◽  
Le Xing ◽  
Hui’e Liang

As the precious cultural heritage of the Chinese nation, traditional costumes are in urgent need of scientific research and protection. In particular, there are scanty studies on costume silhouettes, due to the reasons of the need for cultural relic protection, and the strong subjectivity of manual measurement, which limit the accuracy of quantitative research. This paper presents an automatic measurement method for traditional Chinese costume dimensions based on fuzzy C-means clustering and silhouette feature point location. The method is consisted of six steps: (1) costume image acquisition; (2) costume image preprocessing; (3) color space transformation; (4) object clustering segmentation; (5) costume silhouette feature point location; and (6) costume measurement. First, the relative total variation model was used to obtain the environmental robustness and costume color adaptability. Second, the FCM clustering algorithm was used to implement image segmentation to extract the outer silhouette of the costume. Finally, automatic measurement of costume silhouette was achieved by locating its feature points. The experimental results demonstrated that the proposed method could effectively segment the outer silhouette of a costume image and locate the feature points of the silhouette. The measurement accuracy could meet the requirements of industrial application, thus providing the dual value of costume culture research and industrial application.


Sign in / Sign up

Export Citation Format

Share Document