categorical data clustering Latest Research Papers

Soft Set Multivariate Distribution for Categorical Data Clustering

International Journal on Advanced Science Engineering and Information Technology ◽

10.18517/ijaseit.11.5.15420 ◽

2021 ◽

Vol 11 (5) ◽

pp. 1841

Author(s):

Iwan Tri Riyadi Yanto ◽

Rohmat Saedudin ◽

Sely Novita Sari ◽

Mustafa Mat Deris ◽

Norhalina Senan

Keyword(s):

Categorical Data ◽

Data Clustering ◽

Soft Set ◽

Multivariate Distribution ◽

Categorical Data Clustering

Download Full-text

Context-Based Geodesic Dissimilarity Measure for Clustering Categorical Data

Applied Sciences ◽

10.3390/app11188416 ◽

2021 ◽

Vol 11 (18) ◽

pp. 8416

Author(s):

Changki Lee ◽

Uk Jung

Keyword(s):

Learning Outcomes ◽

Categorical Data ◽

Dissimilarity Measure ◽

Machine Learning Algorithms ◽

Distance Measures ◽

Categorical Variables ◽

Continuous Data ◽

Clustering Problem ◽

Data Clusters ◽

Categorical Data Clustering

Measuring the dissimilarity between two observations is the basis of many data mining and machine learning algorithms, and its effectiveness has a significant impact on learning outcomes. The dissimilarity or distance computation has been a manageable problem for continuous data because many numerical operations can be successfully applied. However, unlike continuous data, defining a dissimilarity between pairs of observations with categorical variables is not straightforward. This study proposes a new method to measure the dissimilarity between two categorical observations, called a context-based geodesic dissimilarity measure, for the categorical data clustering problem. The proposed method considers the relationships between categorical variables and discovers the implicit topological structures in categorical data. In other words, it can effectively reflect the nonlinear patterns of arbitrarily shaped categorical data clusters. Our experimental results confirm that the proposed measure that considers both nonlinear data patterns and relationships among the categorical variables yields better clustering performance than other distance measures.

Download Full-text

Categorical Data Clustering

Clustering ◽

10.1142/9789811241208_0010 ◽

2021 ◽

pp. 527-548

Keyword(s):

Categorical Data ◽

Data Clustering ◽

Categorical Data Clustering

Download Full-text

Qualitative Data Clustering to Detect Outliers

Entropy ◽

10.3390/e23070869 ◽

2021 ◽

Vol 23 (7) ◽

pp. 869

Author(s):

Agnieszka Nowak-Brzezińska ◽

Weronika Łazarz

Keyword(s):

Machine Learning ◽

Categorical Data ◽

Data Clustering ◽

Qualitative Data ◽

Clustering Algorithms ◽

Unusual Behavior ◽

Data Set ◽

Qualitative Variable ◽

Qualitative Variables ◽

Categorical Data Clustering

Detecting outliers is a widely studied problem in many disciplines, including statistics, data mining, and machine learning. All anomaly detection activities are aimed at identifying cases of unusual behavior compared to most observations. There are many methods to deal with this issue, which are applicable depending on the size of the data set, the way it is stored, and the type of attributes and their values. Most of them focus on traditional datasets with a large number of quantitative attributes. The multitude of solutions related to detecting outliers in quantitative sets, a large and still has a small number of research solutions is a problem detecting outliers in data containing only qualitative variables. This article was designed to compare three different categorical data clustering algorithms: K-modes algorithm taken from MacQueen’s K-means algorithm and the STIRR and ROCK algorithms. The comparison concerned the method of dividing the set into clusters and, in particular, the outliers detected by algorithms. During the research, the authors analyzed the clusters detected by the indicated algorithms, using several datasets that differ in terms of the number of objects and variables. They have conducted experiments on the parameters of the algorithms. The presented study made it possible to check whether the algorithms similarly detect outliers in the data and how much they depend on individual parameters and parameters of the set, such as the number of variables, tuples, and categories of a qualitative variable.

Download Full-text

Automated Attribute Weighting Fuzzy k-Centers Algorithm for Categorical Data Clustering

10.1007/978-3-030-85529-1_17 ◽

2021 ◽

pp. 205-217

Author(s):

Toan Nguyen Mau ◽

Van-Nam Huynh

Keyword(s):

Categorical Data ◽

Data Clustering ◽

Attribute Weighting ◽

Categorical Data Clustering

Download Full-text

Improving Quality of Ensemble Technique for Categorical Data Clustering Using Granule Computing

10.1007/978-3-030-86472-9_24 ◽

2021 ◽

pp. 261-272

Author(s):

Rahmah Brnawy ◽

Nematollaah Shiri

Keyword(s):

Categorical Data ◽

Data Clustering ◽

Ensemble Technique ◽

Categorical Data Clustering

Download Full-text

Learnable Weighting of Intra-attribute Distances for Categorical Data Clustering with Nominal and Ordinal Attributes

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2021.3056510 ◽

2021 ◽

pp. 1-1

Author(s):

Yiqun Zhang ◽

Yiu-ming Cheung

Keyword(s):

Categorical Data ◽

Data Clustering ◽

Categorical Data Clustering

Download Full-text

K-modes Based Categorical Data Clustering Algorithms Satisfying Differential Privacy

2020 International Conference on Networking and Network Applications (NaNA) ◽

10.1109/nana51271.2020.00022 ◽

2020 ◽

Author(s):

Mingshuang Li ◽

Yihui Zhou ◽

Wenru Tang ◽

LaiFeng Lu

Keyword(s):

Categorical Data ◽

Data Clustering ◽

Differential Privacy ◽

Clustering Algorithms ◽

Categorical Data Clustering

Download Full-text

Metaheuristic-based possibilistic fuzzy k-modes algorithms for categorical data clustering

Information Sciences ◽

10.1016/j.ins.2020.12.051 ◽

2020 ◽

Author(s):

R.J. Kuo ◽

Y.R. Zheng ◽

Thi Phuong Quyen Nguyen

Keyword(s):

Categorical Data ◽

Data Clustering ◽

Categorical Data Clustering

Download Full-text

A Comparative Study of Centroid and Medoid based Categorical Data Clustering Methods for Solving Cold-start Recommendation Problem

2020 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM) ◽

10.1109/cenim51130.2020.9297960 ◽

2020 ◽

Author(s):

Noor Ifada ◽

M. Eko Ariyanto ◽

Mochammad Kautsar Sophan ◽

Moh Nikmat

Keyword(s):

Comparative Study ◽

Categorical Data ◽

Data Clustering ◽

Cold Start ◽

Clustering Methods ◽

Categorical Data Clustering

Download Full-text

categorical data clustering
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Soft Set Multivariate Distribution for Categorical Data Clustering

Context-Based Geodesic Dissimilarity Measure for Clustering Categorical Data

Categorical Data Clustering

Qualitative Data Clustering to Detect Outliers

Automated Attribute Weighting Fuzzy k-Centers Algorithm for Categorical Data Clustering

Improving Quality of Ensemble Technique for Categorical Data Clustering Using Granule Computing

Learnable Weighting of Intra-attribute Distances for Categorical Data Clustering with Nominal and Ordinal Attributes

K-modes Based Categorical Data Clustering Algorithms Satisfying Differential Privacy

Metaheuristic-based possibilistic fuzzy k-modes algorithms for categorical data clustering

A Comparative Study of Centroid and Medoid based Categorical Data Clustering Methods for Solving Cold-start Recommendation Problem

Export Citation Format

categorical data clusteringRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Soft Set Multivariate Distribution for Categorical Data Clustering

Context-Based Geodesic Dissimilarity Measure for Clustering Categorical Data

Categorical Data Clustering

Qualitative Data Clustering to Detect Outliers

Automated Attribute Weighting Fuzzy k-Centers Algorithm for Categorical Data Clustering

Improving Quality of Ensemble Technique for Categorical Data Clustering Using Granule Computing

Learnable Weighting of Intra-attribute Distances for Categorical Data Clustering with Nominal and Ordinal Attributes

K-modes Based Categorical Data Clustering Algorithms Satisfying Differential Privacy

Metaheuristic-based possibilistic fuzzy k-modes algorithms for categorical data clustering

A Comparative Study of Centroid and Medoid based Categorical Data Clustering Methods for Solving Cold-start Recommendation Problem

categorical data clustering
Recently Published Documents