Clustering for Probability Density Functions by New k-Medoids Method

Scientific Programming ◽

10.1155/2018/2764016 ◽

2018 ◽

Vol 2018 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

D. Ho-Kieu ◽

T. Vo-Van ◽

T. Nguyen-Trang

Keyword(s):

Probability Density ◽

Clustering Algorithm ◽

Real Life ◽

Probability Density Functions ◽

Rand Index ◽

Computational Time ◽

Adjusted Rand Index ◽

Density Functions ◽

Potential Applications ◽

Iteration Number

This paper proposes a novel and efficient clustering algorithm for probability density functions based on k-medoids. Further, a scheme used for selecting the powerful initial medoids is suggested, which speeds up the computational time significantly. Also, a general proof for convergence of the proposed algorithm is presented. The effectiveness and feasibility of the proposed algorithm are verified and compared with various existing algorithms through both artificial and real datasets in terms of adjusted Rand index, computational time, and iteration number. The numerical results reveal an outstanding performance of the proposed algorithm as well as its potential applications in real life.

Download Full-text

An R Code for Implementing Non-hierarchical Algorithm for Clustering of Probability Density Functions

Journal of Advanced Engineering and Computation ◽

10.25073/jaec.201823.194 ◽

2018 ◽

Vol 2 (3) ◽

pp. 174

Author(s):

Diem Ngoc Tran ◽

Tom Vinant ◽

Théo Marc Colombani ◽

Diem Ho-Kieu

Keyword(s):

Probability Density ◽

Clustering Algorithm ◽

Simulated Data ◽

Probability Density Functions ◽

Computational Time ◽

Density Functions ◽

Data Set ◽

Hierarchical Algorithm ◽

Creative Commons ◽

Clustering Quality

This paper aims to present a code for implementation of non-hierarchical algorithm to cluster probability density functions in one dimension for the first time in R environment. The structure of code consists of 2 primary steps: executing the main clustering algorithm and evaluating the clustering quality. The code is validated on one simulated data set and two applications. The numerical results obtained are highly compatible with that on MATLAB software regarding computational time. Notably, the code mainly serves for educational purpose and desires to extend the availability of algorithm in several environments so as having multiple choices for whom interested in clustering. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Full-text

Extraction of Correlated Sparse Sources from Signal Mixtures

ISRN Signal Processing ◽

10.1155/2013/218651 ◽

2013 ◽

Vol 2013 ◽

pp. 1-17

Author(s):

M. S. Woolfson ◽

C. Bigan ◽

J. A. Crowe ◽

B. R. Hayes-Gill

Keyword(s):

Probability Density ◽

Blind Source Separation ◽

Source Separation ◽

Probability Density Functions ◽

Separation Method ◽

Density Functions ◽

Strongly Correlated ◽

Zero Correlation ◽

Input Parameters ◽

Potential Applications

A blind source separation method is described to extract sources from data mixtures where the underlying sources are sparse and correlated. The approach used is to detect and analyze segments of time where one source exists on its own. The method does not assume independence of sources and probability density functions are not assumed for any of the sources. A comparison is made between the proposed method and the Fast-ICA and Clusterwise PCA methods. It is shown that the proposed method works best for cases where the underlying sources are strongly correlated because Fast-ICA assumes zero correlation between sources and Clusterwise PCA can be sensitive to overlap between sources. However, for cases of sources that are sparse and weakly correlated with each other, there is a tendency for Fast-ICA and Clusterwise PCA to have better performances than the proposed method, the reason being that these methods appear to be more robust to changes in input parameters to the algorithms. In addition, because of the deflationary nature of the proposed method, there is a tendency for estimates to be more affected by noise than Fast-ICA when the number of sources increases. The paper concludes with a discussion concerning potential applications for the proposed method.

Download Full-text

A New Binary Adaptive Elitist Differential Evolution Based Automatic k-Medoids Clustering for Probability Density Functions

Mathematical Problems in Engineering ◽

10.1155/2019/6380568 ◽

2019 ◽

Vol 2019 ◽

pp. 1-16

Author(s):

D. Pham-Toan ◽

T. Vo-Van ◽

A. T. Pham-Chau ◽

T. Nguyen-Trang ◽

D. Ho-Kieu

Keyword(s):

Differential Evolution ◽

Probability Density ◽

Probability Density Functions ◽

Computational Time ◽

Density Functions ◽

Number Of Clusters ◽

Computational Burden ◽

Clustering Problem ◽

Chromosome Representation

This paper proposes an evolutionary computing based automatic partitioned clustering of probability density function, the so-called binary adaptive elitist differential evolution for clustering of probability density functions (baeDE-CDFs). Herein, the k-medoids based representative probability density functions (PDFs) are preferred to the k-means one for their capability of avoiding outlier effectively. Moreover, addressing clustering problem in favor of an evolutionary optimization one permits determining number of clusters “on the run”. Notably, the application of adaptive elitist differential evolution (aeDE) algorithm with binary chromosome representation not only decreases the computational burden remarkably, but also increases the quality of solution significantly. Multiple numerical examples are designed and examined to verify the proposed algorithm’s performance, and the numerical results are evaluated using numerous criteria to give a comprehensive conclusion. After some comparisons with other algorithms in the literature, it is worth noticing that the proposed algorithm reveals an outstanding performance in both quality of solution and computational time in a statistically significant way.

Download Full-text

Improving fuzzy clustering algorithm for probability density functions and applying in image recognition

Model Assisted Statistics and Applications ◽

10.3233/mas-200492 ◽

2020 ◽

Vol 15 (3) ◽

pp. 249-261

Author(s):

Dinh Phamtoan ◽

Tai Vovan

Keyword(s):

Probability Density ◽

Image Recognition ◽

Fuzzy Clustering ◽

Clustering Algorithm ◽

Probability Density Functions ◽

Density Functions ◽

Number Of Clusters ◽

Fuzzy Clustering Algorithm ◽

Different Types ◽

Specific Cluster

This study introduces a measure called coefficient of within-cluster proximity (CWP) to evaluate the similarity of probability density functions (DFs) within clusters. After surveying the under and upper, and the computational problems of CWP, a fuzzy clustering algorithm for DFs is proposed. This algorithm can determine the suitable number of clusters and find the probability for each DF to belong to specific cluster. The convergence of the algorithm is considered in theory and illustrated by the numerical examples. The algorithm is applied to image recognition. The results show strong advantages of it in comparison to other algorithms. They also indicate the potential of the proposed approach in application to the data of different types.

Download Full-text

An automatic clustering algorithm for probability density functions

Journal of Statistical Computation and Simulation ◽

10.1080/00949655.2014.949715 ◽

2014 ◽

Vol 85 (15) ◽

pp. 3047-3063 ◽

Cited By ~ 16

Author(s):

Jen-Hao Chen ◽

Wen-Liang Hung

Keyword(s):

Probability Density ◽

Clustering Algorithm ◽

Probability Density Functions ◽

Density Functions ◽

Automatic Clustering

Download Full-text

A jackknife entropy-based clustering algorithm for probability density functions

Journal of Statistical Computation and Simulation ◽

10.1080/00949655.2020.1832490 ◽

2020 ◽

pp. 1-15

Author(s):

Jen-Hao Chen ◽

Wen-Liang Hung

Keyword(s):

Probability Density ◽

Clustering Algorithm ◽

Probability Density Functions ◽

Density Functions

Download Full-text

A robust automatic clustering algorithm for probability density functions with application to categorizing color images

Communications in Statistics - Simulation and Computation ◽

10.1080/03610918.2017.1337137 ◽

2017 ◽

Vol 47 (7) ◽

pp. 2152-2168 ◽

Cited By ~ 3

Author(s):

J. H. Chen ◽

Y. C. Chang ◽

W. L. Hung

Keyword(s):

Probability Density ◽

Clustering Algorithm ◽

Probability Density Functions ◽

Color Images ◽

Density Functions ◽

Automatic Clustering

Download Full-text

Mie scattering measurements of scalar probability density functions in compressible mixing layers

10.2514/6.1991-1686 ◽

1991 ◽

Cited By ~ 3

Author(s):

N. MESSERSMITH ◽

J. DUTTON ◽

H. KRIER

Keyword(s):

Probability Density ◽

Mie Scattering ◽

Probability Density Functions ◽

Mixing Layers ◽

Density Functions ◽

Scattering Measurements

Download Full-text

Summary Statistics of Implied Probability Density Functions and their Properties

SSRN Electronic Journal ◽

10.2139/ssrn.314392 ◽

2002 ◽

Cited By ~ 1

Author(s):

Damien P.G. Lynch ◽

Nikolaos Panigirtzoglou

Keyword(s):

Probability Density ◽

Probability Density Functions ◽

Density Functions ◽

Summary Statistics

Download Full-text

Modeling Diameter Distributions with Six Probability Density Functions in Pinus halepensis Mill. Plantations Using Low-Density Airborne Laser Scanning Data in Aragón (Northeast Spain)

Remote Sensing ◽

10.3390/rs13122307 ◽

2021 ◽

Vol 13 (12) ◽

pp. 2307

Author(s):

J. Javier Gorgoso-Varela ◽

Rafael Alonso Ponce ◽

Francisco Rodríguez-Puerta

Keyword(s):

Probability Density ◽

Linear Models ◽

Pinus Halepensis ◽

Beta Function ◽

Probability Density Functions ◽

Lidar Data ◽

Density Functions ◽

Minimum Diameter ◽

Location Parameters ◽

Diameter Distributions

The diameter distributions of trees in 50 temporary sample plots (TSPs) established in Pinus halepensis Mill. stands were recovered from LiDAR metrics by using six probability density functions (PDFs): the Weibull (2P and 3P), Johnson’s SB, beta, generalized beta and gamma-2P functions. The parameters were recovered from the first and the second moments of the distributions (mean and variance, respectively) by using parameter recovery models (PRM). Linear models were used to predict both moments from LiDAR data. In recovering the functions, the location parameters of the distributions were predetermined as the minimum diameter inventoried, and scale parameters were established as the maximum diameters predicted from LiDAR metrics. The Kolmogorov–Smirnov (KS) statistic (Dn), number of acceptances by the KS test, the Cramér von Misses (W2) statistic, bias and mean square error (MSE) were used to evaluate the goodness of fits. The fits for the six recovered functions were compared with the fits to all measured data from 58 TSPs (LiDAR metrics could only be extracted from 50 of the plots). In the fitting phase, the location parameters were fixed at a suitable value determined according to the forestry literature (0.75·dmin). The linear models used to recover the two moments of the distributions and the maximum diameters determined from LiDAR data were accurate, with R2 values of 0.750, 0.724 and 0.873 for dg, dmed and dmax. Reasonable results were obtained with all six recovered functions. The goodness-of-fit statistics indicated that the beta function was the most accurate, followed by the generalized beta function. The Weibull-3P function provided the poorest fits and the Weibull-2P and Johnson’s SB also yielded poor fits to the data.

Download Full-text