GeneticTKM

2013 ◽  
Vol 4 (1) ◽  
pp. 67-77 ◽  
Author(s):  
Masoud Yaghini ◽  
Nasim Gereilinia

The clustering problem under the criterion of minimum sum square of errors is a non-convex and non-linear problem, which possesses many locally optimal values, resulting that its solution often being stuck at locally optimal solution. In this paper, a hybrid genetic, tabu search and k-means algorithm, called GeneticTKM, is proposed for the clustering problem. A new mutation operator is presented based on tabu search algorithm for the proposed hybrid genetic method. The key idea of the new operator is to produce tabu space for escaping from trap of local optimal and finding better solution. The results of the proposed algorithm are compared with other clustering algorithms such as genetic algorithm; tabu search and particle swarm optimization by implementing them and using standard and simulated data sets. The authors also compare the results of the proposed algorithm with other researchers’ results in clustering the standard data sets. The results show that the proposed algorithm can be considered as an effective and efficient algorithm to find better solution for the clustering problem.

Author(s):  
Masoud Yaghini ◽  
Nasim Gereilinia

The clustering problem under the criterion of minimum sum square of errors is a non-convex and non-linear problem, which possesses many locally optimal values, resulting that its solution often being stuck at locally optimal solution. In this paper, a hybrid genetic, tabu search and k-means algorithm, called GeneticTKM, is proposed for the clustering problem. A new mutation operator is presented based on tabu search algorithm for the proposed hybrid genetic method. The key idea of the new operator is to produce tabu space for escaping from trap of local optimal and finding better solution. The results of the proposed algorithm are compared with other clustering algorithms such as genetic algorithm; tabu search and particle swarm optimization by implementing them and using standard and simulated data sets. The authors also compare the results of the proposed algorithm with other researchers' results in clustering the standard data sets. The results show that the proposed algorithm can be considered as an effective and efficient algorithm to find better solution for the clustering problem.


2018 ◽  
Vol 33 (3) ◽  
pp. 387 ◽  
Author(s):  
Sudha Khambhampati ◽  
Prasad Calyam ◽  
Xinhui Zhang

Author(s):  
Hongkang Yang ◽  
Esteban G Tabak

Abstract The clustering problem, and more generally latent factor discovery or latent space inference, is formulated in terms of the Wasserstein barycenter problem from optimal transport. The objective proposed is the maximization of the variability attributable to class, further characterized as the minimization of the variance of the Wasserstein barycenter. Existing theory, which constrains the transport maps to rigid translations, is extended to affine transformations. The resulting non-parametric clustering algorithms include $k$-means as a special case and exhibit more robust performance. A continuous version of these algorithms discovers continuous latent variables and generalizes principal curves. The strength of these algorithms is demonstrated by tests on both artificial and real-world data sets.


2020 ◽  
Vol 34 (04) ◽  
pp. 3211-3218
Author(s):  
Liang Bai ◽  
Jiye Liang

Due to the complex structure of the real-world data, nonlinearly separable clustering is one of popular and widely studied clustering problems. Currently, various types of algorithms, such as kernel k-means, spectral clustering and density clustering, have been developed to solve this problem. However, it is difficult for them to balance the efficiency and effectiveness of clustering, which limits their real applications. To get rid of the deficiency, we propose a three-level optimization model for nonlinearly separable clustering which divides the clustering problem into three sub-problems: a linearly separable clustering on the object set, a nonlinearly separable clustering on the cluster set and an ensemble clustering on the partition set. An iterative algorithm is proposed to solve the optimization problem. The proposed algorithm can use low computational cost to effectively recognize nonlinearly separable clusters. The performance of this algorithm has been studied on synthetical and real data sets. Comparisons with other nonlinearly separable clustering algorithms illustrate the efficiency and effectiveness of the proposed algorithm.


2008 ◽  
Vol 18 (03) ◽  
pp. 185-194 ◽  
Author(s):  
WESAM BARBAKH ◽  
COLIN FYFE

We introduce a set of clustering algorithms whose performance function is such that the algorithms overcome one of the weaknesses of K-means, its sensitivity to initial conditions which leads it to converge to a local optimum rather than the global optimum. We derive online learning algorithms and illustrate their convergence to optimal solutions which K-means fails to find. We then extend the algorithm by underpinning it with a latent space which enables a topology preserving mapping to be found. We show visualisation results on some standard data sets.


2018 ◽  
Author(s):  
Elisabetta Mereu ◽  
Giovanni Iacono ◽  
Amy Guillaumet-Adkins ◽  
Catia Moutinho ◽  
Giulia Lunazzi ◽  
...  

AbstractSingle-cell transcriptomics allows the identification of cellular types, subtypes and states through cell clustering. In this process, similar cells are grouped before determining co-expressed marker genes for phenotype inference. The performance of computational tools is directly associated to their marker identification accuracy, but the lack of an optimal solution challenges a systematic method comparison. Moreover, phenotypes from different studies are challenging to integrate, due to varying resolution, methodology and experimental design. In this work we introduce matchSCore (https://github.com/elimereu/matchSCore), an approach to match cell populations fast across tools, experiments and technologies. We compared 14 computational methods and evaluated their accuracy in clustering and gene marker identification in simulated data sets. We further used matchSCore to project cell type identities across mouse and human cell atlas projects. Despite originating from different technologies, cell populations could be matched across data sets, allowing the assignment of clusters to reference maps and their annotation.


2013 ◽  
Vol 441 ◽  
pp. 762-767
Author(s):  
Ning Wang ◽  
Shi You Yang

To find the global optimal solution of a multimodal function with both continuous and discrete variables, an improved tabu search algorithm is proposed. The improvements include new generating mechanisms for initial and neighborhood solutions, the exclusive use of the tabu list, the restarting methodology for different cycle of iterations as well as the shifting away from the worst solutions. The numerical results on two numerical examples are reported to demonstrate the feasibility and merit of the proposed algorithm.


Author(s):  
D T Pham ◽  
A A Afify

Clustering is an important data exploration technique with many applications in different areas of engineering, including engineering design, manufacturing system design, quality assurance, production planning and process planning, modelling, monitoring, and control. The clustering problem has been addressed by researchers from many disciplines. However, efforts to perform effective and efficient clustering on large data sets only started in recent years with the emergence of data mining. The current paper presents an overview of clustering algorithms from a data mining perspective. Attention is paid to techniques of scaling up these algorithms to handle large data sets. The paper also describes a number of engineering applications to illustrate the potential of clustering algorithms as a tool for handling complex real-world problems.


2014 ◽  
Vol 24 (1) ◽  
pp. 151-163 ◽  
Author(s):  
Kristian Sabo

Abstract In this paper, we consider the l1-clustering problem for a finite data-point set which should be partitioned into k disjoint nonempty subsets. In that case, the objective function does not have to be either convex or differentiable, and generally it may have many local or global minima. Therefore, it becomes a complex global optimization problem. A method of searching for a locally optimal solution is proposed in the paper, the convergence of the corresponding iterative process is proved and the corresponding algorithm is given. The method is illustrated by and compared with some other clustering methods, especially with the l2-clustering method, which is also known in the literature as a smooth k-means method, on a few typical situations, such as the presence of outliers among the data and the clustering of incomplete data. Numerical experiments show in this case that the proposed l1-clustering algorithm is faster and gives significantly better results than the l2-clustering algorithm.


Sign in / Sign up

Export Citation Format

Share Document