Application of optimization tasks in cluster analysis

10.12737/7483 ◽  
2014 ◽  
Vol 8 (7) ◽  
pp. 0-0
Author(s):  
Олег Сдвижков ◽  
Oleg Sdvizhkov

Cluster analysis [3] is a relatively new branch of mathematics that studies the methods partitioning a set of objects, given a finite set of attributes into homogeneous groups (clusters). Cluster analysis is widely used in psychology, sociology, economics (market segmentation), and many other areas in which there is a problem of classification of objects according to their characteristics. Clustering methods implemented in a package STATISTICA [1] and SPSS [2], they return the partitioning into clusters, clustering and dispersion statistics dendrogram of hierarchical clustering algorithms. MS Excel Macros for main clustering methods and application examples are given in the monograph [5]. One of the central problems of cluster analysis is to define some criteria for the number of clusters, we denote this number by K, into which separated are a given set of objects. There are several dozen approaches [4] to determine the number K. In particular, according to [6], the number of clusters K - minimum number which satisfies where - the minimum value of total dispersion for partitioning into K clusters, N - number of objects. Among the clusters automatically causes the consistent application of abnormal clusters [4]. In 2010, proposed and experimentally validated was a method for obtaining the number of K by applying the density function [4]. The article offers two simple approaches to determining K, where each cluster has at least two objects. In the first number K is determined by the shortest Hamiltonian cycles in the second - through the minimum spanning tree. The examples of clustering with detailed step by step solutions and graphic illustrations are suggested. Shown is the use of macro VBA Excel, which returns the minimum spanning tree to the problems of clustering. The article contains a macro code, with commentaries to the main unit.

2016 ◽  
Vol 11 (2) ◽  
pp. 197-215 ◽  
Author(s):  
Qingyuan Wu ◽  
Changchen Zhan ◽  
Fu Lee Wang ◽  
Siyang Wang ◽  
Zeping Tang

Purpose The quick growth of web-based and mobile e-learning applications such as massive open online courses have created a large volume of online learning resources. Confronting such a large amount of learning data, it is important to develop effective clustering approaches for user group modeling and intelligent tutoring. The paper aims to discuss these issues. Design/methodology/approach In this paper, a minimum spanning tree based approach is proposed for clustering of online learning resources. The novel clustering approach has two main stages, namely, elimination stage and construction stage. During the elimination stage, the Euclidean distance is adopted as a metrics formula to measure density of learning resources. Resources with quite low densities are identified as outliers and therefore removed. During the construction stage, a minimum spanning tree is built by initializing the centroids according to the degree of freedom of the resources. Online learning resources are subsequently partitioned into clusters by exploiting the structure of minimum spanning tree. Findings Conventional clustering algorithms have a number of shortcomings such that they cannot handle online learning resources effectively. On the one hand, extant partitional clustering methods use a randomly assigned centroid for each cluster, which usually cause the problem of ineffective clustering results. On the other hand, classical density-based clustering methods are very computationally expensive and time-consuming. Experimental results indicate that the algorithm proposed outperforms the traditional clustering algorithms for online learning resources. Originality/value The effectiveness of the proposed algorithms has been validated by using several data sets. Moreover, the proposed clustering algorithm has great potential in e-learning applications. It has been demonstrated how the novel technique can be integrated in various e-learning systems. For example, the clustering technique can classify learners into groups so that homogeneous grouping can improve the effectiveness of learning. Moreover, clustering of online learning resources is valuable to decision making in terms of tutorial strategies and instructional design for intelligent tutoring. Lastly, a number of directions for future research have been identified in the study.


2011 ◽  
Vol 03 (04) ◽  
pp. 473-489
Author(s):  
HAI DU ◽  
WEILI WU ◽  
ZAIXIN LU ◽  
YINFENG XU

The Steiner minimum tree and the minimum spanning tree are two important problems in combinatorial optimization. Let P denote a finite set of points, called terminals, in the Euclidean space. A Steiner minimum tree of P, denoted by SMT(P), is a network with minimum length to interconnect all terminals, and a minimum spanning tree of P, denoted by MST(P), is also a minimum network interconnecting all the points in P, however, subject to the constraint that all the line segments in it have to terminate at terminals. Therefore, SMT(P) may contain points not in P, but MST(P) cannot contain such kind of points. Let [Formula: see text] denote the n-dimensional Euclidean space. The Steiner ratio in [Formula: see text] is defined to be [Formula: see text], where Ls(P) and Lm(P), respectively, denote lengths of a Steiner minimum tree and a minimum spanning tree of P. The best previously known lower bound for [Formula: see text] in the literature is 0.615. In this paper, we show that [Formula: see text] for any n ≥ 2.


2018 ◽  
Vol 27 (2) ◽  
pp. 163-182 ◽  
Author(s):  
Ilanthenral Kandasamy

AbstractNeutrosophy (neutrosophic logic) is used to represent uncertain, indeterminate, and inconsistent information available in the real world. This article proposes a method to provide more sensitivity and precision to indeterminacy, by classifying the indeterminate concept/value into two based on membership: one as indeterminacy leaning towards truth membership and the other as indeterminacy leaning towards false membership. This paper introduces a modified form of a neutrosophic set, called Double-Valued Neutrosophic Set (DVNS), which has these two distinct indeterminate values. Its related properties and axioms are defined and illustrated in this paper. An important role is played by clustering in several fields of research in the form of data mining, pattern recognition, and machine learning. DVNS is better equipped at dealing with indeterminate and inconsistent information, with more accuracy, than the Single-Valued Neutrosophic Set, which fuzzy sets and intuitionistic fuzzy sets are incapable of. A generalised distance measure between DVNSs and the related distance matrix is defined, based on which a clustering algorithm is constructed. This article proposes a Double-Valued Neutrosophic Minimum Spanning Tree (DVN-MST) clustering algorithm, to cluster the data represented by double-valued neutrosophic information. Illustrative examples are given to demonstrate the applications and effectiveness of this clustering algorithm. A comparative study of the DVN-MST clustering algorithm with other clustering algorithms like Single-Valued Neutrosophic Minimum Spanning Tree, Intuitionistic Fuzzy Minimum Spanning Tree, and Fuzzy Minimum Spanning Tree is carried out.


2007 ◽  
Vol 38 (3) ◽  
pp. 303-314 ◽  
Author(s):  
K. Srinivasa Raju ◽  
D. Nagesh Kumar

The present study deals with the application of cluster analysis, Fuzzy Cluster Analysis (FCA) and Kohonen Artificial Neural Networks (KANN) methods for classification of 159 meteorological stations in India into meteorologically homogeneous groups. Eight parameters, namely latitude, longitude, elevation, average temperature, humidity, wind speed, sunshine hours and solar radiation, are considered as the classification criteria for grouping. The optimal number of groups is determined as 14 based on the Davies–Bouldin index approach. It is observed that the FCA approach performed better than the other two methodologies for the present study.


1998 ◽  
Vol 52 (9) ◽  
pp. 1210-1221 ◽  
Author(s):  
Eric Laloum ◽  
Nguyen Quy Dao ◽  
Michel Daudon

Sixty-four combination spectra of three major gallstone components [i.e., cholesterol, calcium bilirubinate, and calcium carbonate (aragonite)] were simulated in accordance with a “fractal” ternary diagram. Comparison between the original pattern of composition and factorial maps of pretreated spectra makes it possible to show the effects of different normalization procedures (Euclidean norm, spectrum maximum, and area under spectrum set to 1). Cluster analysis of these spectra, depending on different agglomerative links (single linkage, complete linkage, average linkage, and Ward's criterion), was carried out. All the resultant trees yield the same groups, but Ward's criterion best preserves the pattern of the data. More than 100 gallstones from France and Vietnam were classified by using cluster analysis of their FT-IR spectra with Ward's criterion. Seven homogeneous groups of spectra were extracted, which have been significantly correlated to the four morphological types of gallstones: pure cholesterol, mixed cholesterol, brown pigment, and black pigment stones. This analysis also reveals that the morphological groups are not homogeneous in composition, in particular for black pigment stones.


Author(s):  
Kevin E. Voges

Cluster analysis is a fundamental data reduction technique used in the physical and social sciences. It is of potential interest to managers in Information Science, as it can be used to identify user needs though segmenting users such as Web site visitors. In addition, the theory of Rough sets is the subject of intense interest in computational intelligence research. The extension of this theory into rough clustering provides an important and potentially useful addition to the range of cluster analysis techniques available to the manager. Cluster analysis is defined as the grouping of “individuals or objects into clusters so that objects in the same cluster are more similar to one another than they are to objects in other clusters” (Hair, Black, Babin, Anderson, & Tatham, 2006). There are a number of comprehensive introductions to cluster analysis (Abonyi & Feil, 2007; Arabie, Hubert, & De Soete, 1994; Cramer, 2003; Everitt, Landau, & Leese, 2001; Gan, Ma, & Wu, 2007; Härdle & Hlávka, 2007). Techniques are often classified as hierarchical or nonhierarchical (Hair et al., 2006), and the most commonly used nonhierarchical technique is the k-means approach developed by MacQueen (1967). Recently, techniques based on developments in computational intelligence have also been used as clustering algorithms. For example, the theory of fuzzy sets developed by Zadeh (1965), which introduced the concept of partial set membership, has been applied to clustering (Abonyi & Feil, 2007; Dumitrescu, Lazzerini, & Jain, 2000). Another technique receiving considerable attention is the theory of rough sets (Pawlak, 1982), which has led to clustering algorithms referred to as rough clustering (do Prado, Engel, & Filho, 2002; Kumar, Krishna, Bapi, & De, 2007; Parmar, Wu, & Blackhurst, 2007; Voges, Pope, & Brown, 2002). This article provides brief introductions to k-means cluster analysis, rough sets theory, and rough clustering, and compares k-means clustering and rough clustering. It shows that rough clustering provides a more flexible solution to the clustering problem, and can be conceptualized as extracting concepts from the data, rather than strictly delineated subgroupings (Pawlak, 1991). Traditional clustering methods generate extensional descriptions of groups (i.e., which objects are members of each cluster), whereas clustering techniques based on rough sets theory generate intentional descriptions (i.e., what are the main characteristics of each cluster) (do Prado et al., 2002). These different goals suggest that both k-means clustering and rough clustering have their place in the data analyst’s and the information manager’s toolbox.


Author(s):  
Anay Majee ◽  
Souradeep Nanda ◽  
Gnana Swathika O.V

<p>Microgrids are the solution to the growing demand for energy in the recent times. It has the potential to improve local reliability, reduce cost and increase penetration rates for distributed renewable energy generation. Inclusion of Renewable Energy Systems (RES) which have become the topic of discussion in the recent times due to acute energy crisis, causes the power flow in the microgrid to be bi-directional in nature. The presence of the RES in the microgrid system causes the grid to be reconfigurable. This reconfiguration might also occur due to load or utility grid connection and disconnection. Thus conventional protection strategies are not applicable to micro-grids and is hence challenging for engineers to protect the grid in a fault condition. In this paper various Minimum Spanning Tree (MST) algorithms are applied in microgrids to identify the active nodes of the current topology of the network in a heuristic approach and thereby generating a tree from the given network so that minimum number of nodes have to be disconnected from the network during fault clearance. In the paper we have chosen the IEEE-39 and IEEE-69 bus networks as our sample test systems.</p>


Sign in / Sign up

Export Citation Format

Share Document