Evolution of SOMs’ Structure and Learning Algorithm: From Visualization of High-Dimensional Data to Clustering of Complex Data

Marian B. Gorzałczany; Filip Rudziński

doi:10.3390/a13050109

Evolution of SOMs’ Structure and Learning Algorithm: From Visualization of High-Dimensional Data to Clustering of Complex Data

Algorithms ◽

10.3390/a13050109 ◽

2020 ◽

Vol 13 (5) ◽

pp. 109 ◽

Cited By ~ 1

Author(s):

Marian B. Gorzałczany ◽

Filip Rudziński

Keyword(s):

Data Visualization ◽

Data Clustering ◽

Learning Algorithm ◽

High Dimensional Data ◽

High Dimensional ◽

Data Sets ◽

Complex Data ◽

Self Organizing Maps ◽

Grid Networks ◽

Self Organizing

In this paper, we briefly present several modifications and generalizations of the concept of self-organizing neural networks—usually referred to as self-organizing maps (SOMs)—to illustrate their advantages in applications that range from high-dimensional data visualization to complex data clustering. Starting from conventional SOMs, Growing SOMs (GSOMs), Growing Grid Networks (GGNs), Incremental Grid Growing (IGG) approach, Growing Neural Gas (GNG) method as well as our two original solutions, i.e., Generalized SOMs with 1-Dimensional Neighborhood (GeSOMs with 1DN also referred to as Dynamic SOMs (DSOMs)) and Generalized SOMs with Tree-Like Structures (GeSOMs with T-LSs) are discussed. They are characterized in terms of (i) the modification mechanisms used, (ii) the range of network modifications introduced, (iii) the structure regularity, and (iv) the data-visualization/data-clustering effectiveness. The performance of particular solutions is illustrated and compared by means of selected data sets. We also show that the proposed original solutions, i.e., GeSOMs with 1DN (DSOMs) and GeSOMS with T-LSs outperform alternative approaches in various complex clustering tasks by providing up to 20 % increase in the clustering accuracy. The contribution of this work is threefold. First, algorithm-oriented original computer-implementations of particular SOM’s generalizations are developed. Second, their detailed simulation results are presented and discussed. Third, the advantages of our earlier-mentioned original solutions are demonstrated.

Download Full-text

MIGSOM: Multilevel Interior Growing Self-Organizing Maps for High Dimensional Data Clustering

Neural Processing Letters ◽

10.1007/s11063-012-9233-1 ◽

2012 ◽

Vol 36 (3) ◽

pp. 235-256 ◽

Cited By ~ 7

Author(s):

Thouraya Ayadi ◽

Tarek M. Hamdani ◽

Adel M. Alimi

Keyword(s):

Data Clustering ◽

High Dimensional Data ◽

High Dimensional ◽

Self Organizing Maps ◽

Self Organizing

Download Full-text

Data Visualization and High-Dimensional Data Clustering

Clustering ◽

10.1002/9780470382776.ch9 ◽

2009 ◽

pp. 237-261

Keyword(s):

Data Visualization ◽

Data Clustering ◽

High Dimensional Data ◽

High Dimensional

Download Full-text

HDGSOM: A Modified Growing Self-Organizing Map for High Dimensional Data Clustering

Fourth International Conference on Hybrid Intelligent Systems (HIS'04) ◽

10.1109/ichis.2004.52 ◽

2005 ◽

Cited By ~ 19

Author(s):

R. Amarasiri ◽

D. Alahakoon ◽

K.A. Smith

Keyword(s):

Data Clustering ◽

High Dimensional Data ◽

High Dimensional ◽

Self Organizing Map ◽

Self Organizing

Download Full-text

Self-Organizing Map Learning Nonlinearly Embedded Manifolds

Information Visualization ◽

10.1057/palgrave.ivs.9500088 ◽

2005 ◽

Vol 4 (1) ◽

pp. 22-31 ◽

Cited By ~ 8

Author(s):

Timo Similä

Keyword(s):

Learning Algorithm ◽

Image Data ◽

High Dimensional ◽

Locally Linear Embedding ◽

Complex Data ◽

Dimensional Manifold ◽

Self Organizing Map ◽

Training Strategy ◽

Low Dimensional ◽

Self Organizing

One of the main tasks in exploratory data analysis is to create an appropriate representation for complex data. In this paper, the problem of creating a representation for observations lying on a low-dimensional manifold embedded in high-dimensional coordinates is considered. We propose a modification of the Self-organizing map (SOM) algorithm that is able to learn the manifold structure in the high-dimensional observation coordinates. Any manifold learning algorithm may be incorporated to the proposed training strategy to guide the map onto the manifold surface instead of becoming trapped in local minima. In this paper, the Locally linear embedding algorithm is adopted. We use the proposed method successfully on several data sets with manifold geometry including an illustrative example of a surface as well as image data. We also show with other experiments that the advantage of the method over the basic SOM is restricted to this specific type of data.

Download Full-text

Visual analysis of self-organizing maps

Nonlinear Analysis Modelling and Control ◽

10.15388/na.16.4.14091 ◽

2011 ◽

Vol 16 (4) ◽

pp. 488-504 ◽

Cited By ~ 27

Author(s):

Pavel Stefanovič ◽

Olga Kurasova

Keyword(s):

Data Clustering ◽

Visual Analysis ◽

Data Sets ◽

The European Union ◽

Self Organizing Maps ◽

Data Set ◽

Economic Indices ◽

Graphical Presentation ◽

Iris Data ◽

Self Organizing

In the article, an additional visualization of self-organizing maps (SOM) has been investigated. The main objective of self-organizing maps is data clustering and their graphical presentation. Opportunities of SOM visualization in four systems (NeNet, SOM-Toolbox, Databionic ESOM and Viscovery SOMine) have been investigated. Each system has its additional tools for visualizing SOM. A comparative analysis has been made for two data sets: Fisher’s iris data set and the economic indices of the European Union countries. A new SOM system is also introduced and researched. The system has a specific visualization tool. It is missing in other SOM systems. It helps to see the proportion of neurons, corresponding to the data items, belonging to the different classes, and fallen in the same SOM cell.

Download Full-text

Visualization of Very Large High-Dimensional Data Sets as Minimum Spanning Trees

10.26434/chemrxiv.9698861.v1 ◽

2019 ◽

Author(s):

Daniel Probst ◽

Jean-Louis Reymond

Keyword(s):

Data Visualization ◽

Particle Physics ◽

Cancer Biology ◽

Spanning Trees ◽

Minimum Spanning Tree ◽

High Dimensional Data ◽

Locality Sensitive Hashing ◽

High Dimensional ◽

Data Sets ◽

Data Set

<div>Here, we introduce a new data visualization and exploration method, TMAP (tree-map), which exploits locality sensitive hashing, Kruskal’s minimum-spanning-tree algorithm, and a multilevel multipole-based graph layout algorithm to represent large and high dimensional data sets as a tree structure, which is readily understandable and explorable. Compared to other data visualization methods such as t-SNE or UMAP, TMAP increases the size of data sets that can be visualized due to its significantly lower memory requirements and running time and should find broad applicability in the age of big data. We exemplify TMAP in the area of cheminformatics with interactive maps for 1.16 million drug-like molecules from ChEMBL, 10.1 million small molecule fragments from FDB17, and 131 thousand 3D-structures of biomolecules from the PDB Databank, and to visualize data from literature (GUTENBERG data set), cancer biology (PANSCAN data set) and particle physics (MiniBooNE data set). TMAP is available as a Python package. Installation, usage instructions and application examples can be found at http://tmap.gdb.tools.</div>

Download Full-text

Saviorganizuojančių neuroninių tinklų sistemų lyginamoji analizė

Informacijos mokslai ◽

10.15388/im.2009.0.3216 ◽

2009 ◽

Vol 50 ◽

pp. 334-339

Author(s):

Pavel Stefanovič ◽

Olga Kurasova

Keyword(s):

Data Clustering ◽

The Self ◽

Data Sets ◽

Self Organizing Map ◽

Self Organizing Maps ◽

Learning Rules ◽

Main Target ◽

Similarities And Differences ◽

Graphical Presentation ◽

Self Organizing

Straipsnyje nagrinėjamos ir lyginamos tarpusavyje trys saviorganizuojančių neuroninių tinklų (SOM) sistemos: NeNet, SOM-Toolbox ir Databionic ESOM. Pagrindinis šių sistemų tikslas yra suskirstyti duomenis į klasterius pagal jų panašumą, pateikti juos SOM žemėlapyje. Sistemos viena nuo kitos skiriasi duomenų pateikimu, mokymo taisyklėmis, vizualizavimo galimybėmis, todėl čia aptariami sistemų panašumai ir skirtumai. SOM žemėlapiams mokyti ir vizualizuoti naudojami irisų ir stikloduomenys.Comparative Analysis of Self-Organizing Map SystemsPavel Stefanovič, Olga Kurasova SummaryIn the article, we compare three systems of self-organizing maps: NeNet, SOM-Toolbox and Databionic ESOM. The main target of the usage of the systems is data clustering and their graphical presentation on the self-organizing map (SOM). The self-organizing maps are one of types of artifi cial neural networks. The SOM systems are different one from other in their interfaces, the data pre-processing, learning rules, visualization manners, etc. Similarities and differences of the systems have been highlighted here. The experiments have been carried out with two data sets: iris and glass. Quantization and topographic errors of SOMs have been estimated, too.an>

Download Full-text

A SOM PROJECTION TECHNIQUE WITH THE GROWING STRUCTURE FOR VISUALIZING HIGH-DIMENSIONAL DATA

International Journal of Neural Systems ◽

10.1142/s0129065703001662 ◽

2003 ◽

Vol 13 (05) ◽

pp. 353-365 ◽

Cited By ~ 8

Author(s):

ZHENG WU ◽

GARY G. YEN

Keyword(s):

Projection Method ◽

Cell Structure ◽

High Dimensional Data ◽

Network Size ◽

High Dimensional ◽

Data Sets ◽

Self Organizing Map ◽

Data Set ◽

Growing Cell ◽

Self Organizing

The Self-Organizing Map (SOM) is an efficient tool for visualizing high-dimensional data. In this paper, an intuitive and effective SOM projection method is proposed for mapping high-dimensional data onto the two-dimensional grid structure with a growing self-organizing mechanism. In the learning phase, a growing SOM is trained and the growing cell structure is used as the baseline framework. In the ordination phase, the new projection method is used to map the input vector so that the input data is mapped to the structure of the SOM without having to plot the weight values, resulting in easy visualization of the data. The projection method is demonstrated on four different data sets, including a 118 patent data set and a 399 checical abstract data set related to polymer cements, with promising results and a significantly reduced network size.

Download Full-text

Using self-organizing maps to visualize high-dimensional data

Computers & Geosciences ◽

10.1016/j.cageo.2004.10.009 ◽

2005 ◽

Vol 31 (5) ◽

pp. 531-544 ◽

Cited By ~ 49

Author(s):

Brian S. Penn

Keyword(s):

High Dimensional Data ◽

High Dimensional ◽

Self Organizing Maps ◽

Self Organizing

Download Full-text

An improved Kohonen self-organizing map clustering algorithm for high-dimensional data sets

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v24.i1.pp600-610 ◽

2021 ◽

Vol 24 (1) ◽

pp. 600

Author(s):

Momotaz Begum ◽

Bimal Chandra Das ◽

Md. Zakir Hossain ◽

Antu Saha ◽

Khaleda Akther Papry

Keyword(s):

Clustering Algorithm ◽

Clustering Algorithms ◽

High Dimensional Data ◽

Predictive Performance ◽

High Dimensional ◽

Data Sets ◽

Self Organizing Map ◽

Distance Measurements ◽

Cancer Data ◽

Self Organizing

<p>Manipulating high-dimensional data is a major research challenge in the ﬁeld of computer science in recent years. To classify this data, a lot of clustering algorithms have already been proposed. Kohonen self-organizing map (KSOM) is one of them. However, this algorithm has some drawbacks like overlapping clusters and non-linear separability problems. Therefore, in this paper, we propose an improved KSOM (I-KSOM) to reduce the problems that measures distances among objects using EISEN Cosine correlation formula. So far as we know, no previous work has used EISEN Cosine correlation distance measurements to classify high-dimensional data sets. To the robustness of the proposed KSOM, we carry out the experiments on several popular datasets like Iris, Seeds, Glass, Vertebral column, and Wisconsin breast cancer data sets. Our proposed algorithm shows better result compared to the existing original KSOM and another modiﬁed KSOM in terms of predictive performance with topographic and quantization error.</p>

Download Full-text