Colored Mosaic Matrix: Visualization Technique for High-Dimensional Data

Author(s):  
Hiroaki Kobayashi ◽  
Kazuo Misue ◽  
Jiro Tanaka
Author(s):  
Michael C. Thrun ◽  
Alfred Ultsch

The Databionic swarm (DBS) is a flexible and robust clustering framework that consists of three independent modules: swarm based projection, high-dimensional data visualization and representation guided clustering. The first module is the parameter-free projection method Pswarm, which exploits concepts of self-organization and emergence, game theory, and swarm intelligence. The second module is a parameter-free high-dimensional data visualization technique called topographic map. It uses the generalized U-matrix, which enables to estimate first, if any cluster tendency exists and second, the estimation of the number of clusters. The third module offers a clustering method which can be verified by the visualization and vice versa. Benchmarking w.r.t. conventional algorithms demonstrated that DBS can outperform them. Several applications showed that cluster structures provided by DBS are meaningful. Exemplary, a clustering of worldwide country-related data w.r.t the COVID-19 pandemic is presented here. Code and data is made available via open source.


2017 ◽  
Vol 18 (1) ◽  
pp. 94-109 ◽  
Author(s):  
Junpeng Wang ◽  
Xiaotong Liu ◽  
Han-Wei Shen

Due to the intricate relationship between different dimensions of high-dimensional data, subspace analysis is often conducted to decompose dimensions and give prominence to certain subsets of dimensions, i.e. subspaces. Exploring and comparing subspaces are important to reveal the underlying features of subspaces, as well as to portray the characteristics of individual dimensions. To date, most of the existing high-dimensional data exploration and analysis approaches rely on dimensionality reduction algorithms (e.g. principal component analysis and multi-dimensional scaling) to project high-dimensional data, or their subspaces, to two-dimensional space and employ scatterplots for visualization. However, the dimensionality reduction algorithms are sometimes difficult to fine-tune and scatterplots are not effective for comparative visualization, making subspace comparison hard to perform. In this article, we aggregate high-dimensional data or their subspaces by computing pair-wise distances between all data items and showing the distances with matrix visualizations to present the original high-dimensional data or subspaces. Our approach enables effective visual comparisons among subspaces, which allows users to further investigate the characteristics of individual dimensions by studying their behaviors in similar subspaces. Through subspace comparisons, we identify dominant, similar, and conforming dimensions in different subspace contexts of synthetic and real-world high-dimensional data sets. Additionally, we present a prototype that integrates parallel coordinates plot and matrix visualization for high-dimensional data exploration and incremental dimensionality analysis, which also allows users to further validate the dimension characterization results derived from the subspace comparisons.


2009 ◽  
Vol 35 (7) ◽  
pp. 859-866
Author(s):  
Ming LIU ◽  
Xiao-Long WANG ◽  
Yuan-Chao LIU

Sign in / Sign up

Export Citation Format

Share Document