scholarly journals Hierarchical Hexagonal Clustering and Indexing

Symmetry ◽  
2019 ◽  
Vol 11 (6) ◽  
pp. 731 ◽  
Author(s):  
Vojtěch Uher ◽  
Petr Gajdoš ◽  
Václav Snášel ◽  
Yu-Chi Lai ◽  
Michal Radecký

Space-filling curves (SFCs) represent an efficient and straightforward method for sparse-space indexing to transform an n-dimensional space into a one-dimensional representation. This is often applied for multidimensional point indexing which brings a better perspective for data analysis, visualization and queries. SFCs are involved in many areas such as big data analysis and visualization, image decomposition, computer graphics and geographic information systems (GISs). The indexing methods subdivide the space into logic clusters of close points and they differ in various parameters including the cluster order, the distance metrics, and the pattern shape. Beside the simple and highly preferred triangular and square uniform grids, the hexagonal uniform grids have gained high interest especially in areas such as GISs, image processing and data visualization for the uniform distance between cells and high effectiveness of circle coverage. While the linearization of hexagons is an obvious approach for memory representation, it seems there is no hexagonal SFC indexing method generally used in practice. The main limitation of hexagons lies in lacking infinite decomposition into sub-hexagons and similarity of tiles on different levels of hierarchy. Our research aims at defining a fast and robust hexagonal SFC method. The Gosper fractal is utilized to preserve the benefits of hexagonal grids and to efficiently and hierarchically linearize points in a hexagonal grid while solving the non-convex shape and recursive transformation issues of the fractal. A comparison to other SFCs and grids is conducted to verify the robustness and effectiveness of our hexagonal method.

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Hirokazu Tanaka

AbstractEEG is known to contain considerable inter-trial and inter-subject variability, which poses a challenge in any group-level EEG analyses. A true experimental effect must be reproducible even with variabilities in trials, sessions, and subjects. Extracting components that are reproducible across trials and subjects benefits both understanding common mechanisms in neural processing of cognitive functions and building robust brain-computer interfaces. This study extends our previous method (task-related component analysis, TRCA) by maximizing not only trial-by-trial reproducibility within single subjects but also similarity across a group of subjects, hence referred to as group TRCA (gTRCA). The problem of maximizing reproducibility of time series across trials and subjects is formulated as a generalized eigenvalue problem. We applied gTRCA to EEG data recorded from 35 subjects during a steady-state visual-evoked potential (SSVEP) experiment. The results revealed: (1) The group-representative data computed by gTRCA showed higher and consistent spectral peaks than other conventional methods; (2) Scalp maps obtained by gTRCA showed estimated source locations consistently within the occipital lobe; And (3) the high-dimensional features extracted by gTRCA are consistently mapped to a low-dimensional space. We conclude that gTRCA offers a framework for group-level EEG data analysis and brain-computer interfaces alternative in complement to grand averaging.


Author(s):  
N. Puviarasan ◽  
R. Bhavani

In Content based image retrieval (CBIR) applications, the idea of indexing is mapping the extracted descriptors from images into a high-dimensional space. In this paper, visual features like color, texture and shape are considered. The color features are extracted using color coherence vector (CCV), texture features are obtained from Segmentation based Fractal Texture Analysis (SFTA). The shape features of an image are extracted using the Fourier Descriptors (FD) which is the contour based feature extraction method. All features of an image are then combined. After combining the color, texture and shape features using appropriate weights, the quadtree is used for indexing the images. It is experimentally found that the proposed indexing method using quadtree gives better performance than the other existing indexing methods.


1999 ◽  
Vol 31 (03) ◽  
pp. 625-631
Author(s):  
Tilmann Gneiting

A popular procedure in spatial data analysis is to fit a line segment of the formc(x) = 1 - α ||x||, ||x|| < 1, to observed correlations at (appropriately scaled) spatial lagxind-dimensional space. We show that such an approach is permissible if and only ifthe upper bound depending on the spatial dimensiond. The proof relies on Matheron's turning bands operator and an extension theorem for positive definite functions due to Rudin. Side results and examples include a general discussion of isotropic correlation functions defined ond-dimensional balls.


Author(s):  
Eric Bonabeau ◽  
Marco Dorigo ◽  
Guy Theraulaz

In the previous two chapters, foraging and division of labor were shown to be useful metaphors to design optimization and resource allocation algrithms. In this chapter, we will see that the clustering and sorting behavior of ants has stimulated researchers to design new algorithms for data analysis and graph partitioning. Several species of ants cluster corpses to form a “cemetery,” or sort their larvae into several piles. This behavior is still not fully understood, but a simple model, in which agents move randomly in space and pick up and deposit items on the basis of local information, may account for some of the characteristic features of clustering and sorting in ants. The model can also be applied to data analysis and graph partitioning: objects with different attributes or the nodes of a graph can be considered items to be sorted. Objects placed next to each other by the sorting algorithm have similar attributes, and nodes placed next each other by the sorting algorithm are tightly connected in the graph. The sorting algorithm takes place in a two-dimensional space, thereby offering a low-dimensional representation of the objects or of the graph. Distributed clustering, and more recently sorting, by a swarm of robots have served as benchmarks for swarm-based robotics. In all cases, the robots exhibit extremely simple behavior, act on the basis of purely local information, and communicate indirectly except for collision avoidance. In several species of ants, workers have been reported to form piles of corpses— literally cemeteries—to clean up their nests. Chretien [72] has performed experiments with the ant Lasius niger to study the organization of cemeteries. Other experiments on the ant Pheidole pallidula are also reported in Deneubourg et al. [88], and many species actually organize a cemetery. Figure 4.1 shows the dynamics of cemetery organization in another ant, Messor sancta. If corpses, or, more precisely, sufficiently large parts of corposes are randomly distributed in space at the beginning of the experiment, the workers form cemetery clusters within a few hours.


Medicina ◽  
2010 ◽  
Vol 46 (6) ◽  
pp. 408 ◽  
Author(s):  
Ricardo Duarte ◽  
Duarte Araújo ◽  
Orlando Fernandes ◽  
Cristina Fonseca ◽  
Vanda Correia ◽  
...  

Background and objective. In the last years, several motion analysis methods have been developed without considering representative contexts for sports performance. The purpose of this paper was to explain and underscore a straightforward method to measure human behavior in these contexts. Material and methods. Procedures combining manual video tracking (with TACTO device) and bidimensional reconstruction (through direct linear transformation) using a single camera were used in order to capture kinematic data required to compute collective variable(s) and control parameter(s). These procedures were applied to a 1vs1 association football task as an illustrative subphase of team sports and will be presented in a tutorial fashion. Results. Preliminary analysis of distance and velocity data identified a collective variable (difference between the distance of the attacker and the defender to a target defensive area) and two nested control parameters (interpersonal distance and relative velocity). Conclusions. Findings demonstrated that the complementary use of TACTO software and direct linear transformation permit to capture and reconstruct complex human actions in their context in a low dimensional space (information reduction).


Author(s):  
Muhammad Amjad

Advances in manifold learning have proven to be of great benefit in reducing the dimensionality of large complex datasets. Elements in an intricate dataset will typically belong in high-dimensional space as the number of individual features or independent variables will be extensive. However, these elements can be integrated into a low-dimensional manifold with well-defined parameters. By constructing a low-dimensional manifold and embedding it into high-dimensional feature space, the dataset can be simplified for easier interpretation. In spite of this elemental dimensionality reduction, the dataset’s constituents do not lose any information, but rather filter it with the hopes of elucidating the appropriate knowledge. This paper will explore the importance of this method of data analysis, its applications, and its extensions into topological data analysis.


Author(s):  
Firas A. Khasawneh ◽  
Elizabeth Munch

This paper explores the possibility of using techniques from topological data analysis for studying datasets generated from dynamical systems described by stochastic delay equations. The dataset is generated using Euler-Maryuama simulation for two first order systems with stochastic parameters drawn from a normal distribution. The first system contains additive noise whereas the second one contains parametric or multiplicative noise. Using Taken’s embedding, the dataset is converted into a point cloud in a high-dimensional space. Persistent homology is then employed to analyze the structure of the point cloud in order to study equilibria and periodic solutions of the underlying system. Our results show that the persistent homology successfully differentiates between different types of equilibria. Therefore, we believe this approach will prove useful for automatic data analysis of vibration measurements. For example, our approach can be used in machining processes for chatter detection and prevention.


2019 ◽  
Vol 20 (S19) ◽  
Author(s):  
Thomas A. Geddes ◽  
Taiyun Kim ◽  
Lihao Nan ◽  
James G. Burchfield ◽  
Jean Y. H. Yang ◽  
...  

Abstract Background Single-cell RNA-sequencing (scRNA-seq) is a transformative technology, allowing global transcriptomes of individual cells to be profiled with high accuracy. An essential task in scRNA-seq data analysis is the identification of cell types from complex samples or tissues profiled in an experiment. To this end, clustering has become a key computational technique for grouping cells based on their transcriptome profiles, enabling subsequent cell type identification from each cluster of cells. Due to the high feature-dimensionality of the transcriptome (i.e. the large number of measured genes in each cell) and because only a small fraction of genes are cell type-specific and therefore informative for generating cell type-specific clusters, clustering directly on the original feature/gene dimension may lead to uninformative clusters and hinder correct cell type identification. Results Here, we propose an autoencoder-based cluster ensemble framework in which we first take random subspace projections from the data, then compress each random projection to a low-dimensional space using an autoencoder artificial neural network, and finally apply ensemble clustering across all encoded datasets to generate clusters of cells. We employ four evaluation metrics to benchmark clustering performance and our experiments demonstrate that the proposed autoencoder-based cluster ensemble can lead to substantially improved cell type-specific clusters when applied with both the standard k-means clustering algorithm and a state-of-the-art kernel-based clustering algorithm (SIMLR) designed specifically for scRNA-seq data. Compared to directly using these clustering algorithms on the original datasets, the performance improvement in some cases is up to 100%, depending on the evaluation metric used. Conclusions Our results suggest that the proposed framework can facilitate more accurate cell type identification as well as other downstream analyses. The code for creating the proposed autoencoder-based cluster ensemble framework is freely available from https://github.com/gedcom/scCCESS


Author(s):  
Ronnie Alves ◽  
Joel Ribeiro ◽  
Orlando Belo ◽  
Jiawei Han

Business organizations must pay attention to interesting changes in customer behavior in order to anticipate their needs and act accordingly with appropriated business actions. Tracking customer’s commercial paths through the products they are interested in is an essential technique to improve business and increase customer satisfaction. Data warehousing (DW) allows us to do so, giving the basic means to record every customer transaction based on the different business strategies established. Although managing such huge amounts of records may imply business advantage, its exploration, especially in a multi-dimensional space (MDS), is a nontrivial task. The more dimensions we want to explore, the more are the computational costs involved in multi-dimensional data analysis (MDA). To make MDA practical in real world business problems, DW researchers have been working on combining data cubing and mining techniques to detect interesting changes in MDS. Such changes can also be detected through gradient queries. While those studies have provided the basis for future research in MDA, just few of them points to preference query selection in MDS. Thus, not only the exploration of changes in MDS is an essential task, but also even more important is ranking most interesting gradients. In this chapter, the authors investigate how to mine and rank the most interesting changes in a MDS applying a TOP-K gradient strategy. Additionally, the authors also propose a gradient-based cubing method to evaluate interesting gradient regions in MDS. So, the challenge is to find maximum gradient regions (MGRs) that maximize the task of raking gradients in a MDS. The authors’ evaluation study demonstrates that the proposed method presents a promising strategy for ranking gradients in MDS.


Sign in / Sign up

Export Citation Format

Share Document