Gene Coexpression Network Comparison via Persistent Homology

International Journal of Genomics ◽

10.1155/2018/7329576 ◽

2018 ◽

Vol 2018 ◽

pp. 1-11 ◽

Cited By ~ 3

Author(s):

Ali Nabi Duman ◽

Harun Pirim

Keyword(s):

Microarray Data ◽

Clustering Algorithm ◽

Persistent Homology ◽

Distance Matrix ◽

Stress Factors ◽

Topological Data Analysis ◽

Data Sets ◽

Computationally Efficient ◽

Topological Features ◽

Gene Coexpression

Persistent homology, a topological data analysis (TDA) method, is applied to microarray data sets. Although there are a few papers referring to TDA methods in microarray analysis, the usage of persistent homology in the comparison of several weighted gene coexpression networks (WGCN) was not employed before to the very best of our knowledge. We calculate the persistent homology of weighted networks constructed from 38 Arabidopsis microarray data sets to test the relevance and the success of this approach in distinguishing the stress factors. We quantify multiscale topological features of each network using persistent homology and apply a hierarchical clustering algorithm to the distance matrix whose entries are pairwise bottleneck distance between the networks. The immunoresponses to different stress factors are distinguishable by our method. The networks of similar immunoresponses are found to be close with respect to bottleneck distance indicating the similar topological features of WGCNs. This computationally efficient technique analyzing networks provides a quick test for advanced studies.

Download Full-text

Cluster Analysis of Haze Episodes Based on Topological Features

Sustainability ◽

10.3390/su12103985 ◽

2020 ◽

Vol 12 (10) ◽

pp. 3985

Author(s):

Nur Fariha Syaqina Zulkepli ◽

Mohd Salmi Md Noorani ◽

Fatimah Abdul Razak ◽

Munira Ismail ◽

Mohd Almie Alias

Keyword(s):

Cluster Analysis ◽

Persistent Homology ◽

Topological Data Analysis ◽

Data Sets ◽

Air Quality Monitoring ◽

Hierarchical Agglomerative Cluster ◽

Topological Features ◽

Hidden Patterns ◽

Air Quality Monitoring Stations ◽

Hierarchical Agglomerative Cluster Analysis

Severe haze episodes have periodically occurred in Southeast Asia, specifically taunting Malaysia with adverse effects. A technique called cluster analysis was used to analyze these occurrences. Traditional cluster analysis, in particular, hierarchical agglomerative cluster analysis (HACA), was applied directly to data sets. The data sets may contain hidden patterns that can be explored. In this paper, this underlying information was captured via persistent homology, a topological data analysis (TDA) tool, which extracts topological features including components, holes, and cavities in the data sets. In particular, an improved version of HACA was proposed by combining HACA and persistent homology. Additionally, a comparative study between traditional HACA and improved HACA was done using particulate matter data, which was the major pollutant found during haze episodes by the Klang, Petaling Jaya, and Shah Alam air quality monitoring stations. The effectiveness of these two clustering approaches was evaluated based on their ability to cluster the months according to the haze condition. The results showed that clustering based on topological features via the improved HACA approach was able to correctly group the months with severe haze compared to clustering them without such features, and these results were consistent for all three locations.

Download Full-text

Using Topological Data Analysis (TDA) and Persistent Homology to Analyze the Stock Markets in Singapore and Taiwan

Frontiers in Physics ◽

10.3389/fphy.2021.572216 ◽

2021 ◽

Vol 9 ◽

Author(s):

Peter Tsung-Wen Yen ◽

Siew Ann Cheong

Keyword(s):

Data Analysis ◽

Stock Markets ◽

Persistent Homology ◽

Betti Numbers ◽

Stock Index ◽

Topological Data Analysis ◽

Series Data ◽

Topological Features ◽

Market Crashes ◽

Topological Data

In recent years, persistent homology (PH) and topological data analysis (TDA) have gained increasing attention in the fields of shape recognition, image analysis, data analysis, machine learning, computer vision, computational biology, brain functional networks, financial networks, haze detection, etc. In this article, we will focus on stock markets and demonstrate how TDA can be useful in this regard. We first explain signatures that can be detected using TDA, for three toy models of topological changes. We then showed how to go beyond network concepts like nodes (0-simplex) and links (1-simplex), and the standard minimal spanning tree or planar maximally filtered graph picture of the cross correlations in stock markets, to work with faces (2-simplex) or any k-dim simplex in TDA. By scanning through a full range of correlation thresholds in a procedure called filtration, we were able to examine robust topological features (i.e. less susceptible to random noise) in higher dimensions. To demonstrate the advantages of TDA, we collected time-series data from the Straits Times Index and Taiwan Capitalization Weighted Stock Index (TAIEX), and then computed barcodes, persistence diagrams, persistent entropy, the bottleneck distance, Betti numbers, and Euler characteristic. We found that during the periods of market crashes, the homology groups become less persistent as we vary the characteristic correlation. For both markets, we found consistent signatures associated with market crashes in the Betti numbers, Euler characteristics, and persistent entropy, in agreement with our theoretical expectations.

Download Full-text

Cubical homology-based Image Classification - A Comparative Study

10.36939/ir.202112231202 ◽

2021 ◽

Author(s):

◽

Seungho Choe

Keyword(s):

Machine Learning ◽

Image Classification ◽

Digital Image ◽

Persistent Homology ◽

Topological Data Analysis ◽

Connected Components ◽

Gradient Boosting ◽

Topological Features ◽

Light Gradient ◽

Cubical Homology

Persistent homology is a powerful tool in topological data analysis (TDA) to compute, study and encode efficiently multi-scale topological features and is being increasingly used in digital image classification. The topological features represent number of connected components, cycles, and voids that describe the shape of data. Persistent homology extracts the birth and death of these topological features through a filtration process. The lifespan of these features can represented using persistent diagrams (topological signatures). Cubical homology is a more efficient method for extracting topological features from a 2D image and uses a collection of cubes to compute the homology, which fits the digital image structure of grids. In this research, we propose a cubical homology-based algorithm for extracting topological features from 2D images to generate their topological signatures. Additionally, we propose a score, which measures the significance of each of the sub-simplices in terms of persistence. Also, gray level co-occurrence matrix (GLCM) and contrast limited adapting histogram equalization (CLAHE) are used as a supplementary method for extracting features. Machine learning techniques are then employed to classify images using the topological signatures. Among the eight tested algorithms with six published image datasets with varying pixel sizes, classes, and distributions, our experiments demonstrate that cubical homology-based machine learning with deep residual network (ResNet 1D) and Light Gradient Boosting Machine (lightGBM) shows promise with the extracted topological features.

Download Full-text

Topological Data Analysis with Applications

10.1017/9781108975704 ◽

2021 ◽

Author(s):

Gunnar Carlsson ◽

Mikael Vejdemo-Johansson

Keyword(s):

Data Analysis ◽

Data Science ◽

Persistent Homology ◽

Topological Data Analysis ◽

Data Sets ◽

Advanced Learners ◽

Computer Scientists ◽

Dramatic Rise ◽

Modeling Data ◽

Topological Data

The continued and dramatic rise in the size of data sets has meant that new methods are required to model and analyze them. This timely account introduces topological data analysis (TDA), a method for modeling data by geometric objects, namely graphs and their higher-dimensional versions: simplicial complexes. The authors outline the necessary background material on topology and data philosophy for newcomers, while more complex concepts are highlighted for advanced learners. The book covers all the main TDA techniques, including persistent homology, cohomology, and Mapper. The final section focuses on the diverse applications of TDA, examining a number of case studies drawn from monitoring the progression of infectious diseases to the study of motion capture data. Mathematicians moving into data science, as well as data scientists or computer scientists seeking to understand this new area, will appreciate this self-contained resource which explains the underlying technology and how it can be used.

Download Full-text

Stochastic Gradient Descent Based K-Means Algorithm on Large Scale Data Clustering

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.687-691.1342 ◽

2014 ◽

Vol 687-691 ◽

pp. 1342-1345 ◽

Cited By ~ 1

Author(s):

Jie Ding ◽

Li Peng Zhu ◽

Bin Hu ◽

Ren Long Hang ◽

Yu Bao Sun

Keyword(s):

Gradient Descent ◽

Large Scale ◽

Clustering Algorithm ◽

Distance Matrix ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Data Sets ◽

Human Beings ◽

Large Scale Data ◽

Scale Data

With the rapid advance of data collection and storage technique, it is easy to acquire tens of millions or even billions of data sets. How to explore and exploit the useful or interesting information for human beings from these data sets has become an urgent issue. Traditional k-means clustering algorithm has been widely used in data mining community. First, randomly initialize k clustering centres. Then, all instances are classified into k different classes according to their distances to clustering centres. Lastly, update the clustering centres by the mean of its corresponding constituent instances. This whole process will be iterated until convergence. Obviously, at each iteration, distance matrix from all instances to k clustering centres must be calculated which will cost so much time when encounter large scale data sets. To address this issue, in this paper, we proposed a fast optimization algorithm based on stochastic gradient descent (SGD). At each iteration, randomly choose an instance, search its corresponding clustering centre and then update it immediately. Experimental results show that our proposed method achieves a competitive clustering results with less time cost.

Download Full-text

A persistent homological analysis of network data flow malfunctions

Journal of Complex Networks ◽

10.1093/comnet/cnx038 ◽

2017 ◽

Vol 5 (6) ◽

pp. 884-892 ◽

Cited By ~ 1

Author(s):

Nicholas A Scoville ◽

Karthik Yegnesh

Keyword(s):

Data Flow ◽

Persistent Homology ◽

Topological Data Analysis ◽

Network Data ◽

Packet Delivery ◽

Topological Features ◽

Algorithmic Construction ◽

The Stability ◽

Novel Applications ◽

Persistence Diagrams

Abstract Persistent homology has recently emerged as a powerful technique in topological data analysis for analysing the emergence and disappearance of topological features throughout a filtered space, shown via persistence diagrams. In this article, we develop an application of ideas from the theory of persistent homology and persistence diagrams to the study of data flow malfunctions in networks with a certain hierarchical structure. In particular, we formulate an algorithmic construction of persistence diagrams that parameterize network data flow errors, thus enabling novel applications of statistical methods that are traditionally used to assess the stability of persistence diagrams corresponding to homological data to the study of data flow malfunctions. We conclude with an application to network packet delivery systems.

Download Full-text

Feasibility of topological data analysis for event-related fMRI

Network Neuroscience ◽

10.1162/netn_a_00095 ◽

2019 ◽

Vol 3 (3) ◽

pp. 695-706 ◽

Cited By ~ 4

Author(s):

Cameron T. Ellis ◽

Michael Lesnick ◽

Gregory Henselman-Petrusek ◽

Bryn Keller ◽

Jonathan D. Cohen

Keyword(s):

Data Analysis ◽

Persistent Homology ◽

Time Frame ◽

Topological Data Analysis ◽

Fmri Data ◽

Cognitive Representations ◽

New Approach ◽

Neural Data ◽

Topological Features ◽

Topological Data

Recent fMRI research shows that perceptual and cognitive representations are instantiated in high-dimensional multivoxel patterns in the brain. However, the methods for detecting these representations are limited. Topological data analysis (TDA) is a new approach, based on the mathematical field of topology, that can detect unique types of geometric features in patterns of data. Several recent studies have successfully applied TDA to study various forms of neural data; however, to our knowledge, TDA has not been successfully applied to data from event-related fMRI designs. Event-related fMRI is very common but limited in terms of the number of events that can be run within a practical time frame and the effect size that can be expected. Here, we investigate whether persistent homology—a popular TDA tool that identifies topological features in data and quantifies their robustness—can identify known signals given these constraints. We use fmrisim, a Python-based simulator of realistic fMRI data, to assess the plausibility of recovering a simple topological representation under a variety of conditions. Our results suggest that persistent homology can be used under certain circumstances to recover topological structure embedded in realistic fMRI data simulations.

Download Full-text

Topological measurement of deep neural networks using persistent homology

Annals of Mathematics and Artificial Intelligence ◽

10.1007/s10472-021-09761-3 ◽

2021 ◽

Author(s):

Satoru Watanabe ◽

Hayato Yamana

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Persistent Homology ◽

Topological Data Analysis ◽

Data Sets ◽

One Dimensional ◽

Novel Approach ◽

The One ◽

Fully Connected ◽

Fully Connected Networks

AbstractThe inner representation of deep neural networks (DNNs) is indecipherable, which makes it difficult to tune DNN models, control their training process, and interpret their outputs. In this paper, we propose a novel approach to investigate the inner representation of DNNs through topological data analysis (TDA). Persistent homology (PH), one of the outstanding methods in TDA, was employed for investigating the complexities of trained DNNs. We constructed clique complexes on trained DNNs and calculated the one-dimensional PH of DNNs. The PH reveals the combinational effects of multiple neurons in DNNs at different resolutions, which is difficult to be captured without using PH. Evaluations were conducted using fully connected networks (FCNs) and networks combining FCNs and convolutional neural networks (CNNs) trained on the MNIST and CIFAR-10 data sets. Evaluation results demonstrate that the PH of DNNs reflects both the excess of neurons and problem difficulty, making PH one of the prominent methods for investigating the inner representation of DNNs.

Download Full-text

An algorithm for matching spatial objects of different-scale maps based on topological data analysis

Computer Optics ◽

10.18287/2412-6179-2019-43-6-1021-1029 ◽

2019 ◽

Vol 43 (6) ◽

pp. 1021-1029 ◽

Cited By ~ 1

Author(s):

S.V. Eremeev ◽

D.E. Andrianov ◽

V.S. Titov

Keyword(s):

Data Analysis ◽

Spatial Data ◽

General Structure ◽

Persistent Homology ◽

Topological Data Analysis ◽

Spatial Objects ◽

Topological Features ◽

Definition Of ◽

Topological Data

A problem of automatic comparison of spatial objects on maps with different scales for the same locality is considered in the article. It is proposed that this problem should be solved using methods of topological data analysis. The initial data of the algorithm are spatial objects that can be obtained from maps with different scales and subjected to deformations and distortions. Persistent homology allows us to identify the general structure of such objects in the form of topological features. The main topological features in the study are the connectivity components and holes in objects. The paper gives a mathematical description of the persistent homology method for representing spatial objects. A definition of a barcode for spatial data, which contains a description of the object in the form of topological features is given. An algorithm for comparing feature barcodes was developed. It allows us to find the general structure of objects. The algorithm is based on the analysis of data from the barcode. An index of objects similarity in terms of topological features is introduced. Results of the research of the algorithm for comparing maps of natural and municipal objects with different scales, generalization and deformation are shown. The experiments confirm the high quality of the proposed algorithm. The percentage of similarity in the comparison of natural objects, while taking into account the scale and deformation, is in the range from 85 to 92, and for municipal objects, after stretching and distortion of their parts, was from 74 to 87. Advantages of the proposed approach over analogues for the comparison of objects with significant deformation at different scales and after distortion are demonstrated.

Download Full-text

Feasibility of Topological Data Analysis for event-related fMRI

10.1101/457747 ◽

2018 ◽

Author(s):

Cameron T. Ellis ◽

Michael Lesnick ◽

Gregory Henselman-Petrusek ◽

Bryn Keller ◽

Jonathan D. Cohen

Keyword(s):

Data Analysis ◽

Persistent Homology ◽

Time Frame ◽

Topological Data Analysis ◽

Fmri Data ◽

Cognitive Representations ◽

New Approach ◽

Neural Data ◽

Topological Features ◽

Topological Data

AbstractRecent fMRI research shows that perceptual and cognitive representations are instantiated in high-dimensional multi-voxel patterns in the brain. However, the methods for detecting these representations are limited. Topological Data Analysis (TDA) is a new approach, based on the mathematical field of topology, that can detect unique types of geometric features in patterns of data. Several recent studies have successfully applied TDA to study various forms of neural data; however, to our knowledge, TDA has not been successfully applied to data from event-related fMRI designs. Event-related fMRI is very common but limited in terms of the number of events that can be run within a practical time frame and the effect size that can be expected. Here, we investigate whether persistent homology — a popular TDA tool that identifies topological features in data and quantifies their robustness — can identify known signals given these constraints. We use fmrisim, a Python-based simulator of realistic fMRI data, to assess the plausibility of recovering a simple topological representation under a variety of conditions. Our results suggest that persistent homology can be used under certain circumstances to recover topological structure embedded in realistic fMRI data simulations.

Download Full-text