Identification of Stem Cells from Large Cell Populations with Topological Scoring

Mapping Intimacies ◽

10.1101/2020.04.08.032102 ◽

2020 ◽

Author(s):

Mihaela E. Sardiu ◽

Box C. Andrew ◽

Jeff Haug ◽

Michael P. Washburn

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

Data Analysis ◽

Large Scale ◽

Topological Analysis ◽

Topological Data Analysis ◽

Cell Populations ◽

Hematopoietic Stem ◽

Flow Cytometry Data ◽

Genomics And Proteomics

AbstractMachine learning and topological analysis methods are becoming increasingly used on various large-scale omics datasets. Modern high dimensional flow cytometry data sets share many features with other omics datasets like genomics and proteomics. For example, genomics or proteomics datasets can be sparse and have high dimensionality, and flow cytometry datasets can also share these features. This makes flow cytometry data potentially a suitable candidate for employing machine learning and topological scoring strategies, for example, to gain novel insights into patterns within the data. We have previously developed the Topological Score (TopS) and implemented it for the analysis of quantitative protein interaction network datasets. Here we show that the TopS approach for large scale data analysis is applicable to the analysis of a previously described flow cytometry sorted human hematopoietic stem cell dataset. We demonstrate that TopS is capable of effectively sorting this dataset into cell populations and identify rare cell populations. We demonstrate the utility of TopS when coupled with multiple approaches including topological data analysis, X-shift clustering, and t-Distributed Stochastic Neighbor Embedding (t-SNE). Our results suggest that TopS could be effectively used to analyze large scale flow cytometry datasets to find rare cell populations.

Download Full-text

An open-source solution for advanced imaging flow cytometry data analysis using machine learning

Methods ◽

10.1016/j.ymeth.2016.08.018 ◽

2017 ◽

Vol 112 ◽

pp. 201-210 ◽

Cited By ~ 42

Author(s):

Holger Hennig ◽

Paul Rees ◽

Thomas Blasi ◽

Lee Kamentsky ◽

Jane Hung ◽

...

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

Data Analysis ◽

Open Source ◽

Advanced Imaging ◽

Flow Cytometry Data ◽

Imaging Flow Cytometry

Download Full-text

A Survey of Flow Cytometry Data Analysis Methods

Advances in Bioinformatics ◽

10.1155/2009/584603 ◽

2009 ◽

Vol 2009 ◽

pp. 1-19 ◽

Cited By ~ 53

Author(s):

Ali Bashashati ◽

Ryan R. Brinkman

Keyword(s):

Flow Cytometry ◽

Data Analysis ◽

State Of The Art ◽

Analysis Pipeline ◽

Future Directions ◽

Hematopoietic Stem ◽

Flow Cytometry Data ◽

Helper T Lymphocytes ◽

Data Analysis Pipeline ◽

Data Analysis Methods

Flow cytometry (FCM) is widely used in health research and in treatment for a variety of tasks, such as in the diagnosis and monitoring of leukemia and lymphoma patients, providing the counts of helper-T lymphocytes needed to monitor the course and treatment of HIV infection, the evaluation of peripheral blood hematopoietic stem cell grafts, and many other diseases. In practice, FCM data analysis is performed manually, a process that requires an inordinate amount of time and is error-prone, nonreproducible, nonstandardized, and not open for re-evaluation, making it the most limiting aspect of this technology. This paper reviews state-of-the-art FCM data analysis approaches using a framework introduced to report each of the components in a data analysis pipeline. Current challenges and possible future directions in developing fully automated FCM data analysis tools are also outlined.

Download Full-text

De novo identification and visualization of important cell populations for classic Hodgkin lymphoma using flow cytometry and machine learning

10.1101/2020.12.18.20248526 ◽

2020 ◽

Author(s):

Paul D. Simonson ◽

Yue Wu ◽

David Wu ◽

Jonathan R. Fromm ◽

Aaron Y. Lee

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

Hodgkin Lymphoma ◽

De Novo ◽

Cell Populations ◽

Automated Classification ◽

Flow Cytometry Data ◽

Classic Hodgkin Lymphoma ◽

Standard Flow Cytometry ◽

Potential Applications

AbstractObjectivesAutomated classification of flow cytometry data has the potential to reduce errors and accelerate flow cytometry interpretation. We desired a machine learning approach that is accurate, intuitively easy to understand, and highlights the cells that are most important in the algorithm’s prediction for a given case.MethodsWe developed an ensemble of convolutional neural networks (CNNs) for classification and visualization of impactful cell populations in detecting classic Hodgkin lymphoma, using two-dimensional (2D) histograms. Data from 977 and 245 clinical flow cytometry cases were used for training and testing, respectively. 78 non-gated 2D histograms were created per flow cytometry file. SHAP values were calculated to determine the most impactful 2D histograms and regions within the histograms. The SHAP values from all 78 histograms were then projected back to the original cells data for gating and visualization using standard flow cytometry software.ResultsThe algorithm achieved 67.7% recall (sensitivity), 82.4 % precision, and 0.92 AUROC. Visualization of the important cell populations in making individual predictions demonstrated correlations with known biology.ConclusionsThe method presented enables model explainability while highlighting important cell populations in individual flow cytometry specimens, with potential applications in both diagnosis and discovery of previously overlooked key cell populations.

Download Full-text

Spectral and topological analysis of the cortical representation of the head position: does hypnotizability matter?

10.1101/442053 ◽

2018 ◽

Author(s):

Esther Ibanez-Marcelo ◽

Lisa Campioni ◽

Diego Manzoni ◽

Enrica L Santarcangelo ◽

Giovanni Petri

Keyword(s):

Spectral Analysis ◽

Data Analysis ◽

Large Scale ◽

Topological Analysis ◽

Head Position ◽

Topological Data Analysis ◽

Proprioceptive Information ◽

Head Positions ◽

First Time ◽

Topological Data

The aim of the study was to assess the EEG correlates of head positions, which have never been studied in humans, in participants with different psychophysiological characteristics, as encoded by their hypnotizability scores. This choice is motivated by earlier studies suggesting different processing of the vestibular/neck proprioceptive information in subjects with high (highs) and low (lows) hypnotizability scores maintaining their head rotated toward one side (RH). We analysed EEG signals recorded in 20 highs and 19 lows in basal conditions (head forward) and during RH, using spectral analysis, which captures changes localized to specific recording sites, and Topological Data Analysis (TDA), which instead describes large-scale differences in processing and representing sensorimotor information. Spectral analysis revealed significant differences related to the head position for alpha1, beta2, beta3, gamma bands, but not to hypnotizability. TDA instead revealed global hypnotizability-related differences in the strengths of the correlations among recording sites during RH. Significant changes were observed in lows on the left parieto-occipital side and in highs in right fronto-parietal region. Significant differences between the two groups were found in the occipital region, where changes were larger in lows than in highs. The study reports findings of the EEG correlates of the head posture for the first time, indicates that hypnotizability modulates its representation/processing on large-scale and that spectral and topological data analysis provide complementary results.

Download Full-text

Identification of stem cells from large cell populations with topological scoring

Molecular Omics ◽

10.1039/d0mo00039f ◽

2021 ◽

Author(s):

Mihaela E. Sardiu ◽

Andrew C. Box ◽

Jeffrey S. Haug ◽

Michael P. Washburn

Keyword(s):

Machine Learning ◽

Stem Cells ◽

Large Scale ◽

Topological Analysis ◽

Large Cell ◽

Cell Populations ◽

Analysis Methods

Machine learning and topological analysis methods are becoming increasingly used on various large-scale omics datasets.

Download Full-text

tmap: topological analysis of population-scale microbiome data

10.1101/396960 ◽

2018 ◽

Cited By ~ 1

Author(s):

Tianhua Liao ◽

Yuchen Wei ◽

Mingjing Luo ◽

Guoping Zhao ◽

Haokui Zhou

Keyword(s):

Data Analysis ◽

Large Scale ◽

Topological Analysis ◽

Topological Data Analysis ◽

Link Type ◽

Online Documentation ◽

Microbiome Research ◽

Population Scale ◽

Microbiome Data ◽

Complex Dataset

AbstractPopulation-scale microbiome study poses specific challenges in data analysis, from enterotype analysis, identification of driver species, to microbiome-wide association of host covariates. Application of advanced data mining techniques to high-dimensional complex dataset is expected to meet the rapid advancement in large scale and integrative microbiome research. Here, we present tmap, a topological data analysis framework for population-scale microbiome study. This framework can capture complex shape of large scale microbiome data into a compressive network representation. We also develop network-based statistical analysis for driver species identification and microbiome-wide association analysis. tmap can be used for exploring variations in a population-scale microbiome landscape to study host-microbiome association.Availability and implementationtmap is available at GitHub (https://github.com/GPZ-Bioinfo/tmap), accompanied with online documentation and tutorial (http://tmap.readthedocs.io).Contacthttp://[email protected]

Download Full-text

Classification of apatite structures via topological data analysis: a framework for a ‘Materials Barcode’ representation of structure maps

Scientific Reports ◽

10.1038/s41598-021-90070-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Scott Broderick ◽

Ruhil Dongol ◽

Tianmu Zhang ◽

Krishna Rajan

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Crystal Chemistry ◽

Persistent Homology ◽

Hierarchical Classification ◽

Topological Data Analysis ◽

Learning Tool ◽

Coordination Polyhedra ◽

Machine Learning Tool ◽

Topological Data

AbstractThis paper introduces the use of topological data analysis (TDA) as an unsupervised machine learning tool to uncover classification criteria in complex inorganic crystal chemistries. Using the apatite chemistry as a template, we track through the use of persistent homology the topological connectivity of input crystal chemistry descriptors on defining similarity between different stoichiometries of apatites. It is shown that TDA automatically identifies a hierarchical classification scheme within apatites based on the commonality of the number of discrete coordination polyhedra that constitute the structural building units common among the compounds. This information is presented in the form of a visualization scheme of a barcode of homology classifications, where the persistence of similarity between compounds is tracked. Unlike traditional perspectives of structure maps, this new “Materials Barcode” schema serves as an automated exploratory machine learning tool that can uncover structural associations from crystal chemistry databases, as well as to achieve a more nuanced insight into what defines similarity among homologous compounds.

Download Full-text

On the Application of Topological Data Analysis and Machine Learning to Flood Incidents, and Decision Making

SSRN Electronic Journal ◽

10.2139/ssrn.3981505 ◽

2021 ◽

Author(s):

Felix Obi Ohanuba ◽

Mohd Tahir Ismail ◽

Majid Khan Ali

Keyword(s):

Machine Learning ◽

Decision Making ◽

Data Analysis ◽

Topological Data Analysis ◽

Topological Data

Download Full-text

A non-negative multilinear block tensor decomposition approach to flow cytometry data analysis

2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP) ◽

10.1109/mlsp.2014.6958907 ◽

2014 ◽

Author(s):

David Brie ◽

Sebastian Miron ◽

Philippe Becuwe ◽

Stephanie Grandemange

Keyword(s):

Flow Cytometry ◽

Data Analysis ◽

Tensor Decomposition ◽

Decomposition Approach ◽

Flow Cytometry Data

Download Full-text

Mapping cell populations in flow cytometry data for cross-sample comparison using the Friedman-Rafsky test statistic as a distance measure

Cytometry Part A ◽

10.1002/cyto.a.22735 ◽

2015 ◽

Vol 89 (1) ◽

pp. 71-88 ◽

Cited By ~ 17

Author(s):

Chiaowen Hsiao ◽

Mengya Liu ◽

Rick Stanton ◽

Monnie McGee ◽

Yu Qian ◽

...

Keyword(s):

Flow Cytometry ◽

Distance Measure ◽

Cell Populations ◽

Test Statistic ◽

Flow Cytometry Data

Download Full-text