scholarly journals Identification of Stem Cells from Large Cell Populations with Topological Scoring

2020 ◽  
Author(s):  
Mihaela E. Sardiu ◽  
Box C. Andrew ◽  
Jeff Haug ◽  
Michael P. Washburn

AbstractMachine learning and topological analysis methods are becoming increasingly used on various large-scale omics datasets. Modern high dimensional flow cytometry data sets share many features with other omics datasets like genomics and proteomics. For example, genomics or proteomics datasets can be sparse and have high dimensionality, and flow cytometry datasets can also share these features. This makes flow cytometry data potentially a suitable candidate for employing machine learning and topological scoring strategies, for example, to gain novel insights into patterns within the data. We have previously developed the Topological Score (TopS) and implemented it for the analysis of quantitative protein interaction network datasets. Here we show that the TopS approach for large scale data analysis is applicable to the analysis of a previously described flow cytometry sorted human hematopoietic stem cell dataset. We demonstrate that TopS is capable of effectively sorting this dataset into cell populations and identify rare cell populations. We demonstrate the utility of TopS when coupled with multiple approaches including topological data analysis, X-shift clustering, and t-Distributed Stochastic Neighbor Embedding (t-SNE). Our results suggest that TopS could be effectively used to analyze large scale flow cytometry datasets to find rare cell populations.

Methods ◽  
2017 ◽  
Vol 112 ◽  
pp. 201-210 ◽  
Author(s):  
Holger Hennig ◽  
Paul Rees ◽  
Thomas Blasi ◽  
Lee Kamentsky ◽  
Jane Hung ◽  
...  

2009 ◽  
Vol 2009 ◽  
pp. 1-19 ◽  
Author(s):  
Ali Bashashati ◽  
Ryan R. Brinkman

Flow cytometry (FCM) is widely used in health research and in treatment for a variety of tasks, such as in the diagnosis and monitoring of leukemia and lymphoma patients, providing the counts of helper-T lymphocytes needed to monitor the course and treatment of HIV infection, the evaluation of peripheral blood hematopoietic stem cell grafts, and many other diseases. In practice, FCM data analysis is performed manually, a process that requires an inordinate amount of time and is error-prone, nonreproducible, nonstandardized, and not open for re-evaluation, making it the most limiting aspect of this technology. This paper reviews state-of-the-art FCM data analysis approaches using a framework introduced to report each of the components in a data analysis pipeline. Current challenges and possible future directions in developing fully automated FCM data analysis tools are also outlined.


2020 ◽  
Author(s):  
Paul D. Simonson ◽  
Yue Wu ◽  
David Wu ◽  
Jonathan R. Fromm ◽  
Aaron Y. Lee

AbstractObjectivesAutomated classification of flow cytometry data has the potential to reduce errors and accelerate flow cytometry interpretation. We desired a machine learning approach that is accurate, intuitively easy to understand, and highlights the cells that are most important in the algorithm’s prediction for a given case.MethodsWe developed an ensemble of convolutional neural networks (CNNs) for classification and visualization of impactful cell populations in detecting classic Hodgkin lymphoma, using two-dimensional (2D) histograms. Data from 977 and 245 clinical flow cytometry cases were used for training and testing, respectively. 78 non-gated 2D histograms were created per flow cytometry file. SHAP values were calculated to determine the most impactful 2D histograms and regions within the histograms. The SHAP values from all 78 histograms were then projected back to the original cells data for gating and visualization using standard flow cytometry software.ResultsThe algorithm achieved 67.7% recall (sensitivity), 82.4 % precision, and 0.92 AUROC. Visualization of the important cell populations in making individual predictions demonstrated correlations with known biology.ConclusionsThe method presented enables model explainability while highlighting important cell populations in individual flow cytometry specimens, with potential applications in both diagnosis and discovery of previously overlooked key cell populations.


2018 ◽  
Author(s):  
Esther Ibanez-Marcelo ◽  
Lisa Campioni ◽  
Diego Manzoni ◽  
Enrica L Santarcangelo ◽  
Giovanni Petri

The aim of the study was to assess the EEG correlates of head positions, which have never been studied in humans, in participants with different psychophysiological characteristics, as encoded by their hypnotizability scores. This choice is motivated by earlier studies suggesting different processing of the vestibular/neck proprioceptive information in subjects with high (highs) and low (lows) hypnotizability scores maintaining their head rotated toward one side (RH). We analysed EEG signals recorded in 20 highs and 19 lows in basal conditions (head forward) and during RH, using spectral analysis, which captures changes localized to specific recording sites, and Topological Data Analysis (TDA), which instead describes large-scale differences in processing and representing sensorimotor information. Spectral analysis revealed significant differences related to the head position for alpha1, beta2, beta3, gamma bands, but not to hypnotizability. TDA instead revealed global hypnotizability-related differences in the strengths of the correlations among recording sites during RH. Significant changes were observed in lows on the left parieto-occipital side and in highs in right fronto-parietal region. Significant differences between the two groups were found in the occipital region, where changes were larger in lows than in highs. The study reports findings of the EEG correlates of the head posture for the first time, indicates that hypnotizability modulates its representation/processing on large-scale and that spectral and topological data analysis provide complementary results.


2021 ◽  
Author(s):  
Mihaela E. Sardiu ◽  
Andrew C. Box ◽  
Jeffrey S. Haug ◽  
Michael P. Washburn

Machine learning and topological analysis methods are becoming increasingly used on various large-scale omics datasets.


2018 ◽  
Author(s):  
Tianhua Liao ◽  
Yuchen Wei ◽  
Mingjing Luo ◽  
Guoping Zhao ◽  
Haokui Zhou

AbstractPopulation-scale microbiome study poses specific challenges in data analysis, from enterotype analysis, identification of driver species, to microbiome-wide association of host covariates. Application of advanced data mining techniques to high-dimensional complex dataset is expected to meet the rapid advancement in large scale and integrative microbiome research. Here, we present tmap, a topological data analysis framework for population-scale microbiome study. This framework can capture complex shape of large scale microbiome data into a compressive network representation. We also develop network-based statistical analysis for driver species identification and microbiome-wide association analysis. tmap can be used for exploring variations in a population-scale microbiome landscape to study host-microbiome association.Availability and implementationtmap is available at GitHub (https://github.com/GPZ-Bioinfo/tmap), accompanied with online documentation and tutorial (http://tmap.readthedocs.io).Contacthttp://[email protected]


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Scott Broderick ◽  
Ruhil Dongol ◽  
Tianmu Zhang ◽  
Krishna Rajan

AbstractThis paper introduces the use of topological data analysis (TDA) as an unsupervised machine learning tool to uncover classification criteria in complex inorganic crystal chemistries. Using the apatite chemistry as a template, we track through the use of persistent homology the topological connectivity of input crystal chemistry descriptors on defining similarity between different stoichiometries of apatites. It is shown that TDA automatically identifies a hierarchical classification scheme within apatites based on the commonality of the number of discrete coordination polyhedra that constitute the structural building units common among the compounds. This information is presented in the form of a visualization scheme of a barcode of homology classifications, where the persistence of similarity between compounds is tracked. Unlike traditional perspectives of structure maps, this new “Materials Barcode” schema serves as an automated exploratory machine learning tool that can uncover structural associations from crystal chemistry databases, as well as to achieve a more nuanced insight into what defines similarity among homologous compounds.


2015 ◽  
Vol 89 (1) ◽  
pp. 71-88 ◽  
Author(s):  
Chiaowen Hsiao ◽  
Mengya Liu ◽  
Rick Stanton ◽  
Monnie McGee ◽  
Yu Qian ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document