scholarly journals Identification of stem cells from large cell populations with topological scoring

2021 ◽  
Author(s):  
Mihaela E. Sardiu ◽  
Andrew C. Box ◽  
Jeffrey S. Haug ◽  
Michael P. Washburn

Machine learning and topological analysis methods are becoming increasingly used on various large-scale omics datasets.

2020 ◽  
Author(s):  
Mihaela E. Sardiu ◽  
Box C. Andrew ◽  
Jeff Haug ◽  
Michael P. Washburn

AbstractMachine learning and topological analysis methods are becoming increasingly used on various large-scale omics datasets. Modern high dimensional flow cytometry data sets share many features with other omics datasets like genomics and proteomics. For example, genomics or proteomics datasets can be sparse and have high dimensionality, and flow cytometry datasets can also share these features. This makes flow cytometry data potentially a suitable candidate for employing machine learning and topological scoring strategies, for example, to gain novel insights into patterns within the data. We have previously developed the Topological Score (TopS) and implemented it for the analysis of quantitative protein interaction network datasets. Here we show that the TopS approach for large scale data analysis is applicable to the analysis of a previously described flow cytometry sorted human hematopoietic stem cell dataset. We demonstrate that TopS is capable of effectively sorting this dataset into cell populations and identify rare cell populations. We demonstrate the utility of TopS when coupled with multiple approaches including topological data analysis, X-shift clustering, and t-Distributed Stochastic Neighbor Embedding (t-SNE). Our results suggest that TopS could be effectively used to analyze large scale flow cytometry datasets to find rare cell populations.


Lab on a Chip ◽  
2010 ◽  
Vol 10 (21) ◽  
pp. 2952 ◽  
Author(s):  
Won Chul Lee ◽  
Sara Rigante ◽  
Albert P. Pisano ◽  
Frans A. Kuypers

2020 ◽  
Author(s):  
M. Réda Zellag ◽  
Yifan Zhao ◽  
Vincent Poupart ◽  
Ramya Singh ◽  
Jean-Claude Labbé ◽  
...  

AbstractInvestigating the complex interactions between stem cells and their native environment requires an efficient means to image them in situ. Caenorhabditis elegans germline stem cells (GSCs) are distinctly accessible for intravital imaging; however, long-term image acquisition and analysis of dividing GSCs can be technically challenging. Here we present a systematic investigation into the technical factors impacting GSC physiology during live imaging and provide an optimized method for monitoring GSC mitosis under minimally disruptive conditions. We describe CentTracker, an automated and generalizable image analysis tool that uses machine learning to pair mitotic centrosomes and which can extract a variety of mitotic parameters rapidly from large-scale datasets. We employ CentTracker to assess a range of mitotic features in GSCs and show that subpopulations with distinct mitotic profiles are unlikely to exist within the stem cell pool. We further find evidence for spatial clustering of GSC mitoses within the germline tissue and for biases in mitotic spindle orientation relative to the germline’s distal-proximal axis, and thus the niche. The technical and analytical tools provided herein pave the way for large-scale screening studies of multiple mitotic processes in GSCs dividing in situ, in an intact tissue, in a living animal, under seemingly physiological conditions.


2016 ◽  
Author(s):  
Tonia Korves ◽  
Christopher Garay ◽  
Heather A. Carleton ◽  
Ashley Sabol ◽  
Eija Trees ◽  
...  

AbstractPathogen genomic data is increasingly important in investigations of infectious disease outbreaks. The objective of this study is to develop methods for using large-scale genomic data to determine the type of the environment an outbreak pathogen came from. Specifically, this study focuses on assessing whether an outbreak strain came from a natural environment or experienced substantial laboratory culturing. The approach uses phylogenetic analyses and machine learning to identify DNA changes that are characteristic of laboratory culturing. The analysis methods include parallelized sequence read alignment, variant identification, phylogenetic tree construction, ancestral state reconstruction, semi-supervised classification, and random forests. These methods were applied to 902 Salmonella enterica serovar Typhimurium genomes from the NCBI Sequence Read Archive database. The analyses identified candidate signatures of laboratory culturing that are highly consistent with genes identified in published laboratory passage studies. In particular, the analysis identified mutations in rpoS, hfq, rfb genes, acrB, and rbsR as strong signatures of laboratory culturing. In leave-one-out cross-validation, the classifier had an area under the receiver operating characteristic (ROC) curve of 0.89 for strains from two laboratory reference sets collected in the 1940’s and 1980’s. The classifier was also used to assess laboratory culturing in foodborne and laboratory acquired outbreak strains closely related to laboratory reference strain serovar Typhimurium 14028. The classifier detected some evidence of laboratory culturing on the phylogeny branch leading to this clade, suggesting all of these strains may have a common ancestor that experienced laboratory culturing. Together, these results suggest that phylogenetic analysis and machine learning could be used to assess whether pathogens collected from patients are naturally occurring or have been extensively cultured in laboratories. The data analysis methods can be applied to any bacterial pathogen species, and could be adapted to assess viral pathogens and other types of source environments.


2021 ◽  
pp. mbc.E20-11-0716
Author(s):  
Réda M. Zellag ◽  
Yifan Zhao ◽  
Vincent Poupart ◽  
Ramya Singh ◽  
Jean-Claude Labbé ◽  
...  

Investigating the complex interactions between stem cells and their native environment requires an efficient means to image them in situ. Caenorhabditis elegans germline stem cells (GSCs) are distinctly accessible for intravital imaging; however, long-term image acquisition and analysis of dividing GSCs can be technically challenging. Here we present a systematic investigation into the technical factors impacting GSC physiology during live imaging and provide an optimized method for monitoring GSC mitosis under minimally disruptive conditions. We describe CentTracker, an automated and generalizable image analysis tool that uses machine learning to pair mitotic centrosomes and which can extract a variety of mitotic parameters rapidly from large-scale datasets. We employ CentTracker to assess a range of mitotic features in a large GSC data set. We observe spatial clustering of mitoses within the germline tissue, but no evidence that subpopulations with distinct mitotic profiles exist within the stem cell pool. We further find biases in GSC spindle orientation relative to the germline's distal-proximal axis, and thus the niche. The technical and analytical tools provided herein pave the way for large-scale screening studies of multiple mitotic processes in GSCs dividing in situ, in an intact tissue, in a living animal, under seemingly physiological conditions.


2020 ◽  
Author(s):  
Jin Soo Lim ◽  
Jonathan Vandermause ◽  
Matthijs A. van Spronsen ◽  
Albert Musaelian ◽  
Christopher R. O’Connor ◽  
...  

Restructuring of interface plays a crucial role in materials science and heterogeneous catalysis. Bimetallic systems, in particular, often adopt very different composition and morphology at surfaces compared to the bulk. For the first time, we reveal a detailed atomistic picture of the long-timescale restructuring of Pd deposited on Ag, using microscopy, spectroscopy, and novel simulation methods. Encapsulation of Pd by Ag always precedes layer-by-layer dissolution of Pd, resulting in significant Ag migration out of the surface and extensive vacancy pits. These metastable structures are of vital catalytic importance, as Ag-encapsulated Pd remains much more accessible to reactants than bulk-dissolved Pd. The underlying mechanisms are uncovered by performing fast and large-scale machine-learning molecular dynamics, followed by our newly developed method for complete characterization of atomic surface restructuring events. Our approach is broadly applicable to other multimetallic systems of interest and enables the previously impractical mechanistic investigation of restructuring dynamics.


2021 ◽  
Author(s):  
Norberto Sánchez-Cruz ◽  
Jose L. Medina-Franco

<p>Epigenetic targets are a significant focus for drug discovery research, as demonstrated by the eight approved epigenetic drugs for treatment of cancer and the increasing availability of chemogenomic data related to epigenetics. This data represents a large amount of structure-activity relationships that has not been exploited thus far for the development of predictive models to support medicinal chemistry efforts. Herein, we report the first large-scale study of 26318 compounds with a quantitative measure of biological activity for 55 protein targets with epigenetic activity. Through a systematic comparison of machine learning models trained on molecular fingerprints of different design, we built predictive models with high accuracy for the epigenetic target profiling of small molecules. The models were thoroughly validated showing mean precisions up to 0.952 for the epigenetic target prediction task. Our results indicate that the herein reported models have considerable potential to identify small molecules with epigenetic activity. Therefore, our results were implemented as freely accessible and easy-to-use web application.</p>


Sign in / Sign up

Export Citation Format

Share Document