Single-Cell Data Analysis Using MMD Variational Autoencoder for a More Informative Latent Representation

Mapping Intimacies ◽

10.1101/613414 ◽

2019 ◽

Cited By ~ 1

Author(s):

Chao Zhang

Keyword(s):

Data Analysis ◽

Single Cell ◽

Image Data ◽

Mass Cytometry ◽

Multiple Perspectives ◽

Original Dataset ◽

Latent Space ◽

Variational Autoencoder ◽

Set Up ◽

Cell Data

AbstractVariational Autoencoder (VAE) is a generative model from the computer vision community; it learns a latent representation of images and generates new images in an unsupervised way. Recently, Vanilla VAE has been applied to single-cell data analysis, in the hope of harnessing the representation power of latent space to evade the “curse of dimensionality” of the original dataset. However, Vanilla VAE is suffering from the issue of less informative latent space, which raises a question concerning the reliability of Vanilla VAE latent space in representing the high-dimensional single-cell datasets. Therefore I set up such a study to examine this issue from the multiple perspectives.This paper confirms the issue of Vanilla VAE by comparing it with MMD-VAE, a variant of VAE which has claimed to have overcome this issue based on image data, across a series of single-cell RNAseq and mass cytometry datasets. The result indicates that MMD-VAE is superior to Vanilla VAE in retaining the information not only in the latent space but also the reconstruction space, which suggests that MMD-VAE be a better option for single-cell data analysis than Vanilla VAE.

Download Full-text

SCOUT: Single-cell outlier analysis in cancer

10.1101/2020.03.25.007518 ◽

2020 ◽

Author(s):

Giovana Ravizzoni Onzi ◽

Juliano Luiz Faccioni ◽

Alvaro G. Alvarado ◽

Paula Andreghetto Bracco ◽

Harley I. Kornblum ◽

...

Keyword(s):

Data Analysis ◽

Single Cell ◽

Biological Markers ◽

Rna Seq ◽

Outlier Analysis ◽

Mass Cytometry ◽

Wide Range ◽

Cell Data

Outliers are often ignored or even removed from data analysis. In cancer, however, single outlier cells can be of major importance, since they have uncommon characteristics that may confer capacity to invade, metastasize, or resist to therapy. Here we present the Single-Cell OUTlier analysis (SCOUT), a resource for single-cell data analysis focusing on outlier cells, and the SCOUT Selector (SCOUTS), an application to systematically apply SCOUT on a dataset over a wide range of biological markers. Using publicly available datasets of cancer samples obtained from mass cytometry and single-cell RNA-seq platforms, outlier cells for the expression of proteins or RNAs were identified and compared to their non-outlier counterparts among different samples. Our results show that analyzing single-cell data using SCOUT can uncover key information not easily observed in the analysis of the whole population.

Download Full-text

Meeting the challenges of high-dimensional data analysis in immunology

10.1101/473215 ◽

2018 ◽

Cited By ~ 1

Author(s):

Subarna Palit ◽

Fabian J. Theis ◽

Christina E. Zielinski

Keyword(s):

Data Analysis ◽

Single Cell ◽

Immune Regulation ◽

Cellular Heterogeneity ◽

High Dimensionality ◽

High Dimensional ◽

Mass Cytometry ◽

Analysis Techniques ◽

The Impact ◽

Cell Data

AbstractRecent advances in cytometry have radically altered the fate of single-cell proteomics by allowing a more accurate understanding of complex biological systems. Mass cytometry (CyTOF) provides simultaneous single-cell measurements that are crucial to understand cellular heterogeneity and identify novel cellular subsets. High-dimensional CyTOF data were traditionally analyzed by gating on bivariate dot plots, which are not only laborious given the quadratic increase of complexity with dimension but are also biased through manual gating. This review aims to discuss the impact of new analysis techniques for in-depths insights into the dynamics of immune regulation obtained from static snapshot data and to provide tools to immunologists to address the high dimensionality of their single-cell data.

Download Full-text

High-throughput single cell data analysis – A tutorial

Analytica Chimica Acta ◽

10.1016/j.aca.2021.338872 ◽

2021 ◽

pp. 338872

Author(s):

Gerjen H. Tinnevelt ◽

Kristiaan Wouters ◽

Geert J. Postma ◽

Rita Folcarelli ◽

Jeroen J. Jansen

Keyword(s):

Data Analysis ◽

Single Cell ◽

High Throughput ◽

Cell Data

Download Full-text

Distinguishing different modes of growth using single-cell data

eLife ◽

10.7554/elife.72565 ◽

2021 ◽

Vol 10 ◽

Author(s):

Prathitha Kar ◽

Sriram Tiruvadi-Krishnan ◽

Jaana Männik ◽

Jaan Männik ◽

Ariel Amir

Keyword(s):

Data Analysis ◽

Single Cell ◽

Statistical Methods ◽

Synthetic Data ◽

Cellular Growth ◽

Biological Mechanisms ◽

E Coli ◽

High Throughput Data ◽

Consistent Method ◽

Cell Data

Collection of high-throughput data has become prevalent in biology. Large datasets allow the use of statistical constructs such as binning and linear regression to quantify relationships between variables and hypothesize underlying biological mechanisms based on it. We discuss several such examples in relation to single-cell data and cellular growth. In particular, we show instances where what appears to be ordinary use of these statistical methods leads to incorrect conclusions such as growth being non-exponential as opposed to exponential and vice versa. We propose that the data analysis and its interpretation should be done in the context of a generative model, if possible. In this way, the statistical methods can be validated either analytically or against synthetic data generated via the use of the model, leading to a consistent method for inferring biological mechanisms from data. On applying the validated methods of data analysis to infer cellular growth on our experimental data, we find the growth of length in E. coli to be non-exponential. Our analysis shows that in the later stages of the cell cycle the growth rate is faster than exponential.

Download Full-text

Computational Methods for Single-Cell Data Analysis

10.1007/978-1-4939-9057-3 ◽

2019 ◽

Keyword(s):

Data Analysis ◽

Single Cell ◽

Computational Methods ◽

Cell Data

Download Full-text

Continuous visualization of differences between biological conditions in single-cell data

10.1101/337485 ◽

2018 ◽

Cited By ~ 1

Author(s):

Tyler J. Burns ◽

Garry P. Nolan ◽

Nikolay Samusik

Keyword(s):

Single Cell ◽

Nearest Neighbor ◽

Developmental Trajectory ◽

Functional Markers ◽

Mass Cytometry ◽

K Nearest Neighbor ◽

Cell Frequency ◽

Low Dimensional ◽

Marker Shift ◽

Cell Data

In high-dimensional single cell data, comparing changes in functional markers between conditions is typically done across manual or algorithm-derived partitions based on population-defining markers. Visualizations of these partitions is commonly done on low-dimensional embeddings (eg. t-SNE), colored by per-partition changes. Here, we provide an analysis and visualization tool that performs these comparisons across overlapping k-nearest neighbor (KNN) groupings. This allows one to color low-dimensional embeddings by marker changes without hard boundaries imposed by partitioning. We devised an objective optimization of k based on minimizing functional marker KNN imputation error. Proof-of-concept work visualized the exact location of an IL-7 responsive subset in a B cell developmental trajectory on a t-SNE map independent of clustering. Per-condition cell frequency analysis revealed that KNN is sensitive to detecting artifacts due to marker shift, and therefore can also be valuable in a quality control pipeline. Overall, we found that KNN groupings lead to useful multiple condition visualizations and efficiently extract a large amount of information from mass cytometry data. Our software is publicly available through the Bioconductor package Sconify.

Download Full-text

Peer Review #1 of "SCelVis: exploratory single cell data analysis on the desktop and in the cloud (v0.2)"

10.7287/peerj.8607v0.2/reviews/1 ◽

2020 ◽

Author(s):

A Olsen

Keyword(s):

Data Analysis ◽

Single Cell ◽

Peer Review ◽

Cell Data

Download Full-text

Dice-XMBD: Deep Learning-Based Cell Segmentation for Imaging Mass Cytometry

Frontiers in Genetics ◽

10.3389/fgene.2021.721229 ◽

2021 ◽

Vol 12 ◽

Author(s):

Xu Xiao ◽

Ying Qiao ◽

Yudi Jiao ◽

Na Fu ◽

Wenxian Yang ◽

...

Keyword(s):

Deep Learning ◽

High Resolution ◽

Single Cell ◽

Image Data ◽

Basic Research ◽

Cell Segmentation ◽

Imaging Method ◽

Imaging Data ◽

Mass Cytometry ◽

Multiplexed Imaging

Highly multiplexed imaging technology is a powerful tool to facilitate understanding the composition and interactions of cells in tumor microenvironments at subcellular resolution, which is crucial for both basic research and clinical applications. Imaging mass cytometry (IMC), a multiplex imaging method recently introduced, can measure up to 100 markers simultaneously in one tissue section by using a high-resolution laser with a mass cytometer. However, due to its high resolution and large number of channels, how to process and interpret the image data from IMC remains a key challenge to its further applications. Accurate and reliable single cell segmentation is the first and a critical step to process IMC image data. Unfortunately, existing segmentation pipelines either produce inaccurate cell segmentation results or require manual annotation, which is very time consuming. Here, we developed Dice-XMBD1, a Deep learnIng-based Cell sEgmentation algorithm for tissue multiplexed imaging data. In comparison with other state-of-the-art cell segmentation methods currently used for IMC images, Dice-XMBD generates more accurate single cell masks efficiently on IMC images produced with different nuclear, membrane, and cytoplasm markers. All codes and datasets are available at https://github.com/xmuyulab/Dice-XMBD.

Download Full-text

SISUA: Semi-Supervised Generative Autoencoder for Single Cell Data

10.1101/631382 ◽

2019 ◽

Cited By ~ 1

Author(s):

Trung Ngo Trong ◽

Roger Kramer ◽

Juha Mehtonen ◽

Gerardo González ◽

Ville Hautamäki ◽

...

Keyword(s):

Single Cell ◽

Network Architecture ◽

Surface Protein ◽

Protein Quantification ◽

Additional Information ◽

Protein Levels ◽

Variational Autoencoder ◽

Cell Gene Expression ◽

Cell Phenotypes ◽

Cell Data

ABSTRACTSingle-cell transcriptomics offers a tool to study the diversity of cell phenotypes through snapshots of the abundance of mRNA in individual cells. Often there is additional information available besides the single cell gene expression counts, such as bulk transcriptome data from the same tissue, or quantification of surface protein levels from the same cells. In this study, we propose models based on the Bayesian generative approach, where protein quantification available as CITE-seq counts from the same cells are used to constrain the learning process, thus forming a semi-supervised model. The generative model is based on the deep variational autoencoder (VAE) neural network architecture.

Download Full-text

Computational approaches for high‐throughput single‐cell data analysis

FEBS Journal ◽

10.1111/febs.14613 ◽

2018 ◽

Vol 286 (8) ◽

pp. 1451-1467 ◽

Cited By ~ 6

Author(s):

Helena Todorov ◽

Yvan Saeys

Keyword(s):

Data Analysis ◽

Single Cell ◽

High Throughput ◽

Computational Approaches ◽

Cell Data

Download Full-text