scholarly journals A Quantitative Framework for Evaluating Single-Cell Data Structure Preservation by Dimensionality Reduction Techniques

Cell Reports ◽  
2020 ◽  
Vol 31 (5) ◽  
pp. 107576 ◽  
Author(s):  
Cody N. Heiser ◽  
Ken S. Lau
2019 ◽  
Author(s):  
Cody N. Heiser ◽  
Ken S. Lau

SummaryHigh-dimensional data, such as those generated using single-cell RNA sequencing, present challenges in interpretation and visualization. Numerical and computational methods for dimensionality reduction allow for low-dimensional representation of genome-scale expression data for downstream clustering, trajectory reconstruction, and biological interpretation. However, a comprehensive and quantitative evaluation of the performance of these techniques has not been established. We present an unbiased framework that defines metrics of global and local structure preservation in dimensionality reduction transformations. Using discrete and continuous scRNA-seq datasets, we find that input cell distribution and method parameters are largely determinant of global, local, and organizational data structure preservation by eleven published dimensionality reduction methods. Code available atgithub.com/KenLauLab/DR-structure-preservationallows for rapid evaluation of further datasets and methods.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Van Hoan Do ◽  
Stefan Canzar

AbstractEmerging single-cell technologies profile multiple types of molecules within individual cells. A fundamental step in the analysis of the produced high-dimensional data is their visualization using dimensionality reduction techniques such as t-SNE and UMAP. We introduce j-SNE and j-UMAP as their natural generalizations to the joint visualization of multimodal omics data. Our approach automatically learns the relative contribution of each modality to a concise representation of cellular identity that promotes discriminative features but suppresses noise. On eight datasets, j-SNE and j-UMAP produce unified embeddings that better agree with known cell types and that harmonize RNA and protein velocity landscapes.


2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Akram Vasighizaker ◽  
Saiteja Danda ◽  
Luis Rueda

AbstractIdentifying relevant disease modules such as target cell types is a significant step for studying diseases. High-throughput single-cell RNA-Seq (scRNA-seq) technologies have advanced in recent years, enabling researchers to investigate cells individually and understand their biological mechanisms. Computational techniques such as clustering, are the most suitable approach in scRNA-seq data analysis when the cell types have not been well-characterized. These techniques can be used to identify a group of genes that belong to a specific cell type based on their similar gene expression patterns. However, due to the sparsity and high-dimensionality of scRNA-seq data, classical clustering methods are not efficient. Therefore, the use of non-linear dimensionality reduction techniques to improve clustering results is crucial. We introduce a method that is used to identify representative clusters of different cell types by combining non-linear dimensionality reduction techniques and clustering algorithms. We assess the impact of different dimensionality reduction techniques combined with the clustering of thirteen publicly available scRNA-seq datasets of different tissues, sizes, and technologies. We further performed gene set enrichment analysis to evaluate the proposed method’s performance. As such, our results show that modified locally linear embedding combined with independent component analysis yields overall the best performance relative to the existing unsupervised methods across different datasets.


2015 ◽  
Vol 294 ◽  
pp. 553-564 ◽  
Author(s):  
Manuel Domínguez ◽  
Serafín Alonso ◽  
Antonio Morán ◽  
Miguel A. Prada ◽  
Juan J. Fuertes

2019 ◽  
Vol 165 ◽  
pp. 104-111 ◽  
Author(s):  
S. Velliangiri ◽  
S. Alagumuthukrishnan ◽  
S Iwin Thankumar joseph

Sign in / Sign up

Export Citation Format

Share Document