scholarly journals diffloop: a computational framework for identifying and analyzing differential DNA loops from sequencing data

2017 ◽  
Vol 34 (4) ◽  
pp. 672-674 ◽  
Author(s):  
Caleb A Lareau ◽  
Martin J Aryee
2016 ◽  
Author(s):  
Caleb A. Lareau ◽  
Martin J. Aryee

ABSTRACTThe three-dimensional architecture of DNA within the nucleus is a key determinant of interactions between genes, regulatory elements, and transcriptional machinery. As a result, differences in loop structure are associated with differences in gene expression and cell state. Here, we introduce diffloop, an R/Bioconductor package for identifying differential DNA looping between samples. The package additionally provides a suite of functions for the quality control, statistical testing, annotation and visualization of DNA loops. We demonstrate this functionality by detecting differences in DNA loops between ENCODE ChIA-PET datasets and relate looping to differences in epigenetic state and gene expression.


2018 ◽  
Author(s):  
Xianwen Ren ◽  
Liangtao Zheng ◽  
Zemin Zhang

ABSTRACTClustering is a prevalent analytical means to analyze single cell RNA sequencing data but the rapidly expanding data volume can make this process computational challenging. New methods for both accurate and efficient clustering are of pressing needs. Here we proposed a new clustering framework based on random projection and feature construction for large scale single-cell RNA sequencing data, which greatly improves clustering accuracy, robustness and computational efficacy for various state-of-the-art algorithms benchmarked on multiple real datasets. On a dataset with 68,578 human blood cells, our method reached 20% improvements for clustering accuracy and 50-fold acceleration but only consumed 66% memory usage compared to the widely-used software package SC3. Compared to k-means, the accuracy improvement can reach 3-fold depending on the concrete dataset. An R implementation of the framework is available from https://github.com/Japrin/sscClust.


2020 ◽  
Author(s):  
Jiguang Wang ◽  
Biaobin Jiang ◽  
Quanhua Mu ◽  
Fufang Qiu ◽  
Weiqi Xu

Abstract Metastasis leads to most cancer deaths, but its spatiotemporal behavior remains unpredictable at early stage. Here, we developed MetaNet, a computational framework that integrates clinical and sequencing data from 32,176 primary and metastatic cancer cases, to assess metastatic risks of primary tumors. MetaNet achieved high accuracy in distinguishing the metastasis from the primary in breast and prostate cancers. From the prediction, we identified Metastasis-Featuring Primary (MFP) tumors, a subset of primary tumors with genomic features enriched in metastasis, and demonstrated their high metastatic risks with significantly shorter disease-free survivals and higher migratory potential. In addition, we identified genomic alterations associated with organ-specific metastases, and employed them to stratify patients into the risk groups with propensities toward different metastatic organs. Remarkably, this organotropic stratification achieved better prognostic value than standard histological grading system in prostate cancer, especially between Bone-MFP and Liver-MFP subtypes, with organotropic insights to inform organ-specific examinations in follow-ups.


2021 ◽  
Author(s):  
Michael E Nelson ◽  
Simone G Riva ◽  
Ann Cvejic

Spatial transcriptomics is revolutionising the study of single-cell RNA and tissue-wide cell heterogeneity, but few robust methods connecting spatially resolved cells to so-called marker genes from single-cell RNA sequencing, which generate significant insight gleaned from spatial methods, exist. Here we present SMaSH, a general computational framework for extracting key marker genes from single-cell RNA sequencing data for spatial transcriptomics approaches. SMaSH extracts robust and biologically well-motivated marker genes, which characterise the given data-set better than existing and limited computational approaches for global marker gene calculation.


2019 ◽  
Vol 48 (2) ◽  
pp. e7-e7 ◽  
Author(s):  
Carine Legrand ◽  
Francesca Tuorto

Abstract Recently, newly developed ribosome profiling methods based on high-throughput sequencing of ribosome-protected mRNA footprints allow to study genome-wide translational changes in detail. However, computational analysis of the sequencing data still represents a bottleneck for many laboratories. Further, specific pipelines for quality control and statistical analysis of ribosome profiling data, providing high levels of both accuracy and confidence, are currently lacking. In this study, we describe automated bioinformatic and statistical diagnoses to perform robust quality control of ribosome profiling data (RiboQC), to efficiently visualize ribosome positions and to estimate ribosome speed (RiboMine) in an unbiased way. We present an R pipeline to setup and undertake the analyses that offers the user an HTML page to scan own data regarding the following aspects: periodicity, ligation and digestion of footprints; reproducibility and batch effects of replicates; drug-related artifacts; unbiased codon enrichment including variability between mRNAs, for A, P and E sites; mining of some causal or confounding factors. We expect our pipeline to allow an optimal use of the wealth of information provided by ribosome profiling experiments.


2019 ◽  
Author(s):  
Qiaoxing Liang ◽  
Paul W. Bible ◽  
Yu Liu ◽  
Bin Zou ◽  
Lai Wei

AbstractTaxonomic classification is a crucial step for metagenomics applications including disease diagnostics, microbiome analyses, and outbreak tracing. Yet it is unknown what deep learning architecture can capture microbial genome-wide features relevant to this task. We report DeepMicrobes (https://github.com/MicrobeLab/DeepMicrobes), a computational framework that can perform large-scale training on > 10,000 RefSeq complete microbial genomes and accurately predict the species-of-origin of whole metagenome shotgun sequencing reads. We show the advantage of DeepMicrobes over state-of-the-art tools in precisely identifying species from microbial community sequencing data. Therefore, DeepMicrobes expands the toolbox of taxonomic classification for metagenomics and enables the development of further deep learning-based bioinformatics algorithms for microbial genomic sequence analysis.


2021 ◽  
Author(s):  
Stephan Preibisch ◽  
Nikos Karaiskos ◽  
Nikolaus Rajewsky

We present STIM, an imaging-based computational framework for exploring, visualizing, and processing high-throughput spatial sequencing datasets. STIM is built on the powerful ImgLib2, N5 and BigDataViewer (BDV) frameworks enabling transfer of computer vision techniques to datasets with irregular measurement-spacing and arbitrary spatial resolution, such as spatial transcriptomics data generated by multiplexed targeted hybridization or spatial sequencing technologies. We illustrate STIM's capabilities by representing, visualizing, and automatically registering publicly available spatial sequencing data from 14 serial sections of mouse brain tissue.


2017 ◽  
Author(s):  
Kyle S. Smith ◽  
Debashis Ghosh ◽  
Katherine S. Pollard ◽  
Subhajyoti De

ABSTRACTBy accumulation of somatic mutations, cancer genomes evolve, diverging away from the genome of the host. It remains unclear to what extent somatic evolutionary divergence is comparable across different regions of the cancer genome versus concentrated in specific genomic elements. We present a novel computational framework, SASE-mapper, to identify genomic regions that show signatures of accelerated somatic evolution (SASE) in a subset of samples in a cohort, marked by accumulation of an excess of somatic mutations compared to that expected based on local, context-aware background mutation rates in the cancer genomes. Analyzing tumor whole genome sequencing data for 365 samples from 5 cohorts we detect recurrent SASE at a genome-wide scale. The SASEs were enriched for genomic elements associated with active chromatin, and regulatory regions of several known cancer genes had SASE in multiple cohorts. Regions with SASE carried specific mutagenic signatures and often co-localized within the 3D nuclear space suggesting their common basis. A subset of SASEs was frequently associated with regulatory changes in key cancer pathways and also poor clinical outcome. While the SASE-associated mutations were not necessarily recurrent at base-pair resolution, the SASEs recurrently targeted same functional regions, with similar consequences. It is likely that regulatory redundancy and plasticity promote prevalence of SASE-like patterns in the cancer genomes.


2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Tobias Heinen ◽  
Stefano Secchia ◽  
James P. Reddington ◽  
Bingqing Zhao ◽  
Eileen E. M. Furlong ◽  
...  

AbstractWhile it is established that the functional impact of genetic variation can vary across cell types and states, capturing this diversity remains challenging. Current studies using bulk sequencing either ignore this heterogeneity or use sorted cell populations, reducing discovery and explanatory power. Here, we develop scDALI, a versatile computational framework that integrates information on cellular states with allelic quantifications of single-cell sequencing data to characterize cell-state-specific genetic effects. We apply scDALI to scATAC-seq profiles from developing F1 Drosophila embryos and scRNA-seq from differentiating human iPSCs, uncovering heterogeneous genetic effects in specific lineages, developmental stages, or cell types.


Sign in / Sign up

Export Citation Format

Share Document