Single-nucleotide variant proportion in genes: a new concept to explore major depression based on DNA sequencing data

2017 ◽  
Vol 62 (5) ◽  
pp. 577-580 ◽  
Author(s):  
Chenglong Yu ◽  
Bernhard T Baune ◽  
Julio Licinio ◽  
Ma-Li Wong
2019 ◽  
Vol 4 (1) ◽  
Author(s):  
Andrew Currin ◽  
Neil Swainston ◽  
Mark S Dunstan ◽  
Adrian J Jervis ◽  
Paul Mulherin ◽  
...  

Abstract Synthetic biology utilizes the Design–Build–Test–Learn pipeline for the engineering of biological systems. Typically, this requires the construction of specifically designed, large and complex DNA assemblies. The availability of cheap DNA synthesis and automation enables high-throughput assembly approaches, which generates a heavy demand for DNA sequencing to verify correctly assembled constructs. Next-generation sequencing is ideally positioned to perform this task, however with expensive hardware costs and bespoke data analysis requirements few laboratories utilize this technology in-house. Here a workflow for highly multiplexed sequencing is presented, capable of fast and accurate sequence verification of DNA assemblies using nanopore technology. A novel sample barcoding system using polymerase chain reaction is introduced, and sequencing data are analyzed through a bespoke analysis algorithm. Crucially, this algorithm overcomes the problem of high-error rate nanopore data (which typically prevents identification of single nucleotide variants) through statistical analysis of strand bias, permitting accurate sequence analysis with single-base resolution. As an example, 576 constructs (6 × 96 well plates) were processed in a single workflow in 72 h (from Escherichia coli colonies to analyzed data). Given our procedure’s low hardware costs and highly multiplexed capability, this provides cost-effective access to powerful DNA sequencing for any laboratory, with applications beyond synthetic biology including directed evolution, single nucleotide polymorphism analysis and gene synthesis.


2021 ◽  
Author(s):  
Aaron Wing Cheung Kwok ◽  
Chen Qiao ◽  
Rongting Huang ◽  
Mai-Har Sham ◽  
Joshua W. K. Ho ◽  
...  

AbstractMitochondrial mutations are increasingly recognised as informative endogenous genetic markers that can be used to reconstruct cellular clonal structure using single-cell RNA or DNA sequencing data. However, there is a lack of effective computational methods to identify informative mtDNA variants in noisy and sparse single-cell sequencing data. Here we present an open source computational tool MQuad that accurately calls clonally informative mtDNA variants in a population of single cells, and an analysis suite for complete clonality inference, based on single cell RNA or DNA sequencing data. Through a variety of simulated and experimental single cell sequencing data, we showed that MQuad can identify mitochondrial variants with both high sensitivity and specificity, outperforming existing methods by a large extent. Furthermore, we demonstrated its wide applicability in different single cell sequencing protocols, particularly in complementing single-nucleotide and copy-number variations to extract finer clonal resolution. MQuad is a Python package available via https://github.com/single-cell-genetics/MQuad.


2017 ◽  
Vol 19 (5) ◽  
pp. 651-658 ◽  
Author(s):  
Chung Lee ◽  
Joon S. Bae ◽  
Gyu H. Ryu ◽  
Nayoung K.D. Kim ◽  
Donghyun Park ◽  
...  

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
David Lähnemann ◽  
Johannes Köster ◽  
Ute Fischer ◽  
Arndt Borkhardt ◽  
Alice C. McHardy ◽  
...  

AbstractAccurate single cell mutational profiles can reveal genomic cell-to-cell heterogeneity. However, sequencing libraries suitable for genotyping require whole genome amplification, which introduces allelic bias and copy errors. The resulting data violates assumptions of variant callers developed for bulk sequencing. Thus, only dedicated models accounting for amplification bias and errors can provide accurate calls. We present ProSolo for calling single nucleotide variants from multiple displacement amplified (MDA) single cell DNA sequencing data. ProSolo probabilistically models a single cell jointly with a bulk sequencing sample and integrates all relevant MDA biases in a site-specific and scalable—because computationally efficient—manner. This achieves a higher accuracy in calling and genotyping single nucleotide variants in single cells in comparison to state-of-the-art tools and supports imputation of insufficiently covered genotypes, when downstream tools cannot handle missing data. Moreover, ProSolo implements the first approach to control the false discovery rate reliably and flexibly. ProSolo is implemented in an extendable framework, with code and usage at: https://github.com/prosolo/prosolo


2020 ◽  
Author(s):  
David Lähnemann ◽  
Johannes Köster ◽  
Ute Fischer ◽  
Arndt Borkhardt ◽  
Alice C. McHardy ◽  
...  

ABSTRACTObtaining accurate mutational profiles from single cell DNA is essential for the analysis of genomic cell-to-cell heterogeneity at the finest level of resolution. However, sequencing libraries suitable for genotyping require whole genome amplification, which introduces allelic bias and copy errors. As a result, single cell DNA sequencing data violates the assumptions of variant callers developed for bulk sequencing, which when applied to single cells generate significant numbers of false positives and false negatives. Only dedicated models accounting for amplification bias and errors will be able to provide more accurate calls.We present ProSolo, a probabilistic model for calling single nucleotide variants from multiple displacement amplified single cell DNA sequencing data. It introduces a mechanistically motivated empirical model of amplification bias that improves the quantification of genotyping uncertainty. To account for amplification errors, it jointly models the single cell sample with a bulk sequencing sample from the same cell population—also enabling a biologically relevant imputation of missing genotypes for the single cell. Through these innovations, ProSolo achieves substantially higher performance in calling and genotyping single nucleotide variants in single cells in comparison to all state-of-the-art tools. Moreover, ProSolo implements the first approach to control the false discovery rate reliably and flexibly; not only for single nucleotide variant calls, but also for artefacts of single cell methodology that one may wish to identify, such as allele dropout.ProSolo’s model is implemented into a flexible framework, encouraging extensions. The source code and usage instructions are available at: https://github.com/prosolo/prosolo


2021 ◽  
Author(s):  
Hana Rozhoñová ◽  
Daniel Danciu ◽  
Stefan Stark ◽  
Gunnar Rätsch ◽  
Andr&eacute Kahles ◽  
...  

Recently developed single-cell DNA sequencing technologies enable whole-genome, amplifi-cation-free sequencing of thousands of cells at the cost of ultra-low coverage of the sequenced data(<0.05x per cell), which mostly limits their usage to the identification of copy number alterations(CNAs) in multi-megabase segments. Aside from CNA-based subclone detection, single-nucleotide vari-ant (SNV)-based subclone detection may contribute to a more comprehensive view on intra-tumorheterogeneity. Due to the low coverage of the data, the identification of SNVs is only possible whensuperimposing the sequenced genomes of hundreds of genetically similar cells. Here we present SingleCell Data Tumor Clusterer (SECEDO, lat. 'to separate'), a new method to cluster tumor cells basedsolely on SNVs, inferred on ultra-low coverage single-cell DNA sequencing data. The core aspects ofthe method are an efficient Bayesian filtering of relevant loci and the exploitation of read overlapsand phasing information. We applied SECEDO to a synthetic dataset simulating 7,250 cells and eighttumor subclones from a single patient, and were able to accurately reconstruct the clonal composition,detecting 92.11% of the somatic SNVs, with the smallest clusters representing only 6.9% of the totalpopulation. When applied to four real single-cell sequencing datasets from a breast cancer patient,SECEDO was able to recover the major clonal composition in each dataset at the original sequencingdepth of 0.03x per cell, an 8-fold improvement relative to the state of the art. Variant calling on theresulting clusters recovered more than twice as many SNVs with double the allelic ratio compared tocalling on all cells together, demonstrating the utility of SECEDO. SECEDO is implemented in C++ and is publicly available at https://github.com/ratschlab/secedo.


Sign in / Sign up

Export Citation Format

Share Document