Single-nucleotide variant proportion in genes: a new concept to explore major depression based on DNA sequencing data

Chenglong Yu; Bernhard T Baune; Julio Licinio; Ma-Li Wong

doi:10.1038/jhg.2017.2

To aggregate or not, that is the question. A commentary on single-nucleotide variant proportion in genes: a new concept to explore major depression based on DNA sequencing data

Journal of Human Genetics ◽

10.1038/jhg.2017.7 ◽

2017 ◽

Vol 62 (5) ◽

pp. 523-523

Author(s):

Jurg Ott

Keyword(s):

Major Depression ◽

Dna Sequencing ◽

Single Nucleotide Variant ◽

Sequencing Data ◽

Single Nucleotide

Download Full-text

Highly multiplexed, fast and accurate nanopore sequencing for verification of synthetic DNA constructs and sequence libraries

Synthetic Biology ◽

10.1093/synbio/ysz025 ◽

2019 ◽

Vol 4 (1) ◽

Cited By ~ 4

Author(s):

Andrew Currin ◽

Neil Swainston ◽

Mark S Dunstan ◽

Adrian J Jervis ◽

Paul Mulherin ◽

...

Keyword(s):

Synthetic Biology ◽

Dna Sequencing ◽

Cost Effective ◽

Polymorphism Analysis ◽

Sequencing Data ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Synthetic Dna ◽

Design Build ◽

Hardware Costs

Abstract Synthetic biology utilizes the Design–Build–Test–Learn pipeline for the engineering of biological systems. Typically, this requires the construction of specifically designed, large and complex DNA assemblies. The availability of cheap DNA synthesis and automation enables high-throughput assembly approaches, which generates a heavy demand for DNA sequencing to verify correctly assembled constructs. Next-generation sequencing is ideally positioned to perform this task, however with expensive hardware costs and bespoke data analysis requirements few laboratories utilize this technology in-house. Here a workflow for highly multiplexed sequencing is presented, capable of fast and accurate sequence verification of DNA assemblies using nanopore technology. A novel sample barcoding system using polymerase chain reaction is introduced, and sequencing data are analyzed through a bespoke analysis algorithm. Crucially, this algorithm overcomes the problem of high-error rate nanopore data (which typically prevents identification of single nucleotide variants) through statistical analysis of strand bias, permitting accurate sequence analysis with single-base resolution. As an example, 576 constructs (6 × 96 well plates) were processed in a single workflow in 72 h (from Escherichia coli colonies to analyzed data). Given our procedure’s low hardware costs and highly multiplexed capability, this provides cost-effective access to powerful DNA sequencing for any laboratory, with applications beyond synthetic biology including directed evolution, single nucleotide polymorphism analysis and gene synthesis.

Download Full-text

MQuad enables clonal substructure discovery using single cell mitochondrial variants

10.1101/2021.03.27.437331 ◽

2021 ◽

Author(s):

Aaron Wing Cheung Kwok ◽

Chen Qiao ◽

Rongting Huang ◽

Mai-Har Sham ◽

Joshua W. K. Ho ◽

...

Keyword(s):

Dna Sequencing ◽

Single Cell ◽

Single Cells ◽

High Sensitivity ◽

Copy Number Variations ◽

Sequencing Data ◽

Single Nucleotide ◽

Single Cell Sequencing ◽

Mtdna Variants ◽

Python Package

AbstractMitochondrial mutations are increasingly recognised as informative endogenous genetic markers that can be used to reconstruct cellular clonal structure using single-cell RNA or DNA sequencing data. However, there is a lack of effective computational methods to identify informative mtDNA variants in noisy and sparse single-cell sequencing data. Here we present an open source computational tool MQuad that accurately calls clonally informative mtDNA variants in a population of single cells, and an analysis suite for complete clonality inference, based on single cell RNA or DNA sequencing data. Through a variety of simulated and experimental single cell sequencing data, we showed that MQuad can identify mitochondrial variants with both high sensitivity and specificity, outperforming existing methods by a large extent. Furthermore, we demonstrated its wide applicability in different single cell sequencing protocols, particularly in complementing single-nucleotide and copy-number variations to extract finer clonal resolution. MQuad is a Python package available via https://github.com/single-cell-genetics/MQuad.

Download Full-text

A Method to Evaluate the Quality of Clinical Gene-Panel Sequencing Data for Single-Nucleotide Variant Detection

Journal of Molecular Diagnostics ◽

10.1016/j.jmoldx.2017.06.001 ◽

2017 ◽

Vol 19 (5) ◽

pp. 651-658 ◽

Cited By ~ 11

Author(s):

Chung Lee ◽

Joon S. Bae ◽

Gyu H. Ryu ◽

Nayoung K.D. Kim ◽

Donghyun Park ◽

...

Keyword(s):

Single Nucleotide Variant ◽

Gene Panel ◽

Sequencing Data ◽

Single Nucleotide ◽

Variant Detection ◽

Gene Panel Sequencing ◽

Panel Sequencing

Download Full-text

Whole-genome single nucleotide variant distribution on genomic regions and its relationship to major depression

Psychiatry Research ◽

10.1016/j.psychres.2017.02.041 ◽

2017 ◽

Vol 252 ◽

pp. 75-79 ◽

Cited By ~ 9

Author(s):

Chenglong Yu ◽

Bernhard T. Baune ◽

Julio Licinio ◽

Ma-Li Wong

Keyword(s):

Major Depression ◽

Single Nucleotide Variant ◽

Whole Genome ◽

Single Nucleotide ◽

Genomic Regions

Download Full-text

Accurate and scalable variant calling from single cell DNA sequencing data with ProSolo

Nature Communications ◽

10.1038/s41467-021-26938-w ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

David Lähnemann ◽

Johannes Köster ◽

Ute Fischer ◽

Arndt Borkhardt ◽

Alice C. McHardy ◽

...

Keyword(s):

Dna Sequencing ◽

Single Cell ◽

Single Cells ◽

Variant Calling ◽

Sequencing Data ◽

Computationally Efficient ◽

Single Nucleotide Variants ◽

Efficient Manner ◽

Single Nucleotide ◽

Amplification Bias

AbstractAccurate single cell mutational profiles can reveal genomic cell-to-cell heterogeneity. However, sequencing libraries suitable for genotyping require whole genome amplification, which introduces allelic bias and copy errors. The resulting data violates assumptions of variant callers developed for bulk sequencing. Thus, only dedicated models accounting for amplification bias and errors can provide accurate calls. We present ProSolo for calling single nucleotide variants from multiple displacement amplified (MDA) single cell DNA sequencing data. ProSolo probabilistically models a single cell jointly with a bulk sequencing sample and integrates all relevant MDA biases in a site-specific and scalable—because computationally efficient—manner. This achieves a higher accuracy in calling and genotyping single nucleotide variants in single cells in comparison to state-of-the-art tools and supports imputation of insufficiently covered genotypes, when downstream tools cannot handle missing data. Moreover, ProSolo implements the first approach to control the false discovery rate reliably and flexibly. ProSolo is implemented in an extendable framework, with code and usage at: https://github.com/prosolo/prosolo

Download Full-text

A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data

Computational and Structural Biotechnology Journal ◽

10.1016/j.csbj.2018.01.003 ◽

2018 ◽

Vol 16 ◽

pp. 15-24 ◽

Cited By ~ 96

Author(s):

Chang Xu

Keyword(s):

Next Generation Sequencing ◽

Variant Calling ◽

Single Nucleotide Variant ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Single Nucleotide ◽

Generation Sequencing

Download Full-text

ProSolo: Accurate Variant Calling from Single Cell DNA Sequencing Data

10.1101/2020.04.27.064071 ◽

2020 ◽

Author(s):

David Lähnemann ◽

Johannes Köster ◽

Ute Fischer ◽

Arndt Borkhardt ◽

Alice C. McHardy ◽

...

Keyword(s):

Dna Sequencing ◽

Single Cell ◽

Single Cells ◽

Variant Calling ◽

Sequencing Data ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Biologically Relevant ◽

Amplification Bias ◽

Missing Genotypes

ABSTRACTObtaining accurate mutational profiles from single cell DNA is essential for the analysis of genomic cell-to-cell heterogeneity at the finest level of resolution. However, sequencing libraries suitable for genotyping require whole genome amplification, which introduces allelic bias and copy errors. As a result, single cell DNA sequencing data violates the assumptions of variant callers developed for bulk sequencing, which when applied to single cells generate significant numbers of false positives and false negatives. Only dedicated models accounting for amplification bias and errors will be able to provide more accurate calls.We present ProSolo, a probabilistic model for calling single nucleotide variants from multiple displacement amplified single cell DNA sequencing data. It introduces a mechanistically motivated empirical model of amplification bias that improves the quantification of genotyping uncertainty. To account for amplification errors, it jointly models the single cell sample with a bulk sequencing sample from the same cell population—also enabling a biologically relevant imputation of missing genotypes for the single cell. Through these innovations, ProSolo achieves substantially higher performance in calling and genotyping single nucleotide variants in single cells in comparison to all state-of-the-art tools. Moreover, ProSolo implements the first approach to control the false discovery rate reliably and flexibly; not only for single nucleotide variant calls, but also for artefacts of single cell methodology that one may wish to identify, such as allele dropout.ProSolo’s model is implemented into a flexible framework, encouraging extensions. The source code and usage instructions are available at: https://github.com/prosolo/prosolo

Download Full-text

SECEDO: SNV-based subclone detection using ultra-low coverage single-cell DNA sequencing

10.1101/2021.11.08.467510 ◽

2021 ◽

Author(s):

Hana Rozhoñová ◽

Daniel Danciu ◽

Stefan Stark ◽

Gunnar Rätsch ◽

Andr&eacute Kahles ◽

...

Keyword(s):

Dna Sequencing ◽

Single Cell ◽

Variant Calling ◽

Bayesian Filtering ◽

Sequencing Data ◽

Single Nucleotide ◽

Sequencing Technologies ◽

The Cost ◽

Low Coverage ◽

Clonal Composition

Recently developed single-cell DNA sequencing technologies enable whole-genome, amplifi-cation-free sequencing of thousands of cells at the cost of ultra-low coverage of the sequenced data(<0.05x per cell), which mostly limits their usage to the identification of copy number alterations(CNAs) in multi-megabase segments. Aside from CNA-based subclone detection, single-nucleotide vari-ant (SNV)-based subclone detection may contribute to a more comprehensive view on intra-tumorheterogeneity. Due to the low coverage of the data, the identification of SNVs is only possible whensuperimposing the sequenced genomes of hundreds of genetically similar cells. Here we present SingleCell Data Tumor Clusterer (SECEDO, lat. 'to separate'), a new method to cluster tumor cells basedsolely on SNVs, inferred on ultra-low coverage single-cell DNA sequencing data. The core aspects ofthe method are an efficient Bayesian filtering of relevant loci and the exploitation of read overlapsand phasing information. We applied SECEDO to a synthetic dataset simulating 7,250 cells and eighttumor subclones from a single patient, and were able to accurately reconstruct the clonal composition,detecting 92.11% of the somatic SNVs, with the smallest clusters representing only 6.9% of the totalpopulation. When applied to four real single-cell sequencing datasets from a breast cancer patient,SECEDO was able to recover the major clonal composition in each dataset at the original sequencingdepth of 0.03x per cell, an 8-fold improvement relative to the state of the art. Variant calling on theresulting clusters recovered more than twice as many SNVs with double the allelic ratio compared tocalling on all cells together, demonstrating the utility of SECEDO. SECEDO is implemented in C++ and is publicly available at https://github.com/ratschlab/secedo.

Download Full-text

Assessment of software for somatic single nucleotide variant identification using simulated whole-genome sequencing data of cancer

10.18699/mm-hpc-bbb-2018-25 ◽

2018 ◽

pp. 34-34

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Single Nucleotide Variant ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Single Nucleotide ◽

Variant Identification

Download Full-text