scholarly journals A systematic comparison of error correction enzymes by next-generation sequencing

2017 ◽  
Author(s):  
Nathan B. Lubock ◽  
Di Zhang ◽  
George M. Church ◽  
Sriram Kosuri

AbstractGene synthesis, the process of assembling gene-length fragments from shorter groups of oligonucleotides (oligos), is becoming an increasingly important tool in molecular and synthetic biology. The length, quality, and cost of gene synthesis is limited by errors produced during oligo synthesis and subsequent assembly. Enzymatic error correction methods are cost-effective means to ameliorate errors in gene synthesis. Previous analyses of these methods relied on cloning and Sanger sequencing to evaluate their efficiencies, limiting quantitative assessment and throughput. Here we develop a method to quantify errors in synthetic DNA by next-generation sequencing. We analyzed errors in a model gene assembly and systematically compared six different error correction enzymes across 11 conditions. We find that ErrASE and T7 Endonuclease I are the most effective at decreasing average error rates (up to 5.8-fold relative to the input), whereas MutS is the best for increasing the number of perfect assemblies (up to 25.2-fold). We are able to quantify differential specificities such as ErrASE preferentially corrects C/G → G/C transversions whereas T7 Endonuclease I preferentially corrects A/T → T/A transversions. More generally, this experimental and computational pipeline is a fast, scalable, and extensible way to analyze errors in gene assemblies, to profile error correction methods, and to benchmark DNA synthesis methods.

2013 ◽  
Vol 2 (2) ◽  
pp. 104-111 ◽  
Author(s):  
Joakim Crona ◽  
Alberto Delgado Verdugo ◽  
Dan Granberg ◽  
Staffan Welin ◽  
Peter Stålberg ◽  
...  

BackgroundRecent findings have shown that up to 60% of pheochromocytomas (PCCs) and paragangliomas (PGLs) are caused by germline or somatic mutations in one of the 11 hitherto known susceptibility genes: SDHA, SDHB, SDHC, SDHD, SDHAF2, VHL, HIF2A (EPAS1), RET, NF1, TMEM127 and MAX. This list of genes is constantly growing and the 11 genes together consist of 144 exons. A genetic screening test is extensively time consuming and expensive. Hence, we introduce next-generation sequencing (NGS) as a time-efficient and cost-effective alternative.MethodsTumour lesions from three patients with apparently sporadic PCC were subjected to whole exome sequencing utilizing Agilent Sureselect target enrichment system and Illumina Hi seq platform. Bioinformatics analysis was performed in-house using commercially available software. Variants in PCC and PGL susceptibility genes were identified.ResultsWe have identified 16 unique genetic variants in PCC susceptibility loci in three different PCC, spending less than a 30-min hands-on, in-house time. Two patients had one unique variant each that was classified as probably and possibly pathogenic: NF1 Arg304Ter and RET Tyr791Phe. The RET variant was verified by Sanger sequencing.ConclusionsNGS can serve as a fast and cost-effective method in the clinical genetic screening of PCC. The bioinformatics analysis may be performed without expert skills. We identified process optimization, characterization of unknown variants and determination of additive effects of multiple variants as key issues to be addressed by future studies.


2016 ◽  
Author(s):  
Peizhou Liao ◽  
Glen A. Satten ◽  
Yi-juan Hu

ABSTRACTA fundamental challenge in analyzing next-generation sequencing data is to determine an individual’s genotype correctly as the accuracy of the inferred genotype is essential to downstream analyses. Some genotype callers, such as GATK and SAMtools, directly calculate the base-calling error rates from phred scores or recalibrated base quality scores. Others, such as SeqEM, estimate error rates from the read data without using any quality scores. It is also a common quality control procedure to filter out reads with low phred scores. However, choosing an appropriate phred score threshold is problematic as a too-high threshold may lose data while a too-low threshold may introduce errors. We propose a new likelihood-based genotype-calling approach that exploits all reads and estimates the per-base error rates by incorporating phred scores through a logistic regression model. The algorithm, which we call PhredEM, uses the Expectation-Maximization (EM) algorithm to obtain consistent estimates of genotype frequencies and logistic regression parameters. We also develop a simple, computationally efficient screening algorithm to identify loci that are estimated to be monomorphic, so that only loci estimated to be non-monomorphic require application of the EM algorithm. We evaluate the performance of PhredEM using both simulated data and real sequencing data from the UK10K project. The results demonstrate that PhredEM is an improved, robust and widely applicable genotype-calling approach for next-generation sequencing studies. The relevant software is freely available.


2020 ◽  
Vol 79 (2) ◽  
pp. 105-113
Author(s):  
Abdul Bari Muneera Parveen ◽  
Divya Lakshmanan ◽  
Modhumita Ghosh Dasgupta

The advent of next-generation sequencing has facilitated large-scale discovery and mapping of genomic variants for high-throughput genotyping. Several research groups working in tree species are presently employing next generation sequencing (NGS) platforms for marker discovery, since it is a cost effective and time saving strategy. However, most trees lack a chromosome level genome map and validation of variants for downstream application becomes obligatory. The cost associated with identifying potential variants from the enormous amount of sequence data is a major limitation. In the present study, high resolution melting (HRM) analysis was optimized for rapid validation of single nucleotide polymorphisms (SNPs), insertions or deletions (InDels) and simple sequence repeats (SSRs) predicted from exome sequencing of parents and hybrids of Eucalyptus tereticornis Sm. ? Eucalyptus grandis Hill ex Maiden generated from controlled hybridization. The cost per data point was less than 0.5 USD, providing great flexibility in terms of cost and sensitivity, when compared to other validation methods. The sensitivity of this technology in variant detection can be extended to other applications including Bar-HRM for species authentication and TILLING for detection of mutants.


2019 ◽  
Vol 47 (1) ◽  
pp. 4-13 ◽  
Author(s):  
Daniel Fürst ◽  
Chrysanthi Tsamadou ◽  
Christine Neuchel ◽  
Hubert Schrezenmeier ◽  
Joannis Mytilineos ◽  
...  

Sequencing of the human genome has led to the definition of the genes for most of the relevant blood group systems, and the polymorphisms responsible for most of the clinically relevant blood group antigens are characterized. Molecular blood group typing is used in situations where erythrocytes are not available or where serological testing was inconclusive or not possible due to the lack of antisera. Also, molecular testing may be more cost-effective in certain situations. Molecular typing approaches are mostly based on either PCR with specific primers, DNA hybridization, or DNA sequencing. Particularly the transition of sequencing techniques from Sanger-based sequencing to next-generation sequencing (NGS) technologies has led to exciting new possibilities in blood group genotyping. We describe briefly the currently available NGS platforms and their specifications, depict the genetic background of blood group polymorphisms, and discuss applications for NGS approaches in immunohematology. As an example, we delineate a protocol for large-scale donor blood group screening established and in use at our institution. Furthermore, we discuss technical challenges and limitations as well as the prospect for future developments, including long-read sequencing technologies.


Author(s):  
Noah A. Brown ◽  
Kojo S.J. Elenitoba-Johnson

Genomic testing enables clinical management to be tailored to individual cancer patients based on the molecular alterations present within cancer cells. Genomic sequencing results can be applied to detect and classify cancer, predict prognosis, and target therapies. Next-generation sequencing has revolutionized the field of cancer genomics by enabling rapid and cost-effective sequencing of large portions of the genome. With this technology, precision oncology is quickly becoming a realized paradigm for managing the treatment of cancer patients. However, many challenges must be overcome to efficiently implement the transition of next-generation sequencing from research applications to routine clinical practice, including using specimens commonly available in the clinical setting; determining how to process, store, and manage large amounts of sequencing data; determining how to interpret and prioritize molecular findings; and coordinating health professionals from multiple disciplines.


Sign in / Sign up

Export Citation Format

Share Document