A systematic comparison of error correction enzymes by next-generation sequencing

Mapping Intimacies ◽

10.1101/100685 ◽

2017 ◽

Author(s):

Nathan B. Lubock ◽

Di Zhang ◽

George M. Church ◽

Sriram Kosuri

Keyword(s):

Next Generation Sequencing ◽

Error Correction ◽

Effective Means ◽

Cost Effective ◽

Error Rates ◽

Gene Synthesis ◽

Average Error ◽

Synthesis Methods ◽

Next Generation ◽

Generation Sequencing

AbstractGene synthesis, the process of assembling gene-length fragments from shorter groups of oligonucleotides (oligos), is becoming an increasingly important tool in molecular and synthetic biology. The length, quality, and cost of gene synthesis is limited by errors produced during oligo synthesis and subsequent assembly. Enzymatic error correction methods are cost-effective means to ameliorate errors in gene synthesis. Previous analyses of these methods relied on cloning and Sanger sequencing to evaluate their efficiencies, limiting quantitative assessment and throughput. Here we develop a method to quantify errors in synthetic DNA by next-generation sequencing. We analyzed errors in a model gene assembly and systematically compared six different error correction enzymes across 11 conditions. We find that ErrASE and T7 Endonuclease I are the most effective at decreasing average error rates (up to 5.8-fold relative to the input), whereas MutS is the best for increasing the number of perfect assemblies (up to 25.2-fold). We are able to quantify differential specificities such as ErrASE preferentially corrects C/G → G/C transversions whereas T7 Endonuclease I preferentially corrects A/T → T/A transversions. More generally, this experimental and computational pipeline is a fast, scalable, and extensible way to analyze errors in gene assemblies, to profile error correction methods, and to benchmark DNA synthesis methods.

Download Full-text

Next-generation sequencing in the clinical genetic screening of patients with pheochromocytoma and paraganglioma

Endocrine Connections ◽

10.1530/ec-13-0009 ◽

2013 ◽

Vol 2 (2) ◽

pp. 104-111 ◽

Cited By ~ 32

Author(s):

Joakim Crona ◽

Alberto Delgado Verdugo ◽

Dan Granberg ◽

Staffan Welin ◽

Peter Stålberg ◽

...

Keyword(s):

Next Generation Sequencing ◽

Genetic Screening ◽

Bioinformatics Analysis ◽

Cost Effective ◽

Susceptibility Genes ◽

Next Generation ◽

Clinical Genetic ◽

Cost Effective Method ◽

Agilent Sureselect ◽

Generation Sequencing

BackgroundRecent findings have shown that up to 60% of pheochromocytomas (PCCs) and paragangliomas (PGLs) are caused by germline or somatic mutations in one of the 11 hitherto known susceptibility genes: SDHA, SDHB, SDHC, SDHD, SDHAF2, VHL, HIF2A (EPAS1), RET, NF1, TMEM127 and MAX. This list of genes is constantly growing and the 11 genes together consist of 144 exons. A genetic screening test is extensively time consuming and expensive. Hence, we introduce next-generation sequencing (NGS) as a time-efficient and cost-effective alternative.MethodsTumour lesions from three patients with apparently sporadic PCC were subjected to whole exome sequencing utilizing Agilent Sureselect target enrichment system and Illumina Hi seq platform. Bioinformatics analysis was performed in-house using commercially available software. Variants in PCC and PGL susceptibility genes were identified.ResultsWe have identified 16 unique genetic variants in PCC susceptibility loci in three different PCC, spending less than a 30-min hands-on, in-house time. Two patients had one unique variant each that was classified as probably and possibly pathogenic: NF1 Arg304Ter and RET Tyr791Phe. The RET variant was verified by Sanger sequencing.ConclusionsNGS can serve as a fast and cost-effective method in the clinical genetic screening of PCC. The bioinformatics analysis may be performed without expert skills. We identified process optimization, characterization of unknown variants and determination of additive effects of multiple variants as key issues to be addressed by future studies.

Download Full-text

An Empirical Evaluation of Error Correction Methods and Tools for Next Generation Sequencing Data

International Journal of Advanced Computer Science and Applications ◽

10.14569/ijacsa.2018.090158 ◽

2018 ◽

Vol 9 (1) ◽

Author(s):

Atif Mehmood ◽

Javed Ferzund ◽

Muhammad Usman ◽

Abbas Rehman ◽

Shahzad Ahmed ◽

...

Keyword(s):

Next Generation Sequencing ◽

Error Correction ◽

Empirical Evaluation ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing

Download Full-text

PhredEM: A Phred-Score-Informed Genotype-Calling Approach for Next-Generation Sequencing Studies

10.1101/046136 ◽

2016 ◽

Author(s):

Peizhou Liao ◽

Glen A. Satten ◽

Yi-juan Hu

Keyword(s):

Logistic Regression ◽

Next Generation Sequencing ◽

Em Algorithm ◽

Error Rates ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Genotype Calling ◽

Sequencing Studies ◽

Generation Sequencing

ABSTRACTA fundamental challenge in analyzing next-generation sequencing data is to determine an individual’s genotype correctly as the accuracy of the inferred genotype is essential to downstream analyses. Some genotype callers, such as GATK and SAMtools, directly calculate the base-calling error rates from phred scores or recalibrated base quality scores. Others, such as SeqEM, estimate error rates from the read data without using any quality scores. It is also a common quality control procedure to filter out reads with low phred scores. However, choosing an appropriate phred score threshold is problematic as a too-high threshold may lose data while a too-low threshold may introduce errors. We propose a new likelihood-based genotype-calling approach that exploits all reads and estimates the per-base error rates by incorporating phred scores through a logistic regression model. The algorithm, which we call PhredEM, uses the Expectation-Maximization (EM) algorithm to obtain consistent estimates of genotype frequencies and logistic regression parameters. We also develop a simple, computationally efficient screening algorithm to identify loci that are estimated to be monomorphic, so that only loci estimated to be non-monomorphic require application of the EM algorithm. We evaluate the performance of PhredEM using both simulated data and real sequencing data from the UK10K project. The results demonstrate that PhredEM is an improved, robust and widely applicable genotype-calling approach for next-generation sequencing studies. The relevant software is freely available.

Download Full-text

Error filtering, pair assembly and error correction for next-generation sequencing reads

Bioinformatics ◽

10.1093/bioinformatics/btv401 ◽

2015 ◽

Vol 31 (21) ◽

pp. 3476-3482 ◽

Cited By ~ 465

Author(s):

Robert C. Edgar ◽

Henrik Flyvbjerg

Keyword(s):

Next Generation Sequencing ◽

Error Correction ◽

Next Generation ◽

Generation Sequencing ◽

Error Filtering

Download Full-text

Validation of variants using cost effective highresolution melting (HRM) analysis predicted from target re-sequencing in Eucalyptus

Acta Botanica Croatica ◽

10.37427/botcro-2020-019 ◽

2020 ◽

Vol 79 (2) ◽

pp. 105-113

Author(s):

Abdul Bari Muneera Parveen ◽

Divya Lakshmanan ◽

Modhumita Ghosh Dasgupta

Keyword(s):

Next Generation Sequencing ◽

Large Scale ◽

Sequence Data ◽

Cost Effective ◽

Nucleotide Polymorphisms ◽

Next Generation ◽

Time Saving ◽

Hrm Analysis ◽

The Cost ◽

Generation Sequencing

The advent of next-generation sequencing has facilitated large-scale discovery and mapping of genomic variants for high-throughput genotyping. Several research groups working in tree species are presently employing next generation sequencing (NGS) platforms for marker discovery, since it is a cost effective and time saving strategy. However, most trees lack a chromosome level genome map and validation of variants for downstream application becomes obligatory. The cost associated with identifying potential variants from the enormous amount of sequence data is a major limitation. In the present study, high resolution melting (HRM) analysis was optimized for rapid validation of single nucleotide polymorphisms (SNPs), insertions or deletions (InDels) and simple sequence repeats (SSRs) predicted from exome sequencing of parents and hybrids of Eucalyptus tereticornis Sm. ? Eucalyptus grandis Hill ex Maiden generated from controlled hybridization. The cost per data point was less than 0.5 USD, providing great flexibility in terms of cost and sensitivity, when compared to other validation methods. The sensitivity of this technology in variant detection can be extended to other applications including Bar-HRM for species authentication and TILLING for detection of mutants.

Download Full-text

A Rapid, High-Quality, Cost-Effective, Comprehensive and Expandable Targeted Next-Generation Sequencing Assay for Inherited Heart Diseases

Circulation Research ◽

10.1161/circresaha.115.306723 ◽

2015 ◽

Vol 117 (7) ◽

pp. 603-611 ◽

Cited By ~ 20

Author(s):

Kitchener D. Wilson ◽

Peidong Shen ◽

Eula Fung ◽

Ioannis Karakikes ◽

Angela Zhang ◽

...

Keyword(s):

Next Generation Sequencing ◽

Heart Diseases ◽

Cost Effective ◽

Next Generation ◽

High Quality ◽

Quality Cost ◽

Targeted Next Generation Sequencing ◽

Generation Sequencing

Download Full-text

Next-Generation Sequencing Technologies in Blood Group Typing

Transfusion Medicine and Hemotherapy ◽

10.1159/000504765 ◽

2019 ◽

Vol 47 (1) ◽

pp. 4-13 ◽

Cited By ~ 1

Author(s):

Daniel Fürst ◽

Chrysanthi Tsamadou ◽

Christine Neuchel ◽

Hubert Schrezenmeier ◽

Joannis Mytilineos ◽

...

Keyword(s):

Next Generation Sequencing ◽

Blood Group ◽

Large Scale ◽

Cost Effective ◽

Molecular Testing ◽

Blood Group Antigens ◽

Next Generation ◽

Sequencing Technologies ◽

Blood Group Typing ◽

Generation Sequencing

Sequencing of the human genome has led to the definition of the genes for most of the relevant blood group systems, and the polymorphisms responsible for most of the clinically relevant blood group antigens are characterized. Molecular blood group typing is used in situations where erythrocytes are not available or where serological testing was inconclusive or not possible due to the lack of antisera. Also, molecular testing may be more cost-effective in certain situations. Molecular typing approaches are mostly based on either PCR with specific primers, DNA hybridization, or DNA sequencing. Particularly the transition of sequencing techniques from Sanger-based sequencing to next-generation sequencing (NGS) technologies has led to exciting new possibilities in blood group genotyping. We describe briefly the currently available NGS platforms and their specifications, depict the genetic background of blood group polymorphisms, and discuss applications for NGS approaches in immunohematology. As an example, we delineate a protocol for large-scale donor blood group screening established and in use at our institution. Furthermore, we discuss technical challenges and limitations as well as the prospect for future developments, including long-read sequencing technologies.

Download Full-text

Enabling Precision Oncology Through Precision Diagnostics

Annual Review of Pathology Mechanisms of Disease ◽

10.1146/annurev-pathmechdis-012418-012735 ◽

2020 ◽

Vol 15 (1) ◽

pp. 97-121 ◽

Cited By ~ 4

Author(s):

Noah A. Brown ◽

Kojo S.J. Elenitoba-Johnson

Keyword(s):

Next Generation Sequencing ◽

Cancer Patients ◽

Cancer Genomics ◽

Cost Effective ◽

Routine Clinical Practice ◽

Precision Oncology ◽

Next Generation ◽

Sequencing Data ◽

Molecular Alterations ◽

Generation Sequencing

Genomic testing enables clinical management to be tailored to individual cancer patients based on the molecular alterations present within cancer cells. Genomic sequencing results can be applied to detect and classify cancer, predict prognosis, and target therapies. Next-generation sequencing has revolutionized the field of cancer genomics by enabling rapid and cost-effective sequencing of large portions of the genome. With this technology, precision oncology is quickly becoming a realized paradigm for managing the treatment of cancer patients. However, many challenges must be overcome to efficiently implement the transition of next-generation sequencing from research applications to routine clinical practice, including using specimens commonly available in the clinical setting; determining how to process, store, and manage large amounts of sequencing data; determining how to interpret and prioritize molecular findings; and coordinating health professionals from multiple disciplines.

Download Full-text