DNA Sequencing Error

Author(s):  
Rodger Staden
Viruses ◽  
2020 ◽  
Vol 12 (8) ◽  
pp. 801
Author(s):  
Deborah M. Leigh ◽  
Christopher Schefer ◽  
Carolina Cornejo

The MinION sequencer is increasingly being used for the detection and outbreak surveillance of pathogens due to its rapid throughput. For RNA viruses, MinION’s new direct RNA sequencing is the next significant development. Direct RNA sequencing studies are currently limited and comparisons of its diagnostic performance relative to different DNA sequencing approaches are lacking as a result. We sought to address this gap and sequenced six subtypes from the mycovirus CHV-1 using MinION’s direct RNA sequencing and DNA sequencing based on a targeted viral amplicon. Reads from both techniques could correctly identify viral presence and species using BLAST, though direct RNA reads were more frequently misassigned to closely related CHV species. De novo consensus sequences were error prone but suitable for viral species identification. However, subtype identification was less accurate from both reads and consensus sequences. This is due to the high sequencing error rate and the limited sequence divergence between some CHV-1 subtypes. Importantly, neither RNA nor amplicon sequencing reads could be used to obtain reliable intra-host variants. Overall, both sequencing techniques were suitable for virus detection, though limitations are present due to the error rate of MinION reads.


2017 ◽  
Author(s):  
Fanny Perraudeau ◽  
Sandrine Dudoit ◽  
James H. Bullard

AbstractDNA sequencing of PCR-amplified marker genes, especially but not limited to the 16S rRNA gene, is perhaps the most common approach for profiling microbial communities. Due to technological constraints of commonly available DNA sequencing, these approaches usually take the form of short reads sequenced from a narrow, targeted variable region, with a corresponding loss of taxonomic resolution relative to the full length marker gene. We use Pacific Biosciences single-molecule, real-time circular consensus sequencing to sequence amplicons spanning the entire length of the 16S rRNA gene. However, this sequencing technology suffers from high sequencing error rate that needs to be addressed in order to take full advantage of the longer sequence. Here, we present a method to model the sequencing error process using a generalized pair hidden Markov chain model and estimate bacterial abundances in microbial samples. We demonstrate, with simulated and real data, that our model and its associated estimation procedure are able to give accurate estimates at the species (or subspecies) level, and is more flexible than existing methods like SImple Non-Bayesian TAXonomy (SINTAX).


Author(s):  
Patrick D Schloss ◽  
Sarah L Westcott ◽  
Matthew L Jenior ◽  
Sarah K Highlander

Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality, but short sequences. These platforms have allowed researchers to significantly improve the design of their experiments. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The synthetic mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 2.16% to 0.32%. Unfortunately, this error rate was still 16-times higher than the error rate that has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the longer reads frequently provided better classification, the wider adoption of this approach for 16S rRNA gene sequencing is likely limited by its high sequencing error and low yield of sequencing data relative to the other available platforms.


2021 ◽  
Author(s):  
Daniel Cooke ◽  
Gerton Lunter ◽  
David Wedge

Abstract We describe an extension to our variant calling tool, Octopus (https://github.com/luntergroup/octopus), for single-cell DNA sequencing data. Octopus jointly genotypes cells from a lineage, accounting for amplification stochasticity and sequencing error with a haplotype-based Bayesian model. Octopus is considerably more accurate at genotyping single cells than existing methods.


2013 ◽  
Vol 31 (15_suppl) ◽  
pp. e22041-e22041
Author(s):  
Andre Marziali

e22041 Background: Next Generation DNA Sequencing (NGS) is becoming the new standard for mutational profiling of tumour tissue, due to its flexibility, speed, and decreasing cost. While generally exceptional in performance, NGS suffers from a sequencing error rate of 0.1% – 1%, largely due to amplification-induced artifacts in its workflow. While this does not constitute a significant problem in application of NGS to sequencing of tumour tissue, it makes NGS impractical as a method to search for low abundance mutation signatures in plasma samples. Numerous publications have shown the presence of tumour signatures in the cell-free DNA (cfDNA) circulating in plasma, but concordance between the tumour signature and the plasma signature has been limited. This is likely due to limitations in the detection technologies used to search for cfDNA in plasma. To maximize concordance between plasma and tissue, it will be essential that sensitivities reaching 0.01% and below (as little as a single tumour mutant allele per sample) be achieved, and ideally that multiple mutational hot spots be analysed to maximize the chance of detection. Current technologies are incapable of such sensitivity over a large number of mutation loci. Methods: We have developed a novel electrophoretic method that can enrich nucleic acid samples over 1,000,000-fold for up to 100 somatic mutations, enabling reliable profiling of samples containing as little as 0.01% mutant. By enriching nucleic acid samples for specific targets prior to amplification and sequencing, we enable the use of NGS in plasma-based mutation detection and profiling. Results: We present technical and clinical data demonstrating highly sensitive multiplexed mutation detection in plasma and tissue samples, demonstrating 0.01% sensitivity over 45 somatic mutations per sample. Conclusions: We have demonstrated a novel somatic mutation enrichment methodology that allows DNA sequencing to work beyond its usual limit of detection to accurately profile solid tumours by detecting their mutation signature in plasma, even when the tumour DNA is present in plasma at abundances below 0.01%.


2006 ◽  
Vol 17 (1) ◽  
pp. 193 ◽  
Author(s):  
Wei-Min ZHENG

Author(s):  
S.A.C. Gould ◽  
B. Drake ◽  
C.B. Prater ◽  
A.L. Weisenhorn ◽  
S.M. Lindsay ◽  
...  

The atomic force microscope (AFM) is an instrument that can be used to image many samples of interest in biology and medicine. Images of polymerized amino acids, polyalanine and polyphenylalanine demonstrate the potential of the AFM for revealing the structure of molecules. Images of the protein fibrinogen which agree with TEM images demonstrate that the AFM can provide topographical data on larger molecules. Finally, images of DNA suggest the AFM may soon provide an easier and faster technique for DNA sequencing.The AFM consists of a microfabricated SiO2 triangular shaped cantilever with a diamond tip affixed at the elbow to act as a probe. The sample is mounted on a electronically driven piezoelectric crystal. It is then placed in contact with the tip and scanned. The topography of the surface causes minute deflections in the 100 μm long cantilever which are detected using an optical lever.


2001 ◽  
Vol 28 (10) ◽  
pp. 549-554
Author(s):  
Ryan N. Cole ◽  
Stewart W. West ◽  
Christine L. Terrell ◽  
Glenn D. Roberts ◽  
Iftikhar Ahmed
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document