scholarly journals Clustering of Circular Consensus Sequences: Accurate Error Correction and Assembly of Single Molecule Real-Time Reads from Multiplexed Amplicon Libraries

2017 ◽  
Author(s):  
Felix Francis ◽  
Michael D. Dumas ◽  
Scott B. Davis ◽  
Randall J. Wisser

BACKGROUNDTargeted resequencing with high-throughput sequencing (HTS) platforms can be used to efficiently interrogate the genomes of large numbers of individuals. A critical challenge for research and applications using HTS data, especially from long-read platforms, is errors arising from technological limits and bioinformatic algorithms.RESULTSA single molecule real-time (SMRT) sequencing-error correction and assembly pipeline, C3S-LAA, was developed for libraries of pooled amplicons. By uniquely leveraging the structure of SMRT sequence data (comprised of multiple low quality subreads from which higher quality circular consensus sequences are formed) to cluster raw reads, C3S-LAA produced accurate consensus sequences and assemblies of overlapping amplicons from single sample and multiplexed libraries. In contrast, despite read depths in excess of 100X per amplicon, the standard long amplicon analysis module from Pacific Biosciences generated unexpected numbers of amplicon sequences with substantial inaccuracies in the consensus sequences. A bootstrap analysis showed that the C3S-LAA pipeline per se was effective at removing bioinformatic sources of error, but in rare cases a read depth of nearly 400X was not sufficient to overcome minor but systematic errors inherent to amplification or sequencing.CONCLUSIONSC3S-LAA uses a novel processing algorithm for SMRT amplicon-sequence data that produces accurate consensus sequences and local sequence assemblies. The community standard long amplicon analysis module from Pacific Biosciences is prone to substantial errors that raise concerns about findings based on this pipeline. The method developed here removed this confounding bioinformatics source of error, allowing for the identification of limited instances of errors due to DNA amplification or sequencing.

Blood ◽  
2011 ◽  
Vol 118 (21) ◽  
pp. 3752-3752 ◽  
Author(s):  
Catherine C. Smith ◽  
Michael Brown ◽  
Jason Chin ◽  
Corynn Kasap ◽  
Sara Salerno ◽  
...  

Abstract Abstract 3752 Background: Secondary kinase domain (KD) mutations are the most well-recognized mechanism of resistance to tyrosine kinase inhibitors (TKIs) in chronic myeloid leukemia (CML) and other cancers. In some cases, multiple drug resistant KD mutations can coexist in an individual patient (“polyclonality”). Alternatively, more than one mutation can occur in tandem on a single allele (“compound mutations”) following response and relapse to sequentially administered TKI therapy. Distinguishing between these two scenarios can inform the clinical choice of subsequent TKI treatment. There is currently no clinically adaptable methodology that offers the ability to distinguish polyclonal from compound mutations. Due to the size of the BCR-ABL KD where TKI-resistant mutations are detected, next-generation platforms are unable to generate reads of sufficient length to determine if two mutations separated by 500 nt reside on the same allele. Pacific Biosciences RS Single Molecule Real Time (SMRT) circular consensus sequencing technology is a novel third generation deep sequencing technology capable of rapidly and reliably achieving average read lengths of ∼1000bp (Travers et al, 2010) and frequently beyond 3000bp, allowing sequencing of the entire ABL KD on single strand of DNA. We sought to address the ability of SMRT sequencing technology to distinguish polyclonal from compound mutations using clinical samples obtained from patients who have relapsed on BCR-ABL TKI treatment. Results: We analyzed an 863bp area of the BCR-ABL KD in 6 patients who had clinically relapsed on ABL kinase inhibitor therapy. SMRT sequencing detected mutations at a sensitivity of ∼1–2% of the total sequenced population, and successfully distinguished polyclonal from compound BCR-ABL KD mutations in several patient samples. Results were largely consistent with those obtained by PCR subcloning and sequencing, although SMRT sequencing detected additional mutations and/or mutation combinations. In the most complex case, 7 distinct mutation-bearing alleles were detected in an individual patient after sequential relapse on imatinib and dasatinib. Mutant clones contained single and compound mutations combining distinct mutations (Y253H, T315F, T315A, T315I, T319A, E355G). Three distinct substitutions at residue T315 were detected: T315A, T315I and T315F. Notably, these findings are clinically important as the T315A mutation confers resistance to dasatinib but not imatinib, while the T315F and T315I mutations are resistant to all three clinically approved BCR/ABL inhibitors (imatinib, dasatinib, and nilotinib). Phospho-flow analysis for p-Crkl, a direct substrate of BCR-ABL, was conducted following ex vivo exposure of patient cells from the same time point to all three BCR-ABL inhibitors, and demonstrated the existence of distinct populations of cells with varying sensitivity to each drug (i.e. polyclonal drug sensitivity), underscoring the potential clinical importance of distinguishing polyclonal from compound mutations. Additionally, SMRT sequencing routinely detected alleles harboring compound mutations not detectable by conventional direct sequencing. Data analysis of samples from additional patients is ongoing and will be presented. Conclusions: Pacific Biosciences RS SMRT sequencing sensitively detects KD mutations in patient samples and can distinguish TKI-resistant clones containing compound mutations to reveal a complex mutational landscape in an individual patient not detectable by conventional sequencing. SMRT sequencing of the BCR-ABL KD can feasibly be developed into a rapid and economical clinical test with the additional advantages of increased sensitivity and reliability over current methods. Given the growing numbers of patients exposed to multiple TKIs in a sequential manner, the ability to accurately and sensitively characterize drug-resistant alleles promises to further facilitate a personalized approach to patient management. Disclosures: Brown: Pacific Biosciences: Employment. Chin:Pacific Biosciences: Employment. Travers:Pacific Biosciences: Employment. Wang:Pacific Biosciences: Employment. Kasarskis:Pacific Biosciences: Employment, Equity Ownership. Schadt:Pacific Biosciences: Employment, Equity Ownership.


2021 ◽  
Author(s):  
Evan J. Kipp ◽  
Laramie L. Lindsey ◽  
Benedict S. Khoo ◽  
Christopher Faulk ◽  
Jonathan D. Oliver ◽  
...  

Technological and computational advancements in the fields of genomics and bioinformatics are providing exciting new opportunities for pathogen discovery and surveillance. In particular, single-molecule nucleotide sequence data originating from Oxford Nanopore Technologies (ONT) sequencing platforms can be bioinformatically leveraged, in real-time, for enhanced biosurveillance of a vast array of zoonoses. The recently released nanopore adaptive sampling (NAS) pipeline facilitates immediate mapping of individual nucleotide molecules (i.e., DNA, cDNA, and RNA) to a given reference as each molecule is sequenced. User-defined thresholds then allow for the retention or rejection of specific molecules, informed by the real-time reference mapping results, as they are physically passing through a given sequencing nanopore. Here, we show how NAS can be used to selectively sequence entire genomes of bacterial tick-borne pathogens circulating in wild populations of the blacklegged tick vector, Ixodes scapularis. The NAS method provided a two-fold increase in targeted pathogen sequences, successfully enriching for Borrelia (Borreliella) burgdorferi s.s.; Borrelia (Borrelia) miyamotoi; Anaplasma phagocytophilum; and Ehrlichia muris eauclairensis genomic DNA within our I. scapularis samples. Our results indicate that NAS has strong potential for real-time sequence-based pathogen surveillance.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Thiruni N. Adikari ◽  
Nasir Riaz ◽  
Chathurani Sigera ◽  
Preston Leung ◽  
Braulio M. Valencia ◽  
...  

Abstract Current methods for dengue virus (DENV) genome amplification, amplify parts of the genome in at least 5 overlapping segments and then combine the output to characterize a full genome. This process is laborious, costly and requires at least 10 primers per serotype, thus increasing the likelihood of PCR bias. We introduce an assay to amplify near full-length dengue virus genomes as intact molecules, sequence these amplicons with third generation “nanopore” technology without fragmenting and use the sequence data to differentiate within-host viral variants with a bioinformatics tool (Nano-Q). The new assay successfully generated near full-length amplicons from DENV serotypes 1, 2 and 3 samples which were sequenced with nanopore technology. Consensus DENV sequences generated by nanopore sequencing had over 99.5% pairwise sequence similarity to Illumina generated counterparts provided the coverage was > 100 with both platforms. Maximum likelihood phylogenetic trees generated from nanopore consensus sequences were able to reproduce the exact trees made from Illumina sequencing with a conservative 99% bootstrapping threshold (after 1000 replicates and 10% burn-in). Pairwise genetic distances of within host variants identified from the Nano-Q tool were less than that of between host variants, thus enabling the phylogenetic segregation of variants from the same host.


2016 ◽  
Vol 113 (19) ◽  
pp. 5233-5238 ◽  
Author(s):  
Carl W. Fuller ◽  
Shiv Kumar ◽  
Mintu Porel ◽  
Minchen Chien ◽  
Arek Bibillo ◽  
...  

DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods.


2020 ◽  
Vol 48 (7) ◽  
pp. e42-e42 ◽  
Author(s):  
Justin C Rolando ◽  
Erik Jue ◽  
Jacob T Barlow ◽  
Rustem F Ismagilov

Abstract Isothermal amplification assays, such as loop-mediated isothermal amplification (LAMP), show great utility for the development of rapid diagnostics for infectious diseases because they have high sensitivity, pathogen-specificity and potential for implementation at the point of care. However, elimination of non-specific amplification remains a key challenge for the optimization of LAMP assays. Here, using chlamydia DNA as a clinically relevant target and high-throughput sequencing as an analytical tool, we investigate a potential mechanism of non-specific amplification. We then develop a real-time digital LAMP (dLAMP) with high-resolution melting temperature (HRM) analysis and use this single-molecule approach to analyze approximately 1.2 million amplification events. We show that single-molecule HRM provides insight into specific and non-specific amplification in LAMP that are difficult to deduce from bulk measurements. We use real-time dLAMP with HRM to evaluate differences between polymerase enzymes, the impact of assay parameters (e.g. time, rate or florescence intensity), and the effect background human DNA. By differentiating true and false positives, HRM enables determination of the optimal assay and analysis parameters that leads to the lowest limit of detection (LOD) in a digital isothermal amplification assay.


2015 ◽  
Author(s):  
John F Mulley ◽  
Adam D Hargreaves

Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0-2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5' and 3' UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species.


Sign in / Sign up

Export Citation Format

Share Document