scholarly journals Accurate assembly of minority viral haplotypes from next-generation sequencing through efficient noise reduction

2018 ◽  
Author(s):  
Sergey Knyazev ◽  
Viachaslau Tsyvina ◽  
Anupama Shankar ◽  
Andrew Melnyk ◽  
Alexander Artyomenko ◽  
...  

ABSTRACTRapidly evolving RNA viruses continuously produce minority haplotypes that can become dominant if they are drug-resistant or can better evade the immune system. Therefore, early detection and identification of minority viral haplotypes may help to promptly adjust the patient's treatment plan preventing potential disease complications. Minority haplotypes can be identified using next-generation sequencing (NGS), but sequencing noise hinders accurate identification. The elimination of sequencing noise is a non-trivial task that still remains open. Here we propose CliqueSNV based on extracting pairs of statistically linked mutations from noisy reads. This effectively reduces sequencing noise and enables identifying minority haplotypes with the frequency below the sequencing error rate. We comparatively assess the performance of CliqueSNV using an in vitro mixture of nine haplotypes that were derived from the mutation profile of an existing HIV patient. We show that CliqueSNV can accurately assemble viral haplotypes with frequencies as low as 0.1% and maintains consistent performance across short and long bases sequencing platforms.

2019 ◽  
Author(s):  
Xinyue You ◽  
Suresh Thiruppathi ◽  
Weiying Liu ◽  
Yiyi Cao ◽  
Mikihiko Naito ◽  
...  

ABSTRACTTo improve the accuracy and the cost-efficiency of next-generation sequencing in ultralow-frequency mutation detection, we developed the Paired-End and Complementary Consensus Sequencing (PECC-Seq), a PCR-free duplex consensus sequencing approach. PECC-Seq employed shear points as endogenous barcodes to identify consensus sequences from the overlap in the shortened, complementary DNA strands-derived paired-end reads for sequencing error correction. With the high accuracy of PECC-Seq, we identified the characteristic base substitution errors introduced by the end-repair process of mechanical fragmentation-based library preparations, which were prominent at the terminal 6 bp of the library fragments in the 5’-NpCpA-3’ or 5’-NpCpT-3’ trinucleotide context. As demonstrated at the human genome scale (TK6 cells), after removing these potential end-repair artifacts from the terminal 6 bp, PECC-Seq could reduce the sequencing error frequency to mid-10−7 with a relatively low sequencing depth. For TA base pairs, the background error rate could be suppressed to mid-10−8. In mutagen-treated TK6, slight increases in mutagen treatment-related mutant frequencies could be detected, indicating the potential of PECC-Seq in detecting genome-wide ultra-rare mutations. In addition, our finding on the patterns of end-repair artifacts may provide new insights in further reducing technical errors not only for PECC-Seq, but also for other next-generation sequencing techniques.


GigaScience ◽  
2020 ◽  
Vol 9 (8) ◽  
Author(s):  
Marcela Sandoval-Velasco ◽  
Juan Antonio Rodríguez ◽  
Cynthia Perez Estrada ◽  
Guojie Zhang ◽  
Erez Lieberman Aiden ◽  
...  

Abstract Background Hi-C experiments couple DNA-DNA proximity with next-generation sequencing to yield an unbiased description of genome-wide interactions. Previous methods describing Hi-C experiments have focused on the industry-standard Illumina sequencing. With new next-generation sequencing platforms such as BGISEQ-500 becoming more widely available, protocol adaptations to fit platform-specific requirements are useful to give increased choice to researchers who routinely generate sequencing data. Results We describe an in situ Hi-C protocol adapted to be compatible with the BGISEQ-500 high-throughput sequencing platform. Using zebra finch (Taeniopygia guttata) as a biological sample, we demonstrate how Hi-C libraries can be constructed to generate informative data using the BGISEQ-500 platform, following circularization and DNA nanoball generation. Our protocol is a modification of an Illumina-compatible method, based around blunt-end ligations in library construction, using un-barcoded, distally overhanging double-stranded adapters, followed by amplification using indexed primers. The resulting libraries are ready for circularization and subsequent sequencing on the BGISEQ series of platforms and yield data similar to what can be expected using Illumina-compatible approaches. Conclusions Our straightforward modification to an Illumina-compatible in situHi-C protocol enables data generation on the BGISEQ series of platforms, thus expanding the options available for researchers who wish to utilize the powerful Hi-C techniques in their research.


2016 ◽  
Vol 77 ◽  
pp. 139
Author(s):  
Zahra Kashi ◽  
Meagan Barner ◽  
Jenefer Dekoning ◽  
Gabriel Caceres ◽  
RaeAnna Neville ◽  
...  

2018 ◽  
Vol 56 (7) ◽  
pp. 1046-1053 ◽  
Author(s):  
Anne Bergougnoux ◽  
Valeria D’Argenio ◽  
Stefanie Sollfrank ◽  
Fanny Verneau ◽  
Antonella Telese ◽  
...  

Abstract Background: Many European laboratories offer molecular genetic analysis of the CFTR gene using a wide range of methods to identify mutations causative of cystic fibrosis (CF) and CFTR-related disorders (CFTR-RDs). Next-generation sequencing (NGS) strategies are widely used in diagnostic practice, and CE marking is now required for most in vitro diagnostic (IVD) tests in Europe. The aim of this multicenter study, which involved three European laboratories specialized in CF molecular analysis, was to evaluate the performance of Multiplicom’s CFTR MASTR Dx kit to obtain CE-IVD certification. Methods: A total of 164 samples, previously analyzed with well-established “reference” methods for the molecular diagnosis of the CFTR gene, were selected and re-sequenced using the Illumina MiSeq benchtop NGS platform. Sequencing data were analyzed using two different bioinformatic pipelines. Annotated variants were then compared to the previously obtained reference data. Results and conclusions: The analytical sensitivity, specificity and accuracy rates of the Multiplicom CFTR MASTR assay exceeded 99%. Because different types of CFTR mutations can be detected in a single workflow, the CFTR MASTR assay simplifies the overall process and is consequently well suited for routine diagnostics.


2020 ◽  
Vol 94 (9) ◽  
Author(s):  
Marilia Rita Pinzone ◽  
Maria Paola Bertuccio ◽  
D. Jake VanBelzen ◽  
Ryan Zurakowski ◽  
Una O’Doherty

ABSTRACT Next-generation sequencing (NGS) represents a powerful tool to unravel the genetic make-up of the HIV reservoir, but limited data exist on its use in vitro. Moreover, most NGS studies do not separate integrated from unintegrated DNA, even though selection pressures on these two forms should be distinct. We reasoned we could use NGS to compare the infection of resting and activated CD4 T cells in vitro to address how the metabolic state affects reservoir formation and dynamics. To address these questions, we obtained HIV sequences 2, 4, and 8 days after NL4-3 infection of metabolically activated and quiescent CD4 T cells (cultured with 2 ng/ml interleukin-7). We compared the composition of integrated and total HIV DNA by isolating integrated HIV DNA using pulsed-field electrophoresis before performing sequencing. After a single-round infection, the majority of integrated HIV DNA was intact in both resting and activated T cells. The decay of integrated intact proviruses was rapid and similar in both quiescent and activated T cells. Defective forms accumulated relative to intact ones analogously to what is observed in vivo. Massively deleted viral sequences formed more frequently in resting cells, likely due to lower deoxynucleoside triphosphate (dNTP) levels and the presence of multiple restriction factors. To our surprise, the majority of these deleted sequences did not integrate into the human genome. The use of NGS to study reservoir dynamics in vitro provides a model that recapitulates important aspects of reservoir dynamics. Moreover, separating integrated from unintegrated HIV DNA is important in some clinical settings to properly study selection pressures. IMPORTANCE The major implication of our work is that the decay of intact proviruses in vitro is extremely rapid, perhaps as a result of enhanced expression. Gaining a better understanding of why intact proviruses decay faster in vitro might help the field identify strategies to purge the reservoir in vivo. When used wisely, in vitro models are a powerful tool to study the selective pressures shaping the viral landscape. Our finding that massively deleted sequences rarely succeed in integrating has several ramifications. It demonstrates that the total HIV DNA can differ substantially in character from the integrated HIV DNA under certain circumstances. The presence of unintegrated HIV DNA has the potential to obscure selection pressures and confound the interpretation of clinical studies, especially in the case of trials involving treatment interruptions.


Sign in / Sign up

Export Citation Format

Share Document