scholarly journals Assembly-free single-molecule nanopore sequencing recovers complete virus genomes from natural microbial communities

2019 ◽  
Author(s):  
John Beaulaurier ◽  
Elaine Luo ◽  
John Eppley ◽  
Paul Den Uyl ◽  
Xiaoguang Dai ◽  
...  

AbstractViruses are the most abundant biological entities on Earth, and play key roles in host ecology, evolution, and horizontal gene transfer. Despite recent progress in viral metagenomics, the inherent genetic complexity of virus populations still poses technical difficulties for recovering complete virus genomes from natural assemblages. To address these challenges, we developed an assembly-free, single-molecule nanopore sequencing approach enabling direct recovery of high-quality viral genome sequences from environmental samples. Our method yielded over a thousand high quality, full-length draft virus genome sequences that could not be fully recovered using short read assembly approaches applied to the same samples. Additionally, novel DNA sequences were discovered whose repeat structures, gene contents and concatemer lengths suggested that they represent phage-inducible chromosomal islands that were packaged as concatemers within phage particles. Our new approach provided novel insight into genome structures, population biology, and ecology of naturally occurring viruses and viral parasites.

2017 ◽  
Author(s):  
Devang Mehta ◽  
Matthias Hirsch-Hoffmann ◽  
Mariam Were ◽  
Andrea Patrignani ◽  
Hassan Were ◽  
...  

ABSTRACTDeep-sequencing of virus isolates using short-read sequencing technologies is problematic since viruses are often present in complexes sharing a high-degree of sequence identity. The full-length genomes of such highly-similar viruses cannot be assembled accurately from short sequencing reads. We present a new method, CIDER-Seq (Circular DNA Enrichment Sequencing) which successfully generates accurate full-length virus genomes from individual sequencing reads with no sequence assembly required. CIDER-Seq operates by combining a PCR-free, circular DNA enrichment protocol with Single Molecule Real Time sequencing and a new sequence deconcatenation algorithm. We apply our technique to produce more than 1,200 full-length, highly accurate geminivirus genomes from RNAi-transgenic and control plants in a field trial in Kenya. Using CIDER-Seq we can demonstrate for the first time that the expression of antiviral doublestranded RNA (dsRNA) in transgenic plants causes a consistent shift in virus populations towards species sharing low homology to the transgene derived dsRNA. Our results show that CIDER-seq is a powerful, cost-effective tool for accurately sequencing circular DNA viruses, with future applications in deep-sequencing other forms of circular DNA such as transposons and plasmids.


2012 ◽  
Vol 2012 ◽  
pp. 1-14 ◽  
Author(s):  
B. Karsten Tischer ◽  
Benedikt B. Kaufer

Maintenance and manipulation of large DNA and RNA virus genomes had presented an obstacle for virological research. BAC vectors provided a solution to both problems as they can harbor large DNA sequences and can efficiently be modified using well-established mutagenesis techniques inEscherichia coli. Numerous DNA virus genomes of herpesvirus and pox virus were cloned into mini-F vectors. In addition, several reverse genetic systems for RNA viruses such as members ofCoronaviridaeandFlaviviridaecould be established based on BAC constructs. Transfection into susceptible eukaryotic cells of virus DNA cloned as a BAC allows reconstitution of recombinant viruses. In this paper, we provide an overview on the strategies that can be used for the generation of virus BAC vectors and also on systems that are currently available for various virus species. Furthermore, we address common mutagenesis techniques that allow modification of BACs from single-nucleotide substitutions to deletion of viral genes or insertion of foreign sequences. Finally, we review the reconstitution of viruses from BAC vectors and the removal of the bacterial sequences from the virus genome during this process.


Author(s):  
Leho Tedersoo ◽  
Mads Albertsen ◽  
Sten Anslan ◽  
Benjamin Callahan

Short-read, high-throughput sequencing (HTS) methods have yielded numerous important insights into microbial ecology and function. Yet, in many instances short-read HTS techniques are suboptimal, for example by providing insufficient phylogenetic resolution or low integrity of assembled genomes. Single-molecule and synthetic long-read (SLR) HTS methods have successfully ameliorated these limitations. In addition, nanopore sequencing has generated a number of unique analysis opportunities such as rapid molecular diagnostics and direct RNA sequencing, and both PacBio and nanopore sequencing support detection of epigenetic modifications. Although initially suffering from relatively low sequence quality, recent advances have greatly improved the accuracy of long read sequencing technologies. In spite of great technological progress in recent years, the long-read HTS methods (PacBio and nanopore sequencing) are still relatively costly, require large amounts of high-quality starting material, and commonly need specific solutions in various analysis steps. Despite these challenges, long-read sequencing technologies offer high-quality, cutting-edge alternatives for testing hypotheses about microbiome structure and functioning as well as assembly of eukaryote genomes from complex environmental DNA samples.


GigaScience ◽  
2019 ◽  
Vol 8 (7) ◽  
Author(s):  
Jing Yang ◽  
Hafiz Muhammad Wariss ◽  
Lidan Tao ◽  
Rengang Zhang ◽  
Quanzheng Yun ◽  
...  

Abstract Background Acer yangbiense is a newly described critically endangered endemic maple tree confined to Yangbi County in Yunnan Province in Southwest China. It was included in a programme for rescuing the most threatened species in China, focusing on “plant species with extremely small populations (PSESP)”. Findings We generated 64, 94, and 110 Gb of raw DNA sequences and obtained a chromosome-level genome assembly of A. yangbiense through a combination of Pacific Biosciences Single-molecule Real-time, Illumina HiSeq X, and Hi-C mapping, respectively. The final genome assembly is ∼666 Mb, with 13 chromosomes covering ∼97% of the genome and scaffold N50 sizes of 45 Mb. Further, BUSCO analysis recovered 95.5% complete BUSCO genes. The total number of repetitive elements account for 68.0% of the A. yangbiense genome. Genome annotation generated 28,320 protein-coding genes, assisted by a combination of prediction and transcriptome sequencing. In addition, a nearly 1:1 orthology ratio of dot plots of longer syntenic blocks revealed a similar evolutionary history between A. yangbiense and grape, indicating that the genome has not undergone a whole-genome duplication event after the core eudicot common hexaploidization. Conclusion Here, we report a high-quality de novo genome assembly of A. yangbiense, the first genome for the genus Acer and the family Aceraceae. This will provide fundamental conservation genomics resources, as well as representing a new high-quality reference genome for the economically important Acer lineage and the wider order of Sapindales.


2021 ◽  
Author(s):  
Kazuharu Misawa

SARS-CoV-2 is the cause of the worldwide epidemic of severe acute respiratory syndrome. Evolutionary studies of the virus genome will provide a predictor of the fate of COVID-19 in the near future. Recent studies of the virus genomes have shown that C to U substitutions are overrepresented in the genome sequences of SARS-CoV-2. Traditional time-reversible substitution models cannot be applied to the evolution of SARS-CoV-2 sequences. Therefore, in this study, I propose a new time-irreversible model and a new method for estimating the nucleotide substitution rate of SARS-CoV-2. Computer simulations showed that that the new method gives good estimates. I applied the new method to estimate nucleotide substitution rates of SARS-CoV-2 sequences. The result suggests that the rate of C to U substitution of SARS-Cov-2 is ten times higher than other types of substitutions.


2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii406-iii406
Author(s):  
Julien Masliah-Planchon ◽  
Elodie Girard ◽  
Philipp Euskirchen ◽  
Christine Bourneix ◽  
Delphine Lequin ◽  
...  

Abstract Medulloblastoma (MB) can be classified into four molecular subgroups (WNT group, SHH group, group 3, and group 4). The gold standard of assignment of molecular subgroup through DNA methylation profiling uses Illumina EPIC array. However, this tool has some limitation in terms of cost and timing, in order to get the results soon enough for clinical use. We present an alternative DNA methylation assay based on nanopore sequencing efficient for rapid, cheaper, and reliable subgrouping of clinical MB samples. Low-depth whole genome with long-read single-molecule nanopore sequencing was used to simultaneously assess copy number profile and MB subgrouping based on DNA methylation. The DNA methylation data generated by Nanopore sequencing were compared to a publicly available reference cohort comprising over 2,800 brain tumors including the four subgroups of MB (Capper et al. Nature; 2018) to generate a score that estimates a confidence with a tumor group assignment. Among the 24 MB analyzed with nanopore sequencing (six WNT, nine SHH, five group 3, and four group 4), all of them were classified in the appropriate subgroup established by expression-based Nanostring subgrouping. In addition to the subgrouping, we also examine the genomic profile. Furthermore, all previously identified clinically relevant genomic rearrangements (mostly MYC and MYCN amplifications) were also detected with our assay. In conclusion, we are confirming the full reliability of nanopore sequencing as a novel rapid and cheap assay for methylation-based MB subgrouping. We now plan to implement this technology to other embryonal tumors of the central nervous system.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Jean-Marc Aury ◽  
Benjamin Istace

Abstract Single-molecule sequencing technologies have recently been commercialized by Pacific Biosciences and Oxford Nanopore with the promise of sequencing long DNA fragments (kilobases to megabases order) and then, using efficient algorithms, provide high quality assemblies in terms of contiguity and completeness of repetitive regions. However, the error rate of long-read technologies is higher than that of short-read technologies. This has a direct consequence on the base quality of genome assemblies, particularly in coding regions where sequencing errors can disrupt the coding frame of genes. In the case of diploid genomes, the consensus of a given gene can be a mixture between the two haplotypes and can lead to premature stop codons. Several methods have been developed to polish genome assemblies using short reads and generally, they inspect the nucleotide one by one, and provide a correction for each nucleotide of the input assembly. As a result, these algorithms are not able to properly process diploid genomes and they typically switch from one haplotype to another. Herein we proposed Hapo-G (Haplotype-Aware Polishing Of Genomes), a new algorithm capable of incorporating phasing information from high-quality reads (short or long-reads) to polish genome assemblies and in particular assemblies of diploid and heterozygous genomes.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ratanond Koonchanok ◽  
Swapna Vidhur Daulatabad ◽  
Quoseena Mir ◽  
Khairi Reda ◽  
Sarath Chandra Janga

Abstract Background Direct-sequencing technologies, such as Oxford Nanopore’s, are delivering long RNA reads with great efficacy and convenience. These technologies afford an ability to detect post-transcriptional modifications at a single-molecule resolution, promising new insights into the functional roles of RNA. However, realizing this potential requires new tools to analyze and explore this type of data. Result Here, we present Sequoia, a visual analytics tool that allows users to interactively explore nanopore sequences. Sequoia combines a Python-based backend with a multi-view visualization interface, enabling users to import raw nanopore sequencing data in a Fast5 format, cluster sequences based on electric-current similarities, and drill-down onto signals to identify properties of interest. We demonstrate the application of Sequoia by generating and analyzing ~ 500k reads from direct RNA sequencing data of human HeLa cell line. We focus on comparing signal features from m6A and m5C RNA modifications as the first step towards building automated classifiers. We show how, through iterative visual exploration and tuning of dimensionality reduction parameters, we can separate modified RNA sequences from their unmodified counterparts. We also document new, qualitative signal signatures that characterize these modifications from otherwise normal RNA bases, which we were able to discover from the visualization. Conclusions Sequoia’s interactive features complement existing computational approaches in nanopore-based RNA workflows. The insights gleaned through visual analysis should help users in developing rationales, hypotheses, and insights into the dynamic nature of RNA. Sequoia is available at https://github.com/dnonatar/Sequoia.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sten Ilmjärv ◽  
Fabien Abdul ◽  
Silvia Acosta-Gutiérrez ◽  
Carolina Estarellas ◽  
Ioannis Galdadas ◽  
...  

AbstractThe D614G mutation in the Spike protein of the SARS-CoV-2 has effectively replaced the early pandemic-causing variant. Using pseudotyped lentivectors, we confirmed that the aspartate replacement by glycine in position 614 is markedly more infectious. Molecular modelling suggests that the G614 mutation facilitates transition towards an open state of the Spike protein. To explain the epidemiological success of D614G, we analysed the evolution of 27,086 high-quality SARS-CoV-2 genome sequences from GISAID. We observed striking coevolution of D614G with the P323L mutation in the viral polymerase. Importantly, the exclusive presence of G614 or L323 did not become epidemiologically relevant. In contrast, the combination of the two mutations gave rise to a viral G/L variant that has all but replaced the initial D/P variant. Our results suggest that the P323L mutation, located in the interface domain of the RNA-dependent RNA polymerase, is a necessary alteration that led to the epidemiological success of the present variant of SARS-CoV-2. However, we did not observe a significant correlation between reported COVID-19 mortality in different countries and the prevalence of the Wuhan versus G/L variant. Nevertheless, when comparing the speed of emergence and the ultimate predominance in individual countries, it is clear that the G/L variant displays major epidemiological supremacy over the original variant.


Author(s):  
Hsin-Chih Yeh ◽  
Christopher M. Puleo ◽  
Yi-Ping Ho ◽  
Tza-Huei Wang

In this report, we review several single-molecule detection (SMD) methods and newly developed nanocrystal-mediated single-fluorophore strategies for ultrasensitive and specific analysis of genomic sequences. These include techniques, such as quantum dot (QD)-mediated fluorescence resonance energy transfer (FRET) technology and dual-color fluorescence coincidence and colocalization analysis, which allow separation-free detection of low-abundance DNA sequences and mutational analysis of oncogenes. Microfluidic approaches developed for use with single-molecule detection to achieve rapid, low-volume, and quantitative analysis of nucleic acids, such as electrokinetic manipulation of single molecules and confinement of sub-nanoliter samples using microfluidic networks integrated with valves, are also discussed.


Sign in / Sign up

Export Citation Format

Share Document