A new full-length virus genome sequencing method reveals that antiviral RNAi changes geminivirus populations in field-grown cassava

Mapping Intimacies ◽

10.1101/168724 ◽

2017 ◽

Cited By ~ 1

Author(s):

Devang Mehta ◽

Matthias Hirsch-Hoffmann ◽

Mariam Were ◽

Andrea Patrignani ◽

Hassan Were ◽

...

Keyword(s):

Single Molecule ◽

Deep Sequencing ◽

Cost Effective ◽

Virus Genome ◽

Full Length ◽

Dna Viruses ◽

Circular Dna ◽

Sequencing Technologies ◽

Virus Genomes ◽

And Control

ABSTRACTDeep-sequencing of virus isolates using short-read sequencing technologies is problematic since viruses are often present in complexes sharing a high-degree of sequence identity. The full-length genomes of such highly-similar viruses cannot be assembled accurately from short sequencing reads. We present a new method, CIDER-Seq (Circular DNA Enrichment Sequencing) which successfully generates accurate full-length virus genomes from individual sequencing reads with no sequence assembly required. CIDER-Seq operates by combining a PCR-free, circular DNA enrichment protocol with Single Molecule Real Time sequencing and a new sequence deconcatenation algorithm. We apply our technique to produce more than 1,200 full-length, highly accurate geminivirus genomes from RNAi-transgenic and control plants in a field trial in Kenya. Using CIDER-Seq we can demonstrate for the first time that the expression of antiviral doublestranded RNA (dsRNA) in transgenic plants causes a consistent shift in virus populations towards species sharing low homology to the transgene derived dsRNA. Our results show that CIDER-seq is a powerful, cost-effective tool for accurately sequencing circular DNA viruses, with future applications in deep-sequencing other forms of circular DNA such as transposons and plasmids.

Download Full-text

ISOdb: A Comprehensive Database of Full-Length Isoforms Generated by Iso-Seq

International Journal of Genomics ◽

10.1155/2018/9207637 ◽

2018 ◽

Vol 2018 ◽

pp. 1-6 ◽

Cited By ~ 1

Author(s):

Shang-Qian Xie ◽

Yue Han ◽

Xiao-Zhou Chen ◽

Tai-Yu Cao ◽

Kai-Kai Ji ◽

...

Keyword(s):

Single Molecule ◽

Full Length ◽

Public Access ◽

Transcript Isoforms ◽

Sequencing Technologies ◽

Long Reads ◽

Depth Analysis ◽

Gene Level ◽

Long Read ◽

Full Length Transcript

The accurate landscape of transcript isoforms plays an important role in the understanding of gene function and gene regulation. However, building complete transcripts is very challenging for short reads generated using next-generation sequencing. Fortunately, isoform sequencing (Iso-Seq) using single-molecule sequencing technologies, such as PacBio SMRT, provides long reads spanning entire transcript isoforms which do not require assembly. Therefore, we have developed ISOdb, a comprehensive resource database for hosting and carrying out an in-depth analysis of Iso-Seq datasets and visualising the full-length transcript isoforms. The current version of ISOdb has collected 93 publicly available Iso-Seq samples from eight species and presents the samples in two levels: (1) sample level, including metainformation, long read distribution, isoform numbers, and alternative splicing (AS) events of each sample; (2) gene level, including the total isoforms, novel isoform number, novel AS number, and isoform visualisation of each gene. In addition, ISOdb provides a user interface in the website for uploading sample information to facilitate the collection and analysis of researchers’ datasets. Currently, ISOdb is the first repository that offers comprehensive resources and convenient public access for hosting, analysing, and visualising Iso-Seq data, which is freely available.

Download Full-text

Trans-NanoSim characterizes and simulates nanopore RNA-sequencing data

GigaScience ◽

10.1093/gigascience/giaa061 ◽

2020 ◽

Vol 9 (6) ◽

Cited By ~ 1

Author(s):

Saber Hafezqorani ◽

Chen Yang ◽

Theodora Lo ◽

Ka Ming Nip ◽

René L Warren ◽

...

Keyword(s):

Rna Sequencing ◽

Single Molecule ◽

Rapid Development ◽

Cost Effective ◽

Third Generation ◽

Sequencing Data ◽

Complementary Dna ◽

Sequencing Technologies ◽

Analytical Tools ◽

Generation Sequencing

Abstract Background Compared with second-generation sequencing technologies, third-generation single-molecule RNA sequencing has unprecedented advantages; the long reads it generates facilitate isoform-level transcript characterization. In particular, the Oxford Nanopore Technology sequencing platforms have become more popular in recent years owing to their relatively high affordability and portability compared with other third-generation sequencing technologies. To aid the development of analytical tools that leverage the power of this technology, simulated data provide a cost-effective solution with ground truth. However, a nanopore sequence simulator targeting transcriptomic data is not available yet. Findings We introduce Trans-NanoSim, a tool that simulates reads with technical and transcriptome-specific features learnt from nanopore RNA-sequncing data. We comprehensively benchmarked Trans-NanoSim on direct RNA and complementary DNA datasets describing human and mouse transcriptomes. Through comparison against other nanopore read simulators, we show the unique advantage and robustness of Trans-NanoSim in capturing the characteristics of nanopore complementary DNA and direct RNA reads. Conclusions As a cost-effective alternative to sequencing real transcriptomes, Trans-NanoSim will facilitate the rapid development of analytical tools for nanopore RNA-sequencing data. Trans-NanoSim and its pre-trained models are freely accessible at https://github.com/bcgsc/NanoSim.

Download Full-text

Fitness-associated substitutions following failure of direct-acting antivirals assessed by deep sequencing of full-length hepatitis C virus genomes

Alimentary Pharmacology & Therapeutics ◽

10.1111/apt.16054 ◽

2020 ◽

Author(s):

Slim Fourati ◽

Christophe Rodriguez ◽

Alexandre Soulier ◽

Flora Donati ◽

Sabah Hamadat ◽

...

Keyword(s):

Hepatitis C Virus ◽

Hepatitis C ◽

Deep Sequencing ◽

Full Length ◽

Direct Acting Antivirals ◽

Direct Acting ◽

Virus Genomes

Download Full-text

Full-length sequencing of circular DNA viruses and extrachromosomal circular DNA using CIDER-Seq

Nature Protocols ◽

10.1038/s41596-020-0301-0 ◽

2020 ◽

Vol 15 (5) ◽

pp. 1673-1689

Author(s):

Devang Mehta ◽

Luc Cornet ◽

Matthias Hirsch-Hoffmann ◽

Syed Shan-e-Ali Zaidi ◽

Hervé Vanderschuren

Keyword(s):

Full Length ◽

Dna Viruses ◽

Circular Dna

Download Full-text

Single-molecule DNA sequencing of widely varying GC-content using nucleotide release, capture and detection in microdroplets

Nucleic Acids Research ◽

10.1093/nar/gkaa987 ◽

2020 ◽

Vol 48 (22) ◽

pp. e132-e132

Author(s):

Tim J Puchtler ◽

Kerr Johnson ◽

Rebecca N Palmer ◽

Emma L Talbot ◽

Lindsey A Ibbotson ◽

...

Keyword(s):

Dna Sequencing ◽

Single Molecule ◽

Direct Detection ◽

Gc Content ◽

Cost Effective ◽

Epigenetic Modifications ◽

Fluorescence Signal ◽

Sequencing Platform ◽

Sequencing Technologies ◽

Lower Accuracy

Abstract Despite remarkable progress in DNA sequencing technologies there remains a trade-off between short-read platforms, having limited ability to sequence homopolymers, repeated motifs or long-range structural variation, and long-read platforms, which tend to have lower accuracy and/or throughput. Moreover, current methods do not allow direct readout of epigenetic modifications from a single read. With the aim of addressing these limitations, we have developed an optical electrowetting sequencing platform that uses step-wise nucleotide triphosphate (dNTP) release, capture and detection in microdroplets from single DNA molecules. Each microdroplet serves as a reaction vessel that identifies an individual dNTP based on a robust fluorescence signal, with the detection chemistry extended to enable detection of 5-methylcytosine. Our platform uses small reagent volumes and inexpensive equipment, paving the way to cost-effective single-molecule DNA sequencing, capable of handling widely varying GC-bias, and demonstrating direct detection of epigenetic modifications.

Download Full-text

Analysis and comprehensive comparison of PacBio and nanopore-based RNA sequencing of the Arabidopsis transcriptome

10.21203/rs.2.19252/v2 ◽

2020 ◽

Author(s):

Jiawen Cui ◽

Nan Shen ◽

Zhaogeng Lu ◽

Guolu Xu ◽

Yuyao Wang ◽

...

Keyword(s):

Rna Sequencing ◽

Single Molecule ◽

Transcriptome Analysis ◽

Cost Effective ◽

Full Length ◽

Transcript Expression ◽

Raw Data ◽

Research Areas ◽

Simple Sequence Repeat Analysis ◽

Comprehensive Comparison

Abstract Background: The number of studies using third-generation sequencing using Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) is rapidly increasing in many different research areas. Among them, plant full-length single-molecule transcriptome studies have mostly used PacBio sequencing, whereas ONT is rarely used. Therefore, in this study, we examined ONT RNA sequencing methods in plants. We performed a detailed evaluation of reads from PacBio, Nanopore direct cDNA (ONT Dc), and Nanopore PCR cDNA (ONT Pc) sequencing including characteristics of raw data and identification of transcripts. In addition, matched Illumina data were generated for comparison. Results: ONT Pc showed overall better raw data quality, whereas PacBio generated longer read lengths. In the transcriptome analysis, PacBio and ONT Pc performed similarly in transcript identification, simple sequence repeat analysis, and long non-coding RNA prediction. PacBio was superior in identifying alternative splicing events, whereas ONT Pc could estimate transcript expression levels. Conclusions: This paper made a comprehensive comparison of PacBio and nanopore-based RNA sequencing of the Arabidopsis transcriptome, the results indicate that ONT Pc is more cost-effective for generating extremely long reads and can characterise the transcriptome as well as quantify transcript expression. Therefore, ONT Pc is a new cost-effective and worthwhile method for full-length single-molecule transcriptome analysis in plants.

Download Full-text

Reference-free reconstruction and quantification of transcriptomes from Nanopore long-read sequencing

10.1101/2020.02.08.939942 ◽

2020 ◽

Author(s):

Ivan de la Rubia ◽

Joel A. Indi ◽

Silvia Carbonell-Sala ◽

Julien Lagarde ◽

M Mar Albà ◽

...

Keyword(s):

Single Molecule ◽

Reference Genome ◽

Simulated Data ◽

Cost Effective ◽

Dna Assembly ◽

Sequencing Data ◽

Consensus Sequences ◽

Sequencing Technologies ◽

Long Reads ◽

Long Read

AbstractSingle-molecule long-read sequencing with Nanopore provides an unprecedented opportunity to measure transcriptomes from any sample1–3. However, current analysis methods rely on the comparison with a reference genome or transcriptome2,4,5, or the use of multiple sequencing technologies6,7, thereby precluding cost-effective studies in species with no genome assembly available, in individuals underrepresented in the existing reference, and for the discovery of disease-specific transcripts not directly identifiable from a reference genome. Methods for DNA assembly8–10 cannot be directly transferred to transcriptomes since their consensus sequences lack the required interpretability for genes with multiple transcript isoforms. To address these challenges, we have developed RATTLE, the first tool to perform reference-free reconstruction and quantification of transcripts from Nanopore long reads. Using simulated data, isoform spike-ins, and sequencing data from tissues and cell lines, we demonstrate that RATTLE accurately determines transcript sequence and abundance, is comparable to reference-based methods, and shows saturation in the number of predicted transcripts with increasing number of input reads.

Download Full-text

Analysis and comprehensive comparison of PacBio and nanopore-based RNA sequencing of the Arabidopsis transcriptome

10.21203/rs.2.19252/v1 ◽

2019 ◽

Author(s):

Jiawen Cui ◽

Nan Shen ◽

Zhaogeng Lu ◽

Guolu Xu ◽

Yuyao Wang ◽

...

Keyword(s):

Rna Sequencing ◽

Single Molecule ◽

Transcriptome Analysis ◽

Cost Effective ◽

Full Length ◽

Transcript Expression ◽

Raw Data ◽

Research Areas ◽

Simple Sequence Repeat Analysis ◽

Comprehensive Comparison

Abstract Background The number of studies using third-generation sequencing using Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) is rapidly increasing in many different research areas. Among them, plant full-length single-molecule transcriptome studies have mostly used PacBio sequencing, whereas ONT is rarely used. Therefore, in this study, we examined ONT RNA sequencing methods in plants. We performed a detailed evaluation of reads from PacBio, Nanopore direct cDNA (ONT Dc), and Nanopore PCR cDNA (ONT Pc) sequencing including characteristics of raw data and identification of transcripts. In addition, matched Illumina data were generated for comparison.Results ONT Pc showed overall better raw data quality, whereas PacBio generated longer read lengths. In the transcriptome analysis, PacBio and ONT Pc performed similarly in transcript identification, simple sequence repeat analysis, and long non-coding RNA prediction. PacBio was superior in identifying alternative splicing events, whereas ONT Pc could estimate transcript expression levels.Conclusions This paper made a comprehensive comparison of PacBio and nanopore-based RNA sequencing of the Arabidopsis transcriptome, the results indicate that ONT Pc is more cost-effective for generating extremely long reads and can characterise the transcriptome as well as quantify transcript expression. Therefore, ONT Pc is a new cost-effective and worthwhile method for full-length single-molecule transcriptome analysis in plants.

Download Full-text

HySA: A Hybrid Structural variant Assembly approach using next generation and single-molecule sequencing technologies

10.1101/069815 ◽

2016 ◽

Cited By ~ 2

Author(s):

Xian Fan ◽

Mark Chaisson ◽

Luay Nakhleh ◽

Ken Chen

Keyword(s):

Human Genome ◽

Single Molecule ◽

Clustering Algorithm ◽

Hydatidiform Mole ◽

Cost Effective ◽

Next Generation ◽

Structural Variations ◽

Single Molecule Sequencing ◽

Structural Variant ◽

Sequencing Technologies

AbstractAchieving complete, accurate and cost-effective assembly of human genome is of great importance for realizing the promises of precision medicine. The abundance of repeats and genetic variations in human genome and the limitations of existing sequencing technologies call for the development of novel assembly methods that could leverage the complementary strengths of multiple technologies.We propose a Hybrid Structural variant Assembly (HySA) approach that integrates sequencing reads from next generation sequencing (NGS) and single-molecule sequencing (SMS) technologies to accurately assemble and detect structural variations (SV) in human genome. By identifying homologous SV-containing reads from different technologies through a bipartite-graph-based clustering algorithm, our approach turns a whole genome assembly problem into a set of independent SV assembly problems, each of which can be effectively solved to enhance assembly of structurally altered regions in human genome.In testing our approach using data generated from a haploid hydatidiform mole genome (CHM1) and a diploid human genome (NA12878), we found that our approach substantially improved the detection of many types of SVs, particularly novel large insertions, small INDELs (10-50bp) and short tandem repeat expansions and contractions over existing approaches with a low false discovery rate. Our work highlights the strengths and limitations of current approaches and provides an effective solution for extending the power of existing sequencing technologies for SV discovery.

Download Full-text

Extensive Homologous Recombination among Widely Divergent TT Viruses

Journal of Virology ◽

10.1128/jvi.74.16.7666-7670.2000 ◽

2000 ◽

Vol 74 (16) ◽

pp. 7666-7670 ◽

Cited By ~ 35

Author(s):

Michael Worobey

Keyword(s):

Homologous Recombination ◽

Virus Genome ◽

Full Length ◽

Noncoding Region ◽

Coding Region ◽

Tt Virus ◽

Virus Genomes ◽

Recombination Breakpoints

ABSTRACT Analyses of a collection of full-length TT virus genomes showed nearly half of them to be recombinant. The results were highly significant and revealed homologous recombination both within and among genotypes, often involving extremely divergent lineages. Recombination breakpoints were significantly more common in the noncoding region of the TT virus genome than in the coding region.

Download Full-text