Improved Annotation with de novo Transcriptome Assembly in Four Social Amoeba Species

Mapping Intimacies ◽

10.1101/054536 ◽

2016 ◽

Author(s):

Reema Singh ◽

Hajara M. Lawal ◽

Christina Schilde ◽

Gernot Glöeckner ◽

Geoff J. Barton ◽

...

Keyword(s):

Genome Sequencing ◽

Empirical Data ◽

Genome Annotation ◽

De Novo ◽

Transcriptome Assembly ◽

Rna Seq ◽

De Novo Transcriptome ◽

Novel Genes ◽

Gene Models ◽

First Time

ABSTRACTBackground:Annotation of gene models and transcripts is a fundamental step in genome sequencing projects. Often this is performed with automated prediction pipelines, which can miss complex and atypical genes or transcripts. RNA-seq data can aid the annotation with empirical data. Here we present de novo transcriptome assemblies generated from RNA-seq data in four Dictyostelid species: D. discoideum, P. pallidum, D. fasciculatum and D. lacteum. The assemblies were incorporated with existing gene models to determine corrections and improvement on a whole-genome scale. This is the first time this has been performed in these eukaryotic species.Results:An initial de novo transcriptome assembly was generated by Trinity for each species and then refined with Program to Assemble Spliced Alignments (PASA). The completeness and quality were assessed with the Core Eukaryotic Genes Mapping Approach (CEGMA) and Transrate tools at each stage of the assemblies. The final datasets of 11,315-12,849 transcripts contained 5,610-7,712 updates and corrections to >50% of existing gene models including changes to hundreds or thousands of protein products. Putative novel genes are also identified and alternative splice isoforms were observed for the first time in P. pallidum, D. lacteum and D. fasciculatum.Conclusions:In taking a whole transcriptome approach to genome annotation with empirical data we have been able to enrich the annotations of four existing genome sequencing projects. In doing so we have identified updates to the majority of the gene annotations across all four species under study and found putative novel genes and transcripts which could be worthy for follow-up. The new transcriptome data we present here will be a valuable resource for genome curators in the Dictyostelia and we propose this effective methodology for use in other genome annotation projects.

Download Full-text

First de-novo transcriptome assembly of a South American frog, Oreobates cruralis, enables population genomic studies of Neotropical amphibians

10.7287/peerj.preprints.2980v1 ◽

2017 ◽

Author(s):

Santiago Montero-Mendieta ◽

Manfred Grabherr ◽

Henrik Lantz ◽

Ignacio De la Riva ◽

Jennifer A Leonard ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

De Novo ◽

Transcriptome Assembly ◽

Cost Effective ◽

Model Organisms ◽

South American ◽

Whole Genome ◽

Rna Seq ◽

De Novo Transcriptome

Whole genome sequencing is opening the door to novel insights into the population structure and evolutionary history of poorly known species. In organisms with large genomes, which includes most amphibians, whole-genome sequencing is excessively challenging and transcriptome sequencing (RNA-seq) represents a cost-effective tool to explore genome-wide variability. Non-model organisms do not usually have a reference genome to facilitate assembly and the transcriptome sequence must be assembled de-novo. We used RNA-seq to obtain the transcriptome profile for Oreobates cruralis, a poorly known South American direct-developing frog. In total, 550,871 transcripts were assembled, corresponding to 422,999 putative genes. Of those, we identified 23,500, 37,349, 38,120 and 45,885 genes present in the Pfam, EggNOG, KEGG and GO databases, respectively. Interestingly, our results suggested that genes related to immune system and defense mechanisms are abundant in the transcriptome of O. cruralis. We also present a workflow to assist with pre-processing, assembling, evaluating and functionally annotating a de-novo transcriptome from RNA-seq data of non-model organisms. Our workflow guides the inexperienced user in an intuitive way through all the necessary steps to build de-novo transcriptome assemblies using readily available software and is freely available at: https://github.com/biomendi/PRACTICAL-GUIDE-TO-BUILD-DE-NOVO-TRANSCRIPTOME-ASSEMBLIES-FOR-NON-MODEL-ORGANISMS/wiki

Download Full-text

First de-novo transcriptome assembly of a South American frog, Oreobates cruralis, enables population genomic studies of Neotropical amphibians

10.7287/peerj.preprints.2980 ◽

2017 ◽

Author(s):

Santiago Montero-Mendieta ◽

Manfred Grabherr ◽

Henrik Lantz ◽

Ignacio De la Riva ◽

Jennifer A Leonard ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

De Novo ◽

Transcriptome Assembly ◽

Cost Effective ◽

Model Organisms ◽

South American ◽

Whole Genome ◽

Rna Seq ◽

De Novo Transcriptome

Download Full-text

RNA-Seq Based De Novo Transcriptome Assembly and Gene Discovery of Cistanche deserticola Fleshy Stem

PLoS ONE ◽

10.1371/journal.pone.0125722 ◽

2015 ◽

Vol 10 (5) ◽

pp. e0125722 ◽

Cited By ~ 8

Author(s):

Yuli Li ◽

Xiliang Wang ◽

Tingting Chen ◽

Fuwen Yao ◽

Cuiping Li ◽

...

Keyword(s):

De Novo ◽

Transcriptome Assembly ◽

Gene Discovery ◽

De Novo Transcriptome Assembly ◽

Rna Seq ◽

De Novo Transcriptome ◽

Cistanche Deserticola

Download Full-text

A consensus approach to vertebrate de novo transcriptome assembly from RNA-seq data: assembly of the duck (Anas platyrhynchos) transcriptome

Frontiers in Genetics ◽

10.3389/fgene.2014.00190 ◽

2014 ◽

Vol 5 ◽

Cited By ~ 24

Author(s):

Joanna Moreton ◽

Stephen P. Dunham ◽

Richard D. Emes

Keyword(s):

De Novo ◽

Anas Platyrhynchos ◽

Transcriptome Assembly ◽

De Novo Transcriptome Assembly ◽

Rna Seq ◽

Consensus Approach ◽

De Novo Transcriptome

Download Full-text

De novo transcriptome assembly of RNA-Seq reads with different strategies

Science China Life Sciences ◽

10.1007/s11427-011-4256-9 ◽

2011 ◽

Vol 54 (12) ◽

pp. 1129-1133 ◽

Cited By ~ 11

Author(s):

Geng Chen ◽

KangPing Yin ◽

Charles Wang ◽

TieLiu Shi

Keyword(s):

De Novo ◽

Transcriptome Assembly ◽

De Novo Transcriptome Assembly ◽

Rna Seq ◽

De Novo Transcriptome

Download Full-text

rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data

10.1101/420208 ◽

2018 ◽

Cited By ~ 13

Author(s):

Elena Bushmanova ◽

Dmitry Antipov ◽

Alla Lapidus ◽

Andrey D. Prjibelski

Keyword(s):

De Novo ◽

Transcriptome Assembly ◽

Model Organisms ◽

Challenging Problem ◽

Rna Seq ◽

De Novo Transcriptome ◽

Weak Points ◽

Transcriptome Reconstruction ◽

Evaluation Approaches ◽

Genome Assembler

AbstractSummaryPossibility to generate large RNA-seq datasets has led to development of various reference-based and de novo transcriptome assemblers with their own strengths and limitations. While reference-based tools are widely used in various transcriptomic studies, their application is limited to the model organisms with finished and annotated genomes. De novo transcriptome reconstruction from short reads remains an open challenging problem, which is complicated by the varying expression levels across different genes, alternative splicing and paralogous genes. In this paper we describe a novel transcriptome assembler called rnaSPAdes, which is developed on top of SPAdes genome assembler and explores surprising computational parallels between assembly of transcriptomes and single-cell genomes. We also present quality assessment reports for rnaSPAdes assemblies, compare it with modern transcriptome assembly tools using several evaluation approaches on various RNA-Seq datasets, and briefly highlight strong and weak points of different assemblers.Availability and implementationrnaSPAdes is implemented in C++ and Python and is freely available at cab.spbu.ru/software/rnaspades/.

Download Full-text

Corrections to “IsoTree: A New Framework for de novo Transcriptome Assembly from RNA-seq Reads”

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2020.3005267 ◽

2020 ◽

Vol 17 (6) ◽

pp. 2197-2197

Author(s):

Jin Zhao ◽

Haodi Feng ◽

Daming Zhu ◽

Chi Zhang ◽

Ying Xu

Keyword(s):

De Novo ◽

Transcriptome Assembly ◽

De Novo Transcriptome Assembly ◽

Rna Seq ◽

De Novo Transcriptome ◽

New Framework

Download Full-text

RNA-Seq analysis and de novo transcriptome assembly of Cry toxin susceptible and tolerant Achaea janata larvae

Scientific Data ◽

10.1038/s41597-019-0160-0 ◽

2019 ◽

Vol 6 (1) ◽

Author(s):

Narender K. Dhania ◽

Vinod K. Chauhan ◽

R. K. Chaitanya ◽

Aparna Dutta-Gupta

Keyword(s):

De Novo ◽

Transcriptome Assembly ◽

De Novo Transcriptome Assembly ◽

Rna Seq ◽

De Novo Transcriptome ◽

Cry Toxin ◽

Achaea Janata

Download Full-text

Compacting and correcting Trinity and Oases RNA-Seq de novo assemblies

10.7287/peerj.preprints.2284 ◽

2016 ◽

Author(s):

Cédric Cabau ◽

Frédéric Escudié ◽

Anis Djari ◽

Yann Guiguen ◽

Julien Bobe ◽

...

Keyword(s):

De Novo ◽

Transcriptome Assembly ◽

Error Rates ◽

Rna Seq ◽

De Novo Transcriptome ◽

Software Packages ◽

Redundancy Reduction ◽

Assembly Pipeline ◽

Free Open Source

Background De novo transcriptome assembly of short reads is now a common step in expression analysis of organisms lacking a reference genome sequence. Several software packages are available to perform this task. Even if their results are of good quality it is still possible to improve them in several ways including redundancy reduction or error correction. Trinity and Oases are two commonly used de novo transcriptome assemblers. The contig sets they produce are of good quality. Still, their compaction (number of contigs needed to represent the transcriptome) and their quality (chimera and nucleotide error rates) can be improved. Results We built a de novo RNA-Seq Assembly Pipeline (DRAP) which wraps these two assemblers (Trinity and Oases) in order to improve their results regarding the above-mentioned criteria. DRAP reduces from 1,3 to 15 fold the number of resulting contigs of the assemblies depending on the read set and the assembler used. This article presents seven assembly comparisons showing in some cases drastic improvements when using DRAP. DRAP does not significantly impair assembly quality metrics such are read realignment rate or protein reconstruction counts. Conclusion Transcriptome assembly is a challenging computational task even if good solutions are already available to end-users, these solutions can still be improved while conserving the overall representation and quality of the assembly. The de novo RNA-Seq Assembly Pipeline (DRAP) is an ease to use software package to produce compact and corrected transcript set. DRAP is free, open-source and available at http://www.sigenae.org/drap .

Download Full-text

RNA-Seq Analysis Using De Novo Transcriptome Assembly as a Reference for the Salmon Louse Caligus rogercresseyi

PLoS ONE ◽

10.1371/journal.pone.0092239 ◽

2014 ◽

Vol 9 (4) ◽

pp. e92239 ◽

Cited By ~ 47

Author(s):

Cristian Gallardo-Escárate ◽

Valentina Valenzuela-Muñoz ◽

Gustavo Nuñez-Acuña

Keyword(s):

De Novo ◽

Transcriptome Assembly ◽

De Novo Transcriptome Assembly ◽

Rna Seq ◽

De Novo Transcriptome ◽

Salmon Louse

Download Full-text