scholarly journals De novo transcriptome analysis of dermal tissue from the rough-skinned newt, Taricha granulosa, enables investigation of tetrodotoxin expression

2019 ◽  
Author(s):  
Haley C. Glass ◽  
Amanda D. Melin ◽  
Steven M. Vamosi

AbstractBackgroundTetrodotoxin (TTX) is a potent neurotoxin used in anti-predator defense by several aquatic species, including the rough-skinned newt, Taricha granulosa. While several possible biological sources of newt TTX have been investigated, mounting evidence suggests a genetic, endogenous origin. We present here a de novo transcriptome assembly and annotation of dorsal skin samples from the tetrodotoxin-bearing species T. granulosa, to facilitate the study of putative genetic mechanisms of TTX expression.FindingsApproximately 211 million read pairs were assembled into 245,734 transcripts using the Trinity de novo assembly method. Of the assembled transcripts, we were able to annotate 34% by comparing them to databases of sequences with known functions, suggesting that many transcripts are unique to the rough-skinned newt. Our assembly has near-complete sequence information for an estimated 83% of genes based on Benchmarking Universal Single Copy Orthologs. We also utilized other comparative methods to assess the quality of our assembly. The T. granulosa assembly was compared with that of the Japanese fire-belly newt, Cynops pyrrhogaster, and they were found to share a total of 30,556 orthologous sequences (12.9% gene set).ConclusionsWe provide a reference assembly for Taricha granulosa that will enable downstream differential expression and comparative transcriptomics analyses. This publicly available transcriptome assembly and annotation dataset will facilitate the investigation of a wide range of questions concerning amphibian adaptive radiation, and the elucidation of mechanisms of tetrodotoxin defense in Taricha granulosa and other TTX-bearing species.

2020 ◽  
Author(s):  
Maxim Ivanov ◽  
Albin Sandelin ◽  
Sebastian Marquardt

Abstract Background: The quality of gene annotation determines the interpretation of results obtained in transcriptomic studies. The growing number of genome sequence information calls for experimental and computational pipelines for de novo transcriptome annotation. Ideally, gene and transcript models should be called from a limited set of key experimental data. Results: We developed TranscriptomeReconstructoR, an R package which implements a pipeline for automated transcriptome annotation. It relies on integrating features from independent and complementary datasets: i) full-length RNA-seq for detection of splicing patterns and ii) high-throughput 5' and 3' tag sequencing data for accurate definition of gene borders. The pipeline can also take a nascent RNA-seq dataset to supplement the called gene model with transient transcripts.We reconstructed de novo the transcriptional landscape of wild type Arabidopsis thaliana seedlings as a proof-of-principle. A comparison to the existing transcriptome annotations revealed that our gene model is more accurate and comprehensive than the two most commonly used community gene models, TAIR10 and Araport11. In particular, we identify thousands of transient transcripts missing from the existing annotations. Our new annotation promises to improve the quality of A.thaliana genome research.Conclusions: Our proof-of-concept data suggest a cost-efficient strategy for rapid and accurate annotation of complex eukaryotic transcriptomes. We combine the choice of library preparation methods and sequencing platforms with the dedicated computational pipeline implemented in the TranscriptomeReconstructoR package. The pipeline only requires prior knowledge on the reference genomic DNA sequence, but not the transcriptome. The package seamlessly integrates with Bioconductor packages for downstream analysis.


Author(s):  
Masanao Sato ◽  
Masahide Seki ◽  
Yutaka Suzuki ◽  
Shoko Ueki

Heterosigma akashiwo is a eukaryotic, cosmopolitan, and unicellular alga (class: Raphidophyceae), and produces fish-killing blooms. There is a substantial scientific and practical interest in its ecophysiological characteristics that determine bloom dynamics and its adaptation to broad climate zones. A well-annotated genomic/genetic sequence information enables researchers to characterize organisms using modern molecular technology. The Chloroplast and the mitochondrial genome sequences and transcriptome sequence assembly (TSA) datasets with limited sizes for H. akashiwo are available in NCBI nucleotide database on December 2021: there is no doubt that more genetic information of the species will greatly enhance the progress of biological characterization of the species. Here, we conducted H. akashiwo RNA sequencing, a de novo transcriptome assembly (NCBI TSA ICRV01) of a large number of high-quality short-read sequences, and the functional annotation of predicted genes. Based on our transcriptome, we confirmed that the organism possesses genes that were predicted to function in phagocytosis, supporting the earlier observations of H. akashiwo bacterivory. Along with its capability for photosynthesis, the mixotrophy of H. akashiwo may partially explain its high adaptability to various environmental conditions. Our study here will provide an important toehold to decipher H. akashiwo ecophysiology at a molecular level.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0247180
Author(s):  
Fu-Jin Wei ◽  
Saneyoshi Ueno ◽  
Tokuko Ujino-Ihara ◽  
Maki Saito ◽  
Yoshihiko Tsumura ◽  
...  

Sugi (Cryptomeria japonica D. Don) is an important conifer used for afforestation in Japan. As the genome of this species is 11 Gbps, it is too large to assemble within a short timeframe. Transcriptomics is one approach that can address this deficiency. Here we designed a workflow consisting of three stages to de novo assemble transcriptome using Oases and Trinity. The three transcriptomic stage used were independent assembly, automatic and semi-manual integration, and refinement by filtering out potential contamination. We identified a set of 49,795 cDNA and an equal number of translated proteins. According to the benchmark set by BUSCO, 87.01% of cDNAs identified were complete genes, and 78.47% were complete and single-copy genes. Compared to other full-length cDNA resources collected by Sanger and PacBio sequencers, the extent of the coverage in our dataset was the highest, indicating that these data can be safely used for further studies. When two tissue-specific libraries were compared, there were significant expression differences between male strobili and leaf and bark sets. Moreover, subtle expression difference between male-fertile and sterile libraries were detected. Orthologous genes from other model plants and conifer species were identified. We demonstrated that our transcriptome assembly output (CJ3006NRE) can serve as a reference transcriptome for future functional genomics and evolutionary biology studies.


2020 ◽  
Author(s):  
Michal Levin ◽  
Marion Scheibe ◽  
Falk Butter

Abstract BackgroundThe process of identifying all coding regions in a genome is crucial for any study at the level of molecular biology, ranging from single-gene cloning to genome-wide measurements using RNA-Seq or mass spectrometry. While satisfactory annotation has been made feasible for well-studied model organisms through great efforts of big consortia, for most systems this kind of data is either absent or not adequately precise. ResultsCombining in-depth transcriptome sequencing and high resolution mass spectrometry, we here use proteotranscriptomics to improve gene annotation of protein-coding genes in the Bombyx mori cell line BmN4 which is an increasingly used tool for the analysis of piRNA biogenesis and function. Using this approach we provide the exact coding sequence and evidence for more than 6,200 genes on the protein level. Furthermore using spatial proteomics, we establish the subcellular localization of thousands of these proteins. We show that our approach outperforms current Bombyx mori annotation attempts in terms of accuracy and coverage. ConclusionsWe show that proteotranscriptomics is an efficient, cost-effective and accurate approach to improve previous annotations or generate new gene models. As this technique is based on de-novo transcriptome assembly, it provides the possibility to study any species also in the absence of genome sequence information for which proteogenomics would be impossible.


Author(s):  
Masanao Sato ◽  
Masahide Seki ◽  
Yutaka Suzuki ◽  
Shoko Ueki

Heterosigma akashiwo is a eukaryotic, cosmopolitan, and unicellular alga (class: Raphidophyceae), and produces fish-killing blooms. There is a substantial scientific and practical interest in its ecophysiological characteristics that determine bloom dynamics and its adaptation to broad climate zones. A well-annotated genomic/genetic sequence information enables researchers to characterize organisms using modern molecular technology. The Chloroplast and the mitochondrial genome sequences and transcriptome sequence assembly (TSA) datasets with limited sizes for H. akashiwo are available in NCBI nucleotide database on December 2021: there is no doubt that more genetic information of the species will greatly enhance the progress of biological characterization of the species. Here, we conducted H. akashiwo RNA sequencing, a de novo transcriptome assembly (NCBI TSA ICRV01) of a large number of high-quality short-read sequences, and the functional annotation of predicted genes. Based on our transcriptome, we confirmed that the organism possesses genes that were predicted to function in phagocytosis, supporting the earlier observations of H. akashiwo bacterivory. Along with its capability for photosynthesis, the mixotrophy of H. akashiwo may partially explain its high adaptability to various environmental conditions. Our study here will provide an important toehold to decipher H. akashiwo ecophysiology at a molecular level.


2020 ◽  
Author(s):  
Maxim Ivanov ◽  
Albin Sandelin ◽  
Sebastian Marquardt

AbstractBackgroundThe quality of gene annotation determines the interpretation of results obtained in transcriptomic studies. The growing number of genome sequence information calls for experimental and computational pipelines for de novo transcriptome annotation. Ideally, gene and transcript models should be called from a limited set of key experimental data.ResultsWe developed TranscriptomeReconstructoR, an R package which implements a pipeline for automated transcriptome annotation. It relies on integrating features from independent and complementary datasets: i) full-length RNA-seq for detection of splicing patterns and ii) high-throughput 5’ and 3’ tag sequencing data for accurate definition of gene borders. The pipeline can also take a nascent RNA-seq dataset to supplement the called gene model with transient transcripts.We reconstructed de novo the transcriptional landscape of wild type Arabidopsis thaliana seedlings as a proof-of-principle. A comparison to the existing transcriptome annotations revealed that our gene model is more accurate and comprehensive than the two most commonly used community gene models, TAIR10 and Araport11. In particular, we identify thousands of transient transcripts missing from the existing annotations. Our new annotation promises to improve the quality of A.thaliana genome research.ConclusionsOur proof-of-concept data suggest a cost-efficient strategy for rapid and accurate annotation of complex eukaryotic transcriptomes. We combine the choice of library preparation methods and sequencing platforms with the dedicated computational pipeline implemented in the TranscriptomeReconstructoR package. The pipeline only requires prior knowledge on the reference genomic DNA sequence, but not the transcriptome. The package seamlessly integrates with Bioconductor packages for downstream analysis.


2020 ◽  
Author(s):  
Xi Liu ◽  
Frédéric Hérault ◽  
Christian Diot ◽  
Erwan Corre

Abstract Background: Common Pekin and Muscovy ducks and their intergeneric hinny and mule hybrids have different abilities for fatty liver production. RNA-Seq analyses from the liver of these different genetic types fed ad libitum or overfed would help to identify genes with different response to overfeeding between them. However RNA-seq analyses from different species and comparison is challenging. The goal of this study was develop a relevant strategy for transcriptome analysis and comparison between different species.Results: Transcriptomes were first assembled with a reference-based approach. Important mapping biases were observed when heterologous mapping were conducted on common duck reference genome, suggesting that this reference-based strategy was not suited to compare the four different genetic types. De novo transcriptome assemblies were then performed using Trinity and Oases. Assemblies of transcriptomes were not relevant when more than a single genetic type was considered. Finally, single genetic type transcriptomes were assembled with DRAP in a mega-transcriptome. No bias was observed when reads from the different genetic types were mapped on this mega-transcriptome and differences in gene expression between the four genetic types could be identified.Conclusions: Analyses using both reference-based and de novo transcriptome assemblies point out a good performance of the de novo approach for the analysis of gene expression in different species. It also allowed the identification of differences in responses to overfeeding between Pekin and Muscovy ducks and hinny and mule hybrids.


2016 ◽  
Author(s):  
Cédric Cabau ◽  
Frédéric Escudié ◽  
Anis Djari ◽  
Yann Guiguen ◽  
Julien Bobe ◽  
...  

Background De novo transcriptome assembly of short reads is now a common step in expression analysis of organisms lacking a reference genome sequence. Several software packages are available to perform this task. Even if their results are of good quality it is still possible to improve them in several ways including redundancy reduction or error correction. Trinity and Oases are two commonly used de novo transcriptome assemblers. The contig sets they produce are of good quality. Still, their compaction (number of contigs needed to represent the transcriptome) and their quality (chimera and nucleotide error rates) can be improved. Results We built a de novo RNA-Seq Assembly Pipeline (DRAP) which wraps these two assemblers (Trinity and Oases) in order to improve their results regarding the above-mentioned criteria. DRAP reduces from 1,3 to 15 fold the number of resulting contigs of the assemblies depending on the read set and the assembler used. This article presents seven assembly comparisons showing in some cases drastic improvements when using DRAP. DRAP does not significantly impair assembly quality metrics such are read realignment rate or protein reconstruction counts. Conclusion Transcriptome assembly is a challenging computational task even if good solutions are already available to end-users, these solutions can still be improved while conserving the overall representation and quality of the assembly. The de novo RNA-Seq Assembly Pipeline (DRAP) is an ease to use software package to produce compact and corrected transcript set. DRAP is free, open-source and available at http://www.sigenae.org/drap .


2017 ◽  
Author(s):  
Mariana B. Grizante ◽  
Marc Tollis ◽  
Juan J. Rodriguez ◽  
Ofir Levy ◽  
Michael J. Angilletta ◽  
...  

AbstractBackgroundThe eastern fence lizard (Sceloporus undulatus) has been a model species for ecological and evolutionary research. Genomic and transcriptomic resources for this species would promote investigation of genetic mechanisms that underpin plastic responses to environmental stress, such as climate warming. Moreover, such resources would aid comparative studies of complex traits at the molecular level, such as the transition from oviparous to viviparous reproduction, which happened at least four times within Sceloporus.FindingsA de novo transcriptome assembly for Sceloporus undulatus, Sund_v1.0, was generated using over 179 million Illumina reads obtained from three tissues (whole brain, skeletal muscle, and embryo) as well as previously reported liver sequences. The Sund_v1.0 assembly had an average contig length of 782 nucleotides and an E90N50 statistic of 2,550 nucleotides. Comparing S. undulatus transcripts with the benchmarking universal single-copy orthologs (BUSCO) for tetrapod species yielded 97.2% gene representation. A total of 13,422 protein-coding orthologs were identified in comparison to the genome of the green anole lizard, Anolis carolinensis, which is the closest related species with genomic data available.ConclusionsThe multi-tissue transcriptome of S. undulatus is the first for a member of the family Phrynosomatidae, offering an important resource to advance studies of adaptation in this species and genomic research in reptiles.


2020 ◽  
Author(s):  
Xi Liu ◽  
Frédéric Hérault ◽  
Christian Diot ◽  
Erwan Corre

Abstract Background: Common Pekin and Muscovy ducks and their intergeneric hinny and mule hybrids have different abilities for fatty liver production. RNA-Seq analyses from the liver of these different genetic types fed ad libitum or overfed would help to identify genes with different response to overfeeding between them. However RNA-seq analyses from different species and comparison is challenging. The goal of this study was develop a relevant strategy for transcriptome analysis and comparison between different species.Results: Transcriptomes were first assembled with a reference-based approach. Important mapping biases were observed when heterologous mapping were conducted on common duck reference genome, suggesting that this reference-based strategy was not suited to compare the four different genetic types. De novo transcriptome assemblies were then performed using Trinity and Oases. Assemblies of transcriptomes were not relevant when more than a single genetic type was considered. Finally, single genetic type transcriptomes were assembled with DRAP in a mega-transcriptome. No bias was observed when reads from the different genetic types were mapped on this mega-transcriptome and differences in gene expression between the four genetic types could be identified.Conclusions: Analyses using both reference-based and de novo transcriptome assemblies point out a good performance of the de novo approach for the analysis of gene expression in different species. It also allowed the identification of differences in responses to overfeeding between Pekin and Muscovy ducks and hinny and mule hybrids.


Sign in / Sign up

Export Citation Format

Share Document