The neurotranscriptome of the Aedes aegypti mosquito

Mapping Intimacies ◽

10.1101/026823 ◽

2015 ◽

Author(s):

Benjamin J Matthews ◽

Carolyn S McBride ◽

Matthew DeGennaro ◽

Orion Despo ◽

Leslie B Vosshall

Keyword(s):

Gene Expression ◽

Aedes Aegypti ◽

Vector Control ◽

De Novo ◽

Transcriptome Assembly ◽

Molecular Genetic ◽

Blood Feeding ◽

Model Organisms ◽

Protein Coding ◽

Male And Female

Background A complete genome sequence and the advent of genome editing open up non-traditional model organisms to mechanistic genetic studies. The mosquito Aedes aegypti is an important vector of infectious diseases such as dengue, chikungunya, and yellow fever, and has a large and complex genome, which has slowed annotation efforts. We used comprehensive transcriptomic analysis of adult gene expression to improve the genome annotation and to provide a detailed tissue-specific catalogue of neural gene expression at different adult behavioral states. Results We carried out deep RNA sequencing across all major peripheral male and female sensory tissues, the brain, and (female) ovary. Furthermore, we examined gene expression across three important phases of the female reproductive cycle, a remarkable example of behavioral switching in which a female mosquito alternates between obtaining blood-meals from humans and laying eggs. Using genome-guided alignments and de novo transcriptome assembly, our re-annotation includes 572 new putative protein-coding genes and updates to 13.5% and 50.3% of existing transcripts within coding sequences and untranslated regions, respectively. Using this updated annotation, we detail gene expression in each tissue, identifying large numbers of transcripts regulated by blood-feeding and sexually dimorphic transcripts that may provide clues to the biology of male- and female-specific behaviors, such as mating and blood-feeding, which are areas of intensive study for those interested in vector control. Conclusions This neurotranscriptome forms a strong foundation for the study of genes in the mosquito nervous system and investigation of sensory-driven behaviors and their regulation. Furthermore, understanding the molecular genetic basis of mosquito chemosensory behavior has important implications for vector control.

Download Full-text

The brain transcriptome of the wolf spider, Schizocosa ocreata

BMC Research Notes ◽

10.1186/s13104-021-05648-y ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Daniel Stribling ◽

Peter L. Chang ◽

Justin E. Dalton ◽

Christopher A. Conow ◽

Malcolm Rosenthal ◽

...

Keyword(s):

Gene Expression ◽

De Novo ◽

Transcriptome Assembly ◽

Model Organisms ◽

De Novo Transcriptome Assembly ◽

De Novo Transcriptome ◽

Wolf Spiders ◽

Schizocosa Ocreata ◽

Genomic Studies ◽

The Brain

Abstract Objectives Arachnids have fascinating and unique biology, particularly for questions on sex differences and behavior, creating the potential for development of powerful emerging models in this group. Recent advances in genomic techniques have paved the way for a significant increase in the breadth of genomic studies in non-model organisms. One growing area of research is comparative transcriptomics. When phylogenetic relationships to model organisms are known, comparative genomic studies provide context for analysis of homologous genes and pathways. The goal of this study was to lay the groundwork for comparative transcriptomics of sex differences in the brain of wolf spiders, a non-model organism of the pyhlum Euarthropoda, by generating transcriptomes and analyzing gene expression. Data description To examine sex-differential gene expression, short read transcript sequencing and de novo transcriptome assembly were performed. Messenger RNA was isolated from brain tissue of male and female subadult and mature wolf spiders (Schizocosa ocreata). The raw data consist of sequences for the two different life stages in each sex. Computational analyses on these data include de novo transcriptome assembly and differential expression analyses. Sample-specific and combined transcriptomes, gene annotations, and differential expression results are described in this data note and are available from publicly-available databases.

Download Full-text

A consensus-based ensemble approach to improve de novo transcriptome assembly

10.1101/2020.06.08.139964 ◽

2020 ◽

Cited By ~ 1

Author(s):

Adam Voshall ◽

Sairam Behera ◽

Xiangjun Li ◽

Xiao-Hong Yu ◽

Kushagra Kapil ◽

...

Keyword(s):

Gene Expression ◽

Expression Analysis ◽

Reference Genome ◽

De Novo ◽

Transcriptome Assembly ◽

Model Organisms ◽

Pathway Reconstruction ◽

Rnaseq Data ◽

Metabolic Pathway Reconstruction ◽

Assembly Performance

AbstractSystems-level analyses, such as differential gene expression analysis, co-expression analysis, and metabolic pathway reconstruction, depend on the accuracy of the transcriptome. Multiple tools exist to perform transcriptome assembly from RNAseq data. However, assembling high quality transcriptomes is still not a trivial problem. This is especially the case for non-model organisms where adequate reference genomes are often not available. Different methods produce different transcriptome models and there is no easy way to determine which are more accurate. Furthermore, having alternative splicing events could exacerbate such difficult assembly problems. While benchmarking transcriptome assemblies is critical, this is also not trivial due to the general lack of true reference transcriptomes. In this study, we provide a pipeline to generate a set of the benchmark transcriptome and corresponding RNAseq data. Using the simulated benchmarking datasets, we compared the performance of various transcriptome assembly approaches including genome-guided, de novo, and ensemble methods. The results showed that the assembly performance deteriorates significantly when the reference is not available from the same genome (for genome-guided methods) or when alternative transcripts (isoforms) exist. We demonstrated the value of consensus between de novo assemblers in transcriptome assembly. Leveraging the overlapping predictions between the four de novo assemblers, we further present ConSemble, a consensus-based de novo ensemble transcriptome assembly pipeline. Without using a reference genome, ConSemble achieved an accuracy up to twice as high as any de novo assemblers we compared. It matched or exceeded the best performing genome-guided assemblers even when the transcriptomes included isoforms. The RNAseq simulation pipeline, the benchmark transcriptome datasets, and the ConSemble pipeline are all freely available from: http://bioinfolab.unl.edu/emlab/consemble/.Author summaryObtaining the accurate representation of the gene expression is critical in many analyses, such as differential gene expression analysis, co-expression analysis, and metabolic pathway reconstruction. The state of the art high-throughput RNA-sequencing (RNAseq) technologies can be used to sequence the set of all transcripts in a cell, the transcriptome. Although many computational tools are available for transcriptome assembly from RNAseq data, assembling high-quality transcriptomes is difficult especially for non-model organisms. Different methods often produce different transcriptome models and there is no easy way to determine which are more accurate. In this study, we present an approach to evaluate transcriptome assembly performance using simulated benchmarking read sets. The results showed that the assembly performance of genome-guided assembly methods deteriorates significantly when the adequate reference genome is not available. The assembly performance of all methods is affected when alternative transcripts (isoforms) exist. We further demonstrated the value of consensus among assemblers in improving transcriptome assembly. Leveraging the overlapping predictions between the four de novo assemblers, we present ConSemble. Without using a reference genome, ConSemble achieved a much higher accuracy than any de novo assemblers we compared. It matched or exceeded the best performing genome-guided assemblers even when the transcriptomes included isoforms.

Download Full-text

Assembly-free rapid differential gene expression analysis in non-model organisms using DNA-protein alignment

10.1101/2021.04.23.441097 ◽

2021 ◽

Author(s):

Anish M.S. Shrestha ◽

Joyce Emlyn B. Guiao ◽

Kyle Christian R. Santiago

Keyword(s):

Gene Expression ◽

Differential Expression ◽

Expression Analysis ◽

De Novo ◽

Transcriptome Assembly ◽

Differential Expression Analysis ◽

Homology Search ◽

Model Organisms ◽

Rna Seq ◽

Protein Database

AbstractRNA-seq is being increasingly adopted for gene expression studies in a panoply of non-model organisms, with applications spanning the fields of agriculture, aquaculture, ecology, and environment. Conventional differential expression analysis for organisms without reference sequences requires performing computationally expensive and error-prone de-novo transcriptome assembly, followed by homology search against a high-confidence protein database for functional annotation. We propose a shortcut, where we obtain counts for differential expression analysis by directly aligning RNA-seq reads to the protein database. Through experiments on simulated and real data, we show drastic reductions in run-time and memory usage, with no loss in accuracy. A Snakemake implementation of our workflow is available at:https://bitbucket.org/project_samar/samar

Download Full-text

Gray whale transcriptome reveals longevity adaptations associated with DNA repair and ubiquitination

10.1101/754218 ◽

2019 ◽

Author(s):

Dmitri Toren ◽

Anton Kulaga ◽

Mineshbhai Jethva ◽

Eitan Rubin ◽

Anastasia V Snezhkina ◽

...

Keyword(s):

Gene Expression ◽

De Novo ◽

Transcriptome Assembly ◽

Expression Patterns ◽

Model Organisms ◽

Gray Whale ◽

Aging Research ◽

Arctic Water ◽

Gray Whales ◽

Longevity Genes

AbstractOne important question in aging research is how differences in genomics and transcriptomics determine the maximum lifespan in various species. Despite recent progress, much is still unclear on the topic, partly due to the lack of samples in non-model organisms and due to challenges in direct comparisons of transcriptomes from different species. The novel ranking-based method that we employ here is used to analyze gene expression in the gray whale and compare its de novo assembled transcriptome with that of other long- and short-lived mammals. Gray whales are among the top 1% longest-lived mammals. Despite the extreme environment, or maybe due to a remarkable adaptation to its habitat (intermittent hypoxia, Arctic water, and high pressure), gray whales reach at least the age of 77 years. In this work, we show that long-lived mammals share common gene expression patterns between themselves, including high expression of DNA maintenance and repair, ubiquitination, apoptosis, and immune responses. Additionally, the level of expression for gray whale orthologs of pro- and anti-longevity genes found in model organisms is in support of their alleged role and direction in lifespan determination. Remarkably, among highly expressed pro-longevity genes many are stress-related, reflecting an adaptation to extreme environmental conditions. The conducted analysis suggests that the gray whale potentially possesses high resistance to cancer and stress, at least in part ensuring its longevity. This new transcriptome assembly also provides important resources to support the efforts of maintaining the endangered population of gray whales.

Download Full-text

Error, noise and bias in de novo transcriptome assemblies

10.1101/585745 ◽

2019 ◽

Cited By ~ 3

Author(s):

Adam H. Freedman ◽

Michele Clamp ◽

Timothy B. Sackton

Keyword(s):

De Novo ◽

Transcriptome Assembly ◽

Error Rates ◽

Computational Method ◽

Model Organisms ◽

Data Sets ◽

Protein Coding ◽

Gene Level ◽

Assembly Algorithms ◽

The Impact

ABSTRACTDe novo transcriptome assembly is a powerful tool, widely used over the last decade for making evolutionary inferences. However, it relies on two implicit assumptions: that the assembled transcriptome is an unbiased representation of the underlying expressed transcriptome, and that expression estimates from the assembly are good, if noisy approximations of the relative abundance of expressed transcripts. Using publicly available data for model organisms, we demonstrate that, across assembly algorithms and data sets, these assumptions are consistently violated. Bias exists at the nucleotide level, with genotyping error rates ranging from 30-83%. As a result, diversity is underestimated in transcriptome assemblies, with consistent under-estimation of heterozygosity in all but the most inbred samples. Even at the gene level, expression estimates show wide deviations from map-to-reference estimates, and positive bias at lower expression levels. Standard filtering of transcriptome assemblies improves the robustness of gene expression estimates but leads to the loss of a meaningful number of protein-coding genes, including many that are highly expressed. We demonstrate a computational method, length-rescaled CPM, to partly alleviate noise and bias in expression estimates. Researchers should consider ways to minimize the impact of bias in transcriptome assemblies.

Download Full-text

De Novo Transcriptome Assembly and Analyses of Gene Expression during Photomorphogenesis in Diploid Wheat Triticum monococcum

PLoS ONE ◽

10.1371/journal.pone.0096855 ◽

2014 ◽

Vol 9 (5) ◽

pp. e96855 ◽

Cited By ~ 36

Author(s):

Samuel E. Fox ◽

Matthew Geniza ◽

Mamatha Hanumappa ◽

Sushma Naithani ◽

Chris Sullivan ◽

...

Keyword(s):

Gene Expression ◽

De Novo ◽

Transcriptome Assembly ◽

Triticum Monococcum ◽

Diploid Wheat ◽

De Novo Transcriptome Assembly ◽

De Novo Transcriptome

Download Full-text

De novo Transcriptome Assembly and Dynamic Spatial Gene Expression Analysis in Red Clover

The Plant Genome ◽

10.3835/plantgenome2015.06.0048 ◽

2016 ◽

Vol 9 (2) ◽

Cited By ~ 9

Author(s):

Manohar Chakrabarti ◽

Randy D. Dinkins ◽

Arthur G. Hunt

Keyword(s):

Gene Expression ◽

Expression Analysis ◽

Gene Expression Analysis ◽

De Novo ◽

Transcriptome Assembly ◽

Red Clover ◽

De Novo Transcriptome Assembly ◽

De Novo Transcriptome

Download Full-text

A practical guide to buildde-novoassemblies for single tissues of non-model organisms: the example of a Neotropical frog

PeerJ ◽

10.7717/peerj.3702 ◽

2017 ◽

Vol 5 ◽

pp. e3702 ◽

Cited By ~ 5

Author(s):

Santiago Montero-Mendieta ◽

Manfred Grabherr ◽

Henrik Lantz ◽

Ignacio De la Riva ◽

Jennifer A. Leonard ◽

...

Keyword(s):

Defense Mechanisms ◽

De Novo ◽

Transcriptome Assembly ◽

Cost Effective ◽

Model Organisms ◽

Rna Seq ◽

Assembly Pipeline ◽

Wide Variability ◽

History Of ◽

Inexperienced User

Whole genome sequencing (WGS) is a very valuable resource to understand the evolutionary history of poorly known species. However, in organisms with large genomes, as most amphibians, WGS is still excessively challenging and transcriptome sequencing (RNA-seq) represents a cost-effective tool to explore genome-wide variability. Non-model organisms do not usually have a reference genome and the transcriptome must be assembledde-novo. We used RNA-seq to obtain the transcriptomic profile forOreobates cruralis, a poorly known South American direct-developing frog. In total, 550,871 transcripts were assembled, corresponding to 422,999 putative genes. Of those, we identified 23,500, 37,349, 38,120 and 45,885 genes present in the Pfam, EggNOG, KEGG and GO databases, respectively. Interestingly, our results suggested that genes related to immune system and defense mechanisms are abundant in the transcriptome ofO. cruralis. We also present a pipeline to assist with pre-processing, assembling, evaluating and functionally annotating ade-novotranscriptome from RNA-seq data of non-model organisms. Our pipeline guides the inexperienced user in an intuitive way through all the necessary steps to buildde-novotranscriptome assemblies using readily available software and is freely available at:https://github.com/biomendi/TRANSCRIPTOME-ASSEMBLY-PIPELINE/wiki.

Download Full-text

An Improved Human smORF Annnotation Workflow Combining De Novo Transcriptome Assembly and Ribo-Seq

10.1101/523860 ◽

2019 ◽

Author(s):

Thomas F. Martinez ◽

Qian Chu ◽

Cynthia Donaldson ◽

Dan Tan ◽

Maxim N. Shokhirev ◽

...

Keyword(s):

De Novo ◽

Transcriptome Assembly ◽

Human Leukocyte ◽

Open Reading Frames ◽

Translation Efficiency ◽

Proteomics Data ◽

Protein Coding ◽

Leukocyte Antigen ◽

Human Genes ◽

Small Open Reading Frames

Protein-coding small open reading frames (smORFs) are emerging as an important class of genes, however, the coding capacity of smORFs in the human genome is unclear. By integrating de novo transcriptome assembly and Ribo-Seq, we confidently annotate thousands of novel translated smORFs in three human cell lines. We find that smORF translation prediction is noisier than for annotated coding sequences, underscoring the importance of analyzing multiple experiments and footprinting conditions. These smORFs are located within non-coding and antisense transcripts, the UTRs of mRNAs, and unannotated transcripts. Analysis of RNA levels and translation efficiency during cellular stress identifies regulated smORFs, providing an approach to select smORFs for further investigation. Sequence conservation and signatures of positive selection indicate that encoded microproteins are likely functional. Additionally, proteomics data from enriched human leukocyte antigen complexes validates the translation of hundreds of smORFs and positions them as a source of novel antigens. Thus, smORFs represent a significant number of important, yet unexplored human genes.

Download Full-text

Genome sequence of the model rice variety KitaakeX

BMC Genomics ◽

10.1186/s12864-019-6262-4 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 6

Author(s):

Rashmi Jain ◽

Jerry Jenkins ◽

Shengqiang Shu ◽

Mawsheng Chern ◽

Joel A. Martin ◽

...

Keyword(s):

Functional Genomics ◽

De Novo ◽

Rice Variety ◽

Rice Genome ◽

Life Cycles ◽

Model Organisms ◽

High Quality ◽

Rice Varieties ◽

Protein Coding ◽

High Quality Genome

Abstract Background The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to transform and propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.

Download Full-text