scholarly journals A comprehensive online database for exploring ~20,000 public Arabidopsis RNA-Seq libraries

2019 ◽  
Author(s):  
Hong Zhang ◽  
Fei Zhang ◽  
Li Feng ◽  
Jinbu Jia ◽  
Jixian Zhai

AbstractApplication of Next Generating Sequencing (NGS) technology in transcriptome profiling has greatly improved our understanding of transcriptional regulation at genome-wide scale in the last decade, and tens of thousands of RNA-sequencing (RNA-seq) libraries have been produced by the research community. However, accessing such huge amount of RNA-seq data poses a big challenge for groups that lack dedicated bioinformatic personnel or expensive computational resources. Here, we introduce the Arabidopsis RNA-seq database (ARS), a free, web-accessible, and user-friendly to quickly explore expression level of any gene in 20,000+ publicly available Arabidopsis RNA-seq libraries.

2021 ◽  
Author(s):  
Tanzeem Fatima ◽  
Rangachari Krishnan ◽  
Ashutosh Srivastava ◽  
Vageeshbabu S. Hanur ◽  
M. Srinivasa Rao

East Indian Sandalwood (Santalum album L.) is highly valued for its heartwood and its oil. There have been no efforts to comparative study of high and low oil yielding genetically identical sandalwood trees grown in similar climatic condition. Thus we intend to study a genome wide transcriptome analysis to identify the corresponding genes involved in high oil biosynthesis in S. album. In this study, 15 years old S. album (SaSHc and SaSLc) genotypes were targeted for analysis to understand the contribution of genetic background on high oil biosynthesis in S. album. A total of 28,959187 and 25,598869 raw PE reads were generated by the Illumina sequencing. 2.12 million and 1.811 million coding sequences were obtained in respective accessions. Based on the GO terms, functional classification of the CDS 21262, & 18113 were assigned into 26 functional groups of three GO categories; (4,168; 3,641) for biological process (5,758;4,971) cellular component and (5,108;4,441) for molecular functions. Total 41,900 and 36,571 genes were functionally annotated and KEGG pathways of the DEGs resulted 213 metabolic pathways. In this, 14 pathways were involved in secondary metabolites biosynthesis pathway in S. album. Among 237 cytochrome families, nine groups of cytochromes were participated in high oil biosynthesis. 16,665 differentially expressed genes were commonly detected in both the accessions (SaHc and SaSLc). The results showed that 784 genes were upregulated and 339 genes were downregulated in SaHc whilst 635 upregulated 299 downregulated in SaSLc S. album. RNA-Seq results were further validated by quantitative RT-PCR. Maximum Blast hits were found to be against Vitis vinifera. From this study we have identified additional number of cytochrome family in SaHc. The accessibility of a RNA-Seq for high oil yielding sandalwood accessions will have broader associations for the conservation and selection of superior elite samples/populations for further genetic improvement program.


2020 ◽  
Vol 21 (15) ◽  
pp. 5492 ◽  
Author(s):  
Yu Jin Jung ◽  
Jong Hee Kim ◽  
Hyo Ju Lee ◽  
Dong Hyun Kim ◽  
Jihyeon Yu ◽  
...  

The rice SLR1 gene encodes the DELLA protein (protein with DELLA amino acid motif), and a loss-of-function mutation is dwarfed by inhibiting plant growth. We generate slr1-d mutants with a semi-dominant dwarf phenotype to target mutations of the DELLA/TVHYNP domain using CRISPR/Cas9 genome editing in rice. Sixteen genetic edited lines out of 31 transgenic plants were generated. Deep sequencing results showed that the mutants had six different mutation types at the target site of the TVHYNP domain of the SLR1 gene. The homo-edited plants selected individuals without DNA (T-DNA) transcribed by segregation in the T1 generation. The slr1-d7 and slr1-d8 plants caused a gibberellin (GA)-insensitive dwarf phenotype with shrunken leaves and shortened internodes. A genome-wide gene expression analysis by RNA-seq indicated that the expression levels of two GA-related genes, GA20OX2 (Gibberellin oxidase) and GA3OX2, were increased in the edited mutant plants, suggesting that GA20OX2 acts as a convert of GA12 signaling. These mutant plants are required by altering GA responses, at least partially by a defect in the phytohormone signaling system process and prevented cell elongation. The new mutants, namely, the slr1-d7 and slr1-d8 lines, are valuable semi-dominant dwarf alleles with potential application value for molecule breeding using the CRISPR/Cas9 system in rice.


2021 ◽  
Author(s):  
Nicolas Eugenie ◽  
Yvan Zivanovic ◽  
Gaelle Lelandais ◽  
Genevieve Coste ◽  
Claire Bouthier de la Tour ◽  
...  

Numerous genes are overexpressed in the radioresistant bacterium Deinococcus radiodurans after exposure to radiation or prolonged desiccation. The DdrO and IrrE proteins play a major role in regulating the expression of approximately predicted twenty of these genes. The transcriptional repressor DdrO blocks the expression of these genes under normal growth conditions. After exposure to genotoxic agents, the IrrE metalloprotease cleaves DdrO and relieves gene repression. Bioinformatic analyzes showed that this mechanism seems to be conserved in several species of Deinococcus, but many questions remain as such the number of genes regulated by DdrO. Here, by RNA-seq and CHiP-seq assays performed at a genome-wide scale coupled with bioinformatic analyses, we show that, the DdrO regulon in D. radiodurans includes many other genes than those previously described. These results thus pave the way to better understand the radioresistance mechanisms encoded by this bacterium.


2014 ◽  
Author(s):  
Adam Siepel ◽  
Leonardo Arbiza

Modification of gene regulation has long been considered an important force in human evolution, particularly through changes to cis-regulatory elements (CREs) that function in transcriptional regulation. For decades, however, the study of cis-regulatory evolution was severely limited by the available data. New data sets describing the locations of CREs and genetic variation within and between species have now made it possible to study CRE evolution much more directly on a genome-wide scale. Here, we review recent research on the evolution of CREs in humans based on large-scale genomic data sets. We consider inferences based on primate divergence,human polymorphism, and combinations of divergence and polymorphism. We then consider "new frontiers" in this field stemming from recent research on transcriptional regulation.


Author(s):  
A T Vivek ◽  
Shailesh Kumar

Abstract Plant transcriptome encompasses numerous endogenous, regulatory non-coding RNAs (ncRNAs) that play a major biological role in regulating key physiological mechanisms. While studies have shown that ncRNAs are extremely diverse and ubiquitous, the functions of the vast majority of ncRNAs are still unknown. With ever-increasing ncRNAs under study, it is essential to identify, categorize and annotate these ncRNAs on a genome-wide scale. The use of high-throughput RNA sequencing (RNA-seq) technologies provides a broader picture of the non-coding component of transcriptome, enabling the comprehensive identification and annotation of all major ncRNAs across samples. However, the detection of known and emerging class of ncRNAs from RNA-seq data demands complex computational methods owing to their unique as well as similar characteristics. Here, we discuss major plant endogenous, regulatory ncRNAs in an RNA sample followed by computational strategies applied to discover each class of ncRNAs using RNA-seq. We also provide a collection of relevant software packages and databases to present a comprehensive bioinformatics toolbox for plant ncRNA researchers. We assume that the discussions in this review will provide a rationale for the discovery of all major categories of plant ncRNAs.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Alessandro La Ferlita ◽  
Salvatore Alaimo ◽  
Sebastiano Di Bella ◽  
Emanuele Martorana ◽  
Georgios I. Laliotis ◽  
...  

Abstract Background RNA-Seq is a well-established technology extensively used for transcriptome profiling, allowing the analysis of coding and non-coding RNA molecules. However, this technology produces a vast amount of data requiring sophisticated computational approaches for their analysis than other traditional technologies such as Real-Time PCR or microarrays, strongly discouraging non-expert users. For this reason, dozens of pipelines have been deployed for the analysis of RNA-Seq data. Although interesting, these present several limitations and their usage require a technical background, which may be uncommon in small research laboratories. Therefore, the application of these technologies in such contexts is still limited and causes a clear bottleneck in knowledge advancement. Results Motivated by these considerations, we have developed RNAdetector, a new free cross-platform and user-friendly RNA-Seq data analysis software that can be used locally or in cloud environments through an easy-to-use Graphical User Interface allowing the analysis of coding and non-coding RNAs from RNA-Seq datasets of any sequenced biological species. Conclusions RNAdetector is a new software that fills an essential gap between the needs of biomedical and research labs to process RNA-Seq data and their common lack of technical background in performing such analysis, which usually relies on outsourcing such steps to third party bioinformatics facilities or using expensive commercial software.


2021 ◽  
Author(s):  
Sebastien Riquier ◽  
Chloe Bessiere ◽  
Benoit Guibert ◽  
Anne-Laure Bouge ◽  
Anthony Boureux ◽  
...  

The huge body of publicly available RNA-seq libraries is a treasure of functional information allowing to quantify the expression of known or novel transcripts in tissues. However, transcript quantification commonly relies on alignment methods requiring a lot of computational resources and processing time, which does not scale easily to large datasets. K-mer decomposition constitutes a new way to process RNA-seq data for the identification of transcriptional signatures, as k-mers can be used to quantify accurately gene expression in a less resource-consuming way. We present the Kmerator Suite, a set of three tools designed to extract specific k-mer signatures, quantify these k-mers into RNA-seq datasets and quickly visualize large datasets characteristics. The core tool, Kmerator, produces specific k-mers for 97% of human genes, enabling the measure of gene expression with high accuracy in simulated datasets. KmerExploR, a direct application of Kmerator, uses a set of predictor genes specific k-mers to infer metadata including library protocol, sample features or contaminations from RNA-seq datasets. KmerExploR results are visualised through a user-friendly interface. Moreover, we demonstrate that the Kmerator Suite can be used for advanced queries targeting known or new biomarkers such as mutations, gene fusions or long non coding-RNAs for human health applications.


2020 ◽  
Author(s):  
Daniel Dimitrov ◽  
Quan Gu

AbstractRNA sequencing is a high-throughput sequencing technique considered as an indispensable research tool used in a broad range of transcriptome analysis studies. The most common application of RNA Sequencing is Differential Expression analysis and it is used to determine genetic loci with distinct expression across different conditions. On the other hand, an emerging field called single-cell RNA sequencing is used for transcriptome profiling at the individual cell level. The standard protocols for both these types of analyses include the processing of sequencing libraries and result in the generation of count matrices. An obstacle to these analyses and the acquisition of meaningful results is that both require programming expertise.BingleSeq was developed as an intuitive application that provides a user-friendly solution for the analysis of count matrices produced by both Bulk and Single-cell RNA-Seq experiments. This was achieved by building an interactive dashboard-like user interface and incorporating three state-of-the-art software packages for each type of the aforementioned analyses, alongside additional features such as key visualisation techniques, functional gene annotation analysis and rank-based consensus for differential gene analysis results, among others. As a result, BingleSeq puts the best and most widely used packages and tools for RNA-Seq analyses at the fingertips of biologists with no programming experience.


2013 ◽  
Vol 2013 ◽  
pp. 1-8 ◽  
Author(s):  
Shanrong Zhao ◽  
Kurt Prenger ◽  
Lance Smith

RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets.


2019 ◽  
Vol 21 (6) ◽  
pp. 1987-1998 ◽  
Author(s):  
Sebastiano Di Bella ◽  
Alessandro La Ferlita ◽  
Giovanni Carapezza ◽  
Salvatore Alaimo ◽  
Antonella Isacchi ◽  
...  

Abstract Next-Generation Sequencing (NGS) is a high-throughput technology widely applied to genome sequencing and transcriptome profiling. RNA-Seq uses NGS to reveal RNA identities and quantities in a given sample. However, it produces a huge amount of raw data that need to be preprocessed with fast and effective computational methods. RNA-Seq can look at different populations of RNAs, including ncRNAs. Indeed, in the last few years, several ncRNAs pipelines have been developed for ncRNAs analysis from RNA-Seq experiments. In this paper, we analyze eight recent pipelines (iSmaRT, iSRAP, miARma-Seq, Oasis 2, SPORTS1.0, sRNAnalyzer, sRNApipe, sRNA workbench) which allows the analysis not only of single specific classes of ncRNAs but also of more than one ncRNA classes. Our systematic performance evaluation aims at guiding users to select the appropriate pipeline for processing each ncRNA class, focusing on three key points: (i) accuracy in ncRNAs identification, (ii) accuracy in read count estimation and (iii) deployment and ease of use.


Sign in / Sign up

Export Citation Format

Share Document