scholarly journals Computational approaches for the analysis of epigenome and transcriptome characterisation in Paramecium tetraurelia

2021 ◽  
Author(s):  
◽  
Sivarajan Karunanithi

In the last two decades, our understanding of human gene regulation has improved tremendously. There are plentiful computational methods which focus on integrative data analysis of humans, and model organisms, like mouse and drosophila. However, these tools are not directly employable by researchers working on non-model organisms to answer fundamental biological, and evolutionary questions. We aimed to develop new tools, and adapt existing software for the analysis of transcriptomic and epigenomic data of one such non-model organism, Paramecium tetraurelia, an unicellular eukaryote. Paramecium contains two diploid (2n) germline micronuclei (MIC) and a polyploid (800n) somatic macronuclei (MAC). The transcriptomic and epigenomic regulatory landscape of the MAC genome, which has 80% protein-coding genes and short intergenic regions, is poorly understood. We developed a generic automated eukaryotic short interfering RNA (siRNA) analysis tool, called RAPID. Our tool captures diverse siRNA characteristics from small RNA sequencing data and provides easily navigable visualisations. We also introduced a normalisation technique to facilitate comparison of multiple siRNA-based gene knockdown studies. Further, we developed a pipeline to characterise novel genome-wide endogenous short interfering RNAs (endo-siRNAs). In contrary to many organisms, we found that the endo-siRNAs are not acting in cis, to silence their parent mRNA. We also predicted phasing of siRNAs, which are regulated by the RNA interference (RNAi) pathway. Further, using RAPID, we investigated the aberrations of endo-siRNAs, and their respective transcriptomic alterations caused by an RNAi pathway triggered by feeding small RNAs against a target gene. We find that the small RNA transcriptome is altered, even if a gene unrelated to RNAi pathway is targeted. This is important in the context of investigations of genetically modified organisms (GMOs). We suggest that future studies need to distinguish transcriptomic changes caused by RNAi inducing techniques and actual regulatory changes. Subsequently, we adapted existing epigenomics analysis tools to conduct the first comprehensive epigenomic characterisation of nucleosome positioning and histone modifications of the Paramecium MAC. We identified well positioned nucleosomes shifted downstream of the transcription start site. GC content seems to dictate, in cis, the positioning of nucleosomes, histone marks (H3K4me3, H3K9ac, and H3K27me3), and Pol II in the AT-rich Paramecium genome. We employed a chromatin state segmentation approach, on nucleosomes and histone marks, which revealed genes with active, repressive, and bivalent chromatin states. Further, we constructed a regulatory association network of all the aforementioned data, using the sparse partial correlation network technique. Our analysis revealed subsets of genes, whose expression is positively associated with H3K27me3, different to the otherwise reported negative association with gene expression in many other organisms. Further, we developed a Random Forests classifier to predict gene expression using genic (gene length, intron frequency, etc.) and epigenetic features. Our model has a test performance (PR-AUC) of 0.83. Upon evaluating different feature sets, we found that genic features are as predictive, of gene expression, as the epigenetic features. We used Shapley local feature explanation values, to suggest that high H3K4me3, high intron frequency, low gene length, high sRNA, and high GC content are the most important elements for determining gene expression status. In this thesis, we developed novel tools, and employed several bioinformatics and machine learning methods to characterise the regulatory landscape of the Paramecium’s (epi)genome.

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Mosharrof Mondal ◽  
Jacob Peter ◽  
Obrie Scarbrough ◽  
Alex Flynt

Abstract Background RNA interference (RNAi) regulates gene expression in most multicellular organisms through binding of small RNA effectors to target transcripts. Exploiting this process is a popular strategy for genetic manipulation and has applications that includes arthropod pest control. RNAi technologies are dependent on delivery method with the most convenient likely being feeding, which is effective in some animals while others are insensitive. The two-spotted spider mite, Tetranychus urticae, is prime candidate for developing RNAi approaches due to frequent occurrence of conventional pesticide resistance. Using a sequencing-based approach, the fate of ingested RNAs was explored to identify features and conditions that affect small RNA biogenesis from external sources to better inform RNAi design. Results Biochemical and sequencing approaches in conjunction with extensive computational assessment were used to evaluate metabolism of ingested RNAs in T. urticae. This chelicerae arthropod shows only modest response to oral RNAi and has biogenesis pathways distinct from model organisms. Processing of synthetic and plant host RNAs ingested during feeding were evaluated to identify active substrates for spider mite RNAi pathways. Through cataloging characteristics of biochemically purified RNA from these sources, trans-acting small RNAs could be distinguished from degradation fragments and their origins documented. Conclusions Using a strategy that delineates small RNA processing, we found many transcripts have the potential to enter spider mite RNAi pathways, however, trans-acting RNAs appear very unstable and rare. This suggests potential RNAi pathway substrates from ingested materials are mostly degraded and infrequently converted into regulators of gene expression. Spider mites infest a variety of plants, and it would be maladaptive to generate diverse gene regulators from dietary RNAs. This study provides a framework for assessing RNAi technology in organisms where genetic and biochemical tools are absent and benefit rationale design of RNAi triggers for T.urticae.


2019 ◽  
Author(s):  
Mosharrof Mondal ◽  
Jacob Peter ◽  
Obrie Scarbrough ◽  
Alex Flynt

ABSTRACTRNA interference (RNAi) regulates gene expression in most multicellular organisms through binding of small RNA effectors to target transcripts. Exploiting this process is a popular strategy for genetic manipulation in invertebrates and has applications that includes control of pests. Successful RNAi technologies are dependent on delivery method. The most convenient method is likely feeding which is effective in some animals while others are insensitive. Thus, there is a need to develop RNAi technology on a per-species basis, which will require a comprehensive approach for assessing small RNA production from synthetic nucleic acids.Using a biochemical and sequencing approaches we investigated the metabolism of ingested RNAs using the two-spotted spider mite, Tetranychus urticae, as a model for RNAi insensitivity. This chelicerae arthropod shows only modest response to oral RNAi and has biogenesis pathways distinct from model organisms. To identify RNAi substrates in T. urticae we characterized processing of synthetic RNAs and those derived from plant transcripts ingested during feeding. Through characterization of read size length and overlaps of small RNA reads, visualization methods were developed that facilitate distinguish trans-acting small RNAs from degradation fragments.Using a strategy that delineates small RNA classes, we found a variety of RNA species are gated into spider mite RNAi pathways, however, potential mature trans-acting RNAs appear very unstable and rare. This suggests spider mite RNAi pathway products that originate as ingested materials may be preferentially metabolized instead of converted into regulators of gene expression. Spider mites infest a variety of plants, and it would be maladaptive to generate diverse gene regulators from dietary RNAs. This study provides a framework for assessing RNAi technology in organisms where genetic and biochemical tools are absent and benefit rationale design of RNAi triggers.


2021 ◽  
Author(s):  
Juan Manuel Trinidad ◽  
Rafael Sebastian Fort ◽  
Guillermo Trinidad ◽  
Beatriz Garat ◽  
Maria A Duhagon

MicroRNAs are small RNAs that regulate gene expression through complementary base pairing with their target mRNAs. Given the small size of the pairing region and the large number of mRNAs that each microRNA can control, the identification of biologically relevant targets is difficult. Since current knowledge of target recognition and repression has mainly relied on in vitro studies, we sought to determine if the interrogation of gene expression data of unperturbed tissues could yield new insight into these processes. The transcriptome-wide repression at the microRNA-mRNA canonical interaction sites (seed and 3'-supplementary region, identified by sole base complementarity) was calculated as a normalized Spearman correlation (Z-score) between the abundance of the transcripts in the PRAD-TCGA tissues (RNA-seq and small RNA-seq data of 546 samples). Using the repression values obtained we confirmed established properties or microRNA targeting efficacy, such as the preference for gene regions (3'UTR>CDS>5'UTR), the proportionality between repression and seed length (6mer<7mer<8mer) and the contribution to the repression exerted by the supplementary pairing at 13-16nt of the microRNA. Our results suggest that the 7mer-m8 seed could be more repressive than the 7mer-A1, while they have similar efficacy when they interact using the 3'-supplementary pairing. Strikingly, the 6mer+suppl sites yielded normalized Z-score of repression similar to the sole 7mer-m8 or 7mer-A1 seeds, which raise awareness of its potential biological relevance. We then used the approach to further characterize the 3'-supplementary pairing, using 39 microRNAs that hold repressive 3'-supplementary interactions. The analysis of the bridge between seed and 3'-supplementary pairing site confirmed the optimum +1 offset previously evidenced, but higher offsets appear to hold similar repressive strength. In addition, they show a low GC content at position 13-16, and base preferences that allow the selection of a candidate sequence motif. Overall, our study demonstrates that transcriptome-wide analysis of microRNA-mRNA correlations in large, matched RNA-seq and small-RNA-seq data has the power to uncover hints of microRNA targeting determinants operating in the in vivo unperturbed set. Finally, we made available a bioinformatic tool to analyze microRNA-target mRNA interactions using our approach.


2014 ◽  
Vol 46 (15) ◽  
pp. 533-546 ◽  
Author(s):  
William R. Swindell ◽  
Xianying Xing ◽  
John J. Voorhees ◽  
James T. Elder ◽  
Andrew Johnston ◽  
...  

Gene expression profiling of psoriasis has driven research advances and may soon provide the basis for clinical applications. For expression profiling studies, RNA-seq is now a competitive technology, but RNA-seq results may differ from those obtained by microarray. We therefore compared findings obtained by RNA-seq with those from eight microarray studies of psoriasis. RNA-seq and microarray datasets identified similar numbers of differentially expressed genes (DEGs), with certain genes uniquely identified by each technology. Correspondence between platforms and the balance of increased to decreased DEGs was influenced by mRNA abundance, GC content, and gene length. Weakly expressed genes, genes with low GC content, and long genes were all biased toward decreased expression in psoriasis lesions. The strength of these trends differed among array datasets, most likely due to variations in RNA quality. Gene length bias was by far the strongest trend and was evident in all datasets regardless of the expression profiling technology. The effect was due to differences between lesional and uninvolved skin with respect to the genome-wide correlation between gene length and gene expression, which was consistently more negative in psoriasis lesions. These findings demonstrate the complementary nature of RNA-seq and microarray technology and show that integrative analysis of both data types can provide a richer view of the transcriptome than strict reliance on a single method alone. Our results also highlight factors affecting correspondence between technologies, and we have established that gene length is a major determinant of differential expression in psoriasis lesions.


2021 ◽  
Author(s):  
Sabina Moser Tralamazza ◽  
Leen Nachira Abraham ◽  
Benedito Correa ◽  
Daniel Croll

Epigenetic modifications are key regulators of gene expression and underpin genome integrity. Yet, how epigenetic changes affect the evolution and transcriptional robustness of genes remains largely unknown. Here, we show how the repressive histone mark H3K27me3 influences the trajectory of highly conserved genes in fungi. We first performed transcriptomic profiling on closely related species of the plant pathogen Fusarium graminearum species complex. We determined transcriptional responsiveness of genes across environmental conditions to determine expression robustness. To infer evolutionary conservation of coding sequences, we used a comparative genomics framework of 23 species across the Fusarium genus. We integrated histone methylation data from three Fusarium species across the phylogenetic breadth of the genus. Gene expression variation is negatively correlated with gene conservation confirming that highly conserved genes show higher expression robustness. Furthermore, we show that highly conserved genes marked by H3K27me3 deviate from the typical housekeeping gene archetype. Compared to the genomic background, H3K27me3 marked genes encode smaller proteins, exhibit lower GC content, weaker codon usage bias, higher levels of hydrophobicity and are enriched for functions related to regulation and membrane transport. The evolutionary age of conserved genes with H3K27me3 histone marks falls typically within the origins of the Fusarium genus. We show that highly conserved genes marked by H3K27me3 are more likely to be dispensable for survival. Lastly, we show that conserved genes exposed to repressive H3K27me3 marks across distantly related fungi predict transcriptional perturbation at the microevolutionary scale in Fusarium fungi. In conclusion, we establish how repressive histone marks determine the evolutionary fate of highly conserved genes across evolutionary timescales.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Wanlu Liu ◽  
Javier Gallego-Bartolomé ◽  
Yuxing Zhou ◽  
Zhenhui Zhong ◽  
Ming Wang ◽  
...  

AbstractThe ability to target epigenetic marks like DNA methylation to specific loci is important in both basic research and in crop plant engineering. However, heritability of targeted DNA methylation, how it impacts gene expression, and which epigenetic features are required for proper establishment are mostly unknown. Here, we show that targeting the CG-specific methyltransferase M.SssI with an artificial zinc finger protein can establish heritable CG methylation and silencing of a targeted locus in Arabidopsis. In addition, we observe highly heritable widespread ectopic CG methylation mainly over euchromatic regions. This hypermethylation shows little effect on transcription while it triggers a mild but significant reduction in the accumulation of H2A.Z and H3K27me3. Moreover, ectopic methylation occurs preferentially at less open chromatin that lacks positive histone marks. These results outline general principles of the heritability and interaction of CG methylation with other epigenomic features that should help guide future efforts to engineer epigenomes.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Daniel Stribling ◽  
Peter L. Chang ◽  
Justin E. Dalton ◽  
Christopher A. Conow ◽  
Malcolm Rosenthal ◽  
...  

Abstract Objectives Arachnids have fascinating and unique biology, particularly for questions on sex differences and behavior, creating the potential for development of powerful emerging models in this group. Recent advances in genomic techniques have paved the way for a significant increase in the breadth of genomic studies in non-model organisms. One growing area of research is comparative transcriptomics. When phylogenetic relationships to model organisms are known, comparative genomic studies provide context for analysis of homologous genes and pathways. The goal of this study was to lay the groundwork for comparative transcriptomics of sex differences in the brain of wolf spiders, a non-model organism of the pyhlum Euarthropoda, by generating transcriptomes and analyzing gene expression. Data description To examine sex-differential gene expression, short read transcript sequencing and de novo transcriptome assembly were performed. Messenger RNA was isolated from brain tissue of male and female subadult and mature wolf spiders (Schizocosa ocreata). The raw data consist of sequences for the two different life stages in each sex. Computational analyses on these data include de novo transcriptome assembly and differential expression analyses. Sample-specific and combined transcriptomes, gene annotations, and differential expression results are described in this data note and are available from publicly-available databases.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Bing He ◽  
Ping Chen ◽  
Sonia Zambrano ◽  
Dina Dabaghie ◽  
Yizhou Hu ◽  
...  

AbstractMolecular characterization of the individual cell types in human kidney as well as model organisms are critical in defining organ function and understanding translational aspects of biomedical research. Previous studies have uncovered gene expression profiles of several kidney glomerular cell types, however, important cells, including mesangial (MCs) and glomerular parietal epithelial cells (PECs), are missing or incompletely described, and a systematic comparison between mouse and human kidney is lacking. To this end, we use Smart-seq2 to profile 4332 individual glomerulus-associated cells isolated from human living donor renal biopsies and mouse kidney. The analysis reveals genetic programs for all four glomerular cell types (podocytes, glomerular endothelial cells, MCs and PECs) as well as rare glomerulus-associated macula densa cells. Importantly, we detect heterogeneity in glomerulus-associated Pdgfrb-expressing cells, including bona fide intraglomerular MCs with the functionally active phagocytic molecular machinery, as well as a unique mural cell type located in the central stalk region of the glomerulus tuft. Furthermore, we observe remarkable species differences in the individual gene expression profiles of defined glomerular cell types that highlight translational challenges in the field and provide a guide to design translational studies.


2021 ◽  
Vol 7 (3) ◽  
pp. 42
Author(s):  
Victoria Mamontova ◽  
Barbara Trifault ◽  
Lea Boten ◽  
Kaspar Burger

Gene expression is an essential process for cellular growth, proliferation, and differentiation. The transcription of protein-coding genes and non-coding loci depends on RNA polymerases. Interestingly, numerous loci encode long non-coding (lnc)RNA transcripts that are transcribed by RNA polymerase II (RNAPII) and fine-tune the RNA metabolism. The nucleolus is a prime example of how different lncRNA species concomitantly regulate gene expression by facilitating the production and processing of ribosomal (r)RNA for ribosome biogenesis. Here, we summarise the current findings on how RNAPII influences nucleolar structure and function. We describe how RNAPII-dependent lncRNA can both promote nucleolar integrity and inhibit ribosomal (r)RNA synthesis by modulating the availability of rRNA synthesis factors in trans. Surprisingly, some lncRNA transcripts can directly originate from nucleolar loci and function in cis. The nucleolar intergenic spacer (IGS), for example, encodes nucleolar transcripts that counteract spurious rRNA synthesis in unperturbed cells. In response to DNA damage, RNAPII-dependent lncRNA originates directly at broken ribosomal (r)DNA loci and is processed into small ncRNA, possibly to modulate DNA repair. Thus, lncRNA-mediated regulation of nucleolar biology occurs by several modes of action and is more direct than anticipated, pointing to an intimate crosstalk of RNA metabolic events.


Sign in / Sign up

Export Citation Format

Share Document