scholarly journals MultiPLIER: a transfer learning framework for transcriptomics reveals systemic features of rare disease

2018 ◽  
Author(s):  
Jaclyn N. Taroni ◽  
Peter C. Grayson ◽  
Qiwen Hu ◽  
Sean Eddy ◽  
Matthias Kretzler ◽  
...  

SUMMARYUnsupervised machine learning methods provide a promising means to analyze and interpret large datasets. However, most gene expression datasets generated by individual researchers remain too small to fully benefit from these methods. In the case of rare diseases, there may be too few cases available, even when multiple studies are combined. We trained a Pathway Level Information ExtractoR (PLIER) model using on a large public data compendium comprised of multiple experiments, tissues, and biological conditions. We then transferred the model to small rare disease datasets in an approach we term MultiPLIER. Models constructed from large, diverse public data i) included features that aligned well to important biological factors; ii) were more comprehensive than those constructed from individual datasets or conditions; iii) transferred to rare disease datasets where the models describe biological processes related to disease severity more effectively than models trained on specifically those datasets.

2019 ◽  
Vol 16 (7) ◽  
pp. 607-610 ◽  
Author(s):  
Weiguang Mao ◽  
Elena Zaslavsky ◽  
Boris M. Hartmann ◽  
Stuart C. Sealfon ◽  
Maria Chikina

2021 ◽  
Author(s):  
Yoo-Ah Kim ◽  
Ermin Hodzic ◽  
Ariella Saslafsky ◽  
Damian Wojtowicz ◽  
Bayarbaatar Amgalan ◽  
...  

Background: Environmental exposures such as smoking are widely recognized risk factors in the emergence of lung diseases such as lung cancer and acute respiratory distress syndrome (ARDS). However, the strength of environmental exposures is difficult to measure, making it challenging to understand their impacts. On the other hand, some COVID-19 patients develop ARDS in an unfavorable disease progression and smoking has been suggested as a potential risk factor among others. Yet initial studies on COVID-19 cases reported contradictory results on the effects of smoking on the disease. Some suggest that smoking might have a protective effect against it while other studies report an increased risk. A better understanding of how the exposure to smoking and other environmental factors affect biological processes relevant to SARS-CoV-2 infection and unfavorable disease progression is needed. Approach: In this study, we utilize mutational signatures associated with environmental factors as sensors of their exposure level. Many environmental factors including smoking are mutagenic and leave characteristic patterns of mutations, called mutational signatures, in affected genomes. We postulated that analyzing mutational signatures, combined with gene expression, can shed light on the impact of the mutagenic environmental factors to the biological processes. In particular, we utilized mutational signatures from lung adenocarcinoma (LUAD) data set collected in TCGA to investigate the role of environmental factors in COVID-19 vulnerabilities. Integrating mutational signatures with gene expression in normal tissues and using a pathway level analysis, we examined how the exposure to smoking and other mutagenic environmental factors affects the infectivity of the virus and disease progression. Results: By delineating changes associated with smoking in pathway-level gene expression and cell type proportions, our study demonstrates that mutational signatures can be utilized to study the impact of exogenous mutagenic factors on them. Consistent with previous findings, our analysis showed that smoking mutational signature (SBS4) is associated with activation of cytokines mediated singling pathways, leading to inflammatory responses. Smoking related changes in cell composition were also observed, including the correlation of SBS4 with the expansion of goblet cells. On the other hand, increased basal cells and decreased ciliated cells in proportion were associated with the strength of a different mutational signature (SBS5), which is present abundantly but not exclusively in smokers. In addition, we found that smoking increases the expression levels of genes that are upregulated in severe COVID-19 cases. Jointly, these results suggest an unfavorable impact of smoking on the disease progression and also provide novel findings on how smoking impacts biological processes in lung.


2019 ◽  
Vol 23 (15) ◽  
pp. 1663-1670 ◽  
Author(s):  
Chunyan Ao ◽  
Shunshan Jin ◽  
Yuan Lin ◽  
Quan Zou

Protein methylation is an important and reversible post-translational modification that regulates many biological processes in cells. It occurs mainly on lysine and arginine residues and involves many important biological processes, including transcriptional activity, signal transduction, and the regulation of gene expression. Protein methylation and its regulatory enzymes are related to a variety of human diseases, so improved identification of methylation sites is useful for designing drugs for a variety of related diseases. In this review, we systematically summarize and analyze the tools used for the prediction of protein methylation sites on arginine and lysine residues over the last decade.


Author(s):  
Rianne R. Campbell ◽  
Siwei Chen ◽  
Joy H. Beardwood ◽  
Alberto J. López ◽  
Lilyana V. Pham ◽  
...  

AbstractDuring the initial stages of drug use, cocaine-induced neuroadaptations within the ventral tegmental area (VTA) are critical for drug-associated cue learning and drug reinforcement processes. These neuroadaptations occur, in part, from alterations to the transcriptome. Although cocaine-induced transcriptional mechanisms within the VTA have been examined, various regimens and paradigms have been employed to examine candidate target genes. In order to identify key genes and biological processes regulating cocaine-induced processes, we employed genome-wide RNA-sequencing to analyze transcriptional profiles within the VTA from male mice that underwent one of four commonly used paradigms: acute home cage injections of cocaine, chronic home cage injections of cocaine, cocaine-conditioning, or intravenous-self administration of cocaine. We found that cocaine alters distinct sets of VTA genes within each exposure paradigm. Using behavioral measures from cocaine self-administering mice, we also found several genes whose expression patterns corelate with cocaine intake. In addition to overall gene expression levels, we identified several predicted upstream regulators of cocaine-induced transcription shared across all paradigms. Although distinct gene sets were altered across cocaine exposure paradigms, we found, from Gene Ontology (GO) term analysis, that biological processes important for energy regulation and synaptic plasticity were affected across all cocaine paradigms. Coexpression analysis also identified gene networks that are altered by cocaine. These data indicate that cocaine alters networks enriched with glial cell markers of the VTA that are involved in gene regulation and synaptic processes. Our analyses demonstrate that transcriptional changes within the VTA depend on the route, dose and context of cocaine exposure, and highlight several biological processes affected by cocaine. Overall, these findings provide a unique resource of gene expression data for future studies examining novel cocaine gene targets that regulate drug-associated behaviors.


2021 ◽  
Vol 16 (1) ◽  
pp. 1-24
Author(s):  
Yaojin Lin ◽  
Qinghua Hu ◽  
Jinghua Liu ◽  
Xingquan Zhu ◽  
Xindong Wu

In multi-label learning, label correlations commonly exist in the data. Such correlation not only provides useful information, but also imposes significant challenges for multi-label learning. Recently, label-specific feature embedding has been proposed to explore label-specific features from the training data, and uses feature highly customized to the multi-label set for learning. While such feature embedding methods have demonstrated good performance, the creation of the feature embedding space is only based on a single label, without considering label correlations in the data. In this article, we propose to combine multiple label-specific feature spaces, using label correlation, for multi-label learning. The proposed algorithm, mu lti- l abel-specific f eature space e nsemble (MULFE), takes consideration label-specific features, label correlation, and weighted ensemble principle to form a learning framework. By conducting clustering analysis on each label’s negative and positive instances, MULFE first creates features customized to each label. After that, MULFE utilizes the label correlation to optimize the margin distribution of the base classifiers which are induced by the related label-specific feature spaces. By combining multiple label-specific features, label correlation based weighting, and ensemble learning, MULFE achieves maximum margin multi-label classification goal through the underlying optimization framework. Empirical studies on 10 public data sets manifest the effectiveness of MULFE.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Nathan J. VanDusen ◽  
Julianna Y. Lee ◽  
Weiliang Gu ◽  
Catalina E. Butler ◽  
Isha Sethi ◽  
...  

AbstractThe forward genetic screen is a powerful, unbiased method to gain insights into biological processes, yet this approach has infrequently been used in vivo in mammals because of high resource demands. Here, we use in vivo somatic Cas9 mutagenesis to perform an in vivo forward genetic screen in mice to identify regulators of cardiomyocyte (CM) maturation, the coordinated changes in phenotype and gene expression that occur in neonatal CMs. We discover and validate a number of transcriptional regulators of this process. Among these are RNF20 and RNF40, which form a complex that monoubiquitinates H2B on lysine 120. Mechanistic studies indicate that this epigenetic mark controls dynamic changes in gene expression required for CM maturation. These insights into CM maturation will inform efforts in cardiac regenerative medicine. More broadly, our approach will enable unbiased forward genetics across mammalian organ systems.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Wiruntita Chankeaw ◽  
Sandra Lignier ◽  
Christophe Richard ◽  
Theodoros Ntallaris ◽  
Mariam Raliou ◽  
...  

Abstract Background A number of studies have examined mRNA expression profiles of bovine endometrium at estrus and around the peri-implantation period of pregnancy. However, to date, these studies have been performed on the whole endometrium which is a complex tissue. Consequently, the knowledge of cell-specific gene expression, when analysis performed with whole endometrium, is still weak and obviously limits the relevance of the results of gene expression studies. Thus, the aim of this study was to characterize specific transcriptome of the three main cell-types of the bovine endometrium at day-15 of the estrus cycle. Results In the RNA-Seq analysis, the number of expressed genes detected over 10 transcripts per million was 6622, 7814 and 8242 for LE, GE and ST respectively. ST expressed exclusively 1236 genes while only 551 transcripts were specific to the GE and 330 specific to LE. For ST, over-represented biological processes included many regulation processes and response to stimulus, cell communication and cell adhesion, extracellular matrix organization as well as developmental process. For GE, cilium organization, cilium movement, protein localization to cilium and microtubule-based process were the only four main biological processes enriched. For LE, over-represented biological processes were enzyme linked receptor protein signaling pathway, cell-substrate adhesion and circulatory system process. Conclusion The data show that each endometrial cell-type has a distinct molecular signature and provide a significantly improved overview on the biological process supported by specific cell-types. The most interesting result is that stromal cells express more genes than the two epithelial types and are associated with a greater number of pathways and ontology terms.


2021 ◽  
Vol 22 (2) ◽  
pp. 522
Author(s):  
Noreen Falak ◽  
Qari Muhammad Imran ◽  
Adil Hussain ◽  
Byung-Wook Yun

Plants are in continuous conflict with the environmental constraints and their sessile nature demands a fine-tuned, well-designed defense mechanism that can cope with a multitude of biotic and abiotic assaults. Therefore, plants have developed innate immunity, R-gene-mediated resistance, and systemic acquired resistance to ensure their survival. Transcription factors (TFs) are among the most important genetic components for the regulation of gene expression and several other biological processes. They bind to specific sequences in the DNA called transcription factor binding sites (TFBSs) that are present in the regulatory regions of genes. Depending on the environmental conditions, TFs can either enhance or suppress transcriptional processes. In the last couple of decades, nitric oxide (NO) emerged as a crucial molecule for signaling and regulating biological processes. Here, we have overviewed the plant defense system, the role of TFs in mediating the defense response, and that how NO can manipulate transcriptional changes including direct post-translational modifications of TFs. We also propose that NO might regulate gene expression by regulating the recruitment of RNA polymerase during transcription.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Zhihao Fang ◽  
Yiqiu Hu ◽  
Jinhui Hu ◽  
Yanqin Huang ◽  
Shu Zheng ◽  
...  

AbstractAs the predominant modification in RNA, N6-methyladenosine (m6A) has attracted increasing attention in the past few years since it plays vital roles in many biological processes. This chemical modification is dynamic, reversible and regulated by several methyltransferases, demethylases and proteins that recognize m6A modification. M6A modification exists in messenger RNA and affects their splicing, nuclear export, stability, decay, and translation, thereby modulating gene expression. Besides, the existence of m6A in noncoding RNAs (ncRNAs) could also directly or indirectly regulated gene expression. Colorectal cancer (CRC) is a common cancer around the world and of high mortality. Increasing evidence have shown that the changes of m6A level and the dysregulation of m6A regulatory proteins have been implicated in CRC carcinogenesis and progression. However, the underlying regulation laws of m6A modification to CRC remain elusive and better understanding of these mechanisms will benefit the diagnosis and therapy. In the present review, the latest studies about the dysregulation of m6A and its regulators in CRC have been summarized. We will focus on the crucial roles of m6A modification in the carcinogenesis and development of CRC. Moreover, we will also discuss the potential applications of m6A modification in CRC diagnosis and therapeutics.


2009 ◽  
Vol 2009 ◽  
pp. 1-7 ◽  
Author(s):  
Constance Schmelzer ◽  
Mitsuaki Kitano ◽  
Gerald Rimbach ◽  
Petra Niklowitz ◽  
Thomas Menke ◽  
...  

MicroRNAs (miRs) are involved in key biological processes via suppression of gene expression at posttranscriptional levels. According to their superior functions, subtle modulation of miR expression by certain compounds or nutrients is desirable under particular conditions. Bacterial lipopolysaccharide (LPS) induces a reactive oxygen species-/NF-κB-dependent pathway which increases the expression of the anti-inflammatory miR-146a. We hypothesized that this induction could be modulated by the antioxidant ubiquinol-10. Preincubation of human monocytic THP-1 cells with ubiquinol-10 reduced the LPS-induced expression level of miR-146a to 78.9±13.22%. In liver samples of mice injected with LPS, supplementation with ubiquinol-10 leads to a reduction of LPS-induced miR-146a expression to 78.12±21.25%. From these consistent in vitro and in vivo data, we conclude that ubiquinol-10 may fine-tune the inflammatory response via moderate reduction of miR-146a expression.


Sign in / Sign up

Export Citation Format

Share Document