scholarly journals Accurate differential analysis of transcription factor activity from gene expression

2018 ◽  
Author(s):  
Viren Amin ◽  
Murat Can Cobanoglu

AbstractWe present EPEE (Effector and Perturbation Estimation Engine), a method for differential analysis of transcription factor (TF) activity from gene expression data. EPEE addresses two principal challenges in the field, namely incorporating context-specific TF-gene regulatory networks, and accounting for the fact that TF activity inference is intrinsically coupled for all TFs that share targets. Our validations in well-studied immune and cancer contexts show that addressing the overlap challenge and using state-of-the-art regulatory networks enable EPEE to consistently produce accurate results. (Accessible at: https://github.com/Cobanoglu-Lab/EPEE)

2017 ◽  
Author(s):  
Yijie Wang ◽  
Dong-Yeon Cho ◽  
Hangnoh Lee ◽  
Justin Fear ◽  
Brian Oliver ◽  
...  

AbstractUnderstanding gene regulation is a fundamental step towards understanding of how cells function and respond to environmental cues and perturbations. An important step in this direction is the ability to infer the transcription factor (TF)-gene regulatory network (GRN). However gene regulatory networks are typically constructed disregarding the fact that regulatory programs are conditioned on tissue type, developmental stage, sex, and other factors. Due to lack of the biological context specificity, these context-agnostic networks may not provide insight for revealing the precise actions of genes for a specific biological system under concern. Collecting multitude of features required for a reliable construction of GRNs such as physical features (TF binding, chromatin accessibility) and functional features (correlation of expression or chromatin patterns) for every context of interest is costly. Therefore we need methods that is able to utilize the knowledge about a context-agnostic network (or a network constructed in a related context) for construction of a context specific regulatory network.To address this challenge we developed a computational approach that utilizes expression data obtained in a specific biological context such as a particular development stage, sex, tissue type and a GRN constructed in a different but related context (alternatively an incomplete or a noisy network for the same context) to construct a context specific GRN. Our method, NetREX, is inspired by network component analysis (NCA) that estimates TF activities and their influences on target genes given predetermined topology of a TF-gene network. To predict a network under a different condition, NetREX removes the restriction that the topology of the TF-gene network is fixed and allows for adding and removing edges to that network. To solve the corresponding optimization problem, which is non-convex and non-smooth, we provide a general mathematical framework allowing use of the recently proposed Proximal Alternative Linearized Maximization technique and prove that our formulation has the properties required for convergence.We tested our NetREX on simulated data and subsequently applied it to gene expression data in adult females from 99 hemizygotic lines of the Drosophila deletion (DrosDel) panel. The networks predicted by NetREX showed higher biological consistency than alternative approaches. In addition, we used the list of recently identified targets of the Doublesex (DSX) transcription factor to demonstrate the predictive power of our method.


2019 ◽  
Vol 35 (23) ◽  
pp. 5018-5029
Author(s):  
Viren Amin ◽  
Didem Ağaç ◽  
Spencer D Barnes ◽  
Murat Can Çobanoğlu

Abstract Motivation Activity of transcriptional regulators is crucial in elucidating the mechanism of phenotypes. However regulatory activity hypotheses are difficult to experimentally test. Therefore, we need accurate and reliable computational methods for regulator activity inference. There is extensive work in this area, however, current methods have difficulty with one or more of the following: resolving activity of TFs with overlapping regulons, reflecting known regulatory relationships, or flexible modeling of TF activity over the regulon. Results We present Effector and Perturbation Estimation Engine (EPEE), a method for differential analysis of transcription factor (TF) activity from gene expression data. EPEE addresses each of these principal challenges in the field. Firstly, EPEE collectively models all TF activity in a single multivariate model, thereby accounting for the intrinsic coupling among TFs that share targets, which is highly frequent. Secondly, EPEE incorporates context-specific TF-gene regulatory networks and therefore adapts the analysis to each biological context. Finally, EPEE can flexibly reflect different regulatory activity of a single TF among its potential targets. This allows the flexibility to implicitly recover other regulatory influences such as co-activators or repressors. We comparatively validated EPEE in 15 datasets from three well-studied contexts, namely immunology, cancer, and hematopoiesis. We show that addressing the aforementioned challenges enable EPEE to outperform alternative methods and reliably produce accurate results. Availability and implementation https://github.com/Cobanoglu-Lab/EPEE. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Neel Patel ◽  
William S. Bush

Abstract Background Transcriptional regulation is complex, requiring multiple cis (local) and trans acting mechanisms working in concert to drive gene expression, with disruption of these processes linked to multiple diseases. Previous computational attempts to understand the influence of regulatory mechanisms on gene expression have used prediction models containing input features derived from cis regulatory factors. However, local chromatin looping and trans-acting mechanisms are known to also influence transcriptional regulation, and their inclusion may improve model accuracy and interpretation. In this study, we create a general model of transcription factor influence on gene expression by incorporating both cis and trans gene regulatory features. Results We describe a computational framework to model gene expression for GM12878 and K562 cell lines. This framework weights the impact of transcription factor-based regulatory data using multi-omics gene regulatory networks to account for both cis and trans acting mechanisms, and measures of the local chromatin context. These prediction models perform significantly better compared to models containing cis-regulatory features alone. Models that additionally integrate long distance chromatin interactions (or chromatin looping) between distal transcription factor binding regions and gene promoters also show improved accuracy. As a demonstration of their utility, effect estimates from these models were used to weight cis-regulatory rare variants for sequence kernel association test analyses of gene expression. Conclusions Our models generate refined effect estimates for the influence of individual transcription factors on gene expression, allowing characterization of their roles across the genome. This work also provides a framework for integrating multiple data types into a single model of transcriptional regulation.


Biotechnology ◽  
2019 ◽  
pp. 265-304
Author(s):  
David Correa Martins Jr. ◽  
Fabricio Martins Lopes ◽  
Shubhra Sankar Ray

The inference of Gene Regulatory Networks (GRNs) is a very challenging problem which has attracted increasing attention since the development of high-throughput sequencing and gene expression measurement technologies. Many models and algorithms have been developed to identify GRNs using mainly gene expression profile as data source. As the gene expression data usually has limited number of samples and inherent noise, the integration of gene expression with several other sources of information can be vital for accurately inferring GRNs. For instance, some prior information about the overall topological structure of the GRN can guide inference techniques toward better results. In addition to gene expression data, recently biological information from heterogeneous data sources have been integrated by GRN inference methods as well. The objective of this chapter is to present an overview of GRN inference models and techniques with focus on incorporation of prior information such as, global and local topological features and integration of several heterogeneous data sources.


2011 ◽  
Vol 28 (2) ◽  
pp. 214-221 ◽  
Author(s):  
Geert Geeven ◽  
Ronald E. van Kesteren ◽  
August B. Smit ◽  
Mathisca C. M. de Gunst

Sign in / Sign up

Export Citation Format

Share Document