scholarly journals funMotifs: Tissue-specific transcription factor motifs

2019 ◽  
Author(s):  
Husen M. Umer ◽  
Karolina Smolinska-Garbulowska ◽  
Nour-al-dain Marzouka ◽  
Zeeshan Khaliq ◽  
Claes Wadelius ◽  
...  

ABSTRACTTranscription factors (TF) regulate gene expression by binding to specific sequences known as motifs. A bottleneck in our knowledge of gene regulation is the lack of functional characterization of TF motifs, which is mainly due to the large number of predicted TF motifs, and tissue specificity of TF binding. We built a framework to identify tissue-specific functional motifs (funMotifs) across the genome based on thousands of annotation tracks obtained from large-scale genomics projects including ENCODE, RoadMap Epigenomics and FANTOM. The annotations were weighted using a logistic regression model trained on regulatory elements obtained from massively parallel reporter assays. Overall, genome-wide predicted motifs of 519 TFs were characterized across fifteen tissue types. funMotifs summarizes the weighted annotations into a functional activity score for each of the predicted motifs. funMotifs enabled us to measure tissue specificity of different TFs and to identify candidate functional variants in TF motifs from the 1000 genomes project, the GTEx project, the GWAS catalogue, and in 2,515 cancer samples from the Pan-cancer analysis of whole genome sequences (PCAWG) cohort. To enable researchers annotate genomic variants or regions of interest, we have implemented a command-line pipeline and a web-based interface that can publicly be accessed on: http://bioinf.icm.uu.se/funmotifs.

2016 ◽  
Author(s):  
Yue Li ◽  
Jose Davila-Velderrain ◽  
Manolis Kellis

AbstractDissecting the physiological circuitry underlying diverse human complex traits associated with heritable common mutations is an ongoing effort. The primary challenge involves identifying the relevant cell types and the causal variants among the vast majority of the associated mutations in the noncoding regions. To address this challenge, we developed an efficient probabilistic framework. First, we propose a sparse group-guided learning algorithm to infer cell-type-specific enrichments. Second, we propose a fine-mapping Bayesian model that incorporates as Bayesian priors the sparse enrichments to infer risk variants. Using the proposed framework to analyze 32 complex human traits revealed meaningful tissue-specific epigenomic enrichments indicative of the relevant disease pathologies. The prioritized variants exhibit prominent tissue-specific epigenomic signatures and significant enrichments for eQTL and conserved elements. Together, we demonstrate the general benefits of the proposed integrative framework in elucidating meaningful tissue-specific epigenomic elements from large-scale correlated annotations and the implicated functional variants for future experimental interrogation.


2017 ◽  
Vol 43 (6) ◽  
pp. 789
Author(s):  
Rui WANG ◽  
Meng-Lin ZHU ◽  
Fang-Yuan GAO ◽  
Juan-Sheng REN ◽  
Xian-Jun LU ◽  
...  

Circulation ◽  
2012 ◽  
Vol 125 (suppl_10) ◽  
Author(s):  
Christy L Avery ◽  
Praveen Sethupathy ◽  
Steven Buyske ◽  
Q. C He ◽  
Dan Y Lin ◽  
...  

The QT interval (QT) is a heritable trait and its prolongation is an established risk factor for ventricular tachyarrhythmia and sudden cardiac death. Most genetic studies of QT have examined populations of European ancestry, although the increased genetic diversity in populations of African descent provides opportunity for fine-mapping, which can help narrow association signals and identify candidates for functional characterization. We examined whether eleven previously identified QT loci comprising 6,681 variants on the Illumina Metabochip array were associated with QT in 7,516 African American participants from the Atherosclerosis Risk in Communities study and Women’s Health Initiative clinical trial. Among associated loci, we used conditional analyses and queried bioinformatics databases to identify and functionally categorize signals. We identified nine of the eleven QT loci in African American populations ( P <0.0045 under an additive genetic model adjusting for ancestry and demographic characteristics: NOS1AP, ATP1B1, SCN5A, SLC35F1, KCNH2, KCNQ1, LITAF, NDRG4, and RFFL ). We also identified two independent secondary signals in NOS1AP and ATP1B1 ( P < 7.4x10 −6 ). Conditional analyses adjusting for published loci in European populations demonstrated that eight of these eleven SNPs (nine primary; two secondary) were independent of previously reported SNPs. We then performed the first bioinformatics-based functional characterization of QT loci using the eleven primary and secondary variants and SNPs in strong LD (r 2 > 0.5) among these African American participants. Only the SCN5A locus included a non-synonymous coding variant (rs1805124, H558R, r 2 = 0.7 with primary SNP rs9871385, P = 4.7x10 −4 ). The remaining ten loci harbored variants located exclusively within non-coding regions. Specifically, three contained SNPs within candidate long-range regulatory elements in human cardiomyocytes, five were in or near annotated promoter regions, and the remaining two were in un-annotated, but highly conserved non-coding elements. Several of the QT risk alleles at these SNPs significantly alter the predicted binding affinity for transcription factors, such as TBX5 and AhR, which have been previously implicated in cardiac formation and function. In summary, the findings provide compelling evidence that the same genes influence variation in QT across global populations and that additional, independent signals exist in African Americans. Moreover, of those SNPs identified as strong candidates for functional evaluation, the majority implicate gene regulatory dysfunction in QT prolongation.


2021 ◽  
Author(s):  
Sneha Gopalan ◽  
Yuqing Wang ◽  
Nicholas W. Harper ◽  
Manuel Garber ◽  
Thomas G Fazzio

Methods derived from CUT&RUN and CUT&Tag enable genome-wide mapping of the localization of proteins on chromatin from as few as one cell. These and other mapping approaches focus on one protein at a time, preventing direct measurements of co-localization of different chromatin proteins in the same cells and requiring prioritization of targets where samples are limiting. Here we describe multi-CUT&Tag, an adaptation of CUT&Tag that overcomes these hurdles by using antibody-specific barcodes to simultaneously map multiple proteins in the same cells. Highly specific multi-CUT&Tag maps of histone marks and RNA Polymerase II uncovered sites of co-localization in the same cells, active and repressed genes, and candidate cis-regulatory elements. Single-cell multi-CUT&Tag profiling facilitated identification of distinct cell types from a mixed population and characterization of cell type-specific chromatin architecture. In sum, multi-CUT&Tag increases the information content per cell of epigenomic maps, facilitating direct analysis of the interplay of different proteins on chromatin.


2020 ◽  
Author(s):  
Yanan Song ◽  
Hongli Cui ◽  
Ying Shi ◽  
Jinai Xue ◽  
Chunli Ji ◽  
...  

Abstract Background: WRKY transcription factors are a superfamily of regulators involved in diverse biological processes and stress responses in plants. However, knowledge is limited for WRKY family in camelina (Camelina sativa), an important Brassicaceae oil crop with strong tolerance against various stresses. Here, genome-wide characterization of WRKY proteins is performed to examine their gene-structures, phylogenetics, expressions, conserved motif organizations, and functional annotation to identify candidate WRKYs mediating regulation of stress resistance in camelina.Results: Total of 242 CsWRKY proteins encoded by 224 gene loci distributed uneven on chromosomes were identified, and classified into three groups via phylogenetic analysis according to their WRKY domains and zinc finger motifs. 15 CsWRKY gene loci generated 33 spliced variants. Orthologous WRKY gene pairs were identified, with 173 pairs in C. sativa and Arabidopsis genomes as well as 282 pairs for C. sativa and B. napus, respectively. 137 segmental duplication events were observed but no tandem duplication in camelina genome. Ten major conserved motifs were examined, with WRKYGQK as the most conserved and several variants existed in many CsWRKYs. Expression analysis revealed that half more CsWRKY genes were expressed constitutively, and a set of them had a tissue-specific expression. Notably, 11 CsWRKY genes exhibited significantly expression changes in plant seedlings under cold, salt, and drought stress, respectively, having preferentially inducible expression pattern in response to the stress.Conclusions: The present described a detail analysis of CsWRKY gen family and their expression profiled in twelve tissues and under several stress conditions. Segmental duplication is the major force for large expansion of this gene family, and a strong purifying pressure happened for CsWRKY proteins evolutionally. CsWRKY proteins play important roles for plant development, with differential functions in different tissues. Exceptionally, eleven CsWRKYs, particularly five alternative spliced isoforms were found to be the key players possibly in mediating plant response to various stresses. Overall, our results provide a foundation for understanding roles of CsWRKYs and the precise mechanism through which CsWRKYs regulate high stress resistance to stress as well as development of stress tolerance cultivars for Cruciferae crops.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11508
Author(s):  
Yubing Yong ◽  
Yue Zhang ◽  
Yingmin Lyu

Background. We have previously performed an analysis of the cold-responsive transcriptome in the mature leaves of tiger lily (Lilium lancifolium) by gene co-expression network identification. The results has revealed that a ZFHD gene, notated as encoding zinc finger homeodomain protein, may play an essential regulating role in tiger lily response to cold stress. Methods. A further investigation of the ZFHD gene (termed as LlZFHD4) responding to osmotic stresses, including cold, salt, water stresses, and abscisic acid (ABA) was performed in this study. Based on the transcriptome sequences, the coding region and 5′ promoter region of LlZFHD4 were cloned from mature tiger lily leaves. Stress response analysis was performed under continuous 4 °C, NaCl, PEG, and ABA treatments. Functional characterization of LlZFHD4 was conducted in transgenic Arabidopsis, tobacco, and yeast. Results. LlZFHD4 encodes a nuclear-localized protein consisting of 180 amino acids. The N-terminal region of LlZFHD4 has transcriptional activation activity in yeast. The 4 °C, NaCl, PEG, and ABA treatments induced the expression of LlZFHD4. Several stress- or hormone-responsive cis-acting regulatory elements (T-Box, BoxI. and ARF) and binding sites of transcription factors (MYC, DRE and W-box) were found in the core promoter region (789 bp) of LlZFHD4. Also, the GUS gene driven by LlZFHD4 promoter was up-regulated by cold, NaCl, water stresses, and ABA in Arabidopsis. Overexpression of LlZFHD4 improved cold and drought tolerance in transgenic Arabidopsis; higher survival rate and better osmotic adjustment capacity were observed in LlZFHD4 transgenic plants compared to wild type (WT) plants under 4 °C and PEG conditions. However, LlZFHD4 transgenic plants were less tolerant to salinity and more hypersensitive to ABA compared to WT plants. The transcript levels of stress- and ABA-responsive genes were much more up-regulated in LlZFHD4 transgenic Arabidopsis than WT. These results indicate LlZFHD4 is involved in ABA signaling pathway and plays a crucial role in regulating the response of tiger lily to cold, salt and water stresses.


2016 ◽  
Vol 19 (11) ◽  
pp. 1454-1462 ◽  
Author(s):  
Arjun Krishnan ◽  
Ran Zhang ◽  
Victoria Yao ◽  
Chandra L Theesfeld ◽  
Aaron K Wong ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document