Quantification and discovery of sequence determinants of protein per mRNA amount in 29 human tissues

Mapping Intimacies ◽

10.1101/353763 ◽

2018 ◽

Author(s):

Basak Eraslan ◽

Dongxue Wang ◽

Mirjana Gusic ◽

Holger Prokisch ◽

Björn Hallström ◽

...

Keyword(s):

Dynamic Range ◽

Large Fraction ◽

Regulatory Elements ◽

Integrative Model ◽

Human Tissues ◽

Sequence Motifs ◽

Transcriptional Regulatory Elements ◽

Association Testing ◽

Competition Binding ◽

Protein Sequence Motifs

AbstractDespite their importance in determining protein abundance, a comprehensive catalogue of sequence features controlling protein-to-mRNA (PTR) ratios and a quantification of their effects is still lacking. Here we quantified PTR ratios for 11,575 proteins across 29 human tissues using matched transcriptomes and proteomes. We analyzed the contribution of known sequence determinants of protein synthesis and degradation and 15 novel mRNA and protein sequence motifs that we found by association testing. While the dynamic range of PTR ratios spans more than 2 orders of magnitude, our integrative model predicts PTR ratios at a median precision of 3.2-fold. A reporter assay provided significant functional support for two novel UTR motifs and a proteome-wide competition-binding assay identified motif-specific bound proteins for one motif. Moreover, our direct comparison of protein to RNA levels led to a new metrics of codon optimality. Altogether, this study shows that a large fraction of PTR ratio variance across genes can be predicted from sequence and identified many new candidate post-transcriptional regulatory elements in the human genome.

Download Full-text

Unbiased, Genome-Wide In Vivo Mapping of Transcriptional Regulatory Elements Reveals Sex Differences in Chromatin Structure Associated with Sex-Specific Liver Gene Expression

Molecular and Cellular Biology ◽

10.1128/mcb.00601-10 ◽

2010 ◽

Vol 30 (23) ◽

pp. 5531-5544 ◽

Cited By ~ 71

Author(s):

Guoyu Ling ◽

Aarathi Sugathan ◽

Tali Mazor ◽

Ernest Fraenkel ◽

David J. Waxman

Keyword(s):

High Throughput Sequencing ◽

Strong Association ◽

Regulatory Elements ◽

Sequence Motifs ◽

Transcriptional Regulatory Elements ◽

Transcriptional Regulatory ◽

Mammalian Tissues ◽

Hypersensitive Sites ◽

Regulatory Sites

ABSTRACT We have used a simple and efficient method to identify condition-specific transcriptional regulatory sites in vivo to help elucidate the molecular basis of sex-related differences in transcription, which are widespread in mammalian tissues and affect normal physiology, drug response, inflammation, and disease. To systematically uncover transcriptional regulators responsible for these differences, we used DNase hypersensitivity analysis coupled with high-throughput sequencing to produce condition-specific maps of regulatory sites in male and female mouse livers and in livers of male mice feminized by continuous infusion of growth hormone (GH). We identified 71,264 hypersensitive sites, with 1,284 showing robust sex-related differences. Continuous GH infusion suppressed the vast majority of male-specific sites and induced a subset of female-specific sites in male livers. We also identified broad genomic regions (up to ∼100 kb) showing sex-dependent hypersensitivity and similar patterns of GH responses. We found a strong association of sex-specific sites with sex-specific transcription; however, a majority of sex-specific sites were >100 kb from sex-specific genes. By analyzing sequence motifs within regulatory regions, we identified two known regulators of liver sexual dimorphism and several new candidates for further investigation. This approach can readily be applied to mapping condition-specific regulatory sites in mammalian tissues under a wide variety of physiological conditions.

Download Full-text

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

Bioinformatics ◽

10.1093/bioinformatics/btab083 ◽

2021 ◽

Author(s):

Yanrong Ji ◽

Zhihan Zhou ◽

Han Liu ◽

Ramana V Davuluri

Keyword(s):

Dna Sequences ◽

Regulatory Elements ◽

Ease Of Use ◽

Fine Tuning ◽

Supplementary Information ◽

Sequence Motifs ◽

Semantic Relationship ◽

Accurate Identification ◽

Conserved Sequence ◽

Genome Wide

Abstract Motivation Deciphering the language of non-coding DNA is one of the fundamental problems in genome research. Gene regulatory code is highly complex due to the existence of polysemy and distant semantic relationship, which previous informatics methods often fail to capture especially in data-scarce scenarios. Results To address this challenge, we developed a novel pre-trained bidirectional encoder representation, named DNABERT, to capture global and transferrable understanding of genomic DNA sequences based on up and downstream nucleotide contexts. We compared DNABERT to the most widely used programs for genome-wide regulatory elements prediction and demonstrate its ease of use, accuracy and efficiency. We show that the single pre-trained transformers model can simultaneously achieve state-of-the-art performance on prediction of promoters, splice sites and transcription factor binding sites, after easy fine-tuning using small task-specific labeled data. Further, DNABERT enables direct visualization of nucleotide-level importance and semantic relationship within input sequences for better interpretability and accurate identification of conserved sequence motifs and functional genetic variant candidates. Finally, we demonstrate that pre-trained DNABERT with human genome can even be readily applied to other organisms with exceptional performance. We anticipate that the pre-trained DNABERT model can be fined tuned to many other sequence analyses tasks. Availability and implementation The source code, pretrained and finetuned model for DNABERT are available at GitHub (https://github.com/jerryji1993/DNABERT). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Genomic structure of murine methylmalonyl-CoA mutase: evidence for genetic and epigenetic mechanisms determining enzyme activity

Biochemical Journal ◽

10.1042/bj2960663 ◽

1993 ◽

Vol 296 (3) ◽

pp. 663-670 ◽

Cited By ~ 13

Author(s):

M F Wilkemeyer ◽

E R Andrews ◽

F D Ledley

Keyword(s):

Enzyme Activity ◽

Steady State ◽

Transcription Initiation ◽

Cultured Cells ◽

Genomic Structure ◽

Genetic Regulation ◽

Regulatory Elements ◽

Transcription Unit ◽

Mrna Levels ◽

Sequence Motifs

Methylmalonyl-CoA mutase (MCM) is a nuclear-encoded mitochondrial matrix enzyme. We have reported characterization of murine MCM and cloning of a murine MCM cDNA and now describe the murine Mut locus, its promoter and evidence for tissue-specific variation in MCM mRNA, enzyme and holo-enzyme levels. The Mut locus spans 30 kb and contains 13 exons constituting a unique transcription unit. A B1 repeat element was found in the 3′ untranslated region (exon 13). The transcription initiation site was identified and upstream sequences were shown to direct expression of a reporter gene in cultured cells. The promoter contains sequence motifs characteristic of: (1) TATA-less housekeeping promoters; (2) enhancer elements purportedly involved in co-ordinating expression of nuclear-encoded mitochondrial proteins; and (3) regulatory elements including CCAAT boxes, cyclic AMP-response elements and potential AP-2-binding sites. Northern blots demonstrate a greater than 10-fold variation in steady-state mRNA levels, which correlate with tissue levels of enzyme activity. However, the ratio of holoenzyme to total enzyme varies among different tissues, and there is no correlation between steady-state mRNA levels and holoenzyme activity. These results suggest that, although there may be regulation of MCM activity at the level of mRNA, the significance of genetic regulation is unclear owning to the presence of epigenetic regulation of holoenzyme formation.

Download Full-text

Identification of transcriptional elements within the long terminal repeat of Rous sarcoma virus

Molecular and Cellular Biology ◽

10.1128/mcb.3.10.1834-1845.1983 ◽

1983 ◽

Vol 3 (10) ◽

pp. 1834-1845

Author(s):

G M Gilmartin ◽

J T Parsons

Keyword(s):

Long Terminal Repeat ◽

Virus Replication ◽

Rous Sarcoma Virus ◽

Regulatory Elements ◽

Terminal Repeat ◽

Tata Box ◽

Transcriptional Regulatory Elements ◽

Rous Sarcoma ◽

Transcriptional Regulatory ◽

Sarcoma Virus

Transcriptional regulatory elements within the Rous sarcoma virus long terminal repeat were examined by the construction of a series of deletions and small insertions within the U3 region of the long terminal repeat. The analysis of these mutations in chicken embryo cells and COS cells permitted the identification of important transcriptional regulatory elements. Sequences within the region 31 to 18 base pairs upstream of the RNA cap site (-31 to -18), encompassing a TATA box-like sequence, function in the selection of the correct site of transcription initiation and, in addition, augment the efficiency of transcription. These sequences are essential for virus replication. Sequences within the region -79 to -59, overlapping a CAAT box-like sequence, are not required for virus replication and have no obvious effect on viral RNA transcription in the presence of an intact TATA box. However, in mutants lacking a functional TATA sequence, mutations in this region serve to decrease the efficiency of correct transcriptional initiation events.

Download Full-text

Identification of positive and negative transcriptional regulatory elements of the rabbit angiotensin-converting enzyme gene

Nucleic Acids Research ◽

10.1093/nar/22.7.1194 ◽

1994 ◽

Vol 22 (7) ◽

pp. 1194-1201 ◽

Cited By ~ 15

Author(s):

Tauqir Y. Goraya ◽

Sean P. Kessler ◽

Ravi S. Kumar ◽

Janice Douglas ◽

Ganes C. Sen

Keyword(s):

Angiotensin Converting Enzyme ◽

Regulatory Elements ◽

Angiotensin Converting Enzyme Gene ◽

Converting Enzyme ◽

Enzyme Gene ◽

Transcriptional Regulatory Elements ◽

Transcriptional Regulatory

Download Full-text

Discovering Transcriptional Regulatory Elements From Run‐On and Sequencing Data Using the Web‐Based dREG Gateway

Current Protocols in Bioinformatics ◽

10.1002/cpbi.70 ◽

2018 ◽

Vol 66 (1) ◽

Cited By ~ 4

Author(s):

Tinyi Chu ◽

Zhong Wang ◽

Shao‐Pei Chou ◽

Charles G. Danko

Keyword(s):

Regulatory Elements ◽

Sequencing Data ◽

Web Based ◽

Transcriptional Regulatory Elements ◽

Transcriptional Regulatory ◽

The Web

Download Full-text

Prediction of protein-ligand interactions from paired protein sequence motifs and ligand substructures

Biocomputing 2018 ◽

10.1142/9789813235533_0003 ◽

2017 ◽

Cited By ~ 1

Author(s):

Peyton Greenside ◽

Maureen Hillenmeyer ◽

Anshul Kundaje

Keyword(s):

Protein Sequence ◽

Sequence Motifs ◽

Protein Ligand Interactions ◽

Ligand Interactions ◽

Protein Sequence Motifs

Download Full-text

Effects of NRF-1 and PGC-1α Cooperation on HIF-1α and Rat Cardiomyocyte Apoptosis Under Hypoxia

10.21203/rs.3.rs-181724/v1 ◽

2021 ◽

Author(s):

Nan Niu ◽

Hui Li ◽

Xiancai Du ◽

Chan Wang ◽

Junliang Li ◽

...

Keyword(s):

Hypoxia Inducible Factor ◽

Regulatory Elements ◽

Cardiomyocyte Apoptosis ◽

Peroxisome Proliferator ◽

Peroxisome Proliferator Activated Receptor ◽

Rat Cardiomyocytes ◽

Nuclear Respiratory Factor ◽

Transcriptional Regulatory Elements ◽

Nuclear Respiratory Factor 1 ◽

Myocardial Hypoxia

Abstract Background: Hypoxia is a primary inducer of cardiomyocyte injury, its significant marker being hypoxia-induced cardiomyocyte apoptosis. Nuclear respiratory factor-1 (NRF-1) and hypoxia-inducible factor-1α (HIF-1α) are transcriptional regulatory elements implicated in multiple biological functions, including oxidative stress response. However, their roles in hypoxia-induced cardiomyocyte apoptosis remain unknown. The effect HIF-α, together with NRF-1, exerts on cardiomyocyte apoptosis also remains unclear. Methods: We established a myocardial hypoxia model and investigated the effects of these proteins on the proliferation and apoptosis of rat cardiomyocytes (H9C2) under hypoxia. Further, we examined the association between NRF-1 and HIF-1α to improve the current understanding of NRF-1 anti-apoptotic mechanisms. Results: The results show that NRF-1 and HIF-1α are important anti-apoptotic molecules in H9C2 cells under hypoxia, although their regulatory mechanisms differ. NRF-1 could bind to the promoter region of Hif1a and negatively regulate its expression. Additionally, HIF-1β exhibited competitive binding with NRF-1 and HIF-1α, demonstrating a synergism between NRF-1 and the peroxisome proliferator-activated receptor-gamma coactivator-1α. Conclusion: These results indicate that cardiomyocytes can regulate different molecular patterns to tolerate hypoxia, providing a novel methodological framework for studying cardiomyocyte apoptosis under hypoxia.

Download Full-text

Systematic dissection of transcriptional regulatory networks by genome-scale and single-cell CRISPR screens

Science Advances ◽

10.1126/sciadv.abf5733 ◽

2021 ◽

Vol 7 (27) ◽

pp. eabf5733

Author(s):

Rui Lopes ◽

Kathleen Sprouffske ◽

Caibin Sheng ◽

Esther C. H. Uijttewaal ◽

Adriana Emma Wesdorp ◽

...

Keyword(s):

Breast Cancer ◽

Regulatory Networks ◽

Essential Role ◽

Regulatory Elements ◽

Transcriptional Regulatory Networks ◽

Transcriptional Regulatory Elements ◽

Transcriptional Regulatory ◽

Functional Relevance ◽

Genome Scale ◽

Transcription Program

Millions of putative transcriptional regulatory elements (TREs) have been cataloged in the human genome, yet their functional relevance in specific pathophysiological settings remains to be determined. This is critical to understand how oncogenic transcription factors (TFs) engage specific TREs to impose transcriptional programs underlying malignant phenotypes. Here, we combine cutting edge CRISPR screens and epigenomic profiling to functionally survey ≈15,000 TREs engaged by estrogen receptor (ER). We show that ER exerts its oncogenic role in breast cancer by engaging TREs enriched in GATA3, TFAP2C, and H3K27Ac signal. These TREs control critical downstream TFs, among which TFAP2C plays an essential role in ER-driven cell proliferation. Together, our work reveals novel insights into a critical oncogenic transcription program and provides a framework to map regulatory networks, enabling to dissect the function of the noncoding genome of cancer cells.

Download Full-text

Elucidating the Regulatory Elements for Transcription Termination and Posttranscriptional Processing in the Streptomyces clavuligerus Genome

mSystems ◽

10.1128/msystems.01013-20 ◽

2021 ◽

Vol 6 (3) ◽

Author(s):

Soonkyu Hwang ◽

Namil Lee ◽

Donghui Choe ◽

Yongjae Lee ◽

Woori Kim ◽

...

Keyword(s):

Secondary Metabolite ◽

Transcription Termination ◽

Gene Clusters ◽

Streptomyces Clavuligerus ◽

Regulatory Elements ◽

Regulation Of Transcription ◽

Content Type ◽

Transcriptional Regulatory Elements ◽

Transcriptional Regulatory ◽

Posttranscriptional Processing

ABSTRACT Identification of transcriptional regulatory elements in the GC-rich Streptomyces genome is essential for the production of novel biochemicals from secondary metabolite biosynthetic gene clusters (smBGCs). Despite many efforts to understand the regulation of transcription initiation in smBGCs, information on the regulation of transcription termination and posttranscriptional processing remains scarce. In this study, we identified the transcriptional regulatory elements in β-lactam antibiotic-producing Streptomyces clavuligerus ATCC 27064 by determining a total of 1,427 transcript 3′-end positions (TEPs) using the term-seq method. Termination of transcription was governed by three classes of TEPs, of which each displayed unique sequence features. The data integration with transcription start sites and transcriptome data generated 1,648 transcription units (TUs) and 610 transcription unit clusters (TUCs). TU architecture showed that the transcript abundance in TU isoforms of a TUC was potentially affected by the sequence context of their TEPs, suggesting that the regulatory elements of TEPs could control the transcription level in additional layers. We also identified TU features of a xenobiotic response element (XRE) family regulator and DUF397 domain-containing protein, particularly showing the abundance of bidirectional TEPs. Finally, we found that 189 noncoding TUs contained potential cis- and trans-regulatory elements that played a major role in regulating the 5′ and 3′ UTR. These findings highlight the role of transcriptional regulatory elements in transcription termination and posttranscriptional processing in Streptomyces sp. IMPORTANCE Streptomyces sp. is a great source of bioactive secondary metabolites, including antibiotics, antifungal agents, antiparasitic agents, immunosuppressant compounds, and other drugs. Secondary metabolites are synthesized via multistep conversions of the precursor molecules from primary metabolism, governed by multicomplex enzymes from secondary metabolite biosynthetic gene clusters. As their production is closely related with the growth phase and dynamic cellular status in response to various intra- and extracellular signals, complex regulatory systems tightly control the gene expressions related to secondary metabolism. In this study, we determined genome-wide transcript 3′-end positions and transcription units in the β-lactam antibiotic producer Streptomyces clavuligerus ATCC 27064 to elucidate the transcriptional regulatory elements in transcription termination and posttranscriptional processing by integration of multiomics data. These unique features, such as transcript 3′-end sequence, potential riboregulators, and potential 3′-untranslated region (UTR) cis-regulatory elements, can be potentially used to design engineering tools that can regulate the transcript abundance of genes for enhancing secondary metabolite production.

Download Full-text