scholarly journals Enhancers facilitate the birth of de novo genes and their integration into regulatory networks

2019 ◽  
Author(s):  
Paco Majic ◽  
Joshua L. Payne

AbstractRegulatory networks control the spatiotemporal gene expression patterns that give rise to and define the individual cell types of multicellular organisms. In eumetazoa, distal regulatory elements called enhancers play a key role in determining the structure of such networks, particularly the wiring diagram of “who regulates whom.” Mutations that affect enhancer activity can therefore rewire regulatory networks, potentially causing changes in gene expression that are adaptive. Here, we use whole-tissue and single-cell transcriptomic and chromatin accessibility data from mouse to show that enhancers play an additional role in the evolution of regulatory networks: They facilitate network growth by creating transcriptionally active regions of open chromatin that are conducive to de novo gene evolution. Specifically, our comparative transcriptomic analysis with three other mammalian species shows that young, mouse-specific intergenic open reading frames are preferentially located near enhancers, whereas older open reading frames are not. Mouse-specific intergenic open reading frames that are proximal to enhancers are more highly and stably transcribed than those that are not proximal to enhancers or promoters, and they are transcribed in a limited diversity of cellular contexts. Furthermore, we report several instances of mouse-specific intergenic open reading frames that are proximal to promoters that show evidence of being repurposed enhancers. We also show that open reading frames gradually acquire specific interactions with enhancers over macro-evolutionary timescales, helping integrate new genes into existing regulatory networks. Taken together, our results highlight a dual role of enhancers in expanding and rewiring gene regulatory networks.

2019 ◽  
Vol 37 (4) ◽  
pp. 1165-1178 ◽  
Author(s):  
Paco Majic ◽  
Joshua L Payne

Abstract Regulatory networks control the spatiotemporal gene expression patterns that give rise to and define the individual cell types of multicellular organisms. In eumetazoa, distal regulatory elements called enhancers play a key role in determining the structure of such networks, particularly the wiring diagram of “who regulates whom.” Mutations that affect enhancer activity can therefore rewire regulatory networks, potentially causing adaptive changes in gene expression. Here, we use whole-tissue and single-cell transcriptomic and chromatin accessibility data from mouse to show that enhancers play an additional role in the evolution of regulatory networks: They facilitate network growth by creating transcriptionally active regions of open chromatin that are conducive to de novo gene evolution. Specifically, our comparative transcriptomic analysis with three other mammalian species shows that young, mouse-specific intergenic open reading frames are preferentially located near enhancers, whereas older open reading frames are not. Mouse-specific intergenic open reading frames that are proximal to enhancers are more highly and stably transcribed than those that are not proximal to enhancers or promoters, and they are transcribed in a limited diversity of cellular contexts. Furthermore, we report several instances of mouse-specific intergenic open reading frames proximal to promoters showing evidence of being repurposed enhancers. We also show that open reading frames gradually acquire interactions with enhancers over macroevolutionary timescales, helping integrate genes—those that have arisen de novo or by other means—into existing regulatory networks. Taken together, our results highlight a dual role of enhancers in expanding and rewiring gene regulatory networks.


2019 ◽  
Author(s):  
Robin A. Sorg ◽  
Clement Gallay ◽  
Jan-Willem Veening

AbstractStreptococcus pneumoniae can cause disease in various human tissues and organs, including the ear, the brain, the blood and the lung, and thus in highly diverse and dynamic environments. It is challenging to study how pneumococci control virulence factor expression, because cues of natural environments and the presence of an immune system are difficult to simulate in vitro. Here, we apply synthetic biology methods to reverse-engineer gene expression control in S. pneumoniae. A selection platform is described that allows for straightforward identification of transcriptional regulatory elements out of combinatorial libraries. We present TetR- and LacI-regulated promoters that show expression ranges of four orders of magnitude. Based on these promoters, regulatory networks of higher complexity are assembled, such as logic AND and IMPLY gates. Finally, we demonstrate single-copy genome-integrated toggle switches that give rise to bimodal population distributions. The tools described here can be used to mimic complex expression patterns, such as the ones found for pneumococcal virulence factors, paving the way for in vivo investigations of the importance of gene expression control on the pathogenicity of S. pneumoniae.


2002 ◽  
Vol 68 (11) ◽  
pp. 5671-5684 ◽  
Author(s):  
Hiroaki Iwaki ◽  
Yoshie Hasegawa ◽  
Shaozhao Wang ◽  
Margaret M. Kayser ◽  
Peter C. K. Lau

ABSTRACT Cyclopentanone 1,2-monooxygenase, a flavoprotein produced by Pseudomonas sp. strain NCIMB 9872 upon induction by cyclopentanol or cyclopentanone (M. Griffin and P. W. Trudgill, Biochem. J. 129:595-603, 1972), has been utilized as a biocatalyst in Baeyer-Villiger oxidations. To further explore this biocatalytic potential and to discover new genes, we have cloned and sequenced a 16-kb chromosomal locus of strain 9872 that is herein reclassified as belonging to the genus Comamonas. Sequence analysis revealed a cluster of genes and six potential open reading frames designated and grouped in at least four possible transcriptional units as (orf11-orf10-orf9)-(cpnE-cpnD-orf6-cpnC)-(cpnR-cpnB-cpnA)-(orf3-orf4 [partial 3′ end]). The cpnABCDE genes encode enzymes for the five-step conversion of cyclopentanol to glutaric acid catalyzed by cyclopentanol dehydrogenase, cyclopentanone 1,2-monooxygenase, a ring-opening 5-valerolactone hydrolase, 5-hydroxyvalerate dehydrogenase, and 5-oxovalerate dehydrogenase, respectively. Inactivation of cpnB by using a lacZ-Kmr cassette resulted in a strain that was not capable of growth on cyclopentanol or cyclopentanone as a sole carbon and energy source. The presence of σ54-dependent regulatory elements in front of the divergently transcribed cpnB and cpnC genes supports the notion that cpnR is a regulatory gene of the NtrC type. Knowledge of the nucleotide sequence of the cpn genes was used to construct isopropyl-β-thio-d-galactoside-inducible clones of Escherichia coli cells that overproduce the five enzymes of the cpn pathway. The substrate specificities of CpnA and CpnB were studied in particular to evaluate the potential of these enzymes and establish the latter recombinant strain as a bioreagent for Baeyer-Villiger oxidations. Although frequently nonenantioselective, cyclopentanone 1,2-monooxygenase was found to exhibit a broader substrate range than the related cyclohexanone 1,2-monooxygenase from Acinetobacter sp. strain NCIMB 9871. However, in a few cases opposite enantioselectivity was observed between the two biocatalysts.


2019 ◽  
Author(s):  
Eirene Markenscoff-Papadimitriou ◽  
Sean Whalen ◽  
Pawel Przytycki ◽  
Reuben Thomas ◽  
Fadya Binyameen ◽  
...  

AbstractGene expression differs between cell types and regions within complex tissues such as the developing brain. To discover regulatory elements underlying this specificity, we generated genome-wide maps of chromatin accessibility in eleven anatomically-defined regions of the developing human telencephalon, including upper and deep layers of the prefrontal cortex. We predicted a subset of open chromatin regions (18%) that are most likely to be active enhancers, many of which are dynamic with 26% differing between early and late mid-gestation and 28% present in only one brain region. These region-specific predicted regulatory elements (pREs) are enriched proximal to genes with expression differences across regions and developmental stages and harbor distinct sequence motifs that suggest potential upstream regulators of regional and temporal transcription. We leverage this atlas to identify regulators of genes associated with autism spectrum disorder (ASD) including an enhancer of BCL11A, validated in mouse, and two functional de novo mutations in individuals with ASD in an enhancer of SLC6A1, validated in neuroblastoma cells. These applications demonstrate the utility of this atlas for decoding neurodevelopmental gene regulation in health and disease.SummaryTo discover regulatory elements driving the specificity of gene expression in different cell types and regions of the developing human brain, we generated an atlas of open chromatin from eleven dissected regions of the mid-gestation human telencephalon, including upper and deep layers of the prefrontal cortex. We identified a subset of open chromatin regions (OCRs), termed predicted regulatory elements (pREs), that are likely to function as developmental brain enhancers. pREs showed regional differences in chromatin accessibility, including many specific to one brain region, and were correlated with gene expression differences across the same regions and gestational ages. pREs allowed us to map neurodevelopmental disorder risk genes to developing telencephalic regions, and we identified three functional de novo noncoding variants in pREs that alter enhancer function. In addition, transgenic experiments in mouse validated enhancer activity for a pRE proximal to BCL11A, showing how this atlas serves as a resource for decoding neurodevelopmental gene regulation in health and disease.


2019 ◽  
Author(s):  
Evan Witt ◽  
Sigi Benjamin ◽  
Nicolas Svetec ◽  
Li Zhao

SummaryThe testis is a peculiar tissue in many respects. It shows patterns of rapid gene evolution and provides a hotspot for the origination of genetic novelties such as de novo genes, duplications and mutations. To investigate the expression patterns of genetic novelties across cell types, we performed single-cell RNA-sequencing of adult Drosophila testis. We found that new genes were expressed in various cell types, the patterns of which may be influenced by their mode of origination. In particular, lineage-specific de novo genes are commonly expressed in early spermatocytes, while young duplicated genes are often bimodally expressed. Analysis of germline substitutions suggests that spermatogenesis is a highly reparative process, with the mutational load of germ cells decreasing as spermatogenesis progresses. By elucidating the distribution of genetic novelties across spermatogenesis, this study provides a deeper understanding of how the testis maintains its core reproductive function while being a hotbed of evolutionary innovation.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Evan Witt ◽  
Sigi Benjamin ◽  
Nicolas Svetec ◽  
Li Zhao

The testis is a peculiar tissue in many respects. It shows patterns of rapid gene evolution and provides a hotspot for the origination of genetic novelties such as de novo genes, duplications and mutations. To investigate the expression patterns of genetic novelties across cell types, we performed single-cell RNA-sequencing of adult Drosophila testis. We found that new genes were expressed in various cell types, the patterns of which may be influenced by their mode of origination. In particular, lineage-specific de novo genes are commonly expressed in early spermatocytes, while young duplicated genes are often bimodally expressed. Analysis of germline substitutions suggests that spermatogenesis is a highly reparative process, with the mutational load of germ cells decreasing as spermatogenesis progresses. By elucidating the distribution of genetic novelties across spermatogenesis, this study provides a deeper understanding of how the testis maintains its core reproductive function while being a hotbed of evolutionary innovation.


2020 ◽  
Author(s):  
Bethany M. Moore ◽  
Yun Sun Lee ◽  
Erich Grotewold ◽  
Shin-Han Shiu

AbstractPlants respond to wounding stress by changing gene expression patterns and inducing jasmonic acid (JA), as well as other plant hormones. This includes activating some specialized metabolism pathways, including the glucosinolate pathways, in the case of Arabidopsis thaliana. We model how these responses are regulated by using machine learning to incorporate putative cis-regulatory elements (pCREs), known transcription factor binding sites from literature, in-vitro DNA affinity purification sequencing (DAP-seq) and DNase I hypersensitive sites to predict gene expression for genes clustered by their wound response using machine learning. We found temporal patterns where regulatory sites and regions of open chromatin differed between clusters of genes up-regulated at early and late wounding time points as well as clusters where JA response was induced relative to clusters where JA response was not induced. Overall, we identified pCREs that improved model predictions of expression clusters over known binding sites. We discovered 4,255 pCREs related to wound response at different time points and 2,569 pCREs related to differences between JA-induced and non-JA induced wound response. In addition, pCREs found to be important at different wounding time points were mapped to the promoters of genes in a glucosinolate biosynthesis pathway indicating regulation of this pathway under wounding stress. Finally, we experimentally validated a predicted cis-regulatory element, CCGCGT, showing that knock-out via CRISPR-Cas9 reduces gene expression in response to wounding.


2021 ◽  
pp. 002203452110120
Author(s):  
C. Gluck ◽  
S. Min ◽  
A. Oyelakin ◽  
M. Che ◽  
E. Horeth ◽  
...  

The parotid, submandibular, and sublingual glands represent a trio of oral secretory glands whose primary function is to produce saliva, facilitate digestion of food, provide protection against microbes, and maintain oral health. While recent studies have begun to shed light on the global gene expression patterns and profiles of salivary glands, particularly those of mice, relatively little is known about the location and identity of transcriptional control elements. Here we have established the epigenomic landscape of the mouse submandibular salivary gland (SMG) by performing chromatin immunoprecipitation sequencing experiments for 4 key histone marks. Our analysis of the comprehensive SMG data sets and comparisons with those from other adult organs have identified critical enhancers and super-enhancers of the mouse SMG. By further integrating these findings with complementary RNA-sequencing based gene expression data, we have unearthed a number of molecular regulators such as members of the Fox family of transcription factors that are enriched and likely to be functionally relevant for SMG biology. Overall, our studies provide a powerful atlas of cis-regulatory elements that can be leveraged for better understanding the transcriptional control mechanisms of the mouse SMG, discovery of novel genetic switches, and modulating tissue-specific gene expression in a targeted fashion.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Stéphane Deschamps ◽  
John A. Crow ◽  
Nadia Chaidir ◽  
Brooke Peterson-Burch ◽  
Sunil Kumar ◽  
...  

Abstract Background Three-dimensional chromatin loop structures connect regulatory elements to their target genes in regions known as anchors. In complex plant genomes, such as maize, it has been proposed that loops span heterochromatic regions marked by higher repeat content, but little is known on their spatial organization and genome-wide occurrence in relation to transcriptional activity. Results Here, ultra-deep Hi-C sequencing of maize B73 leaf tissue was combined with gene expression and open chromatin sequencing for chromatin loop discovery and correlation with hierarchical topologically-associating domains (TADs) and transcriptional activity. A majority of all anchors are shared between multiple loops from previous public maize high-resolution interactome datasets, suggesting a highly dynamic environment, with a conserved set of anchors involved in multiple interaction networks. Chromatin loop interiors are marked by higher repeat contents than the anchors flanking them. A small fraction of high-resolution interaction anchors, fully embedded in larger chromatin loops, co-locate with active genes and putative protein-binding sites. Combinatorial analyses indicate that all anchors studied here co-locate with at least 81.5% of expressed genes and 74% of open chromatin regions. Approximately 38% of all Hi-C chromatin loops are fully embedded within hierarchical TAD-like domains, while the remaining ones share anchors with domain boundaries or with distinct domains. Those various loop types exhibit specific patterns of overlap for open chromatin regions and expressed genes, but no apparent pattern of gene expression. In addition, up to 63% of all unique variants derived from a prior public maize eQTL dataset overlap with Hi-C loop anchors. Anchor annotation suggests that < 7% of all loops detected here are potentially devoid of any genes or regulatory elements. The overall organization of chromatin loop anchors in the maize genome suggest a loop modeling system hypothesized to resemble phase separation of repeat-rich regions. Conclusions Sets of conserved chromatin loop anchors mapping to hierarchical domains contains core structural components of the gene expression machinery in maize. The data presented here will be a useful reference to further investigate their function in regard to the formation of transcriptional complexes and the regulation of transcriptional activity in the maize genome.


2006 ◽  
Vol 3 (2) ◽  
pp. 109-122 ◽  
Author(s):  
◽  
Christopher H. Bryant ◽  
Graham J.L. Kemp ◽  
Marija Cvijovic

Summary We have taken a first step towards learning which upstream Open Reading Frames (uORFs) regulate gene expression (i.e., which uORFs are functional) in the yeast Saccharomyces cerevisiae. We do this by integrating data from several resources and combining a bioinformatics tool, ORF Finder, with a machine learning technique, inductive logic programming (ILP). Here, we report the challenge of using ILP as part of this integrative system, in order to automatically generate a model that identifies functional uORFs. Our method makes searching for novel functional uORFs more efficient than random sampling. An attempt has been made to predict novel functional uORFs using our method. Some preliminary evidence that our model may be biologically meaningful is presented.


Sign in / Sign up

Export Citation Format

Share Document