Origins of Bidirectional Promoters: Computational Analyses of Intergenic Distance in the Human Genome

D. Takai

doi:10.1093/molbev/msh040

High frequency of alternative first exons in erythroid genes suggests a critical role in regulating gene function

Blood ◽

10.1182/blood-2005-07-2957 ◽

2006 ◽

Vol 107 (6) ◽

pp. 2557-2561 ◽

Cited By ~ 24

Author(s):

Jeff S. Tan ◽

Narla Mohandas ◽

John G. Conboy

Keyword(s):

Human Genome ◽

High Frequency ◽

Critical Role ◽

Test Group ◽

Protein Isoforms ◽

Alternative Promoters ◽

Transcriptional Promoters ◽

Show Evidence ◽

Computational Analyses ◽

Frequent Presence

AbstractThe human genome uses alternative pre-mRNA splicing as an important mechanism to encode a complex proteome from a relatively small number of genes. An unknown number of these genes also possess multiple transcriptional promoters and alternative first exons that contribute another layer of complexity to gene expression mechanisms. Using a collection of more than 100 erythroid-expressed genes as a test group, we used genome browser tools and genetic databases to assess the frequency of alternative first exons in the genome. Remarkably, 35% of these erythroid genes show evidence of alternative first exons. The majority of the candidate first exons are situated upstream of the coding exons, whereas a few are located internally within the gene. Computational analyses predict transcriptional promoters closely associated with many of the candidate first exons, supporting their authenticity. Importantly, the frequent presence of consensus translation initiation sites among the alternative first exons suggests that many proteins have alternative N-terminal structures whose expression can be coupled to promoter choice. These findings indicate that alternative promoters and first exons are more widespread in the human genome than previously appreciated and that they may play a major role in regulating expression of selected protein isoforms in a tissue-specific manner. (Blood. 2006;107: 2557-2561)

Download Full-text

An Abundance of Bidirectional Promoters in the Human Genome

Genome Research ◽

10.1101/gr.1982804 ◽

2003 ◽

Vol 14 (1) ◽

pp. 62-66 ◽

Cited By ~ 361

Author(s):

N. D. Trinklein

Keyword(s):

Human Genome ◽

Bidirectional Promoters

Download Full-text

Invited Talk: A Computational Study of Bidirectional Promoters in the Human Genome

Bioinformatics Research and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-540-72031-7_33 ◽

2007 ◽

pp. 361-371 ◽

Cited By ~ 6

Author(s):

Mary Qu Yang ◽

Laura L. Elnitski

Keyword(s):

Human Genome ◽

Computational Study ◽

Bidirectional Promoters

Download Full-text

Feature Characterization and Testing of Bidirectional Promoters in the Human Genome—Significance and Applications in Human Genome Research

Machine Learning in Bioinformatics ◽

10.1002/9780470397428.ch15 ◽

2009 ◽

pp. 321-338

Author(s):

Mary Q. Yang ◽

David C. King ◽

Laura L. Elnitski

Keyword(s):

Human Genome ◽

Genome Research ◽

Bidirectional Promoters ◽

Feature Characterization

Download Full-text

A pan-cancer landscape of somatic substitutions in non-unique regions of the human genome

10.1101/2020.04.14.040634 ◽

2020 ◽

Author(s):

Maxime Tarabichi ◽

Jonas Demeulemeester ◽

Annelien Verfaillie ◽

Adrienne M. Flanagan ◽

Peter Van Loo ◽

...

Keyword(s):

Human Genome ◽

Sequence Similarity ◽

Gene Families ◽

Regulatory Elements ◽

Cancer Genes ◽

Mutation Load ◽

Sequencing Data ◽

High Sequence Similarity ◽

Computational Analyses ◽

Pan Cancer

AbstractAround 13% of the human genome displays high sequence similarity with at least one other chromosomal position and thereby poses challenges for computational analyses such as detection of somatic events in cancer. We here extract features of sequencing data from across non-unique regions and employ a machine learning pipeline to describe a landscape of somatic substitutions in 2,658 cancers from the PCAWG cohort. We show mutations in non-unique regions are consistent with mutations in unique regions in terms of mutation load and substitution profiles, and can be validated with linked-read sequencing. This uncovers hidden mutations in ~1,700 coding sequences and thousands of regulatory elements, including known cancer genes, immunoglobulins, and highly mutated gene families.

Download Full-text

Purine-rich low complexity regions are potential RNA binding hubs in the human genome

F1000Research ◽

10.12688/f1000research.13522.2 ◽

2019 ◽

Vol 7 ◽

pp. 76 ◽

Cited By ~ 1

Author(s):

Ivan Antonov ◽

Yulia A. Medvedeva

Keyword(s):

Human Genome ◽

Dna Sequences ◽

Rna Binding ◽

Statistical Significance ◽

Low Complexity ◽

Distal Enhancer ◽

Long Distance ◽

Wide Range ◽

Data Files ◽

Computational Analyses

Many long noncoding RNAs are bound to the chromatin and some of these interactions are mediated by triple helices. It is usually assumed that a transcript can form triplexes with a distinct set of genomic loci also known as triplex target sites (TTSs). Here we performed computational analyses of the TTSs that have been experimentally identified for particular RNAs. To assess the ability of these TTSs to bind other transcripts we developed a method to estimate the statistical significance of the predicted number of triplexes for a given RNA-DNA pair. We demonstrated that each DNA set included a subset of sequences that have a potential to form a statistically significant (adjusted p-value < 0.01) number of triplexes with the majority (>90%) of the analyzed transcripts. Due to the predicted ability of these DNA sequences to interact with a wide range of different RNAs, we called them "universal TTSs". While the universal TTSs were quite rare in the human genome (around 0.5%), they were more frequent (>15%) among the MEG3 binding sites (ChOP-seq peaks) and especially among the shared Capture-seq peaks (40%). The universal TTSs were enriched with the purine-rich low complexity regions. Nowadays, the role of the chromatin bound RNAs in the formation of 3D chromatin structure is actively discussed. We speculated that such universal TTSs may contribute to establishing long-distance chromosomal contacts and may facilitate distal enhancer-promoter interactions. All the scripts and the data files related to this study are available at: https://github.com/vanya-antonov/universal_tts

Download Full-text

DNA methylation in satellite repeats disorders

Essays in Biochemistry ◽

10.1042/ebc20190028 ◽

2019 ◽

Vol 63 (6) ◽

pp. 757-771 ◽

Cited By ~ 4

Author(s):

Claire Francastel ◽

Frédérique Magdinier

Keyword(s):

Dna Methylation ◽

Human Genome ◽

Repetitive Dna ◽

Dna Sequences ◽

Satellite Repeats ◽

Tremendous Progress ◽

Genes Encoding ◽

Dna Elements ◽

Near Future

Abstract Despite the tremendous progress made in recent years in assembling the human genome, tandemly repeated DNA elements remain poorly characterized. These sequences account for the vast majority of methylated sites in the human genome and their methylated state is necessary for this repetitive DNA to function properly and to maintain genome integrity. Furthermore, recent advances highlight the emerging role of these sequences in regulating the functions of the human genome and its variability during evolution, among individuals, or in disease susceptibility. In addition, a number of inherited rare diseases are directly linked to the alteration of some of these repetitive DNA sequences, either through changes in the organization or size of the tandem repeat arrays or through mutations in genes encoding chromatin modifiers involved in the epigenetic regulation of these elements. Although largely overlooked so far in the functional annotation of the human genome, satellite elements play key roles in its architectural and topological organization. This includes functions as boundary elements delimitating functional domains or assembly of repressive nuclear compartments, with local or distal impact on gene expression. Thus, the consideration of satellite repeats organization and their associated epigenetic landmarks, including DNA methylation (DNAme), will become unavoidable in the near future to fully decipher human phenotypes and associated diseases.

Download Full-text