Tissue-specific evolution of protein coding genes in human and mouse

Mapping Intimacies ◽

10.1101/011692 ◽

2014 ◽

Author(s):

Nadezda Kryuchkova-Mostacci ◽

Marc Robinson-Rechavi

Keyword(s):

Gene Expression ◽

Positive Selection ◽

Evolutionary Rate ◽

Purifying Selection ◽

Expression Level ◽

Neutral Evolution ◽

Protein Coding ◽

Tissue Specific ◽

Protein Coding Genes ◽

Mouse Tissues

Protein-coding genes evolve at different rates, and the influence of different parameters, from gene size to expression level, has been extensively studied. While in yeast gene expression level is the major causal factor of gene evolutionary rate, the situation is more complex in animals. Here we investigate these relations further, especially taking in account gene expression in different organs as well as indirect correlations between parameters. We used RNA-seq data from two large datasets, covering 22 mouse tissues and 27 human tissues. Over all tissues, evolutionary rate only correlates weakly with levels and breadth of expression. The strongest explanatory factors of strong purifying selection are GC content, expression in many developmental stages, and expression in brain tissues. While the main component of evolutionary rate is purifying selection, we also find tissue-specific patterns for sites under neutral evolution and for positive selection. We observe fast evolution of genes expressed in testis, but also in other tissues, notably liver, which are explained by weak purifying selection rather than by positive selection.

Download Full-text

GENE EXPRESSION LEVEL ANALYSIS OF PROTEIN CODING GENES INCLUDING NON-CODING RNA GENES IN INTRONIC REGIONS

Quantum Bio-Informatics VI ◽

10.1142/9789811217838_0009 ◽

2020 ◽

Author(s):

YOSUKE KONDO ◽

SATORU MIYAZAKI

Keyword(s):

Gene Expression ◽

Gene Expression Level ◽

Expression Level ◽

Protein Coding ◽

Protein Coding Genes ◽

Non Coding Rna ◽

Rna Genes ◽

Level Analysis

Download Full-text

How do we transition from non-coding to coding?

10.7287/peerj.preprints.3031v1 ◽

2017 ◽

Author(s):

Jorge Ruiz-Orera ◽

José Luis Villanueva-Cañas ◽

William Blevins ◽

M.Mar Albà

Keyword(s):

De Novo ◽

Gene Evolution ◽

Purifying Selection ◽

Neutral Evolution ◽

Functional Protein ◽

Protein Coding ◽

Coding Sequences ◽

Sequence Composition ◽

Protein Coding Genes ◽

Small Proteins

Recent years have witnessed the discovery of protein–coding genes which appear to have evolved de novo from previously non-coding sequences. This has changed the long-standing view that coding sequences can only evolve from other coding sequences. However, there are still many open questions regarding how new protein-coding sequences can arise from non-genic DNA. Two prerequisites for the birth of a new functional protein-coding gene are that the corresponding DNA fragment is transcribed and that it is also translated. Transcription is known to be pervasive in the genome, producing a large number of transcripts that do not correspond to conserved protein-coding genes, and which are usually annotated as long non-coding RNAs (lncRNA). Recently, sequencing of ribosome protected fragments (Ribo-Seq) has provided evidence that many of these transcripts actually translate small proteins. We have used mouse non-synonymous and synonymous variation data to estimate the strength of purifying selection acting on the translated open reading frames (ORFs). Whereas a subset of the lncRNAs are likely to actually be true protein-coding genes (and thus previously misclassified), the bulk of lncRNAs code for proteins which show variation patterns consistent with neutral evolution. We also show that the ORFs that have a more favorable, coding-like, sequence composition are more likely to be translated than other ORFs in lncRNAs. This study provides strong evidence that there is a large and ever-changing reservoir of lowly abundant proteins; some of these peptides may become useful and act as seeds for de novo gene evolution.

Download Full-text

Patterns of Positive Selection and Neutral Evolution in the Protein-Coding Genes of Tetraodon and Takifugu

PLoS ONE ◽

10.1371/journal.pone.0024800 ◽

2011 ◽

Vol 6 (9) ◽

pp. e24800 ◽

Cited By ~ 26

Author(s):

Juan I. Montoya-Burgos

Keyword(s):

Positive Selection ◽

Neutral Evolution ◽

Protein Coding ◽

Protein Coding Genes

Download Full-text

GC-AG introns features in long non-coding and protein-coding genes suggest their role in gene expression regulation

10.26226/morressier.5ebd45acffea6f735881b039 ◽

2020 ◽

Author(s):

Monah Abou Alezz

Keyword(s):

Gene Expression ◽

Gene Expression Regulation ◽

Expression Regulation ◽

Protein Coding ◽

Protein Coding Genes

Download Full-text

Analysis of Stop Codons within Prokaryotic Protein-Coding Genes Suggests Frequent Readthrough Events

International Journal of Molecular Sciences ◽

10.3390/ijms22041876 ◽

2021 ◽

Vol 22 (4) ◽

pp. 1876

Author(s):

Frida Belinky ◽

Ishan Ganguly ◽

Eugenia Poliakov ◽

Vyacheslav Yurchenko ◽

Igor B. Rogozin

Keyword(s):

Stop Codon ◽

Purifying Selection ◽

Protein Product ◽

Intermediate Step ◽

Protein Coding ◽

Stop Codons ◽

Protein Coding Genes ◽

Synonymous Sites ◽

Prokaryotic Protein ◽

Sense Codon

Nonsense mutations turn a coding (sense) codon into an in-frame stop codon that is assumed to result in a truncated protein product. Thus, nonsense substitutions are the hallmark of pseudogenes and are used to identify them. Here we show that in-frame stop codons within bacterial protein-coding genes are widespread. Their evolutionary conservation suggests that many of them are not pseudogenes, since they maintain dN/dS values (ratios of substitution rates at non-synonymous and synonymous sites) significantly lower than 1 (this is a signature of purifying selection in protein-coding regions). We also found that double substitutions in codons—where an intermediate step is a nonsense substitution—show a higher rate of evolution compared to null models, indicating that a stop codon was introduced and then changed back to sense via positive selection. This further supports the notion that nonsense substitutions in bacteria are relatively common and do not necessarily cause pseudogenization. In-frame stop codons may be an important mechanism of regulation: Such codons are likely to cause a substantial decrease of protein expression levels.

Download Full-text

Classification of topological domains based on gene expression and regulation

Genome ◽

10.1139/gen-2013-0111 ◽

2013 ◽

Vol 56 (7) ◽

pp. 415-423 ◽

Cited By ~ 2

Author(s):

Jingjing Zhao ◽

Hongbo Shi ◽

Nadav Ahituv

Keyword(s):

Gene Expression ◽

Gene Ontology ◽

Tissue Type ◽

Specific Gene ◽

Gene Expression And Regulation ◽

Genome Chromosome ◽

Chromosome Conformation ◽

Topological Domains ◽

Tissue Specific ◽

Mouse Tissues

Tissue-specific gene expression is thought to be one of the major forces shaping mammalian gene order. A recent study that used whole-genome chromosome conformation assays has shown that the mammalian genome is divided into specific topological domains that are shared between different tissues and organisms. Here, we wanted to assess whether gene expression and regulation are involved in shaping these domains and can be used to classify them. We analyzed gene expression and regulation levels in these domains by using RNA-seq and enhancer-associated ChIP-seq datasets for 17 different mouse tissues. We found 162 domains that are active (high gene expression and regulation) in all 17 tissues. These domains are significantly shorter, contain less repeats, and have more housekeeping genes. In contrast, we found 29 domains that are inactive (low gene expression and regulation) in all analyzed tissues and are significantly longer, have more repeats, and gene deserts. Tissue-specific active domains showed some correlation with tissue-type and gene ontology. Domain temporal gene regulation and expression differences also displayed some gene ontology terms fitting their temporal function. Combined, our results provide a catalog of shared and tissue-specific topological domains and suggest that gene expression and regulation could have a role in shaping them.

Download Full-text

Tri-methylation of Histone H3 Lysine 4 Facilitates Gene Expression in Ageing Cells

10.1101/238048 ◽

2017 ◽

Cited By ~ 1

Author(s):

Cristina Cruz ◽

Monica Della Rosa ◽

Christel Krueger ◽

Qian Gao ◽

Lucy Field ◽

...

Keyword(s):

Gene Expression ◽

Budding Yeast ◽

Histone H3 ◽

Specific Factor ◽

Protein Coding ◽

Replicative Lifespan ◽

Protein Coding Genes ◽

Genome Wide ◽

Normal Expression ◽

Transcriptional Induction

AbstractTranscription of protein coding genes is accompanied by recruitment of COMPASS to promoter-proximal chromatin, which deposits di- and tri-methylation on histone H3 lysine 4 (H3K4) to form H3K4me2 and H3K4me3. Here we determine the importance of COMPASS in maintaining gene expression across lifespan in budding yeast. We find that COMPASS mutations dramatically reduce replicative lifespan and cause widespread gene expression defects. Known repressive functions of H3K4me2 are progressively lost with age, while hundreds of genes become dependent on H3K4me3 for full expression. Induction of these H3K4me3 dependent genes is also impacted in young cells lacking COMPASS components including the H3K4me3-specific factor Spp1. Remarkably, the genome-wide occurrence of H3K4me3 is progressively reduced with age despite widespread transcriptional induction, minimising the normal positive correlation between promoter H3K4me3 and gene expression. Our results provide clear evidence that H3K4me3 is required to attain normal expression levels of many genes across organismal lifespan.

Download Full-text

Two waves of transcriptomic changes in periovulatory human granulosa cells

Human Reproduction ◽

10.1093/humrep/deaa043 ◽

2020 ◽

Vol 35 (5) ◽

pp. 1230-1245 ◽

Cited By ~ 2

Author(s):

L C Poulsen ◽

J A Bøtkjær ◽

O Østrup ◽

K B Petersen ◽

C Yding Andersen ◽

...

Keyword(s):

Gene Expression ◽

Growth Factor ◽

Tumour Necrosis Factor ◽

Tumour Necrosis ◽

Fertility Treatment ◽

Protein Coding ◽

Time Points ◽

Protein Coding Genes ◽

Upstream Regulators ◽

Necrosis Factor

Abstract STUDY QUESTION How does the human granulosa cell (GC) transcriptome change during ovulation? SUMMARY ANSWER Two transcriptional peaks were observed at 12 h and at 36 h after induction of ovulation, both dominated by genes and pathways known from the inflammatory system. WHAT IS KNOWN ALREADY The crosstalk between GCs and the oocyte, which is essential for ovulation and oocyte maturation, can be assessed through transcriptomic profiling of GCs. Detailed transcriptional changes during ovulation have not previously been assessed in humans. STUDY DESIGN, SIZE, DURATION This prospective cohort study comprised 50 women undergoing fertility treatment in a standard antagonist protocol at a university hospital-affiliated fertility clinic in 2016–2018. PARTICIPANTS/MATERIALS, SETTING, METHODS From each woman, one sample of GCs was collected by transvaginal ultrasound-guided follicle aspiration either before or 12 h, 17 h or 32 h after ovulation induction (OI). A second sample was collected at oocyte retrieval, 36 h after OI. Total RNA was isolated from GCs and analyzed by microarray. Gene expression differences between the five time points were assessed by ANOVA with a random factor accounting for the pairing of samples, and seven clusters of protein-coding genes representing distinct expression profiles were identified. These were used as input for subsequent bioinformatic analyses to identify enriched pathways and suggest upstream regulators. Subsets of genes were assessed to explore specific ovulatory functions. MAIN RESULTS AND THE ROLE OF CHANCE We identified 13 345 differentially expressed transcripts across the five time points (false discovery rate, <0.01) of which 58% were protein-coding genes. Two clusters of mainly downregulated genes represented cell cycle pathways and DNA repair. Upregulated genes showed one peak at 12 h that resembled the initiation of an inflammatory response, and one peak at 36 h that resembled the effector functions of inflammation such as vasodilation, angiogenesis, coagulation, chemotaxis and tissue remodelling. Genes involved in cell–matrix interactions as a part of cytoskeletal rearrangement and cell motility were also upregulated at 36 h. Predicted activated upstream regulators of ovulation included FSH, LH, transforming growth factor B1, tumour necrosis factor, nuclear factor kappa-light-chain-enhancer of activated B cells, coagulation factor 2, fibroblast growth factor 2, interleukin 1 and cortisol, among others. The results confirmed early regulation of several previously described factors in a cascade inducing meiotic resumption and suggested new factors involved in cumulus expansion and follicle rupture through co-regulation with previously described factors. LARGE SCALE DATA The microarray data were deposited to the Gene Expression Omnibus (www.ncbi.nlm.nih.gov/gds/, accession number: GSE133868). LIMITATIONS, REASONS FOR CAUTION The study included women undergoing ovarian stimulation and the findings may therefore differ from a natural cycle. However, the results confirm significant regulation of many well-established ovulatory genes from a series of previous studies such as amphiregulin, epiregulin, tumour necrosis factor alfa induced protein 6, tissue inhibitor of metallopeptidases 1 and plasminogen activator inhibitor 1, which support the relevance of the results. WIDER IMPLICATIONS OF THE FINDINGS The study increases our understanding of human ovarian function during ovulation, and the publicly available dataset is a valuable resource for future investigations. Suggested upstream regulators and highly differentially expressed genes may be potential pharmaceutical targets in fertility treatment and gynaecology. STUDY FUNDING/COMPETING INTEREST(S) The study was funded by EU Interreg ÔKS V through ReproUnion (www.reprounion.eu) and by a grant from the Region Zealand Research Foundation. None of the authors have any conflicts of interest to declare.

Download Full-text

Identifying inaccuracies in gene expression estimates from unstranded RNA-seq data

Scientific Reports ◽

10.1038/s41598-019-52584-w ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 1

Author(s):

Mikhail Pomaznoy ◽

Ashu Sethi ◽

Jason Greenbaum ◽

Bjoern Peters

Keyword(s):

Gene Expression ◽

Differential Expression Analysis ◽

Cell Types ◽

Library Preparation ◽

Rna Seq ◽

Protein Coding ◽

Protein Coding Genes ◽

Machine Learning Model ◽

Specific Manner ◽

Library Preparation Protocol

Abstract RNA-seq methods are widely utilized for transcriptomic profiling of biological samples. However, there are known caveats of this technology which can skew the gene expression estimates. Specifically, if the library preparation protocol does not retain RNA strand information then some genes can be erroneously quantitated. Although strand-specific protocols have been established, a significant portion of RNA-seq data is generated in non-strand-specific manner. We used a comprehensive stranded RNA-seq dataset of 15 blood cell types to identify genes for which expression would be erroneously estimated if strand information was not available. We found that about 10% of all genes and 2.5% of protein coding genes have a two-fold or higher difference in estimated expression when strand information of the reads was ignored. We used parameters of read alignments of these genes to construct a machine learning model that can identify which genes in an unstranded dataset might have incorrect expression estimates and which ones do not. We also show that differential expression analysis of genes with biased expression estimates in unstranded read data can be recovered by limiting the reads considered to those which span exonic boundaries. The resulting approach is implemented as a package available at https://github.com/mikpom/uslcount.

Download Full-text

Signals of positive selection in mitochondrial protein‐coding genes of woolly mammoth: Adaptation to extreme environments?

Ecology and Evolution ◽

10.1002/ece3.5250 ◽

2019 ◽

Vol 9 (12) ◽

pp. 6821-6832 ◽

Cited By ~ 2

Author(s):

Jacob Njaramba Ngatia ◽

Tian Ming Lan ◽

Thi Dao Dinh ◽

Le Zhang ◽

Ahmed Khalid Ahmed ◽

...

Keyword(s):

Positive Selection ◽

Mitochondrial Protein ◽

Extreme Environments ◽

Protein Coding ◽

Protein Coding Genes ◽

Woolly Mammoth

Download Full-text