Biased gene conversion drives codon usage in human and precludes selection on translation efficiency

Mapping Intimacies ◽

10.1101/086447 ◽

2016 ◽

Author(s):

Fanny Pouyet ◽

Dominique Mouchiroud ◽

Laurent Duret ◽

Marie Sémon

Keyword(s):

Gene Expression ◽

Codon Usage ◽

Gene Conversion ◽

Large Scale ◽

Synonymous Codon ◽

Gc Content ◽

Intragenic Recombination ◽

Translation Efficiency ◽

Functional Categories ◽

Biased Gene Conversion

AbstractIn humans, as in other mammals, synonymous codon usage (SCU) varies widely among genes. In particular, genes involved in cell differentiation or in proliferation display a distinct codon usage, suggesting that SCU is adaptively constrained to optimize translation efficiency in distinct cellular states. However, in mammals, SCU is known to correlate with large-scale fluctuations of GC-content along chromosomes, caused by meiotic recombination, via the non-adaptive process of GC-biased gene conversion (gBGC). To disentangle and to quantify the different factors driving SCU in humans, we analyzed the relationships between functional categories, base composition, recombination, and gene expression. We first demonstrate that SCU is predominantly driven by large-scale variation in GC-content and is not linked to constraints on tRNA abundance, which excludes an effect of translational selection. In agreement with the gBGC model, we show that differences in SCU among functional categories are explained by variation in intragenic recombination rate, which, in turn, is strongly negatively correlated to gene expression levels during meiosis. Our results indicate that variation in SCU among functional categories (including variation associated to differentiation or proliferation) result from differences in levels of meiotic transcription, which interferes with the formation of crossovers and thereby affects gBGC intensity within genes. Overall, the gBGC model explains 70% of the variance in SCU among genes. We argue that the strong heterogeneity of SCU induced by gBGC in mammalian genomes precludes any optimization of the tRNA pool to the demand in codon usage.

Download Full-text

Recombination, meiotic expression and human codon usage

eLife ◽

10.7554/elife.27344 ◽

2017 ◽

Vol 6 ◽

Cited By ~ 23

Author(s):

Fanny Pouyet ◽

Dominique Mouchiroud ◽

Laurent Duret ◽

Marie Sémon

Keyword(s):

Codon Usage ◽

Large Scale ◽

Synonymous Codon ◽

Gc Content ◽

Synonymous Codon Usage ◽

Translation Efficiency ◽

Functional Categories ◽

Human Genes ◽

Biased Gene Conversion ◽

Mammalian Genomes

Synonymous codon usage (SCU) varies widely among human genes. In particular, genes involved in different functional categories display a distinct codon usage, which was interpreted as evidence that SCU is adaptively constrained to optimize translation efficiency in distinct cellular states. We demonstrate here that SCU is not driven by constraints on tRNA abundance, but by large-scale variation in GC-content, caused by meiotic recombination, via the non-adaptive process of GC-biased gene conversion (gBGC). Expression in meiotic cells is associated with a strong decrease in recombination within genes. Differences in SCU among functional categories reflect differences in levels of meiotic transcription, which is linked to variation in recombination and therefore in gBGC. Overall, the gBGC model explains 70% of the variance in SCU among genes. We argue that the strong heterogeneity of SCU induced by gBGC in mammalian genomes precludes any optimization of the tRNA pool to the demand in codon usage.

Download Full-text

Selection on silent sites in the rodent H3 histone gene family.

Genetics ◽

10.1093/genetics/138.1.191 ◽

1994 ◽

Vol 138 (1) ◽

pp. 191-202

Author(s):

R W DeBry ◽

W F Marzluff

Keyword(s):

Codon Usage ◽

Gene Conversion ◽

Base Composition ◽

Large Scale ◽

Synonymous Codon ◽

Histone Genes ◽

Flanking Sequence ◽

Effective Population ◽

Mutation Pressure ◽

Synonymous Codons

Abstract Selection promoting differential use of synonymous codons has been shown for several unicellular organisms and for Drosophila, but not for mammals. Selection coefficients operating on synonymous codons are likely to be extremely small, so that a very large effective population size is required for selection to overcome the effects of drift. In mammals, codon-usage bias is believed to be determined exclusively by mutation pressure, with differences between genes due to large-scale variation in base composition around the genome. The replication-dependent histone genes are expressed at extremely high levels during periods of DNA synthesis, and thus are among the most likely mammalian genes to be affected by selection on synonymous codon usage. We suggest that the extremely biased pattern of codon usage in the H3 genes is determined in part by selection. Silent site G + C content is much higher than expected based on flanking sequence G + C content, compared to other rodent genes with similar silent site base composition but lower levels of expression. Dinucleotide-mediated mutation bias does affect codon usage, but the affect is limited to the choice between G and C in some fourfold degenerate codons. Gene conversion between the two clusters of histone genes has not been an important force in the evolution of the H3 genes, but gene conversion appears to have had some effect within the cluster on chromosome 13.

Download Full-text

Large scale variation in the rate of de novo mutation, base composition, divergence and diversity in humans

10.1101/110452 ◽

2017 ◽

Author(s):

Thomas Smith ◽

Peter Arndt ◽

Adam Eyre-Walker

Keyword(s):

Gene Conversion ◽

Mutation Rate ◽

Base Composition ◽

Large Scale ◽

De Novo ◽

Gc Content ◽

De Novo Mutation ◽

Biased Gene Conversion ◽

Different Types ◽

Scale Variation

AbstractIt has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. It is now possible to directly investigate this question using the large number of de novo mutations (DNMs) that have been discovered in humans through the sequencing of trios. We show that there is variation in the mutation rate at the 100KB, 1MB and 10MB scale that cannot be explained by variation at smaller scales, however the level of this variation is modest at large scales – at the 1MB scale we infer that ~90% of regions have a mutation rate within 50% of the mean. Different types of mutation show similar levels of variation and appear to vary in concert which suggests the pattern of mutation is relatively constant across the genome and hence unlikely to generate variation in GC-content. We confirm this using two different analyses. We find that genomic features explain less than 50% of the explainable variance in the rate of DNM. As expected the rate of divergence between species and the level of diversity within humans are correlated to the rate of DNM. However, the correlations are weaker than if all the variation in divergence was due to variation in the mutation rate. We provide evidence that this is due the effect of biased gene conversion on the probability that a mutation will become fixed. We find no evidence that linked selection affects the relationship between divergence and DNM density. In contrast to divergence, we find that most of the variation in diversity can be explained by variation in the mutation rate. Finally, we show that the correlation between divergence and DNM density declines as increasingly divergent species are considered.Author summaryUsing a dataset of 40,000 de novo mutations we show that there is large-scale variation in the mutation rate at the 100KB and 1MB scale. We show that different types of mutation vary in concert and in a manner that is not expected to generate variation in base composition; hence mutation bias is not responsible for the large-scale variation in base composition that is observed across human chromosomes. As expected large-scale variation in the rate of divergence between species and the variation within species across the genome, are correlated to the rate of mutation, but the correlation between divergence and the mutation rate is not as strong as they could be. We show that biased gene conversion is responsible for weakening the correlation. In contrast we find that most of the variation across the genome in diversity can be explained by variation in the mutation rate. Finally, we show that the correlation between the rate of mutation in humans and the divergence between humans and other species, weakens as the species become more divergent.

Download Full-text

Large-Scale Genomic Analysis of Codon Usage in Dengue Virus and Evaluation of Its Phylogenetic Dependence

BioMed Research International ◽

10.1155/2014/851425 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 15

Author(s):

Edgar E. Lara-Ramírez ◽

Ma Isabel Salazar ◽

María de Jesús López-López ◽

Juan Santiago Salas-Benito ◽

Alejandro Sánchez-Varela ◽

...

Keyword(s):

Dengue Virus ◽

Codon Usage ◽

Large Scale ◽

Synonymous Codon ◽

Geographic Origin ◽

Gc Content ◽

Mutational Bias ◽

Stabilizing Selection ◽

Nucleotide Position ◽

Clustering Patterns

The increasing number of dengue virus (DENV) genome sequences available allows identifying the contributing factors to DENV evolution. In the present study, the codon usage in serotypes 1–4 (DENV1–4) has been explored for 3047 sequenced genomes using different statistics methods. The correlation analysis of total GC content (GC) with GC content at the three nucleotide positions of codons (GC1, GC2, and GC3) as well as the effective number of codons (ENC, ENCp) versus GC3 plots revealed mutational bias and purifying selection pressures as the major forces influencing the codon usage, but with distinct pressure on specific nucleotide position in the codon. The correspondence analysis (CA) and clustering analysis on relative synonymous codon usage (RSCU) within each serotype showed similar clustering patterns to the phylogenetic analysis of nucleotide sequences for DENV1–4. These clustering patterns are strongly related to the virus geographic origin. The phylogenetic dependence analysis also suggests that stabilizing selection acts on the codon usage bias. Our analysis of a large scale reveals new feature on DENV genomic evolution.

Download Full-text

Massively parallel gene expression variation measurement of a synonymous codon library

BMC Genomics ◽

10.1186/s12864-021-07462-z ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Alexander Schmitz ◽

Fuzhong Zhang

Keyword(s):

Gene Expression ◽

Codon Usage ◽

Single Cells ◽

Massively Parallel ◽

Protein Abundance ◽

Translation Efficiency ◽

Gene Expression Variation ◽

Expression Variation ◽

Change In Mean ◽

Adaptation Index

Abstract Background Cell-to-cell variation in gene expression strongly affects population behavior and is key to multiple biological processes. While codon usage is known to affect ensemble gene expression, how codon usage influences variation in gene expression between single cells is not well understood. Results Here, we used a Sort-seq based massively parallel strategy to quantify gene expression variation from a green fluorescent protein (GFP) library containing synonymous codons in Escherichia coli. We found that sequences containing codons with higher tRNA Adaptation Index (TAI) scores, and higher codon adaptation index (CAI) scores, have higher GFP variance. This trend is not observed for codons with high Normalized Translation Efficiency Index (nTE) scores nor from the free energy of folding of the mRNA secondary structure. GFP noise, or squared coefficient of variance (CV2), scales with mean protein abundance for low-abundant proteins but does not change at high mean protein abundance. Conclusions Our results suggest that the main source of noise for high-abundance proteins is likely not originating at translation elongation. Additionally, the drastic change in mean protein abundance with small changes in protein noise seen from our library implies that codon optimization can be performed without concerning gene expression noise for biotechnology applications.

Download Full-text

Codon usage bias and environmental adaptation in microbial organisms

Molecular Genetics and Genomics ◽

10.1007/s00438-021-01771-4 ◽

2021 ◽

Author(s):

Davide Arella ◽

Maddalena Dilucca ◽

Andrea Giansanti

Keyword(s):

Codon Usage ◽

Codon Usage Bias ◽

Pathogenic Bacteria ◽

Large Scale ◽

Synonymous Codon ◽

Principal Component ◽

Gene Copy ◽

Phenotypic Traits ◽

Translational Efficiency ◽

Microbial Organisms

AbstractIn each genome, synonymous codons are used with different frequencies; this general phenomenon is known as codon usage bias. It has been previously recognised that codon usage bias could affect the cellular fitness and might be associated with the ecology of microbial organisms. In this exploratory study, we investigated the relationship between codon usage bias, lifestyles (thermophiles vs. mesophiles; pathogenic vs. non-pathogenic; halophilic vs. non-halophilic; aerobic vs. anaerobic and facultative) and habitats (aquatic, terrestrial, host-associated, specialised, multiple) of 615 microbial organisms (544 bacteria and 71 archaea). Principal component analysis revealed that species with given phenotypic traits and living in similar environmental conditions have similar codon preferences, as represented by the relative synonymous codon usage (RSCU) index, and similar spectra of tRNA availability, as gauged by the tRNA gene copy number (tGCN). Moreover, by measuring the average tRNA adaptation index (tAI) for each genome, an index that can be associated with translational efficiency, we observed that organisms able to live in multiple habitats, including facultative organisms, mesophiles and pathogenic bacteria, are characterised by a reduced translational efficiency, consistently with their need to adapt to different environments. Our results show that synonymous codon choices might be under strong translational selection, which modulates the choice of the codons to differently match tRNA availability, depending on the organism’s lifestyle needs. To our knowledge, this is the first large-scale study that examines the role of codon bias and translational efficiency in the adaptation of microbial organisms to the environment in which they live.

Download Full-text

Intragenomic variation in mutation biases causes underestimation of selection on synonymous codon usage

10.1101/2021.10.29.466462 ◽

2021 ◽

Author(s):

Alexander L Cope ◽

Premal Shah

Keyword(s):

Population Genetics ◽

Natural Selection ◽

Codon Usage ◽

Codon Bias ◽

Synonymous Codon ◽

Synonymous Codon Usage ◽

Mutation Bias ◽

Biased Gene Conversion ◽

Intragenomic Variation ◽

The Impact

Patterns of non-uniform usage of synonymous codons (codon bias) varies across genes in an organism and across species from all domains of life. The bias in codon usage is due to a combination of both non-adaptive (e.g. mutation biases) and adaptive (e.g. natural selection for translation efficiency/accuracy) evolutionary forces. Most population genetics models quantify the effects of mutation bias and selection on shaping codon usage patterns assuming a uniform mutation bias across the genome. However, mutation biases can vary both along and across chromosomes due to processes such as biased gene conversion, potentially obfuscating signals of translational selection. Moreover, estimates of variation in genomic mutation biases are often lacking for non-model organisms. Here, we combine an unsupervised learning method with a population genetics model of synonymous codon bias evolution to assess the impact of intragenomic variation in mutation bias on the strength and direction of natural selection on synonymous codon usage across 49 Saccharomycotina budding yeasts. We find that in the absence of a priori information, unsupervised learning approaches can be used to identify regions evolving under different mutation biases. We find that the impact of intragenomic variation in mutation bias varies widely, even among closely-related species. We show that the overall strength and direction of selection on codon usage can be underestimated by failing to account for intragenomic variation in mutation biases. Interestingly, genes falling into clusters identified by machine learning are also often physically clustered across chromosomes, consistent with processes such as biased gene conversion. Our results indicate the need for more nuanced models of sequence evolution that systematically incorporate the effects of variable mutation biases on codon frequencies.

Download Full-text

Gene Expression Levels Are Correlated with Synonymous Codon Usage, Amino Acid Composition, and Gene Architecture in the Red Flour Beetle, Tribolium castaneum

Molecular Biology and Evolution ◽

10.1093/molbev/mss184 ◽

2012 ◽

Vol 29 (12) ◽

pp. 3755-3766 ◽

Cited By ~ 28

Author(s):

Anna Williford ◽

Jeffery P. Demuth

Keyword(s):

Gene Expression ◽

Amino Acid ◽

Codon Usage ◽

Amino Acid Composition ◽

Tribolium Castaneum ◽

Synonymous Codon ◽

Synonymous Codon Usage ◽

Red Flour Beetle ◽

Gene Architecture ◽

Gene Expression Levels

Download Full-text

Synonymous Codon Usage Analysis of Thirty Two Mycobacteriophage Genomes

Advances in Bioinformatics ◽

10.1155/2009/316936 ◽

2009 ◽

Vol 2009 ◽

pp. 1-11 ◽

Cited By ~ 15

Author(s):

Sameer Hassan ◽

Vasantha Mahalingam ◽

Vanaja Kumar

Keyword(s):

Codon Usage ◽

Synonymous Codon ◽

Nucleotide Composition ◽

Synonymous Codon Usage ◽

Compositional Bias ◽

Trna Genes ◽

Translation Efficiency ◽

Multivariate Statistical ◽

Strong Negative Correlation ◽

Highly Expressed Genes

Synonymous codon usage of protein coding genes of thirty two completely sequenced mycobacteriophage genomes was studied using multivariate statistical analysis. One of the major factors influencing codon usage is identified to be compositional bias. Codons ending with either C or G are preferred in highly expressed genes among which C ending codons are highly preferred over G ending codons. A strong negative correlation between effective number of codons (Nc) and GC3s content was also observed, showing that the codon usage was effected by gene nucleotide composition. Translational selection is also identified to play a role in shaping the codon usage operative at the level of translational accuracy. High level of heterogeneity is seen among and between the genomes. Length of genes is also identified to influence the codon usage in 11 out of 32 phage genomes. Mycobacteriophage Cooper is identified to be the highly biased genome with better translation efficiency comparing well with the host specific tRNA genes.

Download Full-text

In-depth analysis of amino acid and nucleotide sequences of Hsp60: how conserved is this protein?

10.22541/au.163254907.78657374/v1 ◽

2021 ◽

Author(s):

Tatyana Tikhomirova ◽

Maxim Matyunin ◽

Mikhail Lobanov ◽

Oxana Galzitskaya

Keyword(s):

Amino Acid ◽

Codon Usage ◽

Synonymous Codon ◽

Dominant Role ◽

Gc Content ◽

Nucleotide Composition ◽

Nucleotide Sequences ◽

Amino Acid Sequences ◽

Mutational Pressure ◽

Depth Analysis

Chaperonin Hsp60, as a protein found in all organisms, is of great interest in medicine, since it is present in many tissues and can be used both as a drug and as an object of targeted therapy. Hence, Hsp60 deserves a fundamental comparative analysis to assess its evolutionary characteristics. It was found that the percent identity of Hsp60 amino acid sequences both within and between phyla was not high enough to identify Hsp60s as highly conserved proteins. In turn, their amino acid composition remained relatively constant. At the same time, the analysis of the nucleotide sequences showed that GC content in the Hsp60 genes was comparable to or greater than the genomic values, which may indicate a high resistance to mutations due to tight control of the nucleotide composition by DNA repair systems. Natural selection plays a dominant role in the evolution of Hsp60 genes. The degree of mutational pressure affecting the Hsp60 genes is quite low, and its direction does not depend on taxonomy. Interestingly, for the Hsp60 genes from Chordata, Arthropoda, and Proteobacteria the exact direction of mutational pressure could not be determined. However, upon further division into classes, it was found that the direction of the mutational pressure for Hsp60 genes from Fish differs from that for other chordates. The direction of the mutational pressure affects the synonymous codon usage bias. The number of high and low represented codons increases with increasing GC content, which can improve codon usage.

Download Full-text