Hidden patterns of codon usage bias across kingdoms

Mapping Intimacies ◽

10.1101/478016 ◽

2018 ◽

Author(s):

Yun Deng ◽

Fabio de Lima Hedayioglu ◽

Jeremie Kalfon ◽

Dominique Chu ◽

Tobias von der Haar

Keyword(s):

Amino Acids ◽

Codon Usage ◽

Codon Usage Bias ◽

Large Scale ◽

Model Organisms ◽

Stochastic Thermodynamics ◽

Synonymous Codons ◽

Genome Wide ◽

Taxonomic Class ◽

Taxonomic Groups

AbstractThe genetic code is necessarily degenerate with 64 possible nucleotide triplets being translated into 20 amino acids. 18 out of the 20 amino acids are encoded by multiple synonymous codons. While synonymous codons are clearly equivalent in terms of the information they carry, it is now well established that they are used in a biased fashion. There is currently no consensus as to the origin of this bias. Drawing on ideas from stochastic thermodynamics we derive from first principles a mathematical model describing the statistics of codon usage bias. We show that the model accurately describes the distribution of codon usage bias of genomes in the fungal and bacterial kingdoms. Based on it, we derive a new computational measure of codon usage bias — the distance capturing two aspects of codon usage bias: (i) Differences in the genome-wide frequency of codons and (ii) apparent non-random distributions of codons across mRNAs. By means of large scale computational analysis of over 900 species across 2 kingdoms of life, we demonstrate that our measure provides novel biological insights. Specifically, we show that while codon usage bias is clearly based on heritable traits and closely related species show similar degrees of bias, there is considerable variation in the magnitude of within taxonomic classes suggesting that the contribution of sequence-level selection to codon bias varies substantially within relatively confined taxonomic groups. Interestingly, commonly used model organisms are near the median for values of for their taxonomic class, suggesting that they may not be good representative models for species with more extreme , which comprise organisms of medical an agricultural interest. We also demonstrate that amino acid specific patterns of codon usage are themselves quite variable between branches of the tree of life, and that some of this variability correlates with organismal tRNA content.

Download Full-text

Hidden patterns of codon usage bias across kingdoms

Journal of The Royal Society Interface ◽

10.1098/rsif.2019.0819 ◽

2020 ◽

Vol 17 (163) ◽

pp. 20190819

Author(s):

Yun Deng ◽

Fabio de Lima Hedayioglu ◽

Jeremie Kalfon ◽

Dominique Chu ◽

Tobias von der Haar

Keyword(s):

Amino Acids ◽

Codon Usage ◽

Codon Usage Bias ◽

Large Scale ◽

Model Organisms ◽

Stochastic Thermodynamics ◽

Synonymous Codons ◽

Taxonomic Class ◽

Taxonomic Groups ◽

Two Kingdoms

The genetic code is necessarily degenerate with 64 possible nucleotide triplets being translated into 20 amino acids. Eighteen out of the 20 amino acids are encoded by multiple synonymous codons. While synonymous codons are clearly equivalent in terms of the information they carry, it is now well established that they are used in a biased fashion. There is currently no consensus as to the origin of this bias. Drawing on ideas from stochastic thermodynamics we derive from first principles a mathematical model describing the statistics of codon usage bias. We show that the model accurately describes the distribution of codon usage bias of genomes in the fungal and bacterial kingdoms. Based on it, we derive a new computational measure of codon usage bias—the distance D capturing two aspects of codon usage bias: (i) differences in the genome-wide frequency of codons and (ii) apparent non-random distributions of codons across mRNAs. By means of large scale computational analysis of over 900 species across two kingdoms of life, we demonstrate that our measure provides novel biological insights. Specifically, we show that while codon usage bias is clearly based on heritable traits and closely related species show similar degrees of bias, there is considerable variation in the magnitude of D within taxonomic classes suggesting that the contribution of sequence-level selection to codon bias varies substantially within relatively confined taxonomic groups. Interestingly, commonly used model organisms are near the median for values of D for their taxonomic class, suggesting that they may not be good representative models for species with more extreme D , which comprise organisms of medical and agricultural interest. We also demonstrate that amino acid specific patterns of codon usage are themselves quite variable between branches of the tree of life, and that some of this variability correlates with organismal tRNA content.

Download Full-text

Patterns of genome-wide codon usage bias in tobacco, tomato and potato

Biotechnology & Biotechnological Equipment ◽

10.1080/13102818.2021.1911684 ◽

2021 ◽

Vol 35 (1) ◽

pp. 657-664

Author(s):

Ali Mostafa Anwar ◽

Maha Aljabri ◽

Mohamed El-Soda

Keyword(s):

Codon Usage ◽

Codon Usage Bias ◽

Genome Wide

Download Full-text

Codon usage bias and environmental adaptation in microbial organisms

Molecular Genetics and Genomics ◽

10.1007/s00438-021-01771-4 ◽

2021 ◽

Author(s):

Davide Arella ◽

Maddalena Dilucca ◽

Andrea Giansanti

Keyword(s):

Codon Usage ◽

Codon Usage Bias ◽

Pathogenic Bacteria ◽

Large Scale ◽

Synonymous Codon ◽

Principal Component ◽

Gene Copy ◽

Phenotypic Traits ◽

Translational Efficiency ◽

Microbial Organisms

AbstractIn each genome, synonymous codons are used with different frequencies; this general phenomenon is known as codon usage bias. It has been previously recognised that codon usage bias could affect the cellular fitness and might be associated with the ecology of microbial organisms. In this exploratory study, we investigated the relationship between codon usage bias, lifestyles (thermophiles vs. mesophiles; pathogenic vs. non-pathogenic; halophilic vs. non-halophilic; aerobic vs. anaerobic and facultative) and habitats (aquatic, terrestrial, host-associated, specialised, multiple) of 615 microbial organisms (544 bacteria and 71 archaea). Principal component analysis revealed that species with given phenotypic traits and living in similar environmental conditions have similar codon preferences, as represented by the relative synonymous codon usage (RSCU) index, and similar spectra of tRNA availability, as gauged by the tRNA gene copy number (tGCN). Moreover, by measuring the average tRNA adaptation index (tAI) for each genome, an index that can be associated with translational efficiency, we observed that organisms able to live in multiple habitats, including facultative organisms, mesophiles and pathogenic bacteria, are characterised by a reduced translational efficiency, consistently with their need to adapt to different environments. Our results show that synonymous codon choices might be under strong translational selection, which modulates the choice of the codons to differently match tRNA availability, depending on the organism’s lifestyle needs. To our knowledge, this is the first large-scale study that examines the role of codon bias and translational efficiency in the adaptation of microbial organisms to the environment in which they live.

Download Full-text

Genome-Wide Analysis of Codon Usage Patterns of SARS-CoV-2 Virus Reveals Global Heterogeneity of COVID-19

Biomolecules ◽

10.3390/biom11060912 ◽

2021 ◽

Vol 11 (6) ◽

pp. 912

Author(s):

Saadullah Khattak ◽

Mohd Ahmar Rauf ◽

Qamar Zaman ◽

Yasir Ali ◽

Shabeen Fatima ◽

...

Keyword(s):

Standard Deviation ◽

Codon Usage ◽

Codon Usage Bias ◽

Geographic Location ◽

Mutation Pressure ◽

Genome Wide ◽

Margin Of Error ◽

Usage Patterns ◽

Causative Agents ◽

Virus Genomes

The ongoing outbreak of coronavirus disease COVID-19 is significantly implicated by global heterogeneity in the genome organization of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The causative agents of global heterogeneity in the whole genome of SARS-CoV-2 are not well characterized due to the lack of comparative study of a large enough sample size from around the globe to reduce the standard deviation to the acceptable margin of error. To better understand the SARS-CoV-2 genome architecture, we have performed a comprehensive analysis of codon usage bias of sixty (60) strains to get a snapshot of its global heterogeneity. Our study shows a relatively low codon usage bias in the SARS-CoV-2 viral genome globally, with nearly all the over-preferred codons’ A.U. ended. We concluded that the SARS-CoV-2 genome is primarily shaped by mutation pressure; however, marginal selection pressure cannot be overlooked. Within the A/U rich virus genomes of SARS-CoV-2, the standard deviation in G.C. (42.91% ± 5.84%) and the GC3 value (30.14% ± 6.93%) points towards global heterogeneity of the virus. Several SARS-CoV-2 viral strains were originated from different viral lineages at the exact geographic location also supports this fact. Taking all together, these findings suggest that the general root ancestry of the global genomes are different with different genome’s level adaptation to host. This research may provide new insights into the codon patterns, host adaptation, and global heterogeneity of SARS-CoV-2.

Download Full-text

The effect of expression levels on codon usage inPlasmodium falciparum

Parasitology ◽

10.1017/s0031182003004517 ◽

2004 ◽

Vol 128 (3) ◽

pp. 245-251 ◽

Cited By ~ 26

Author(s):

L. PEIXOTO ◽

V. FERNÁNDEZ ◽

H. MUSTO

Keyword(s):

Amino Acids ◽

Plasmodium Falciparum ◽

Natural Selection ◽

Codon Usage ◽

Complete Sequence ◽

Expression Data ◽

Expression Levels ◽

Synonymous Codons ◽

Translational Selection ◽

Highly Expressed Genes

The usage of alternative synonymous codons in the completely sequenced, extremely A+T-rich parasitePlasmodium falciparumwas studied. Confirming previous studies obtained with less than 3% of the total genes recently described, we found that A- and U-ending triplets predominate but translational selection increases the frequency of a subset of codons in highly expressed genes. However, some new results come from the analysis of the complete sequence. First, there is more variation in GC3 than previously described; second, the effect of natural selection acting at the level of translation has been analysed with real expression data at 4 different stages and third, we found that highly expressed proteins increment the frequency of energetically less expensive amino acids. The implications of these results are discussed.

Download Full-text

Codon usage bias creates a ramp of hydrogen bonding at the 5′-end in prokaryotic ORFeomes

10.1101/811612 ◽

2019 ◽

Author(s):

Juan C. Villada ◽

Maria F. Duran ◽

Patrick K. H. Lee

Keyword(s):

Hydrogen Bonding ◽

Codon Usage ◽

Codon Usage Bias ◽

Translation Efficiency ◽

Molecular Processes ◽

Molecular Feature ◽

Web Based ◽

Synonymous Codons ◽

Double Stranded Dna ◽

Codon Positions

Codon usage bias exerts control over a wide variety of molecular processes. The positioning of synonymous codons within coding sequences (CDSs) dictates protein expression by mechanisms such as local translation efficiency, mRNA Gibbs free energy, and protein co-translational folding. In this work, we explore how codon variants affect the position-dependent content of hydrogen bonding, which in turn influences energy requirements for unwinding double-stranded DNA. By analyzing over 14,000 bacterial, archaeal, and fungal ORFeomes, we found that Bacteria and Archaea exhibit an exponential ramp of hydrogen bonding at the 5′-end of CDSs, while a similar ramp was not found in Fungi. The ramp develops within the first 20 codon positions in prokaryotes, eventually reaching a steady carrying capacity of hydrogen bonding that does not differ from Fungi. Selection against uniformity tests proved that selection acts against synonymous codons with high content of hydrogen bonding at the 5′-end of prokaryotic ORFeomes. Overall, this study provides novel insights into the molecular feature of hydrogen bonding that is governed by the genetic code at the 5′-end of CDSs. A web-based application to analyze the position-dependent hydrogen bonding of ORFeomes has been developed and is publicly available (https://juanvillada.shinyapps.io/hbonds/).

Download Full-text

Analysis of computational codon usage models and their association with translationally slow codons

10.1101/2020.03.26.010488 ◽

2020 ◽

Author(s):

Gabriel Wright ◽

Anabel Rodriguez ◽

Jun Li ◽

Patricia L. Clark ◽

Tijana Milenković ◽

...

Keyword(s):

Codon Usage ◽

Computational Models ◽

Selective Pressure ◽

Synonymous Codon ◽

Ground Truth ◽

Protein Translation ◽

Weak Correlation ◽

Experimental Conditions ◽

Synonymous Codons ◽

Genome Wide

AbstractImproved computational modeling of protein translation rates, including better prediction of where translational slowdowns along an mRNA sequence may occur, is critical for understanding co-translational folding. Because codons within a synonymous codon group are translated at different rates, many computational translation models rely on analyzing synonymous codons. Some models rely on genome-wide codon usage bias (CUB), believing that globally rare and common codons are the most informative of slow and fast translation, respectively. Others use the CUB observed only in highly expressed genes, which should be under selective pressure to be translated efficiently (and whose CUB may therefore be more indicative of translation rates). No prior work has analyzed these models for their ability to predict translational slowdowns. Here, we evaluate five models for their association with slowly translated positions as denoted by two independent ribosome footprint (RFP) count experiments from S. cerevisiae, because RFP data is often considered as a “ground truth” for translation rates across mRNA sequences. We show that all five considered models strongly associate with the RFP data and therefore have potential for estimating translational slowdowns. However, we also show that there is a weak correlation between RFP counts for the same genes originating from independent experiments, even when their experimental conditions are similar. This raises concerns about the efficacy of using current RFP experimental data for estimating translation rates and highlights a potential advantage of using computational models to understand translation rates instead.

Download Full-text

Comparative analysis of codon usage bias in Crenarchaea and Euryarchaea genome reveals differential preference of synonymous codons to encode highly expressed ribosomal and RNA polymerase proteins

Journal of Genetics ◽

10.1007/s12041-016-0667-5 ◽

2016 ◽

Vol 95 (3) ◽

pp. 537-549 ◽

Cited By ~ 2

Author(s):

VISHWA JYOTI BARUAH ◽

SIDDHARTHA SANKAR SATAPATHY ◽

BHESH RAJ POWDEL ◽

ROCKTOTPAL KONWARH ◽

ALAK KUMAR BURAGOHAIN ◽

...

Keyword(s):

Comparative Analysis ◽

Rna Polymerase ◽

Codon Usage ◽

Codon Usage Bias ◽

Synonymous Codons

Download Full-text

Codon usage of highly expressed genes affects proteome-wide translation efficiency

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1719375115 ◽

2018 ◽

Vol 115 (21) ◽

pp. E4940-E4949 ◽

Cited By ~ 57

Author(s):

Idan Frumkin ◽

Marc J. Lajoie ◽

Christopher J. Gregg ◽

Gil Hornung ◽

George M. Church ◽

...

Keyword(s):

Escherichia Coli ◽

Amino Acid ◽

Codon Usage ◽

Codon Usage Bias ◽

Protein Translation ◽

Translation Efficiency ◽

Synonymous Codons ◽

Codon Composition ◽

Theoretical Predictions ◽

Highly Expressed Genes

Although the genetic code is redundant, synonymous codons for the same amino acid are not used with equal frequencies in genomes, a phenomenon termed “codon usage bias.” Previous studies have demonstrated that synonymous changes in a coding sequence can exert significantciseffects on the gene’s expression level. However, whether the codon composition of a gene can also affect the translation efficiency of other genes has not been thoroughly explored. To study how codon usage bias influences the cellular economy of translation, we massively converted abundant codons to their rare synonymous counterpart in several highly expressed genes inEscherichia coli. This perturbation reduces both the cellular fitness and the translation efficiency of genes that have high initiation rates and are naturally enriched with the manipulated codon, in agreement with theoretical predictions. Interestingly, we could alleviate the observed phenotypes by increasing the supply of the tRNA for the highly demanded codon, thus demonstrating that the codon usage of highly expressed genes was selected in evolution to maintain the efficiency of global protein translation.

Download Full-text

Genome-wide codon usage pattern analysis reveals the correlation between codon usage bias and gene expression in Cuscuta australis

Genomics ◽

10.1016/j.ygeno.2020.03.002 ◽

2020 ◽

Vol 112 (4) ◽

pp. 2695-2702 ◽

Cited By ~ 2

Author(s):

Xu-Yuan Liu ◽

Yu Li ◽

Kai-Kai Ji ◽

Jie Zhu ◽

Peng Ling ◽

...

Keyword(s):

Gene Expression ◽

Codon Usage ◽

Codon Usage Bias ◽

Pattern Analysis ◽

Codon Usage Pattern ◽

Usage Pattern ◽

Genome Wide

Download Full-text