scholarly journals COUSIN (COdon Usage Similarity INdex): A normalized measure of Codon Usage Preferences

2019 ◽  
Author(s):  
Jérôme Bourret ◽  
Samuel Alizon ◽  
Ignacio G. Bravo

AbstractCodon Usage Preferences (CUPrefs) describe the unequal usage of synonymous codons at the gene, genomic region or genome scale. Numerous indices have been developed to measure the CUPrefs of a sequence. We introduce a normalized index to calculate CUPrefs called COUSIN for COdon Usage Similarity INdex. This index compares the CUPrefs of a query against those of a reference dataset and normalizes the output over a Null Hypothesis of random codon usage. COUSIN results can be easily interpreted, quantitatively and qualitatively. We exemplify the use of COUSIN and highlight its advantages with an analysis on the complete coding sequences of eight divergent genomes, two of them with extreme nucleotide composition. Strikingly, COUSIN captures a hitherto unreported bimodal distribution in CUPrefs in genes in the human and in the chicken genomes. We show that this bimodality can be explained by the global nucleotide composition bias of the chromosome in which the gene resides, and by the precise location within the chromosome. Our results highlight the power of the COUSIN index and uncover unexpected characteristics of the CUPrefs in human and chicken. An eponymous tool written in python3 to calculate COUSIN is available for online or local use.

2019 ◽  
Vol 11 (12) ◽  
pp. 3523-3528 ◽  
Author(s):  
Jérôme Bourret ◽  
Samuel Alizon ◽  
Ignacio G Bravo

Abstract Codon Usage Preferences (CUPrefs) describe the unequal usage of synonymous codons at the gene, chromosome, or genome levels. Numerous indices have been developed to evaluate CUPrefs, either in absolute terms or with respect to a reference. We introduce the normalized index COUSIN (for COdon Usage Similarity INdex), that compares the CUPrefs of a query against those of a reference and normalizes the output over a Null Hypothesis of random codon usage. The added value of COUSIN is to be easily interpreted, both quantitatively and qualitatively. An eponymous software written in Python3 is available for local or online use (http://cousin.ird.fr). This software allows for an easy and complete analysis of CUPrefs via COUSIN, includes seven other indices, and provides additional features such as statistical analyses, clustering, and CUPrefs optimization for gene expression. We illustrate the flexibility of COUSIN and highlight its advantages by analyzing the complete coding sequences of eight divergent genomes. Strikingly, COUSIN captures a bimodal distribution in the CUPrefs of human and chicken genes hitherto unreported with such precision. COUSIN opens new perspectives to uncover CUPrefs specificities in genomes in a practical, informative, and user-friendly way.


Genetics ◽  
2001 ◽  
Vol 159 (3) ◽  
pp. 1191-1199
Author(s):  
Araxi O Urrutia ◽  
Laurence D Hurst

Abstract In numerous species, from bacteria to Drosophila, evidence suggests that selection acts even on synonymous codon usage: codon bias is greater in more abundantly expressed genes, the rate of synonymous evolution is lower in genes with greater codon bias, and there is consistency between genes in the same species in which codons are preferred. In contrast, in mammals, while nonequal use of alternative codons is observed, the bias is attributed to the background variance in nucleotide concentrations, reflected in the similar nucleotide composition of flanking noncoding and exonic third sites. However, a systematic examination of the covariants of codon usage controlling for background nucleotide content has yet to be performed. Here we present a new method to measure codon bias that corrects for background nucleotide content and apply this to 2396 human genes. Nearly all (99%) exhibit a higher amount of codon bias than expected by chance. The patterns associated with selectively driven codon bias are weakly recovered: Broadly expressed genes have a higher level of bias than do tissue-specific genes, the bias is higher for genes with lower rates of synonymous substitutions, and certain codons are repeatedly preferred. However, while these patterns are suggestive, the first two patterns appear to be methodological artifacts. The last pattern reflects in part biases in usage of nucleotide pairs. We conclude that we find no evidence for selection on codon usage in humans.


Parasitology ◽  
2004 ◽  
Vol 128 (3) ◽  
pp. 245-251 ◽  
Author(s):  
L. PEIXOTO ◽  
V. FERNÁNDEZ ◽  
H. MUSTO

The usage of alternative synonymous codons in the completely sequenced, extremely A+T-rich parasitePlasmodium falciparumwas studied. Confirming previous studies obtained with less than 3% of the total genes recently described, we found that A- and U-ending triplets predominate but translational selection increases the frequency of a subset of codons in highly expressed genes. However, some new results come from the analysis of the complete sequence. First, there is more variation in GC3 than previously described; second, the effect of natural selection acting at the level of translation has been analysed with real expression data at 4 different stages and third, we found that highly expressed proteins increment the frequency of energetically less expensive amino acids. The implications of these results are discussed.


2021 ◽  
Author(s):  
Neetu Tyagi ◽  
Rahila Sardar ◽  
Dinesh Gupta

AbstractThe Coronavirus disease 2019 (COVID-19) outbreak caused by Severe Acute Respiratory Syndrome Coronavirus 2 virus (SARS-CoV-2) poses a worldwide human health crisis, causing respiratory illness with a high mortality rate. To investigate the factors governing codon usage bias in all the respiratory viruses, including SARS-CoV-2 isolates from different geographical locations (~62K), including two recently emerging strains from the United Kingdom (UK), i.e., VUI202012/01 and South Africa (SA), i.e., 501.Y.V2 codon usage bias (CUBs) analysis was performed. The analysis includes RSCU analysis, GC content calculation, ENC analysis, dinucleotide frequency and neutrality plot analysis. We were motivated to conduct the study to fulfil two primary aims: first, to identify the difference in codon usage bias amongst all SARS-CoV-2 genomes and, secondly, to compare their CUBs properties with other respiratory viruses. A biased nucleotide composition was found as most of the highly preferred codons were A/U-ending in all the respiratory viruses studied here. Compared with the human host, the RSCU analysis led to the identification of 11 over-represented codons and 9 under-represented codons in SARS-CoV-2 genomes. Correlation analysis of ENC and GC3s revealed that mutational pressure is the leading force determining the CUBs. The present study results yield a better understanding of codon usage preferences for SARS-CoV-2 genomes and discover the possible evolutionary determinants responsible for the biases found among the respiratory viruses, thus unveils a unique feature of the SARS-CoV-2 evolution and adaptation. To the best of our knowledge, this is the first attempt at comparative CUBs analysis on the worldwide genomes of SARS-CoV-2, including novel emerged strains and other respiratory viruses.


2019 ◽  
Author(s):  
Juan C. Villada ◽  
Maria F. Duran ◽  
Patrick K. H. Lee

Codon usage bias exerts control over a wide variety of molecular processes. The positioning of synonymous codons within coding sequences (CDSs) dictates protein expression by mechanisms such as local translation efficiency, mRNA Gibbs free energy, and protein co-translational folding. In this work, we explore how codon variants affect the position-dependent content of hydrogen bonding, which in turn influences energy requirements for unwinding double-stranded DNA. By analyzing over 14,000 bacterial, archaeal, and fungal ORFeomes, we found that Bacteria and Archaea exhibit an exponential ramp of hydrogen bonding at the 5′-end of CDSs, while a similar ramp was not found in Fungi. The ramp develops within the first 20 codon positions in prokaryotes, eventually reaching a steady carrying capacity of hydrogen bonding that does not differ from Fungi. Selection against uniformity tests proved that selection acts against synonymous codons with high content of hydrogen bonding at the 5′-end of prokaryotic ORFeomes. Overall, this study provides novel insights into the molecular feature of hydrogen bonding that is governed by the genetic code at the 5′-end of CDSs. A web-based application to analyze the position-dependent hydrogen bonding of ORFeomes has been developed and is publicly available (https://juanvillada.shinyapps.io/hbonds/).


2020 ◽  
Author(s):  
Gabriel Wright ◽  
Anabel Rodriguez ◽  
Jun Li ◽  
Patricia L. Clark ◽  
Tijana Milenković ◽  
...  

AbstractImproved computational modeling of protein translation rates, including better prediction of where translational slowdowns along an mRNA sequence may occur, is critical for understanding co-translational folding. Because codons within a synonymous codon group are translated at different rates, many computational translation models rely on analyzing synonymous codons. Some models rely on genome-wide codon usage bias (CUB), believing that globally rare and common codons are the most informative of slow and fast translation, respectively. Others use the CUB observed only in highly expressed genes, which should be under selective pressure to be translated efficiently (and whose CUB may therefore be more indicative of translation rates). No prior work has analyzed these models for their ability to predict translational slowdowns. Here, we evaluate five models for their association with slowly translated positions as denoted by two independent ribosome footprint (RFP) count experiments from S. cerevisiae, because RFP data is often considered as a “ground truth” for translation rates across mRNA sequences. We show that all five considered models strongly associate with the RFP data and therefore have potential for estimating translational slowdowns. However, we also show that there is a weak correlation between RFP counts for the same genes originating from independent experiments, even when their experimental conditions are similar. This raises concerns about the efficacy of using current RFP experimental data for estimating translation rates and highlights a potential advantage of using computational models to understand translation rates instead.


2016 ◽  
Vol 95 (3) ◽  
pp. 537-549 ◽  
Author(s):  
VISHWA JYOTI BARUAH ◽  
SIDDHARTHA SANKAR SATAPATHY ◽  
BHESH RAJ POWDEL ◽  
ROCKTOTPAL KONWARH ◽  
ALAK KUMAR BURAGOHAIN ◽  
...  

2015 ◽  
Vol 13 (02) ◽  
pp. 1550002
Author(s):  
Mohammad-Hadi Foroughmand-Araabi ◽  
Bahram Goliaei ◽  
Kasra Alishahi ◽  
Mehdi Sadeghi ◽  
Sama Goliaei

Although it is known that synonymous codons are not chosen randomly, the role of the codon usage in gene regulation is not clearly understood, yet. Researchers have investigated the relation between the codon usage and various properties, such as gene regulation, translation rate, translation efficiency, mRNA stability, splicing, and protein domains. Recently, a universal codon usage based mechanism for gene regulation is proposed. We studied the role of protein sequence patterns on the codons usage by related genes. Considering a subsequence of a protein that matches to a pattern or motif, we showed that, parts of the genes, which are translated to this subsequence, use specific ratios of synonymous codons. Also, we built a multinomial logistic regression statistical model for codon usage, which considers the effect of patterns on codon usage. This model justifies the observed codon usage preference better than the classic organism dependent codon usage. Our results showed that the codon usage plays a role in controlling protein levels, for genes that participate in a specific biological function. This is the first time that this phenomenon is reported.


2015 ◽  
Vol 94 (2) ◽  
pp. 251-260 ◽  
Author(s):  
XIAO-XIA MA ◽  
YU-PING FENG ◽  
JIA-LING BAI ◽  
DE-RONG ZHANG ◽  
XIN-SHI LIN ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document