scholarly journals Elucidation of Codon Usage Signatures across the Domains of Life

2019 ◽  
Vol 36 (10) ◽  
pp. 2328-2339 ◽  
Author(s):  
Eva Maria Novoa ◽  
Irwin Jungreis ◽  
Olivier Jaillon ◽  
Manolis Kellis

Abstract Because of the degeneracy of the genetic code, multiple codons are translated into the same amino acid. Despite being “synonymous,” these codons are not equally used. Selective pressures are thought to drive the choice among synonymous codons within a genome, while GC content, which is typically attributed to mutational drift, is the major determinant of variation across species. Here, we find that in addition to GC content, interspecies codon usage signatures can also be detected. More specifically, we show that a single amino acid, arginine, is the major contributor to codon usage bias differences across domains of life. We then exploit this finding and show that domain-specific codon bias signatures can be used to classify a given sequence into its corresponding domain of life with high accuracy. We then wondered whether the inclusion of codon usage codon autocorrelation patterns, which reflects the nonrandom distribution of codon occurrences throughout a transcript, might improve the classification performance of our algorithm. However, we find that autocorrelation patterns are not domain-specific, and surprisingly, are unrelated to tRNA reusage, in contrast to previous reports. Instead, our results suggest that codon autocorrelation patterns are a by-product of codon optimality throughout a sequence, where highly expressed genes display autocorrelated “optimal” codons, whereas lowly expressed genes display autocorrelated “nonoptimal” codons.

2018 ◽  
Author(s):  
Eva Maria Novoa ◽  
Olivier Jaillon ◽  
Irwin Jungreis ◽  
Manolis Kellis

AbstractDue to the degeneracy of the genetic code, multiple codons are translated into the same amino acid. Despite being ‘synonymous’, these codons are not equally used. Selective pressures are thought to drive the choice among synonymous codons within a genome, while GC content, which is generally attributed to mutational drift, is the major determinant of interspecies codon usage bias. Here we find that in addition to the bias caused by GC content, inter-species codon usage signatures can also be detected. More specifically, we show that a single amino acid, arginine, is the major contributor to codon usage bias differences across domains of life. We then exploit this finding, and show that the identified domain-specific codon bias signatures can be used to classify a given sequence into its corresponding domain with high accuracy. Considering that species belonging to the same domain share similar tRNA decoding strategies, we then wondered whether the inclusion of codon autocorrelation patterns might improve the classification performance of our algorithm. However, we find that autocorrelation patterns are not domain-specific, and surprisingly, are unrelated to tRNA reusage, in contrast to the common belief. Instead, our results reveal that codon autocorrelation patterns are a consequence of codon optimality throughout a sequence, where highly expressed genes display autocorrelated ‘optimal’ codons, whereas lowly expressed genes display autocorrelated ‘non-optimal’ codons.


Genes ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 1169
Author(s):  
Xin Li ◽  
Xiaocen Wang ◽  
Pengtao Gong ◽  
Nan Zhang ◽  
Xichen Zhang ◽  
...  

Giardia duodenalis, a flagellated parasitic protozoan, the most common cause of parasite-induced diarrheal diseases worldwide. Codon usage bias (CUB) is an important evolutionary character in most species. However, G. duodenalis CUB remains unclear. Thus, this study analyzes codon usage patterns to assess the restriction factors and obtain useful information in shaping G. duodenalis CUB. The neutrality analysis result indicates that G. duodenalis has a wide GC3 distribution, which significantly correlates with GC12. ENC-plot result—suggesting that most genes were close to the expected curve with only a few strayed away points. This indicates that mutational pressure and natural selection played an important role in the development of CUB. The Parity Rule 2 plot (PR2) result demonstrates that the usage of GC and AT was out of proportion. Interestingly, we identified 26 optimal codons in the G. duodenalis genome, ending with G or C. In addition, GC content, gene expression, and protein size also influence G. duodenalis CUB formation. This study systematically analyzes G. duodenalis codon usage pattern and clarifies the mechanisms of G. duodenalis CUB. These results will be very useful to identify new genes, molecular genetic manipulation, and study of G. duodenalis evolution.


2018 ◽  
Vol 115 (21) ◽  
pp. E4940-E4949 ◽  
Author(s):  
Idan Frumkin ◽  
Marc J. Lajoie ◽  
Christopher J. Gregg ◽  
Gil Hornung ◽  
George M. Church ◽  
...  

Although the genetic code is redundant, synonymous codons for the same amino acid are not used with equal frequencies in genomes, a phenomenon termed “codon usage bias.” Previous studies have demonstrated that synonymous changes in a coding sequence can exert significantciseffects on the gene’s expression level. However, whether the codon composition of a gene can also affect the translation efficiency of other genes has not been thoroughly explored. To study how codon usage bias influences the cellular economy of translation, we massively converted abundant codons to their rare synonymous counterpart in several highly expressed genes inEscherichia coli. This perturbation reduces both the cellular fitness and the translation efficiency of genes that have high initiation rates and are naturally enriched with the manipulated codon, in agreement with theoretical predictions. Interestingly, we could alleviate the observed phenotypes by increasing the supply of the tRNA for the highly demanded codon, thus demonstrating that the codon usage of highly expressed genes was selected in evolution to maintain the efficiency of global protein translation.


Author(s):  
Tatyana Tikhomirova ◽  
Maxim Matyunin ◽  
Mikhail Lobanov ◽  
Oxana Galzitskaya

Chaperonin Hsp60, as a protein found in all organisms, is of great interest in medicine, since it is present in many tissues and can be used both as a drug and as an object of targeted therapy. Hence, Hsp60 deserves a fundamental comparative analysis to assess its evolutionary characteristics. It was found that the percent identity of Hsp60 amino acid sequences both within and between phyla was not high enough to identify Hsp60s as highly conserved proteins. In turn, their amino acid composition remained relatively constant. At the same time, the analysis of the nucleotide sequences showed that GC content in the Hsp60 genes was comparable to or greater than the genomic values, which may indicate a high resistance to mutations due to tight control of the nucleotide composition by DNA repair systems. Natural selection plays a dominant role in the evolution of Hsp60 genes. The degree of mutational pressure affecting the Hsp60 genes is quite low, and its direction does not depend on taxonomy. Interestingly, for the Hsp60 genes from Chordata, Arthropoda, and Proteobacteria the exact direction of mutational pressure could not be determined. However, upon further division into classes, it was found that the direction of the mutational pressure for Hsp60 genes from Fish differs from that for other chordates. The direction of the mutational pressure affects the synonymous codon usage bias. The number of high and low represented codons increases with increasing GC content, which can improve codon usage.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e10787
Author(s):  
Huirong Duan ◽  
Qian Zhang ◽  
Chunmei Wang ◽  
Fang Li ◽  
Fuping Tian ◽  
...  

Background Codon usage bias analysis is a suitable strategy for identifying the principal evolutionary driving forces in different organisms. Delphinium grandiflorum L. is a perennial herb with high economic value and typical biological characteristics. Evolutionary analysis of D. grandiflorum can provide a rich resource of genetic information for developing hybridization resources of the genus Delphinium. Methods Synonymous codon usage (SCU) and related indices of 51 coding sequences from the D. grandiflorum chloroplast (cp) genome were calculated using Codon W, Cups of EMBOSS, SPSS and Microsoft Excel. Multivariate statistical analysis combined by principal component analysis (PCA), correspondence analysis (COA), PR2-plot mapping analysis and ENC plot analysis was then conducted to explore the factors affecting the usage of synonymous codons. Results The SCU bias of D. grandiflorum was weak and codons preferred A/T ending. A SCU imbalance between A/T and G/C at the third base position was revealed by PR2-plot mapping analysis. A total of eight codons were identified as the optimal codons. The PCA and COA results indicated that base composition (GC content, GC3 content) and gene expression were important for SCU bias. A majority of genes were distributed below the expected curve from the ENC plot analysis and up the standard curve by neutrality plot analysis. Our results showed that with the exception of notable mutation pressure effects, the majority of genetic evolution in the D. grandiflorum cp genome might be driven by natural selection. Discussions Our results provide a theoretical foundation for elucidating the genetic architecture and mechanisms of D. grandiflorum, and contribute to enriching D. grandiflorum genetic resources.


2010 ◽  
Vol 7 (1) ◽  
pp. 131-135 ◽  
Author(s):  
Laura R. Emery ◽  
Paul M. Sharp

Patterns of codon usage have been extensively studied among Bacteria and Eukaryotes, but there has been little investigation of species from the third domain of life, the Archaea. Here, we examine the nature of codon usage bias in a methanogenic archaeon, Methanococcus maripaludis . Genome-wide patterns of codon usage are dominated by a strong A + T bias, presumably largely reflecting mutation patterns. Nevertheless, there is variation among genes in the use of a subset of putatively translationally optimal codons, which is strongly correlated with gene expression level. In comparison with Bacteria such as Escherichia coli , the strength of selected codon usage bias in highly expressed genes in M. maripaludis seems surprisingly high given its moderate growth rate. However, the pattern of selected codon usage differs between M. maripaludis and E. coli : in the archaeon, strongly selected codon usage bias is largely restricted to twofold degenerate amino acids (AAs). Weaker bias among the codons for fourfold degenerate AAs is consistent with the small number of tRNA genes in the M. maripaludis genome.


2021 ◽  
Author(s):  
Lirong Bai ◽  
Lili Lu ◽  
Suping Li ◽  
Jicui He ◽  
Jian Chen ◽  
...  

Abstract Background: Epinephelus fuscoguttatus is one of the rare marine economic fishes with high economic value. At present, the researches on grouper mainly focus on artificial propagation, physiology and biochemistry, diseases and so on. However, there are few reports on mitochondrial genome level. The research aimed to analyze composition characteristics and usage preference of codon of mitochondrial genome in E. fuscoguttatus, and explored main factors of affecting the formation of codon preference, thereby providing theoretical basis for studying species evolution, genetics and breeding, and improving expression efficiency of exogenous genes. Results: GC content of mitochondrial genome of E. fuscoguttatus changed between 44.00% and 46.30%, with 45.40% of mean. Change range of CAI value was between 0.125 and 0.202, and the mean was 0.155. Effective number of codons (ENC) changed between 36.08 and 49.55, with 44.98 of mean. There were 32 codons that relative synonymous codon usage (RSCU) was more than 1, mainly ended with A/C. ENC-plot analysis found that all the genes were in the lower middle of the standard curve, and there was larger difference between actual and theoretical ENC, illustrating that codon bias was mainly affected by the choice. Correspondence analysis showed that the first axis contributed 58.85% of the difference, while the second, third and fourth axes contributed 14.59%, 7.66% and 5.43% of the difference respectively. Cumulative contribution rate of the first four vectors was 85.53%. Finally, nine optimal codons were selected: CUU, AUC, GUU, CCU, GCA, UAU, CGC, AGC and GGC.Conclusions: Codon usage preference of mitochondrial genome of E. fuscoguttatus was weak, and it preferred to use A/C terminated codon, and preference was mainly influenced by choice.


2020 ◽  
Vol 21 (11) ◽  
Author(s):  
Redi Aditama ◽  
Zulfikar Achmad Tanjung ◽  
Widyartini Made Sudania ◽  
Yogo Adhi Nugroho ◽  
Condro Utomo ◽  
...  

Abstract. Aditama R, Tanjung ZA, Sudania WM, Nugroho YA, Utomo C, Liwang T. 2020. Analysis of codon usage bias reveals optimal codons in Elaeis guineensis. Biodiversitas 21: 5331-5337. Codon usage bias of oil palm genome was reported employing several indices, including GC content, relative synonymous codon usage (RSCU), the effective number of codons (ENC), and codon adaptation index (CAI). Unimodal distribution of GC content was observed and matched with non-grass monocots characteristics. Correspondence analysis (COA) on synonymous codon usage bias showed that the main axis was strongly driven by GC content. The ENC and neutrality plot of oil palm genes indicating that natural selection played more vital role compared to mutational bias on shaping codon usage bias. A positive correlation between calculated CAI and experimental data of oil palm gene expression was detected indicating good ability of this index. Finally, eighteen codons were defined as “optimal codons” that may provide a useful reference for heterogeneous expression and genome editing studies.


2000 ◽  
Vol 81 (9) ◽  
pp. 2313-2325 ◽  
Author(s):  
David B. Levin ◽  
Beatrixe Whittome

Phylogenetic analyses based on baculovirus polyhedrin nucleotide and amino acid sequences revealed two major nucleopolyhedrovirus (NPV) clades, designated Group I and Group II. Subsequent phylogenetic analyses have revealed three Group II subclades, designated A, B and C. Variations in amino acid frequencies determine the extent of dissimilarity for divergent but structurally and functionally conserved genes and therefore significantly influence the analysis of phylogenetic relationships. Hence, it is important to consider variations in amino acid codon usage. The Genome Hypothesis postulates that genes in any given genome use the same coding pattern with respect to synonymous codons and that genes in phylogenetically related species generally show the same pattern of codon usage. We have examined codon usage in six genes from six NPVs and found that: (1) there is significant variation in codon use by genes within the same virus genome; (2) there is significant variation in the codon usage of homologous genes encoded by different NPVs; (3) there is no correlation between the level of gene expression and codon bias in NPVs; (4) there is no correlation between gene length and codon bias in NPVs; and (5) that while codon use bias appears to be conserved between viruses that are closely related phylogenetically, the patterns of codon usage also appear to be a direct function of the GC-content of the virus-encoded genes.


2020 ◽  
Author(s):  
Carrie A. Whittle ◽  
Arpita Kulkarni ◽  
Nina Chung ◽  
Cassandra G. Extavour

AbstractBackgroundFor multicellular organisms, much remains unknown about the dynamics of synonymous codon and amino acid use in highly expressed genes, including whether their use varies with expression in different tissue types and sexes. Moreover, specific codons and amino acids may have translational functions in highly transcribed genes, that largely depend on their relationships to tRNA gene copies in the genome. However, these relationships and putative functions are poorly understood, particularly in multicellular systems.ResultsHere, we rigorously studied codon and amino acid use in highly expressed genes from reproductive and nervous system tissues (male and female gonad, somatic reproductive system, brain, ventral nerve cord, and male accessory glands) in the cricket Gryllus bimaculatus. We report an optimal codon, defined as the codon preferentially used in highly expressed genes, for each of the 18 amino acids with synonymous codons in this organism. The optimal codons were largely shaped by selection, and their identities were mostly shared among tissue types and both sexes. However, the frequency of optimal codons was highest in gonadal genes. Concordant with translational selection, a majority of the optimal codons had abundant matching tRNA gene copies in the genome, but sometimes obligately required wobble tRNAs. We suggest the latter may comprise a mechanism for slowing translation of abundant transcripts, particularly for cell-cycle genes. Non-optimal codons, defined as those least commonly used in highly transcribed genes, intriguingly often had abundant tRNAs, and had elevated use in a subset of genes with specialized functions (gametic and apoptosis genes), suggesting their use promotes the upregulation of particular mRNAs. In terms of amino acids, we found evidence suggesting that amino acid frequency, tRNA gene copy number, and amino acid biosynthetic costs (size/complexity) had all interdependently evolved in this insect model, potentially for translational optimization.ConclusionsCollectively, the results strongly suggest that codon use in highly expressed genes, including optimal, wobble, and non-optimal codons, and their tRNAs abundances, as well as amino acid use, have been adapted for various functional roles in translation within this cricket. The effects of expression in different tissue types and the two sexes are discussed.


Sign in / Sign up

Export Citation Format

Share Document