scholarly journals Synonymous Dinucleotide Usage: A Codon-Aware Metric for Quantifying Dinucleotide Representation in Viruses

Viruses ◽  
2020 ◽  
Vol 12 (4) ◽  
pp. 462 ◽  
Author(s):  
Spyros Lytras ◽  
Joseph Hughes

Distinct patterns of dinucleotide representation, such as CpG and UpA suppression, are characteristic of certain viral genomes. Recent research has uncovered vertebrate immune mechanisms that select against specific dinucleotides in targeted viruses. This evidence highlights the importance of systematically examining the dinucleotide composition of viral genomes. We have developed a novel metric, called synonymous dinucleotide usage (SDU), for quantifying dinucleotide representation in coding sequences. Our method compares the abundance of a given dinucleotide to the null hypothesis of equal synonymous codon usage in the sequence. We present a Python3 package, DinuQ, for calculating SDU and other relevant metrics. We have applied this method on two sets of invertebrate- and vertebrate-specific flaviviruses and rhabdoviruses. The SDU shows that the vertebrate viruses exhibit consistently greater under-representation of CpG dinucleotides in all three codon positions in both datasets. In comparison to existing metrics for dinucleotide quantification, the SDU allows for a statistical interpretation of its values by comparing it to a null expectation based on the codon table. Here we apply the method to viruses, but coding sequences of other living organisms can be analysed in the same way.

Author(s):  
Spyros Lytras ◽  
Joseph Hughes

AbstractDistinct patterns of dinucleotide representation, such as CpG and UpA suppression, are characteristic of certain viral genomes. Recent research has uncovered vertebrate immune mechanisms that select against specific dinucleotides in targeted viruses. This evidence highlights the importance of systematically examining the dinucleotide composition of viral genomes. We have developed a novel metric, called Synonymous Dinucleotide Usage (SDU), for quantifying dinucleotide representation in coding sequences. Our method compares the abundance of a given dinucleotide to the null hypothesis of equal synonymous codon usage in the sequence. We present a Python3 package, DinuQ, for calculating SDU and other relevant metrics. We have applied this method on two sets of invertebrate- and vertebrate-specific flaviviruses and rhabdoviruses. The SDU shows that the vertebrate viruses exhibit consistently greater under-representation of CpG dinucleotides in all three codon positions in both datasets. In comparison to existing metrics for dinucleotide quantification, the SDU allows for a statistical interpretation of its values by comparing it to a null expectation based on the codon table. Here we apply the method to viruses, but coding sequences of other living organisms can be analysed in the same way.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5099 ◽  
Author(s):  
Salvatore Camiolo ◽  
Andrea Porceddu

Background Optimization of transgene expression can be achieved by designing coding sequences with the synonymous codon usage of genes which are highly expressed in the host organism. The identification of the so-called “favoured codons” generally requires the access to either the genome or the coding sequences and the availability of expression data. Results Here we describe corseq, a fast and reliable software for detecting the favoured codons directly from RNAseq data without prior knowledge of genomic sequence or gene annotation. The presented tool allows the inference of codons that are preferentially used in highly expressed genes while estimating the transcripts abundance by a new kmer based approach. corseq is implemented in Python and runs under any operating system. The software requires the Biopython 1.65 library (or later versions) and is available under the ‘GNU General Public License version 3’ at the project webpage https://sourceforge.net/projects/corseq/files. Conclusion corseq represents a faster and easy-to-use alternative for the detection of favoured codons in non model organisms.


Author(s):  
Yicong Li ◽  
Rui Wang ◽  
Huihui Wang ◽  
Feiyang Pu ◽  
Xili Feng ◽  
...  

Synonymous codon usage bias is a universal characteristic of genomes across various organisms. Autophagy-related gene 13 (atg13) is one essential gene for autophagy initiation, yet the evolutionary trends of the atg13 gene at the usages of nucleotide and synonymous codon remains unexplored. According to phylogenetic analyses for the atg13 gene of 226 eukaryotic organisms at the nucleotide and amino acid levels, it is clear that their nucleotide usages exhibit more genetic information than their amino acid usages. Specifically, the overall nucleotide usage bias quantified by information entropy reflected that the usage biases at the first and second codon positions were stronger than those at the third position of the atg13 genes. Furthermore, the bias level of nucleotide ‘G’ usage is highest, while that of nucleotide ‘C’ usage is lowest in the atg13 genes. On top of that, genetic features represented by synonymous codon usage exhibits a species-specific pattern on the evolution of the atg13 genes to some extent. Interestingly, the codon usages of atg13 genes in the ancestor animals (Latimeria chalumnae, Petromyzon marinus, and Rhinatrema bivittatum) are strongly influenced by mutation pressure from nucleotide composition constraint. However, the distributions of nucleotide composition at different codon positions in the atg13 gene display that natural selection still dominates atg13 codon usages during organisms’ evolution.


2009 ◽  
Vol 2 (3) ◽  
pp. 133-141
Author(s):  
Tangjie Zhang ◽  
Hong Chang ◽  
Yuzhi Liu ◽  
Huifang Li ◽  
Kuanwei Chen

Codon usage in mitochondrial genes of 11 Gallus gallus and two Anatidae species was analysed to determine the general patterns in codon choice of Callus gallus species. C3 contents were higher in Gallus gallus than in mammalian mitochondrial genomes that encode protein codon positions. The high C3 contents of Callus gallus might be the result of relatively strong mutational bias that occurred in the lineage of the Callus gallus species. A and C ending codons were detected as the “preferred 77 codons in Callus gallus and Anatidae. The NNR codon families are dominated by the A-ending codons, the NNY codon families are dominated by the C-ending codons and the NNN codon families are dominated by the A-ending or the C-ending codons. A comparison of the relative synonymous codon usage (RSCU) and synonymous codon families (SCF) of tRNA and proteins was made, and two groups can be classified by SCF. The codon usage in Callus gallus species indicates that codons containing A or C at the third position are used preferentially, regardless of whether corresponding tRNAs are encoded in the mtDNA. In both Callus gallus and Anatidae species mtDNA, codon usage biases are highly related to CC-ending binucleotide condons.


Viruses ◽  
2019 ◽  
Vol 11 (8) ◽  
pp. 752 ◽  
Author(s):  
Zhen He ◽  
Haifeng Gan ◽  
Xinyan Liang

Potato virus M (PVM) is a member of the genus Carlavirus of the family Betaflexviridae and causes large economic losses of nightshade crops. Several previous studies have elucidated the population structure, evolutionary timescale and adaptive evolution of PVM. However, the synonymous codon usage pattern of PVM remains unclear. In this study, we performed comprehensive analyses of the codon usage and composition of PVM based on 152 nucleotide sequences of the coat protein (CP) gene and 125 sequences of the cysteine-rich nucleic acid binding protein (NABP) gene. We observed that the PVM CP and NABP coding sequences were GC-and AU-rich, respectively, whereas U- and G-ending codons were preferred in the PVM CP and NABP coding sequences. The lower codon usage of the PVM CP and NABP coding sequences indicated a relatively stable and conserved genomic composition. Natural selection and mutation pressure shaped the codon usage patterns of PVM, with natural selection being the most important factor. The codon adaptation index (CAI) and relative codon deoptimization index (RCDI) analysis revealed that the greatest adaption of PVM was to pepino, followed by tomato and potato. Moreover, similarity Index (SiD) analysis showed that pepino had a greater impact on PVM than tomato and potato. Our study is the first attempt to evaluate the codon usage pattern of the PVM CP and NABP genes to better understand the evolutionary changes of a carlavirus.


Genetics ◽  
2001 ◽  
Vol 159 (1) ◽  
pp. 347-358
Author(s):  
Brian R Morton

Abstract A previously employed method that uses the composition of noncoding DNA as the basis of a test for selection between synonymous codons in plastid genes is reevaluated. The test requires the assumption that in the absence of selective differences between synonymous codons the composition of silent sites in coding sequences will match the composition of noncoding sites. It is demonstrated here that this assumption is not necessarily true and, more generally, that using compositional properties to draw inferences about selection on silent changes in coding sequences is much more problematic than commonly assumed. This is so because selection on nonsynonymous changes can influence the composition of synonymous sites (i.e., codon usage) in a complex manner, meaning that the composition biases of different silent sites, including neutral noncoding DNA, are not comparable. These findings also draw into question the commonly utilized method of investigating how selection to increase translation accuracy influences codon usage. The work then focuses on implications for studies that assess codon adaptation, which is selection on codon usage to enhance translation rate, in plastid genes. A new test that does not require the use of noncoding DNA is proposed and applied. The results of this test suggest that far fewer plastid genes display codon adaptation than previously thought.


2021 ◽  
Author(s):  
Puttatida Mahapattanakul ◽  
Pragun Rajbhandari ◽  
Patsarin Rodpothong

Abstract Codon usage is a reflection of evolutionary adaptation to environmental pressure. The pattern of usage may be unique to species of viruses, genomes of the same species or genes within the same genome. Here we have analysed the overall nucleotide composition and the nucleotides at different codon positions in the genomes of 6 Alphabaculoviruses. Principle Component Analysis (PCA) based on Relative Synonymous Codon Usage (RSCU) of all Open Reading Frames (ORFs) was employed to investigate the pattern of the codon usage. The results suggest the Alphabaculovirus genomes, except that of Agrotis Ipsilon mNPV (AgipNPV), are predominantly under an influence of a neutral mutation that bias toward A/T. The majority of the ORFs, except those of the AgipNPV, cluster at the same location in the 2-dimensional PCA map with one prominent outlier that has been identified as a P6.9 gene. The six Alpha-baculovirus P6.9 genes have a high G/C content, dissimilar to the majority of the ORFs. The G/C content is found to be significantly high at the 2 nd codon position, suggesting the influence of natural selection and perhaps reflecting its functional conservation in DNA packaging as well as its evolutionary relation to Protamine.


2021 ◽  
Author(s):  
J. Daron ◽  
I.G. Bravo

AbstractThe Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the third virus within the Orthocoronavirinae causing an emergent infectious disease in humans, the ongoing coronavirus disease 2019 pandemic (COVID-19). Due to the high zoonotic potential of these viruses, it is critical to unravel their evolutionary history of host species shift, adaptation and emergence. Only such knowledge can guide virus discovery, surveillance and research efforts to identify viruses posing a pandemic risk in humans. We present a comprehensive analysis of the composition and codon usage bias of the 82 Orthocoronavirinae members, infecting 47 different avian and mammalian hosts. Our results clearly establish that synonymous codon usage varies widely among viruses and is only weakly dependent on the type of host they infect. Instead, we identify mutational bias towards AT-enrichment and selection against CpG dinucleotides as the main factors responsible of the codon usage bias variation. Further insight on the mutational equilibrium within Orthocoronavirinae revealed that most coronavirus genomes are close to their neutral equilibrium, the exception is the three recently-infecting human coronaviruses, which lie further away from the mutational equilibrium than their endemic human coronavirus counterparts. Finally, our results suggest that while replicating in humans SARS-CoV-2 is slowly becoming AT-richer, likely until attaining a new mutational equilibrium.


Sign in / Sign up

Export Citation Format

Share Document