A Relationship Between GC Content and Coding-Sequence Length

1996 ◽  
Vol 43 (3) ◽  
pp. 216-223
Author(s):  
José L. Oliver ◽  
Antonio Marín
1996 ◽  
Vol 43 (3) ◽  
pp. 216-223 ◽  
Author(s):  
José L. Oliver ◽  
Antonio Marín

2021 ◽  
Author(s):  
Amit Kumar ◽  
Malyaj R Prajapati ◽  
Surendra Upadhyay ◽  
Anamika Bhordia ◽  
Vinod Kumar Singh ◽  
...  

Abstract The present report communicates the first complete genome sequence of Brucella abortus 2308 strain isolated from a an abortion storm in a dairy farm located at Kanpur, Uttar Pradesh in India. It caused the last trimester abortions of 32 animals out of 100 cows in a dairy over a period of 60 days. The bacteria were isolated in pure culture from the placenta of aborted cows. The genome sequence length of isolated bacteria is 3,285,606 bp with a 57.25 % GC content, an N50 value of 296,426, L50 value of 4 containing 3,119 coding DNA sequences (CDSs), 49 tRNAs, 1 transfer messenger RNA (mRNA), and 3 rRNA genes. It is the first report of Brucella abortus 2308 isolation and complete genome sequence from Indian subcontinent.


2020 ◽  
Author(s):  
Sameer Aryal ◽  
Francesco Longo ◽  
Eric Klann

AbstractLoss of the fragile X mental retardation protein (FMRP) causes fragile X syndrome (FXS). FMRP is widely thought to repress protein synthesis, but its translational targets and modes of control remain in dispute. We previously showed that genetic removal of p70 S6 kinase 1 (S6K1) corrects altered protein synthesis as well as synaptic and behavioral phenotypes in FXS mice. In this study, we examined the gene-specificity of altered mRNA translation in FXS and the mechanism of rescue with genetic reduction of S6K1 by carrying out ribosome profiling and RNA-Seq on cortical lysates from wild-type, FXS, S6K1 knockout, and double knockout mice. We observed reduced ribosome footprint abundance in the majority of differentially translated genes in the cortices of FXS mice. We used molecular assays to discover evidence that the reduction in ribosome footprint abundance reflects an increased rate of ribosome translocation, which is captured as a decrease in the number of translating ribosomes at steady state, and is normalized by inhibition of S6K1. We also found that genetic removal of S6K1 prevented a positive-to-negative gradation of alterations in translation efficiencies (RF/mRNA) with coding sequence length across mRNAs in FXS mouse cortices. Our findings reveal the identities of dysregulated mRNAs and a molecular mechanism by which reduction of S6K1 prevents altered translation in FXS.


2018 ◽  
Vol 68 (4) ◽  
pp. 353-365 ◽  
Author(s):  
Li Gong ◽  
Wei Shi ◽  
Min Yang ◽  
Xiaoyu Kong

Abstract The eukaryotic ribosomal DNA (rDNA) cluster consists of multiple copies of three genes (18S, 5.8S, and 28S rDNA) and two internal transcribed spacers (ITS1 and ITS2). In recent years, an increasing number of rDNA sequence polymorphisms have been identified in numerous species. In the present study, we provide 33 complete ITS (ITS1-5.8S-ITS2) sequences from two Symphurus plagiusa individuals. To the best of our knowledge, these sequences are the first detailed information on ITS sequences in Pleuronectiformes. Here, two divergent types (Type A and B) of the ITS1-5.8S-ITS2 rDNA sequence were found, which mainly differ in sequence length, GC content, nucleotide diversity (π), secondary structure and minimum free energy. The ITS1-5.8S-ITS2 rDNA sequence of Type B was speculated to be a putative pseudogene according to pseudogene identification criteria. Cluster analysis showed that sequences from the same type clustered into one group and two major groups were formed. The high degree of ITS1-5.8S-ITS2 sequence polymorphism at the intra-specific level indicated that the S. plagiusa genome has evolved in a non-concerted evolutionary manner. These results not only provide useful data for ribosomal pseudogene identification, but also further contribute to the study of rDNA evolution in teleostean genomes.


2018 ◽  
Vol 15 (138) ◽  
pp. 20170667 ◽  
Author(s):  
Sophia S. Liu ◽  
Adam J. Hockenberry ◽  
Michael C. Jewett ◽  
Luís A. N. Amaral

The unequal utilization of synonymous codons affects numerous cellular processes including translation rates, protein folding and mRNA degradation. In order to understand the biological impact of variable codon usage bias (CUB) between genes and genomes, it is crucial to be able to accurately measure CUB for a given sequence. A large number of metrics have been developed for this purpose, but there is currently no way of systematically testing the accuracy of individual metrics or knowing whether metrics provide consistent results. This lack of standardization can result in false-positive and false-negative findings if underpowered or inaccurate metrics are applied as tools for discovery. Here, we show that the choice of CUB metric impacts both the significance and measured effect sizes in numerous empirical datasets, raising questions about the generality of findings in published research. To bring about standardization, we developed a novel method to create synthetic protein-coding DNA sequences according to different models of codon usage. We use these benchmark sequences to identify the most accurate and robust metrics with regard to sequence length, GC content and amino acid heterogeneity. Finally, we show how our benchmark can aid the development of new metrics by providing feedback on its performance compared to the state of the art.


2020 ◽  
Author(s):  
Xueping LI ◽  
Jianhong Li ◽  
Yonghong Qi ◽  
Yonggang Liu ◽  
Minquan Li

Abstract BackgroundFusarium equiseti is a plant pathogen with a wide range of hosts and diverse effects, including probiotic activity. However, the underlying molecular mechanisms remain unclear, hindering its effective control and utilization. In this study, the Illumina HiSeq 4000 and PacBio platforms were used to sequence and assemble the whole genome of Fusarium equiseti D25-1.ResultsThe assembly included 16 fragments with a GC content of 48.01%, gap number of zero, and size of 40,776,005 bp. There were 40,110 exons and 26,281 introns having a total size of 19,787,286 bp and 2,290,434 bp, respectively. The genome had an average copy number of 333, 71, 69, 31, and 108 for tRNAs, rRNAs, sRNAs, snRNAs, and miRNAs, respectively. The total repetitive sequence length was 1,713,918 bp, accounting for 4.2033% of the genome. In total, 13,134 functional genes were annotated, accounting for 94.97% of the total gene number. Toxin-related genes, including two related to zearalenone and 23 related to trichothecene, were identified. A comparative genomic analysis supported the high quality of the F. equiseti assembly, exhibiting good collinearity with the reference strains, 3,483 species-specific genes, and 1,805 core genes. A gene family analysis revealed more than 2,500 single-copy orthologs. F. equiseti was most closely related to Fusarium pseudograminearum based on a phylogenetic analysis at the whole-genome level.ConclusionsOur comprehensive analysis of the whole genome of F. equiseti provides basic data for studies of gene expression, regulatory and functional mechanisms, evolutionary processes, as well as disease prevention and control.


2021 ◽  
Author(s):  
Phillip C. Burke ◽  
Heungwon Park ◽  
Arvind Rasi Subramaniam

AbstractStability of eukaryotic mRNAs is associated with their codon, amino acid, and GC content. Yet, coding sequence motifs that predictably alter mRNA stability in human cells remain poorly defined. Here, we develop a massively parallel assay to measure mRNA effects of thousands of synthetic and endogenous coding sequence motifs in human cells. We identify several families of simple dipeptide repeats whose translation triggers acute mRNA instability. Rather than individual amino acids, specific combinations of bulky and positively charged amino acids are critical for the destabilizing effects of dipeptide repeats. Remarkably, dipeptide sequences that form extended β strands in silico and in vitro drive ribosome stalling and mRNA instability in vivo. The resulting nascent peptide code underlies ribosome stalling and mRNA-destabilizing effects of hundreds of endogenous peptide sequences in the human proteome. Our work reveals an intrinsic role for the ribosome as a selectivity filter against the synthesis of bulky and aggregation-prone peptides.


2021 ◽  
Author(s):  
Yvain Desplat ◽  
Jacob F Warner ◽  
Jose V Lopez

Abstract Marine sponge transcriptomes are underrepresented in current databases. Furthermore, only two sponge genomes are available for comparative studies. Here we present the assembled and annotated holo-transcriptome of the common Florida reef sponge from the species Cinachyrella alloclada. After Illumina high throughput sequencing, the data assembled using Trinity v2.5 confirmed a highly symbiotic organism, with the complexity of high microbial abundance (HMA) sponges. This dataset is enriched in poly-A selected eukaryotic, rather than microbial transcripts. Overall, 39,813 transcripts with verified sponge sequence homology coded for 8,496 unique proteins. The average sequence length was found to be 946 bp with an N50 sequence length of 1290 bp. Overall, the sponge assembly resulted in a GC content of 51.04%, which is within the range of GC bases in a eukaryotic transcriptome. BUSCO scored completeness analysis revealed a completeness of 60.3% and 60.1% based on the Eukaryota and Metazoa databases, respectively. Overall, this study points to an overarching goal of developing the Cinachyrella alloclada sponge as a useful new experimental model organism.


2021 ◽  
Author(s):  
Johana R. C. Fajardo ◽  
Diethard Tautz

We study the potential for the de novo evolution of genes from random nucleotide sequences using libraries of E. coli expressing random sequence peptides. We assess the effects of such peptides on cell growth by monitoring frequency changes of individual clones in a complex library through four serial passages. Using a new analysis pipeline that allows to trace peptides of all lengths, we find that over half of the peptides have consistent effects on cell growth. Across nine different experiments, around 16 % of clones increase in frequency and 36 % decrease, with some variation between individual experiments. Shorter peptides (8 - 20 residues), are more likely to increase in frequency, longer ones are more likely to decrease. GC content, amino acid composition, intrinsic dis-order and aggregation propensity show slightly different patterns between peptide groups. Sequences that increase in frequency tend to be more disordered with lower aggregation propensity. This coincides with the observation that young genes with more disordered structures are better tolerated in genomes. Our data indicate that random sequences can be a source of evolutionary innovation, since a large fraction of them are well tolerated by the cells or can provide a growth advantage.


Sign in / Sign up

Export Citation Format

Share Document