scholarly journals Structural-Statistical Properties of the Flavivirus Genomes

Author(s):  
Ж.С. Тюлько ◽  
Zh.S. Tyulko

Essential structural-statistical properties of coding regions in the genomes of flaviviruses are investigated on base of the Spectral-statistical approach. Both full-length polyprotein coding sequences and their separated structural segments are analyzed. On the whole, structural-statistical properties of the flavivirus genome sequences are shown to be similar to the properties of 3-regularity and latent triplet profile periodicity revealed earlier in the coding regions of prokaryotic and eukaryotic genomes. However, two-level organization of coding is not occurred in discrete segments coding for structural proteins in the flavivirus genomes and property of sequence homogeneity is manifested in significant part of such the segments. These coding sequence particularities are explained by simple structure and high mutation rate of the flavivirus genomes.

Author(s):  
В.А. Кутыркин ◽  
V.A. Kutyrkin

Structural-statistical characteristics of the coding DNA sequences (CDSs) from human genome are investigated in the frame of spectral-statistical approach (the 2S-approach). Properties of 3-regularity and latent profile periodicity are among of such the characteristics. Special meaning and intrinsic existence of these properties are confirmed by researching the binary recoded CDSs. The only one kind of singular recoding, that identifies complementary nucleotides, serves to persistence of the original CDSs characteristics. Usage of nonsingular binary recoding proves a statement that latent triplet periodicity in the CDSs of human genome belongs to earlier unknown type called as profile periodicity.


2020 ◽  
Author(s):  
Bhaskar Bhadra ◽  
Saakshi Jalali ◽  
Santanu Dasgupta

The outbreak of the infectious and rapidly expanding coronavirus disease 19 (COVID-19) caused by the SARS-CoV-2 virus has led to a devastating effect on public health and the global economy. The daily country-wise updates from the World Health Organization on number of infected cases and death rates show diverse statistics. In this study, we performed a comparative analysis between the COVID-19 death rate and mutation rate for selected structural and non-structural proteins. A total of 7200 genome sequences of SARS-CoV-2 virus from 49 countries were investigated. The mutation rate of specific proteins of SARS-CoV-2 over the last four months (April – July, 2020) was correlated with the death rate across various countries. From our findings, we suggest a significant correlation between the mutation rates of NSP6 and Surface glycoprotein with the death rate. Additional analysis of cumulative mutation rate of these two proteins with the death rate of three major clusters helped us to hypothesize that mutations of these 2 proteins will grow consistently while the death rate would drop below 0.5% by end of 2020 in cluster I countries. Hence, we propose that with the current mutation rate trend, COVID-19 death rate would significantly weaken by the end of this year.


2019 ◽  
Vol 17 (04) ◽  
pp. 386-389
Author(s):  
Miguel Bento ◽  
Sónia Gomes Pereira ◽  
Wanda Viegas ◽  
Manuela Silva

AbstractAssessing durum wheat genomic diversity is crucial in a changing environmental particularly in the Mediterranean region where it is largely used to produce pasta. Durum wheat varieties cultivated in Portugal and previously assessed regarding thermotolerance ability were screened for the variability of coding sequences associated with technological traits and repetitive sequences. As expected, reduced variability was observed regarding low molecular weight glutenin subunits (LMW-GS) but a specific LMW-GS allelic form associated with improved pasta-making characteristics was absent in one variety. Contrastingly, molecular markers targeting repetitive elements like microsatellites and retrotransposons – Inter Simple Sequence Repeat (ISSR) and Inter Retrotransposons Amplified Polymorphism (IRAP) – disclosed significant inter and intra-varietal diversity. This high level of polymorphism was revealed by the 20 distinct ISSR/IRAP concatenated profiles observed among the 23 individuals analysed. Interestingly, median joining networks and PCoA analysis grouped individuals of the same variety and clustered varieties accordingly with geographical origin. Globally, this work demonstrates that durum wheat breeding strategies induced selection pressure for some relevant coding sequences while maintaining high levels of genomic variability in non-coding regions enriched in repetitive sequences.


2020 ◽  
Vol 10 (9) ◽  
pp. 3399-3402 ◽  
Author(s):  
Dae-Kyum Kim ◽  
Jennifer J Knapp ◽  
Da Kuang ◽  
Aditya Chawla ◽  
Patricia Cassonnet ◽  
...  

Abstract The world is facing a global pandemic of COVID-19 caused by the SARS-CoV-2 coronavirus. Here we describe a collection of codon-optimized coding sequences for SARS-CoV-2 cloned into Gateway-compatible entry vectors, which enable rapid transfer into a variety of expression and tagging vectors. The collection is freely available. We hope that widespread availability of this SARS-CoV-2 resource will enable many subsequent molecular studies to better understand the viral life cycle and how to block it.


2018 ◽  
Vol 7 (14) ◽  
Author(s):  
Nikolay V. Volozhantsev ◽  
Angelina A. Kislichkina ◽  
Anastasia I. Lev ◽  
Ekaterina V. Solovieva ◽  
Vera P. Myakinina ◽  
...  

We report here the genome sequences of 10 Klebsiella pneumoniae strains of capsular type K2 isolated in Russia from patients in an infectious clinical hospital and neurosurgical intensive care unit. The draft genome sizes range from 5.34 to 5.87 Mb and include 5,448 to 6,137 protein-coding sequences.


2016 ◽  
Vol 4 (6) ◽  
Author(s):  
Xuehua Wan ◽  
Shaobin Hou ◽  
Kazukuni Hayashi ◽  
James Anderson ◽  
Stuart P. Donachie

Rheinheimera salexigens KH87 T is an obligately halophilic gammaproteobacterium. The strain’s draft genome sequence, generated by the Roche 454 GS FLX+ platform, comprises two scaffolds of ~3.4 Mbp and ~3 kbp, with 3,030 protein-coding sequences and 58 tRNA coding regions. The G+C content is 42 mol%.


BIOPHYSICS ◽  
2019 ◽  
Vol 64 (3) ◽  
pp. 339-348
Author(s):  
Yu. M. Suvorova ◽  
V. M. Pugacheva ◽  
E. V. Korotkov

2015 ◽  
Author(s):  
Malgorzata Habich ◽  
Sergej Djuranovic ◽  
Pawel Szczesny

Recent addition to the repertoire of gene expression regulatory mechanisms are polyadenylate (polyA) tracks encoding for poly-lysine runs in protein sequences. Such tracks stall translation apparatus and induce frameshifting independently of the effects of charged nascent poly-lysine sequence on the ribosome exit channel. As such they substantially influence the stability of mRNA and amount of protein produced from a given transcript. Single base changes in these regions are enough to exert a measurable response on both protein and mRNA abundance, and makes each of these sequences potentially interesting case studies for effects of synonymous mutation, gene dosage balance and natural frameshifting. Here we present the PATACSDB, a resource that contain comprehensive list of polyA tracks from over 250 eukaryotic genomes. Our data is based on Ensembl genomic database of coding sequences and filtered with algorithm of 12A-1 which selects sequences of polyA tracks with a minimal length of 12 A's allowing for one mismatched base. The PATACSDB database is accesible at: http://sysbio.ibb.waw.pl/patacsdb. Source code is available for download from GitHub repository at http://github.com/habich/PATACSDB, including the scripts to recreate the database from the scratch on user's own computer.


2019 ◽  
Author(s):  
Juan C. Villada ◽  
Maria F. Duran ◽  
Patrick K. H. Lee

Understanding the interplay between genotype and phenotype is a fundamental goal of functional genomics. Methane oxidation is a microbial phenotype with global-scale significance as part of the carbon biogeochemical cycle, and is a sink for greenhouse gas. Microorganisms that oxidize methane (methanotrophs) are taxonomically diverse and widespread around the globe. Recent reports have suggested that type Ia methanotrophs are the most prevalent methane-oxidizing bacteria in different environments. In methanotrophic bacteria, complete methane oxidation is encoded in four operons (pmoCAB, mmoXYZBCD, mxaFI, andxoxF), but how evolution has shaped these genes to execute methane oxidation remains poorly understood. Here, we used a genomic meta-analysis to investigate the coding sequences that encode methane oxidation. By analyzing isolate and metagenome-assembled genomes from phylogenetically and geographically diverse sources, we detected an anomalous nucleotide composition bias in the coding sequences of particulate methane monooxygenase genes (pmoCAB) from type Ia methanotrophs around the globe. We found that this was a highly conserved sequence that optimizes codon usage in order to maximize translation efficiency and accuracy, while minimizing the synthesis cost of transcripts and proteins. We show that among the seven types of methanotrophs, only type Ia methanotrophs possess a unique coding sequence of thepmoCABoperon that is under positive selection for optimal resource allocation and efficient synthesis of transcripts and proteins in environmental counter gradients with high oxygen and low methane concentrations. This adaptive trait possibly enables type Ia methanotrophs to respond robustly to fluctuating methane availability and explains their global prevalence.


2020 ◽  
Author(s):  
Anyou Wang ◽  
Rong Hai

AbstractEukaryotic genomes gradually gain noncoding regions when advancing evolution and human genome actively transcribes >90% of its noncoding regions1, suggesting their criticality in evolutionary human genome. Yet <1% of them have been functionally characterized2, leaving most human genome in dark. Here we systematically decode endogenous lncRNAs located in unannotated regions of human genome and decipher a distinctive functional regime of lncRNAs hidden in massive RNAseq data. LncRNAs divergently distribute across chromosomes, independent of protein-coding regions. Their transcriptions barely initiate on promoters through polymerase II, but mostly on enhancers. Yet conventional enhancer activators(e.g. H3K4me1) only account for a small proportion of lncRNA activation, suggesting alternatively unknown mechanisms initiating the majority of lncRNAs. Meanwhile, lncRNA-self regulation also notably contributes to lncRNA activation. LncRNAs trans-regulate broad bioprocesses, including transcription and RNA processing, cell cycle, respiration, response to stress, chromatin organization, post-translational modification, and development. Overall lncRNAs govern their owned regime distinctive from protein’s.


Sign in / Sign up

Export Citation Format

Share Document