scholarly journals Computational Characterization of the mtORF of Pocilloporid Corals: Insights into Protein Structure and Function in Stylophora Lineages from Contrasting Environments

Genes ◽  
2019 ◽  
Vol 10 (5) ◽  
pp. 324
Author(s):  
Banguera-Hinestroza ◽  
Ferrada ◽  
Sawall ◽  
Flot

More than a decade ago, a new mitochondrial Open Reading Frame (mtORF) was discovered in corals of the family Pocilloporidae and has been used since then as an effective barcode for these corals. Recently, mtORF sequencing revealed the existence of two differentiated Stylophora lineages occurring in sympatry along the environmental gradient of the Red Sea (18.5°C to 33.9°C). In the endemic Red Sea lineage RS_LinB, the mtORF and the heat shock protein gene hsp70 uncovered similar phylogeographic patterns strongly correlated with environmental variations. This suggests that the mtORF too might be involved in thermal adaptation. Here, we used computational analyses to explore the features and putative function of this mtORF. In particular, we tested the likelihood that this gene encodes a functional protein and whether it may play a role in adaptation. Analyses of full mitogenomes showed that the mtORF originated in the common ancestor of Madracis and other pocilloporids, and that it encodes a transmembrane protein differing in length and domain architecture among genera. Homology-based annotation and the relative conservation of metal-binding sites revealed traces of an ancient hydrolase catalytic activity. Furthermore, signals of pervasive purifying selection, lack of stop codons in 1830 sequences analyzed, and a codon-usage bias similar to that of other mitochondrial genes indicate that the protein is functional, i.e., not a pseudogene. Other features, such as intrinsically disordered regions, tandem repeats, and signals of positive selection particularly in Stylophora RS_LinB populations, are consistent with a role of the mtORF in adaptive responses to environmental changes.

Genes ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 407 ◽  
Author(s):  
Matteo Delucchi ◽  
Elke Schaper ◽  
Oxana Sachenkova ◽  
Arne Elofsson ◽  
Maria Anisimova

Protein tandem repeats (TRs) are often associated with immunity-related functions and diseases. Since that last census of protein TRs in 1999, the number of curated proteins increased more than seven-fold and new TR prediction methods were published. TRs appear to be enriched with intrinsic disorder and vice versa. The significance and the biological reasons for this association are unknown. Here, we characterize protein TRs across all kingdoms of life and their overlap with intrinsic disorder in unprecedented detail. Using state-of-the-art prediction methods, we estimate that 50.9% of proteins contain at least one TR, often located at the sequence flanks. Positive linear correlation between the proportion of TRs and the protein length was observed universally, with Eukaryotes in general having more TRs, but when the difference in length is taken into account the difference is quite small. TRs were enriched with disorder-promoting amino acids and were inside intrinsically disordered regions. Many such TRs were homorepeats. Our results support that TRs mostly originate by duplication and are involved in essential functions such as transcription processes, structural organization, electron transport and iron-binding. In viruses, TRs are found in proteins essential for virulence.


2021 ◽  
Vol 1 ◽  
Author(s):  
Max A. Verbiest ◽  
Matteo Delucchi ◽  
Tugce Bilgin Sonay ◽  
Maria Anisimova

Short tandem repeats (STRs) are abundant in genomic sequences and are known for comparatively high mutation rates; STRs therefore are thought to be a potent source of genetic diversity. In protein-coding sequences STRs primarily encode disorder-promoting amino acids and are often located in intrinsically disordered regions (IDRs). STRs are frequently studied in the scope of microsatellite instability (MSI) in cancer, with little focus on the connection between protein STRs and IDRs. We believe, however, that this relationship should be explicitly included when ascertaining STR functionality in cancer. Here we explore this notion using all canonical human proteins from SwissProt, wherein we detected 3,699 STRs. Over 80% of these consisted completely of disorder promoting amino acids. 62.1% of amino acids in STR sequences were predicted to also be in an IDR, compared to 14.2% for non-repeat sequences. Over-representation analysis showed STR-containing proteins to be primarily located in the nucleus where they perform protein- and nucleotide-binding functions and regulate gene expression. They were also enriched in cancer-related signaling pathways. Furthermore, we found enrichments of STR-containing proteins among those correlated with patient survival for cancers derived from eight different anatomical sites. Intriguingly, several of these cancer types are not known to have a MSI-high (MSI-H) phenotype, suggesting that protein STRs play a role in cancer pathology in non MSI-H settings. Their intrinsic link with IDRs could therefore be an attractive topic of future research to further explore the role of STRs and IDRs in cancer. We speculate that our observations may be linked to the known dosage-sensitivity of disordered proteins, which could hint at a concentration-dependent gain-of-function mechanism in cancer for proteins containing STRs and IDRs.


2019 ◽  
Vol 47 (W1) ◽  
pp. W373-W378 ◽  
Author(s):  
Damiano Piovesan ◽  
Silvio C E Tosatto

Abstract Our current knowledge of complex biological systems is stored in a computable form through the Gene Ontology (GO) which provides a comprehensive description of genes function. Prediction of GO terms from the sequence remains, however, a challenging task, which is particularly critical for novel genomes. Here we present INGA 2.0, a new version of the INGA software for protein function prediction. INGA exploits homology, domain architecture, interaction networks and information from the ‘dark proteome’, like transmembrane and intrinsically disordered regions, to generate a consensus prediction. INGA was ranked in the top ten methods on both CAFA2 and CAFA3 blind tests. The new algorithm can process entire genomes in a few hours or even less when additional input files are provided. The new interface provides a better user experience by integrating filters and widgets to explore the graph structure of the predicted terms. The INGA web server, databases and benchmarking are available from URL: https://inga.bio.unipd.it/.


Author(s):  
Eulalia Banguera-Hinestroza ◽  
Yvonne Sawall ◽  
Jean-François Flot

More than a decade ago, a new mitochondrial Open Reading Frame (mtORF) was discovered in corals of the family Pocilloporidae, which turn out to be an effective barcode gene for these corals. However, its function remains unknown. Recently, this gene revealed the existence of a hybrid Stylophora lineage (RS_LinA) inhabiting in sympatry along the environmental gradient of the Red Sea (18.5°C to 33.9°C) with its parental species (RS_LinB). Furthermore, in RS_LinB, the mtORF uncovered phylogeographic patterns that were strongly correlated with environmental variations. This was similar to the patterns unraveled by hsp70, suggesting that mtORF too might be involved in thermal adaptation. Here we used computational approaches to characterize the mtORF and to identify its potential role. Results showed that this gene encodes a transmembrane protein (0.97<P< 1.00) involved in transport (0.80<P< 0.87), regulation of metabolic processes (0.70<P<0.85), and likely in the cell-surface receptor signaling pathway (0.56<P<0.80). Predicted protein functions differed among Stylophora lineages and interestingly, in RS_LinB only, the protein was intrinsically disordered and displayed domains involved in cellular complexes and stress response (0.0001< P <0.001). These characteristics, exclusive of an endemic lineage adapted to extreme environmental fluctuations, support a role of the mtORF in stress response, speciation and adaptation.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9669
Author(s):  
Paul M. Harrison

Prions are self-propagating alternative states of protein domains. They are linked to both diseases and functional protein roles in eukaryotes. Prion-forming domains in Saccharomyces cerevisiae are typically domains with high intrinsic protein disorder (i.e., that remain unfolded in the cell during at least some part of their functioning), that are converted to self-replicating amyloid forms. S. cerevisiae is a member of the fungal class Saccharomycetes, during the evolution of which a large population of prion-like domains has appeared. It is still unclear what principles might govern the molecular evolution of prion-forming domains, and intrinsically disordered domains generally. Here, it is discovered that in a set of such prion-forming domains some evolve in the fungal class Saccharomycetes in such a way as to absorb general mutation biases across millions of years, whereas others do not, indicating a spectrum of selection pressures on composition and sequence. Thus, if the bias-absorbing prion formers are conserving a prion-forming capability, then this capability is not interfered with by the absorption of bias changes over the duration of evolutionary epochs. Evidence is discovered for selective constraint against the occurrence of lysine residues (which likely disrupt prion formation) in S. cerevisiae prion-forming domains as they evolve across Saccharomycetes. These results provide a case study of the absorption of mutational trends by compositionally biased domains, and suggest methodology for assessing selection pressures on the composition of intrinsically disordered regions.


2018 ◽  
Author(s):  
Michael Babokhov ◽  
Bradley I. Reinfeld ◽  
Kevin Hackbarth ◽  
Yotam Bentov ◽  
Stephen M. Fuchs

AbstractCopy-number variation in tandem repeat coding regions is more prevalent in eukaryotic genomes than current literature suggests. We have reexamined the genomes of nearly 100 yeast strains looking to map regions of repeat variation. From this analysis we have identified that length variation is highly correlated to intrinsically disordered regions (IDRs). Furthermore, the majority of length variation is associated with tandem repeats. These repetitive regions are rich in homopolymeric amino acid sequences but nearly half of the variation comes from longer-repeating motifs. Comparisons of repeat copy number and sequence between strains of budding yeast as well as closely related fungi suggest selection for and conservation of IDR-related tandem repeats. In some instances, repeat variation has been demonstrated to mediate binding affinity, aggregation, and protein stability. With this analysis, we can identify proteins for which repeat variation may play conserved roles in modulating protein function.


2018 ◽  
Author(s):  
Antonio Deiana ◽  
Sergio Forcelloni ◽  
Alessandro Porrello ◽  
Andrea Giansanti

ABSTRACTWe propose a new, sequence-only, classification of intrinsically disordered human proteins which is based on two parameters: dr, the percentage of disordered residues, and Ld, the length of the longest disordered segment in the sequence. Depending on dr and Ld, we distinguish five variants: i) ordered proteins (ORDs); ii) not disordered proteins (NDPsj; (iii) proteins with intrinsically disordered regions (PDRs); iv) intrinsically disordered proteins (IDPs) and v) proteins with fragmented disorder (FRAGs). PDRs have been considered in the general category of intrinsically disordered proteins for a long time. We show that PDRs are closer to globular, ordered proteins (ORDs and NDPs) than to disordered ones (IDPs), both in amino acid composition and functionally. Moreover, NDPs and PDRs are uniformly spread over several functional protein classes, whereas IDPs are concentrated only on two, namely nucleic acid binding proteins and transcription factors, which are just a subset of the functions that are commonly associated with protein intrinsic disorder. As a conclusion, PDRs and IDPs should be considered, in future classifications, as distinct variants of disordered proteins, with different physical-chemical properties and functional spectra.


2021 ◽  
Vol 22 (7) ◽  
pp. 3300
Author(s):  
Jing Yang ◽  
Ying Cao ◽  
Ligeng Ma

Most protein-coding genes in eukaryotes possess at least two poly(A) sites, and alternative polyadenylation is considered a contributing factor to transcriptomic and proteomic diversity. Following transcription, a nascent RNA usually undergoes capping, splicing, cleavage, and polyadenylation, resulting in a mature messenger RNA (mRNA); however, increasing evidence suggests that transcription and RNA processing are coupled. Plants, which must produce rapid responses to environmental changes because of their limited mobility, exhibit such coupling. In this review, we summarize recent advances in our understanding of the coupling of transcription with RNA processing in plants, and we describe the possible spatial environment and important proteins involved. Moreover, we describe how liquid–liquid phase separation, mediated by the C-terminal domain of RNA polymerase II and RNA processing factors with intrinsically disordered regions, enables efficient co-transcriptional mRNA processing in plants.


Sign in / Sign up

Export Citation Format

Share Document