Folding Rate Optimization Promotes Frustrated Interactions in Entangled Protein Structures

Federico Norbiato; Flavio Seno; Antonio Trovato; Marco Baiesi

doi:10.3390/ijms21010213

Folding Rate Optimization Promotes Frustrated Interactions in Entangled Protein Structures

International Journal of Molecular Sciences ◽

10.3390/ijms21010213 ◽

2019 ◽

Vol 21 (1) ◽

pp. 213

Author(s):

Federico Norbiato ◽

Flavio Seno ◽

Antonio Trovato ◽

Marco Baiesi

Keyword(s):

Amino Acids ◽

Structural Biology ◽

Weak Interactions ◽

Protein Structures ◽

Protein Sequences ◽

Control Case ◽

Folding Rate ◽

Empirical Observation ◽

Rate Optimization ◽

Kinetic Traps

Many native structures of proteins accomodate complex topological motifs such as knots, lassos, and other geometrical entanglements. How proteins can fold quickly even in the presence of such topological obstacles is a debated question in structural biology. Recently, the hypothesis that energetic frustration might be a mechanism to avoid topological frustration has been put forward based on the empirical observation that loops involved in entanglements are stabilized by weak interactions between amino-acids at their extrema. To verify this idea, we use a toy lattice model for the folding of proteins into two almost identical structures, one entangled and one not. As expected, the folding time is longer when random sequences folds into the entangled structure. This holds also under an evolutionary pressure simulated by optimizing the folding time. It turns out that optmized protein sequences in the entangled structure are in fact characterized by frustrated interactions at the closures of entangled loops. This phenomenon is much less enhanced in the control case where the entanglement is not present. Our findings, which are in agreement with experimental observations, corroborate the idea that an evolutionary pressure shapes the folding funnel to avoid topological and kinetic traps.

Download Full-text

Analysis of Oncogene Protein Structure Using Small World Network Concept

Current Bioinformatics ◽

10.2174/1574893614666191113143840 ◽

2020 ◽

Vol 15 (7) ◽

pp. 732-740

Author(s):

Neetu Kumari ◽

Anshul Verma

Keyword(s):

Amino Acids ◽

Protein Structure ◽

Degree Distribution ◽

Protein Structures ◽

Small World ◽

Extreme Condition ◽

Centrality Measures ◽

Small World Network ◽

Network Concept ◽

Oncogene Protein

Background: The basic building block of a body is protein which is a complex system whose structure plays a key role in activation, catalysis, messaging and disease states. Therefore, careful investigation of protein structure is necessary for the diagnosis of diseases and for the drug designing. Protein structures are described at their different levels of complexity: primary (chain), secondary (helical), tertiary (3D), and quaternary structure. Analyzing complex 3D structure of protein is a difficult task but it can be analyzed as a network of interconnection between its component, where amino acids are considered as nodes and interconnection between them are edges. Objective: Many literature works have proven that the small world network concept provides many new opportunities to investigate network of biological systems. The objective of this paper is analyzing the protein structure using small world concept. Methods: Protein is analyzed using small world network concept, specifically where extreme condition is having a degree distribution which follows power law. For the correct verification of the proposed approach, dataset of the Oncogene protein structure is analyzed using Python programming. Results: Protein structure is plotted as network of amino acids (Residue Interaction Graph (RIG)) using distance matrix of nodes with given threshold, then various centrality measures (i.e., degree distribution, Degree-Betweenness correlation, and Betweenness-Closeness correlation) are calculated for 1323 nodes and graphs are plotted. Conclusion: Ultimately, it is concluded that there exist hubs with higher centrality degree but less in number, and they are expected to be robust toward harmful effects of mutations with new functions.

Download Full-text

A Study on Host Tropism Determinants of Influenza Virus Using Machine Learning

Current Bioinformatics ◽

10.2174/1574893614666191104160927 ◽

2020 ◽

Vol 15 (2) ◽

pp. 121-134 ◽

Cited By ~ 2

Author(s):

Eunmi Kwon ◽

Myeongji Cho ◽

Hayeon Kim ◽

Hyeon S. Son

Keyword(s):

Machine Learning ◽

Amino Acids ◽

Influenza Virus ◽

Random Forest ◽

Physicochemical Properties ◽

Protein Sequences ◽

Influenza Viruses ◽

Host Tropism ◽

Post Hoc ◽

Ha Protein

Background: The host tropism determinants of influenza virus, which cause changes in the host range and increase the likelihood of interaction with specific hosts, are critical for understanding the infection and propagation of the virus in diverse host species. Methods: Six types of protein sequences of influenza viral strains isolated from three classes of hosts (avian, human, and swine) were obtained. Random forest, naïve Bayes classification, and knearest neighbor algorithms were used for host classification. The Java language was used for sequence analysis programming and identifying host-specific position markers. Results: A machine learning technique was explored to derive the physicochemical properties of amino acids used in host classification and prediction. HA protein was found to play the most important role in determining host tropism of the influenza virus, and the random forest method yielded the highest accuracy in host prediction. Conserved amino acids that exhibited host-specific differences were also selected and verified, and they were found to be useful position markers for host classification. Finally, ANOVA analysis and post-hoc testing revealed that the physicochemical properties of amino acids, comprising protein sequences combined with position markers, differed significantly among hosts. Conclusion: The host tropism determinants and position markers described in this study can be used in related research to classify, identify, and predict the hosts of influenza viruses that are currently susceptible or likely to be infected in the future.

Download Full-text

Determinants of adenine-mutagenesis in diversity-generating retroelements

Nucleic Acids Research ◽

10.1093/nar/gkaa1240 ◽

2020 ◽

Author(s):

Sumit Handa ◽

Andres Reyna ◽

Timothy Wiryaman ◽

Partho Ghosh

Keyword(s):

Amino Acids ◽

Dark Matter ◽

Reverse Transcription ◽

Genetic Information ◽

Human Microbiome ◽

Protein Sequences ◽

Catalytic Efficiency ◽

Natural World ◽

In Vitro System

Abstract Diversity-generating retroelements (DGRs) vary protein sequences to the greatest extent known in the natural world. These elements are encoded by constituents of the human microbiome and the microbial ‘dark matter’. Variation occurs through adenine-mutagenesis, in which genetic information in RNA is reverse transcribed faithfully to cDNA for all template bases but adenine. We investigated the determinants of adenine-mutagenesis in the prototypical Bordetella bacteriophage DGR through an in vitro system composed of the reverse transcriptase bRT, Avd protein, and a specific RNA. We found that the catalytic efficiency for correct incorporation during reverse transcription by the bRT-Avd complex was strikingly low for all template bases, with the lowest occurring for adenine. Misincorporation across a template adenine was only somewhat lower in efficiency than correct incorporation. We found that the C6, but not the N1 or C2, purine substituent was a key determinant of adenine-mutagenesis. bRT-Avd was insensitive to the C6 amine of adenine but recognized the C6 carbonyl of guanine. We also identified two bRT amino acids predicted to nonspecifically contact incoming dNTPs, R74 and I181, as promoters of adenine-mutagenesis. Our results suggest that the overall low catalytic efficiency of bRT-Avd is intimately tied to its ability to carry out adenine-mutagenesis.

Download Full-text

Protein Structures-based Neighborhood Analysis vs Preferential Interactions Between the Special Pairs of Amino acids?

Journal of Biomolecular Structure and Dynamics ◽

10.1080/073911011010524968 ◽

2011 ◽

Vol 28 (4) ◽

pp. 629-632 ◽

Cited By ~ 3

Author(s):

Jihua Wang ◽

Zanxia Cao ◽

Jiafeng Yu

Keyword(s):

Amino Acids ◽

Protein Structures ◽

Neighborhood Analysis ◽

Preferential Interactions

Download Full-text

Beyond History: The List of The Most Well Studied Human Protein Structures

10.20944/preprints202008.0655.v1 ◽

2020 ◽

Author(s):

Zhenlu Li ◽

Matthias Buck

Keyword(s):

Protein Structures ◽

Protein Sequences ◽

Human Protein ◽

Current Status ◽

Protein Database ◽

X Ray ◽

X Ray Crystallography ◽

Protein Biophysics ◽

The Relationship ◽

Past Trend

Of 20,000 or so canonical human protein sequences, as of July 2020, 6,747 proteins have had their full or partial medium to high-resolution structures determined by x-ray crystallography or other methods. Which of these proteins dominate the protein database (the PDB) and why? In this paper, we list the 272 top protein structures based on the number of their PDB depositions. This set of proteins accounts for more than 40% of all available human PDB entries and represent past trend and current status for protein science. We briefly discuss the relationship which some of the prominent protein structures have with protein biophysics research and mention their relevance to human diseases. The information may inspire researchers who are new to protein science, but it also provides a year 2020 snap-shot for the state of protein science.

Download Full-text

Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences

Frontiers in Molecular Biosciences ◽

10.3389/fmolb.2020.626363 ◽

2021 ◽

Vol 7 ◽

Author(s):

Castrense Savojardo ◽

Matteo Manfredi ◽

Pier Luigi Martelli ◽

Rita Casadio

Keyword(s):

Solvent Accessibility ◽

Protein Structures ◽

Three Dimensional ◽

Protein Sequences ◽

Large Data ◽

Human Protein ◽

Dimensional Structure ◽

Wild Type ◽

Solvent Exposure ◽

Data Set

Solvent accessibility (SASA) is a key feature of proteins for determining their folding and stability. SASA is computed from protein structures with different algorithms, and from protein sequences with machine-learning based approaches trained on solved structures. Here we ask the question as to which extent solvent exposure of residues can be associated to the pathogenicity of the variation. By this, SASA of the wild-type residue acquires a role in the context of functional annotation of protein single-residue variations (SRVs). By mapping variations on a curated database of human protein structures, we found that residues targeted by disease related SRVs are less accessible to solvent than residues involved in polymorphisms. The disease association is not evenly distributed among the different residue types: SRVs targeting glycine, tryptophan, tyrosine, and cysteine are more frequently disease associated than others. For all residues, the proportion of disease related SRVs largely increases when the wild-type residue is buried and decreases when it is exposed. The extent of the increase depends on the residue type. With the aid of an in house developed predictor, based on a deep learning procedure and performing at the state-of-the-art, we are able to confirm the above tendency by analyzing a large data set of residues subjected to variations and occurring in some 12,494 human protein sequences still lacking three-dimensional structure (derived from HUMSAVAR). Our data support the notion that surface accessible area is a distinguished property of residues that undergo variation and that pathogenicity is more frequently associated to the buried property than to the exposed one.

Download Full-text

Thermodynamic stereoselectivity assisted by weak interactions in metal complexes. Copper(II) ternary complexes of cyclo-L-histidyl-L-histidine and L- or D-amino acids in aqueous solution

Journal of the Chemical Society Dalton Transactions ◽

10.1039/dt9910003203 ◽

1991 ◽

pp. 3203 ◽

Cited By ~ 7

Author(s):

Giuseppe Arena ◽

Raffaele P. Bonomo ◽

Luigi Casella ◽

Michele Gullotti ◽

Giuseppe Impellizzeri ◽

...

Keyword(s):

Aqueous Solution ◽

Amino Acids ◽

Metal Complexes ◽

Weak Interactions ◽

Ternary Complexes

Download Full-text

3. Proteins

Biochemistry: A Very Short Introduction ◽

10.1093/actrade/9780198833871.003.0003 ◽

2021 ◽

pp. 34-51

Author(s):

Mark Lorch

Keyword(s):

Amino Acids ◽

Protein Folding ◽

Protein Structure ◽

Protein Structures ◽

Structure And Function ◽

Vast Array ◽

A Cell ◽

Cellular Machinery ◽

And Function ◽

The Relationship

This chapter examines proteins, the dominant proportion of cellular machinery, and the relationship between protein structure and function. The multitude of biological processes needed to keep cells functioning are managed in the organism or cell by a massive cohort of proteins, together known as the proteome. The twenty amino acids that make up the bulk of proteins produce the vast array of protein structures. However, amino acids alone do not provide quite enough chemical variety to complete all of the biochemical activity of a cell, so the chapter also explores post-translation modifications. It finishes by looking as some dynamic aspects of proteins, including enzyme kinetics and the protein folding problem.

Download Full-text

BIOPEP-UWM Database of Bioactive Peptides: Current Opportunities

International Journal of Molecular Sciences ◽

10.3390/ijms20235978 ◽

2019 ◽

Vol 20 (23) ◽

pp. 5978 ◽

Cited By ~ 49

Author(s):

Minkiewicz ◽

Iwaniak ◽

Darewicz

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Chronic Diseases ◽

Bioactive Peptides ◽

Protein Sequences ◽

Batch Processing ◽

Amino Acid Sequences ◽

Quantitative Parameters ◽

New Information

The BIOPEP-UWM™ database of bioactive peptides (formerly BIOPEP) has recently become a popular tool in the research on bioactive peptides, especially on these derived from foods and being constituents of diets that prevent development of chronic diseases. The database is continuously updated and modified. The addition of new peptides and the introduction of new information about the existing ones (e.g., chemical codes and references to other databases) is in progress. New opportunities include the possibility of annotating peptides containing D-enantiomers of amino acids, batch processing option, converting amino acid sequences into SMILES code, new quantitative parameters characterizing the presence of bioactive fragments in protein sequences, and finding proteinases that release particular peptides.

Download Full-text

Diverse protein assembly driven by metal and chelating amino acids with selectivity and tunability

Nature Communications ◽

10.1038/s41467-019-13491-w ◽

2019 ◽

Vol 10 (1) ◽

Cited By ~ 10

Author(s):

Minwoo Yang ◽

Woon Ju Song

Keyword(s):

Amino Acids ◽

Self Assembly ◽

Protein Structures ◽

Functional Materials ◽

Building Blocks ◽

Unnatural Amino Acid ◽

Metal Coordination ◽

Specific Chemical ◽

Natural Building ◽

Kinetics Of

AbstractProteins are versatile natural building blocks with highly complex and multifunctional architectures, and self-assembled protein structures have been created by the introduction of covalent, noncovalent, or metal-coordination bonding. Here, we report the robust, selective, and reversible metal coordination properties of unnatural chelating amino acids as the sufficient and dominant driving force for diverse protein self-assembly. Bipyridine-alanine is genetically incorporated into a D3 homohexamer. Depending on the position of the unnatural amino acid, 1-directional, crystalline and noncrystalline 2-directional, combinatory, and hierarchical architectures are effectively created upon the addition of metal ions. The length and shape of the structures is tunable by altering conditions related to thermodynamics and kinetics of metal-coordination and subsequent reactions. The crystalline 1-directional and 2-directional biomaterials retain their native enzymatic activities with increased thermal stability, suggesting that introducing chelating ligands provides a specific chemical basis to synthesize diverse protein-based functional materials while retaining their native structures and functions.

Download Full-text