Amino-acid site variability among natural and designed proteins

10.7287/peerj.preprints.74v1 ◽

2013 ◽

Author(s):

Eleisha L. Jackson ◽

Noah Ollikainen ◽

Arthur W. Covert III ◽

Tanja Kortemme ◽

Claus O. Wilke

Keyword(s):

Amino Acid ◽

Protein Design ◽

Protein Sequences ◽

Structural Constraints ◽

Scoring Functions ◽

Solvent Exposure ◽

Backbone Flexibility ◽

Hydrophobic Residues ◽

Designed Proteins ◽

Site Variability

Computational protein design attempts to create protein sequences that fold stably into pre-specified structures. Here we compare alignments of designed proteins to alignments of natural proteins and assess how closely designed sequences recapitulate patterns of sequence variation found in natural protein sequences. We design proteins using RosettaDesign, and we evaluate both fixed-backbone designs and variable-backbone designs with different amounts of backbone flexibility. We find that proteins designed with a fixed backbone tend to underestimate the amount of site variability observed in natural proteins while proteins designed with an intermediate amount of backbone flexibility result in more realistic site variability. Further, the correlation between solvent exposure and site variability in designed proteins is lower than that in natural proteins. This finding suggests that site variability is too uniform across different solvent exposure states (i.e., buried residues are too variable or exposed residues too conserved). When comparing the amino acid frequencies in the designed proteins with those in natural proteins we find that in the designed proteins hydrophobic residues are underrepresented in the core. From these results we conclude that intermediate backbone flexibility during design results in more accurate protein design and that either scoring functions or backbone sampling methods require further improvement to accurately replicate structural constraints on site variability.

Download Full-text

Computational Protein Design Quantifies Structural Constraints on Amino Acid Covariation

PLoS Computational Biology ◽

10.1371/journal.pcbi.1003313 ◽

2013 ◽

Vol 9 (11) ◽

pp. e1003313 ◽

Cited By ~ 24

Author(s):

Noah Ollikainen ◽

Tanja Kortemme

Keyword(s):

Amino Acid ◽

Protein Design ◽

Computational Protein Design ◽

Structural Constraints

Download Full-text

Improving folding properties of computationally designed proteins

Protein Engineering Design and Selection ◽

10.1093/protein/gzz025 ◽

2019 ◽

Vol 32 (3) ◽

pp. 145-151

Author(s):

Benjamin Bjerre ◽

Jakob Nissen ◽

Mikkel Madsen ◽

Jūratė Fahrig-Kamarauskaitė ◽

Rasmus K Norrild ◽

...

Keyword(s):

Amino Acid ◽

Protein Design ◽

Genetic Selection ◽

Single Amino Acid ◽

Structured Design ◽

Designed Proteins ◽

Improved Stability ◽

The Difference ◽

Essential Enzyme ◽

Selection Of

Abstract While the field of computational protein design has witnessed amazing progression in recent years, folding properties still constitute a significant barrier towards designing new and larger proteins. In order to assess and improve folding properties of designed proteins, we have developed a genetics-based folding assay and selection system based on the essential enzyme, orotate phosphoribosyl transferase from Escherichia coli. This system allows for both screening of candidate designs with good folding properties and genetic selection of improved designs. Thus, we identified single amino acid substitutions in two failed designs that rescued poorly folding and unstable proteins. Furthermore, when these substitutions were transferred into a well-structured design featuring a complex folding profile, the resulting protein exhibited native-like cooperative folding with significantly improved stability. In protein design, a single amino acid can make the difference between folding and misfolding, and this approach provides a useful new platform to identify and improve candidate designs.

Download Full-text

Coupling backbone flexibility and amino acid sequence selection in protein design

Protein Science ◽

10.1002/pro.5560060810 ◽

1997 ◽

Vol 6 (8) ◽

pp. 1701-1707 ◽

Cited By ~ 77

Author(s):

Alyce Su ◽

Stephen L. Mayo

Keyword(s):

Amino Acid ◽

Amino Acid Sequence ◽

Protein Design ◽

Backbone Flexibility ◽

Sequence Selection

Download Full-text

Frequencies of amino acid strings in globular protein sequences indicate suppression of blocks of consecutive hydrophobic residues

Protein Science ◽

10.1110/ps.33201 ◽

2001 ◽

Vol 10 (5) ◽

pp. 1023-1031 ◽

Cited By ~ 63

Author(s):

Russell Schwartz ◽

Sorin Istrail ◽

Jonathan King

Keyword(s):

Amino Acid ◽

Protein Sequences ◽

Globular Protein ◽

Hydrophobic Residues

Download Full-text

Peer Review #1 of "Amino-acid site variability among natural and designed proteins (v0.1)"

10.7287/peerj.211v0.1/reviews/1 ◽

2013 ◽

Keyword(s):

Amino Acid ◽

Peer Review ◽

Acid Site ◽

Amino Acid Site ◽

Designed Proteins ◽

Site Variability

Download Full-text

Peer Review #2 of "Amino-acid site variability among natural and designed proteins (v0.1)"

10.7287/peerj.211v0.1/reviews/2 ◽

2013 ◽

Keyword(s):

Amino Acid ◽

Peer Review ◽

Acid Site ◽

Amino Acid Site ◽

Designed Proteins ◽

Site Variability

Download Full-text

Amino-acid site variability among natural and designed proteins

PeerJ ◽

10.7717/peerj.211 ◽

2013 ◽

Vol 1 ◽

pp. e211 ◽

Cited By ~ 16

Author(s):

Eleisha L. Jackson ◽

Noah Ollikainen ◽

Arthur W. Covert ◽

Tanja Kortemme ◽

Claus O. Wilke

Keyword(s):

Amino Acid ◽

Acid Site ◽

Amino Acid Site ◽

Designed Proteins ◽

Site Variability

Download Full-text

Computational Analysis of Therapeutic Enzyme Uricase from Different Source Organisms

Current Proteomics ◽

10.2174/1570164616666190617165107 ◽

2020 ◽

Vol 17 (1) ◽

pp. 59-77

Author(s):

Anand Kumar Nelapati ◽

JagadeeshBabu PonnanEttiyappan

Keyword(s):

Uric Acid ◽

Amino Acid ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Protein Sequences ◽

Amino Acid Sequences ◽

Amino Acid Residues ◽

Multiple Sequence ◽

Physiochemical Properties ◽

Pharmaceutical Industries

Background:Hyperuricemia and gout are the conditions, which is a response of accumulation of uric acid in the blood and urine. Uric acid is the product of purine metabolic pathway in humans. Uricase is a therapeutic enzyme that can enzymatically reduces the concentration of uric acid in serum and urine into more a soluble allantoin. Uricases are widely available in several sources like bacteria, fungi, yeast, plants and animals.Objective:The present study is aimed at elucidating the structure and physiochemical properties of uricase by insilico analysis.Methods:A total number of sixty amino acid sequences of uricase belongs to different sources were obtained from NCBI and different analysis like Multiple Sequence Alignment (MSA), homology search, phylogenetic relation, motif search, domain architecture and physiochemical properties including pI, EC, Ai, Ii, and were performed.Results:Multiple sequence alignment of all the selected protein sequences has exhibited distinct difference between bacterial, fungal, plant and animal sources based on the position-specific existence of conserved amino acid residues. The maximum homology of all the selected protein sequences is between 51-388. In singular category, homology is between 16-337 for bacterial uricase, 14-339 for fungal uricase, 12-317 for plants uricase, and 37-361 for animals uricase. The phylogenetic tree constructed based on the amino acid sequences disclosed clusters indicating that uricase is from different source. The physiochemical features revealed that the uricase amino acid residues are in between 300- 338 with a molecular weight as 33-39kDa and theoretical pI ranging from 4.95-8.88. The amino acid composition results showed that valine amino acid has a high average frequency of 8.79 percentage compared to different amino acids in all analyzed species.Conclusion:In the area of bioinformatics field, this work might be informative and a stepping-stone to other researchers to get an idea about the physicochemical features, evolutionary history and structural motifs of uricase that can be widely used in biotechnological and pharmaceutical industries. Therefore, the proposed in silico analysis can be considered for protein engineering work, as well as for gout therapy.

Download Full-text

iAFP-gap-SMOTE: An Efficient Feature Extraction Scheme Gapped Dipeptide Composition is Coupled with an Oversampling Technique for Identification of Antifreeze Proteins

Letters in Organic Chemistry ◽

10.2174/1570178615666180816101653 ◽

2019 ◽

Vol 16 (4) ◽

pp. 294-302 ◽

Cited By ~ 6

Author(s):

Shahid Akbar ◽

Maqsood Hayat ◽

Muhammad Kabir ◽

Muhammad Iqbal

Keyword(s):

Feature Extraction ◽

Amino Acid ◽

Antifreeze Proteins ◽

Protein Sequences ◽

Sampling Technique ◽

Lower Class ◽

Success Rates ◽

Throughput Model ◽

Extraction Scheme ◽

Living Organisms

Antifreeze proteins (AFPs) perform distinguishable roles in maintaining homeostatic conditions of living organisms and protect their cell and body from freezing in extremely cold conditions. Owing to high diversity in protein sequences and structures, the discrimination of AFPs from non- AFPs through experimental approaches is expensive and lengthy. It is, therefore, vastly desirable to propose a computational intelligent and high throughput model that truly reflects AFPs quickly and accurately. In a sequel, a new predictor called “iAFP-gap-SMOTE” is proposed for the identification of AFPs. Protein sequences are expressed by adopting three numerical feature extraction schemes namely; Split Amino Acid Composition, G-gap di-peptide Composition and Reduce Amino Acid alphabet composition. Usually, classification hypothesis biased towards majority class in case of the imbalanced dataset. Oversampling technique Synthetic Minority Over-sampling Technique is employed in order to increase the instances of the lower class and control the biasness. 10-fold cross-validation test is applied to appraise the success rates of “iAFP-gap-SMOTE” model. After the empirical investigation, “iAFP-gap-SMOTE” model obtained 95.02% accuracy. The comparison suggested that the accuracy of” iAFP-gap-SMOTE” model is higher than that of the present techniques in the literature so far. It is greatly recommended that our proposed model “iAFP-gap-SMOTE” might be helpful for the research community and academia.

Download Full-text