scholarly journals Accurate prediction of transmembrane β-barrel proteins from sequences

2014 ◽  
Author(s):  
Sikander Hayat ◽  
Chris Sander ◽  
Arne Elofsson ◽  
Debora S. Marks

AbstractTransmembrane β-barrels are known to play major roles in substrate transport and protein biogenesis in gram-negative bacteria, chloroplasts and mitochondria. However, the exact number of transmembrane β-barrel families is unknown and experimental structure determination is challenging. In theory, if one knows the number of strands in the β-barrel, then the 3D structure of the barrel could be trivial, but current topology predictions do not predict accurate structures and are unable to give information beyond the β-strands in the barrel. Recent work has shown successful prediction of globular and alpha-helical membrane proteins from sequence alignments, by using high ranked evolutionary couplings between residues as distance constraints to fold extended polypeptides. However, these methods, have not addressed the calculation of precise β-sheet hydrogen bonding that defines transmembrane β-barrels, and would be required to fold these proteins successfully. Hence we developed a method (EVFold_BB) that can successfully model transmembrane β-barrels by combining evolutionary couplings together with topology predictions. EVFold_BB is validated by the accurate all-atom 3D modeling of 18 proteins, representing all known membrane β-barrel families that have sufficient sequences available. To demonstrate the potential of our approach we predict the unknown 3D structure of the LptD protein, the plausibility of its accuracy is supported by the blindly predicted benchmarks, and is consistent with experimental observations. Our approach can naturally be extended to all unknown β-barrel proteins with sufficient sequence information.SignificanceEVFold_BB predicts fast, accurate 3D models of large membrane β-barrels that are notoriously hard to solve experimentally. The major advance is the use of evolutionary couplings from sequence alignments together with the β-strand prediction to ascertain accurate hydrogen bond between theβ-strands that gives rise to the canonical barrel shapes. The method will enable biological research into outer-membrane proteins.

2015 ◽  
Vol 112 (17) ◽  
pp. 5413-5418 ◽  
Author(s):  
Sikander Hayat ◽  
Chris Sander ◽  
Debora S. Marks ◽  
Arne Elofsson

Transmembrane β-barrels (TMBs) carry out major functions in substrate transport and protein biogenesis but experimental determination of their 3D structure is challenging. Encouraged by successful de novo 3D structure prediction of globular and α-helical membrane proteins from sequence alignments alone, we developed an approach to predict the 3D structure of TMBs. The approach combines the maximum-entropy evolutionary coupling method for predicting residue contacts (EVfold) with a machine-learning approach (boctopus2) for predicting β-strands in the barrel. In a blinded test for 19 TMB proteins of known structure that have a sufficient number of diverse homologous sequences available, this combined method (EVfold_bb) predicts hydrogen-bonded residue pairs between adjacent β-strands at an accuracy of ∼70%. This accuracy is sufficient for the generation of all-atom 3D models. In the transmembrane barrel region, the average 3D structure accuracy [template-modeling (TM) score] of top-ranked models is 0.54 (ranging from 0.36 to 0.85), with a higher (44%) number of residue pairs in correct strand–strand registration than in earlier methods (18%). Although the nonbarrel regions are predicted less accurately overall, the evolutionary couplings identify some highly constrained loop residues and, for FecA protein, the barrel including the structure of a plug domain can be accurately modeled (TM score = 0.68). Lower prediction accuracy tends to be associated with insufficient sequence information and we therefore expect increasing numbers of β-barrel families to become accessible to accurate 3D structure prediction as the number of available sequences increases.


2015 ◽  
Author(s):  
Robert Sheridan ◽  
Robert J. Fieldhouse ◽  
Sikander Hayat ◽  
Yichao Sun ◽  
Yevgeniy Antipin ◽  
...  

Recently developed maximum entropy methods infer evolutionary constraints on protein function and structure from the millions of protein sequences available in genomic databases. The EVfold web server (at EVfold.org) makes these methods available to predict functional and structural interactions in proteins. The key algorithmic development has been to disentangle direct and indirect residue-residue correlations in large multiple sequence alignments and derive direct residue-residue evolutionary couplings (EVcouplings or ECs). For proteins of unknown structure, distance constraints obtained from evolutionarily couplings between residue pairs are used to de novo predict all-atom 3D structures, often to good accuracy. Given sufficient sequence information in a protein family, this is a major advance toward solving the problem of computing the native 3D fold of proteins from sequence information alone. Availability: EVfold server at http://evfold.org/ Contact: [email protected]


Author(s):  
Kang Mo Lee ◽  
Seung-Hak Cho ◽  
Cheorl-Ho Kim ◽  
Jong Hyun Kim ◽  
Sung Soon Kim

Objectives: Lectin-like adhesins of enteric bacterial pathogens such as Escherichia coli are an attractive target for vaccine or drug development. Here, we have developed e-Membranome as a database of genome-wide putative adhesins in Escherichia coli (E. coli). Methods: The outer membrane adhesins were predicted from the annotated genes of Escherichia coli strains using the PSORTb program. Further analysis was performed using Interproscan and the String database. The candidate proteins can be investigated for homology modeling of the three-dimensional (3D) structure (I-TASSER version 5.1), epitope region (ABCpred), and the glycan array. Results: e-Membranome is implemented using the Django (version 2.2.5) framework. The Web Application Server Apache Tomcat 6.0 is integrated in the platform on Ubuntu Linux (version 16.04). MySQL database (version 5.7) is used as a database engine. The information of homology model of the 3D structure, epitope region, and affinity information from the glycan array will be stored in the e-Membranome database. As a case study, we performed a genome-wide screening of outer membrane-embedded proteins from the annotated genes of E. coli using the e-Membranome pipeline. Conclusion: This platform is expected to be a valuable resource for advancing research of outer membrane proteins for the construction of lectin-glycan interaction network of E. coli. In addition, the e-Membranome pipeline can be extended to other similar biological systems that need to address host-pathogen interactions.


2015 ◽  
Author(s):  
Caleb Weinreb ◽  
Torsten Gross ◽  
Chris Sander ◽  
Debora S Marks

Non-protein-coding RNAs are ubiquitous in cell physiology, with a diverse repertoire of known functions. In fact, the majority of the eukaryotic genome does not code for proteins, and thousands of conserved long non-protein-coding RNAs of currently unkown function have been identified. When available, knowledge of their 3D structure is very helpful in elucidating the function of these RNAs. However, despite some outstanding structure elucidation of RNAs using X-ray crystallography, NMR and cryoEM, learning RNA 3D structures remains low-throughput. RNA structure prediction in silico is a promising alternative approach and works well for double-helical stems, but full 3D structure determination requires tertiary contacts outside of secondary structures that are difficult to infer from sequence information. Here, based only on information from RNA multiple sequence alignments, we use a global statistical sequence probability model of co-variation in a pairs of nucleotide positions to detect 3D contacts, in analogy to recently developed breakthrough methods for computational protein folding. In blinded tests on 22 known RNA structures ranging in size from 65 to 1800 nucleotides, the predicted contacts matched physical nucleotide interactions with 65-95% true positive prediction accuracy. Importantly, we infer many long-range tertiary contacts, including non-Watson-Crick interactions, where secondary structure elements assemble in 3D. When used as restraints in molecular dynamics simulations, the inferred contacts improve RNA 3D structure prediction to a coordinate error as low as 6 to 10 angstrom rmsd deviation in atom positions, with potential for further refinement by molecular dynamics. These contacts include functionally important interactions, such as those that distinguish the active and inactive conformations of four riboswitches. In blind prediction mode, we present evolutionary couplings suitable for folding simulations for 180 RNAs of unknown structure, available at https://marks.hms.harvard.edu/ev_rna/. We anticipate that this approach can help shed light on the structure and function of non-protein-coding RNAs as well as 3D-structured mRNAs.


2002 ◽  
Vol 70 (5) ◽  
pp. 2311-2318 ◽  
Author(s):  
Paul A. Cullen ◽  
Stuart J. Cordwell ◽  
Dieter M. Bulach ◽  
David A. Haake ◽  
Ben Adler

ABSTRACT Recombinant leptospiral outer membrane proteins (OMPs) can elicit immunity to leptospirosis in a hamster infection model. Previously characterized OMPs appear highly conserved, and thus their potential to stimulate heterologous immunity is of critical importance. In this study we undertook a global analysis of leptospiral OMPs, which were obtained by Triton X-114 extraction and phase partitioning. Outer membrane fractions were isolated from Leptospira interrogans serovar Lai grown at 20, 30, and 37°C with or without 10% fetal calf serum and, finally, in iron-depleted medium. The OMPs were separated by two-dimensional gel electrophoresis. Gel patterns from each of the five conditions were compared via image analysis, and 37 gel-purified proteins were tryptically digested and characterized by mass spectrometry (MS). Matrix-assisted laser desorption ionization-time-of-flight MS was used to rapidly identify leptospiral OMPs present in sequence databases. Proteins identified by this approach included the outer membrane lipoproteins LipL32, LipL36, LipL41, and LipL48. No known proteins from any cellular location other than the outer membrane were identified. Tandem electrospray MS was used to obtain peptide sequence information from eight novel proteins designated pL18, pL21, pL22, pL24, pL45, pL47/49, pL50, and pL55. The expression of LipL36 and pL50 was not apparent at temperatures above 30°C or under iron-depleted conditions. The expression of pL24 was also downregulated after iron depletion. The leptospiral major OMP LipL32 was observed to undergo substantial cleavage under all conditions except iron depletion. Additionally, significant downregulation of these mass forms was observed under iron limitation at 30°C, but not at 30°C alone, suggesting that LipL32 processing is dependent on iron-regulated extracellular proteases. However, separate cleavage products responded differently to changes in growth temperature and medium constituents, indicating that more than one process may be involved in LipL32 processing. Furthermore, under iron-depleted conditions there was no concomitant increase in the levels of the intact form of LipL32. The temperature- and iron-regulated expression of LipL36 and the iron-dependent cleavage of LipL32 were confirmed by immunoblotting with specific antisera. Global analysis of the cellular location and expression of leptospiral proteins will be useful in the annotation of genomic sequence data and in providing insight into the biology of Leptospira.


2015 ◽  
Vol 370 (1679) ◽  
pp. 20150023 ◽  
Author(s):  
Sarah E. Rollauer ◽  
Moloud A. Sooreshjani ◽  
Nicholas Noinaj ◽  
Susan K. Buchanan

Gram-negative bacteria contain a double membrane which serves for both protection and for providing nutrients for viability. The outermost of these membranes is called the outer membrane (OM), and it contains a host of fully integrated membrane proteins which serve essential functions for the cell, including nutrient uptake, cell adhesion, cell signalling and waste export. For pathogenic strains, many of these outer membrane proteins (OMPs) also serve as virulence factors for nutrient scavenging and evasion of host defence mechanisms. OMPs are unique membrane proteins in that they have a β-barrel fold and can range in size from 8 to 26 strands, yet can still serve many different functions for the cell. Despite their essential roles in cell survival and virulence, the exact mechanism for the biogenesis of these OMPs into the OM has remained largely unknown. However, the past decade has witnessed significant progress towards unravelling the pathways and mechanisms necessary for moulding a nascent polypeptide into a functional OMP within the OM. Here, we will review some of these recent discoveries that have advanced our understanding of the biogenesis of OMPs in Gram-negative bacteria, starting with synthesis in the cytoplasm to folding and insertion into the OM.


Genes ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 451
Author(s):  
Pablo Mier ◽  
Miguel A. Andrade-Navarro

Low complexity regions (LCRs) in proteins are characterized by amino acid frequencies that differ from the average. These regions evolve faster and tend to be less conserved between homologs than globular domains. They are not common in bacteria, as compared to their prevalence in eukaryotes. Studying their conservation could help provide hypotheses about their function. To obtain the appropriate evolutionary focus for this rapidly evolving feature, here we study the conservation of LCRs in bacterial strains and compare their high variability to the closeness of the strains. For this, we selected 20 taxonomically diverse bacterial species and obtained the completely sequenced proteomes of two strains per species. We calculated all orthologous pairs for each of the 20 strain pairs. Per orthologous pair, we computed the conservation of two types of LCRs: compositionally biased regions (CBRs) and homorepeats (polyX). Our results show that, in bacteria, Q-rich CBRs are the most conserved, while A-rich CBRs and polyA are the most variable. LCRs have generally higher conservation when comparing pathogenic strains. However, this result depends on protein subcellular location: LCRs accumulate in extracellular and outer membrane proteins, with conservation increased in the extracellular proteins of pathogens, and decreased for polyX in the outer membrane proteins of pathogens. We conclude that these dependencies support the functional importance of LCRs in host–pathogen interactions.


Sign in / Sign up

Export Citation Format

Share Document