Accurate prediction of transmembrane β-barrel proteins from sequences

Mapping Intimacies ◽

10.1101/006577 ◽

2014 ◽

Cited By ~ 2

Author(s):

Sikander Hayat ◽

Chris Sander ◽

Arne Elofsson ◽

Debora S. Marks

Keyword(s):

Membrane Proteins ◽

Outer Membrane Proteins ◽

3D Structure ◽

3D Models ◽

Biological Research ◽

Sequence Information ◽

Major Advance ◽

Sequence Alignments ◽

Successful Prediction ◽

Protein Biogenesis

AbstractTransmembrane β-barrels are known to play major roles in substrate transport and protein biogenesis in gram-negative bacteria, chloroplasts and mitochondria. However, the exact number of transmembrane β-barrel families is unknown and experimental structure determination is challenging. In theory, if one knows the number of strands in the β-barrel, then the 3D structure of the barrel could be trivial, but current topology predictions do not predict accurate structures and are unable to give information beyond the β-strands in the barrel. Recent work has shown successful prediction of globular and alpha-helical membrane proteins from sequence alignments, by using high ranked evolutionary couplings between residues as distance constraints to fold extended polypeptides. However, these methods, have not addressed the calculation of precise β-sheet hydrogen bonding that defines transmembrane β-barrels, and would be required to fold these proteins successfully. Hence we developed a method (EVFold_BB) that can successfully model transmembrane β-barrels by combining evolutionary couplings together with topology predictions. EVFold_BB is validated by the accurate all-atom 3D modeling of 18 proteins, representing all known membrane β-barrel families that have sufficient sequences available. To demonstrate the potential of our approach we predict the unknown 3D structure of the LptD protein, the plausibility of its accuracy is supported by the blindly predicted benchmarks, and is consistent with experimental observations. Our approach can naturally be extended to all unknown β-barrel proteins with sufficient sequence information.SignificanceEVFold_BB predicts fast, accurate 3D models of large membrane β-barrels that are notoriously hard to solve experimentally. The major advance is the use of evolutionary couplings from sequence alignments together with the β-strand prediction to ascertain accurate hydrogen bond between theβ-strands that gives rise to the canonical barrel shapes. The method will enable biological research into outer-membrane proteins.

Download Full-text

All-atom 3D structure prediction of transmembrane β-barrel proteins from sequences

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1419956112 ◽

2015 ◽

Vol 112 (17) ◽

pp. 5413-5418 ◽

Cited By ~ 41

Author(s):

Sikander Hayat ◽

Chris Sander ◽

Debora S. Marks ◽

Arne Elofsson

Keyword(s):

Structure Prediction ◽

De Novo ◽

3D Structure ◽

3D Models ◽

Sequence Information ◽

Sequence Alignments ◽

Residue Contacts ◽

Machine Learning Approach ◽

3D Structure Prediction ◽

Structure Accuracy

Transmembrane β-barrels (TMBs) carry out major functions in substrate transport and protein biogenesis but experimental determination of their 3D structure is challenging. Encouraged by successful de novo 3D structure prediction of globular and α-helical membrane proteins from sequence alignments alone, we developed an approach to predict the 3D structure of TMBs. The approach combines the maximum-entropy evolutionary coupling method for predicting residue contacts (EVfold) with a machine-learning approach (boctopus2) for predicting β-strands in the barrel. In a blinded test for 19 TMB proteins of known structure that have a sufficient number of diverse homologous sequences available, this combined method (EVfold_bb) predicts hydrogen-bonded residue pairs between adjacent β-strands at an accuracy of ∼70%. This accuracy is sufficient for the generation of all-atom 3D models. In the transmembrane barrel region, the average 3D structure accuracy [template-modeling (TM) score] of top-ranked models is 0.54 (ranging from 0.36 to 0.85), with a higher (44%) number of residue pairs in correct strand–strand registration than in earlier methods (18%). Although the nonbarrel regions are predicted less accurately overall, the evolutionary couplings identify some highly constrained loop residues and, for FecA protein, the barrel including the structure of a plug domain can be accurately modeled (TM score = 0.68). Lower prediction accuracy tends to be associated with insufficient sequence information and we therefore expect increasing numbers of β-barrel families to become accessible to accurate 3D structure prediction as the number of available sequences increases.

Download Full-text

EVfold.org: Evolutionary Couplings and Protein 3D Structure Prediction

10.1101/021022 ◽

2015 ◽

Cited By ~ 14

Author(s):

Robert Sheridan ◽

Robert J. Fieldhouse ◽

Sikander Hayat ◽

Yichao Sun ◽

Yevgeniy Antipin ◽

...

Keyword(s):

Protein Function ◽

Structure Prediction ◽

De Novo ◽

3D Structure ◽

Sequence Information ◽

Major Advance ◽

Sequence Alignments ◽

Multiple Sequence ◽

Genomic Databases ◽

Multiple Sequence Alignments

Recently developed maximum entropy methods infer evolutionary constraints on protein function and structure from the millions of protein sequences available in genomic databases. The EVfold web server (at EVfold.org) makes these methods available to predict functional and structural interactions in proteins. The key algorithmic development has been to disentangle direct and indirect residue-residue correlations in large multiple sequence alignments and derive direct residue-residue evolutionary couplings (EVcouplings or ECs). For proteins of unknown structure, distance constraints obtained from evolutionarily couplings between residue pairs are used to de novo predict all-atom 3D structures, often to good accuracy. Given sufficient sequence information in a protein family, this is a major advance toward solving the problem of computing the native 3D fold of proteins from sequence information alone. Availability: EVfold server at http://evfold.org/ Contact: [email protected]

Download Full-text

e-Membranome: a Database for Genome-Wide Analysis of Escherichia coli Outer Membrane Proteins

Current Pharmaceutical Biotechnology ◽

10.2174/1389201021666200610105549 ◽

2020 ◽

Vol 21 ◽

Author(s):

Kang Mo Lee ◽

Seung-Hak Cho ◽

Cheorl-Ho Kim ◽

Jong Hyun Kim ◽

Sung Soon Kim

Keyword(s):

Escherichia Coli ◽

Membrane Proteins ◽

Outer Membrane ◽

Outer Membrane Proteins ◽

3D Structure ◽

Glycan Array ◽

Epitope Region ◽

E Coli ◽

Genome Wide ◽

A Genome

Objectives: Lectin-like adhesins of enteric bacterial pathogens such as Escherichia coli are an attractive target for vaccine or drug development. Here, we have developed e-Membranome as a database of genome-wide putative adhesins in Escherichia coli (E. coli). Methods: The outer membrane adhesins were predicted from the annotated genes of Escherichia coli strains using the PSORTb program. Further analysis was performed using Interproscan and the String database. The candidate proteins can be investigated for homology modeling of the three-dimensional (3D) structure (I-TASSER version 5.1), epitope region (ABCpred), and the glycan array. Results: e-Membranome is implemented using the Django (version 2.2.5) framework. The Web Application Server Apache Tomcat 6.0 is integrated in the platform on Ubuntu Linux (version 16.04). MySQL database (version 5.7) is used as a database engine. The information of homology model of the 3D structure, epitope region, and affinity information from the glycan array will be stored in the e-Membranome database. As a case study, we performed a genome-wide screening of outer membrane-embedded proteins from the annotated genes of E. coli using the e-Membranome pipeline. Conclusion: This platform is expected to be a valuable resource for advancing research of outer membrane proteins for the construction of lectin-glycan interaction network of E. coli. In addition, the e-Membranome pipeline can be extended to other similar biological systems that need to address host-pathogen interactions.

Download Full-text

3D RNA from evolutionary couplings

10.1101/028456 ◽

2015 ◽

Author(s):

Caleb Weinreb ◽

Torsten Gross ◽

Chris Sander ◽

Debora S Marks

Keyword(s):

Molecular Dynamics ◽

Structure Prediction ◽

Probability Model ◽

3D Structure ◽

Cell Physiology ◽

Sequence Information ◽

Sequence Alignments ◽

Protein Coding ◽

Promising Alternative ◽

Tertiary Contacts

Non-protein-coding RNAs are ubiquitous in cell physiology, with a diverse repertoire of known functions. In fact, the majority of the eukaryotic genome does not code for proteins, and thousands of conserved long non-protein-coding RNAs of currently unkown function have been identified. When available, knowledge of their 3D structure is very helpful in elucidating the function of these RNAs. However, despite some outstanding structure elucidation of RNAs using X-ray crystallography, NMR and cryoEM, learning RNA 3D structures remains low-throughput. RNA structure prediction in silico is a promising alternative approach and works well for double-helical stems, but full 3D structure determination requires tertiary contacts outside of secondary structures that are difficult to infer from sequence information. Here, based only on information from RNA multiple sequence alignments, we use a global statistical sequence probability model of co-variation in a pairs of nucleotide positions to detect 3D contacts, in analogy to recently developed breakthrough methods for computational protein folding. In blinded tests on 22 known RNA structures ranging in size from 65 to 1800 nucleotides, the predicted contacts matched physical nucleotide interactions with 65-95% true positive prediction accuracy. Importantly, we infer many long-range tertiary contacts, including non-Watson-Crick interactions, where secondary structure elements assemble in 3D. When used as restraints in molecular dynamics simulations, the inferred contacts improve RNA 3D structure prediction to a coordinate error as low as 6 to 10 angstrom rmsd deviation in atom positions, with potential for further refinement by molecular dynamics. These contacts include functionally important interactions, such as those that distinguish the active and inactive conformations of four riboswitches. In blind prediction mode, we present evolutionary couplings suitable for folding simulations for 180 RNAs of unknown structure, available at https://marks.hms.harvard.edu/ev_rna/. We anticipate that this approach can help shed light on the structure and function of non-protein-coding RNAs as well as 3D-structured mRNAs.

Download Full-text

Global Analysis of Outer Membrane Proteins from Leptospira interrogans Serovar Lai

Infection and Immunity ◽

10.1128/iai.70.5.2311-2318.2002 ◽

2002 ◽

Vol 70 (5) ◽

pp. 2311-2318 ◽

Cited By ~ 129

Author(s):

Paul A. Cullen ◽

Stuart J. Cordwell ◽

Dieter M. Bulach ◽

David A. Haake ◽

Ben Adler

Keyword(s):

Membrane Proteins ◽

Outer Membrane ◽

Global Analysis ◽

Outer Membrane Proteins ◽

Calf Serum ◽

Peptide Sequence ◽

Sequence Information ◽

Leptospira Interrogans ◽

Iron Depletion ◽

Cellular Location

ABSTRACT Recombinant leptospiral outer membrane proteins (OMPs) can elicit immunity to leptospirosis in a hamster infection model. Previously characterized OMPs appear highly conserved, and thus their potential to stimulate heterologous immunity is of critical importance. In this study we undertook a global analysis of leptospiral OMPs, which were obtained by Triton X-114 extraction and phase partitioning. Outer membrane fractions were isolated from Leptospira interrogans serovar Lai grown at 20, 30, and 37°C with or without 10% fetal calf serum and, finally, in iron-depleted medium. The OMPs were separated by two-dimensional gel electrophoresis. Gel patterns from each of the five conditions were compared via image analysis, and 37 gel-purified proteins were tryptically digested and characterized by mass spectrometry (MS). Matrix-assisted laser desorption ionization-time-of-flight MS was used to rapidly identify leptospiral OMPs present in sequence databases. Proteins identified by this approach included the outer membrane lipoproteins LipL32, LipL36, LipL41, and LipL48. No known proteins from any cellular location other than the outer membrane were identified. Tandem electrospray MS was used to obtain peptide sequence information from eight novel proteins designated pL18, pL21, pL22, pL24, pL45, pL47/49, pL50, and pL55. The expression of LipL36 and pL50 was not apparent at temperatures above 30°C or under iron-depleted conditions. The expression of pL24 was also downregulated after iron depletion. The leptospiral major OMP LipL32 was observed to undergo substantial cleavage under all conditions except iron depletion. Additionally, significant downregulation of these mass forms was observed under iron limitation at 30°C, but not at 30°C alone, suggesting that LipL32 processing is dependent on iron-regulated extracellular proteases. However, separate cleavage products responded differently to changes in growth temperature and medium constituents, indicating that more than one process may be involved in LipL32 processing. Furthermore, under iron-depleted conditions there was no concomitant increase in the levels of the intact form of LipL32. The temperature- and iron-regulated expression of LipL36 and the iron-dependent cleavage of LipL32 were confirmed by immunoblotting with specific antisera. Global analysis of the cellular location and expression of leptospiral proteins will be useful in the annotation of genomic sequence data and in providing insight into the biology of Leptospira.

Download Full-text

Outer membrane protein biogenesis in Gram-negative bacteria

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2015.0023 ◽

2015 ◽

Vol 370 (1679) ◽

pp. 20150023 ◽

Cited By ~ 82

Author(s):

Sarah E. Rollauer ◽

Moloud A. Sooreshjani ◽

Nicholas Noinaj ◽

Susan K. Buchanan

Keyword(s):

Membrane Proteins ◽

Outer Membrane ◽

Outer Membrane Proteins ◽

Cell Signalling ◽

Gram Negative Bacteria ◽

Defence Mechanisms ◽

Gram Negative ◽

Fully Integrated ◽

The Past ◽

Protein Biogenesis

Gram-negative bacteria contain a double membrane which serves for both protection and for providing nutrients for viability. The outermost of these membranes is called the outer membrane (OM), and it contains a host of fully integrated membrane proteins which serve essential functions for the cell, including nutrient uptake, cell adhesion, cell signalling and waste export. For pathogenic strains, many of these outer membrane proteins (OMPs) also serve as virulence factors for nutrient scavenging and evasion of host defence mechanisms. OMPs are unique membrane proteins in that they have a β-barrel fold and can range in size from 8 to 26 strands, yet can still serve many different functions for the cell. Despite their essential roles in cell survival and virulence, the exact mechanism for the biogenesis of these OMPs into the OM has remained largely unknown. However, the past decade has witnessed significant progress towards unravelling the pathways and mechanisms necessary for moulding a nascent polypeptide into a functional OMP within the OM. Here, we will review some of these recent discoveries that have advanced our understanding of the biogenesis of OMPs in Gram-negative bacteria, starting with synthesis in the cytoplasm to folding and insertion into the OM.

Download Full-text

Identification and characterization of the major outer membrane proteins of Haemophilus parasuis for the rational development of a vaccine candidate and diagnostic for Glässer's disease

10.31274/etd-180810-3675 ◽

2011 ◽

Author(s):

Mandy Kay Zimmerli

Keyword(s):

Membrane Proteins ◽

Outer Membrane ◽

Vaccine Candidate ◽

Outer Membrane Proteins ◽

Haemophilus Parasuis ◽

Rational Development ◽

Glässer’S Disease ◽

Glasser's Disease ◽

Identification And Characterization

Download Full-text

Faculty Opinions recommendation of Cloning, expression and immunogenicty analysis of five outer membrane proteins of Vibrio parahaemolyticus zj2003.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1089015.542160 ◽

2007 ◽

Author(s):

Nitaya Thammapalerd

Keyword(s):

Membrane Proteins ◽

Outer Membrane ◽

Vibrio Parahaemolyticus ◽

Outer Membrane Proteins

Download Full-text

Faculty Opinions recommendation of Supramolecular assemblies underpin turnover of outer membrane proteins in bacteria.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.725547362.793508247 ◽

2015 ◽

Author(s):

Chris Whitfield

Keyword(s):

Membrane Proteins ◽

Outer Membrane ◽

Outer Membrane Proteins ◽

Supramolecular Assemblies

Download Full-text

The Conservation of Low Complexity Regions in Bacterial Proteins Depends on the Pathogenicity of the Strain and Subcellular Location of the Protein

Genes ◽

10.3390/genes12030451 ◽

2021 ◽

Vol 12 (3) ◽

pp. 451

Author(s):

Pablo Mier ◽

Miguel A. Andrade-Navarro

Keyword(s):

Membrane Proteins ◽

Outer Membrane ◽

Bacterial Species ◽

Outer Membrane Proteins ◽

Subcellular Location ◽

Low Complexity ◽

Extracellular Proteins ◽

Bacterial Strains ◽

Bacterial Proteins ◽

Protein Subcellular Location

Low complexity regions (LCRs) in proteins are characterized by amino acid frequencies that differ from the average. These regions evolve faster and tend to be less conserved between homologs than globular domains. They are not common in bacteria, as compared to their prevalence in eukaryotes. Studying their conservation could help provide hypotheses about their function. To obtain the appropriate evolutionary focus for this rapidly evolving feature, here we study the conservation of LCRs in bacterial strains and compare their high variability to the closeness of the strains. For this, we selected 20 taxonomically diverse bacterial species and obtained the completely sequenced proteomes of two strains per species. We calculated all orthologous pairs for each of the 20 strain pairs. Per orthologous pair, we computed the conservation of two types of LCRs: compositionally biased regions (CBRs) and homorepeats (polyX). Our results show that, in bacteria, Q-rich CBRs are the most conserved, while A-rich CBRs and polyA are the most variable. LCRs have generally higher conservation when comparing pathogenic strains. However, this result depends on protein subcellular location: LCRs accumulate in extracellular and outer membrane proteins, with conservation increased in the extracellular proteins of pathogens, and decreased for polyX in the outer membrane proteins of pathogens. We conclude that these dependencies support the functional importance of LCRs in host–pathogen interactions.

Download Full-text