Membrane-spanning α-helical barrels as tractable protein-design targets

Ai Niitsu; Jack W. Heal; Kerstin Fauland; Andrew R. Thomson; Derek N. Woolfson

doi:10.1098/rstb.2016.0213

Membrane-spanning α-helical barrels as tractable protein-design targets

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2016.0213 ◽

2017 ◽

Vol 372 (1726) ◽

pp. 20160213 ◽

Cited By ~ 11

Author(s):

Ai Niitsu ◽

Jack W. Heal ◽

Kerstin Fauland ◽

Andrew R. Thomson ◽

Derek N. Woolfson

Keyword(s):

Protein Structure ◽

Protein Design ◽

De Novo ◽

Protein Structures ◽

Coiled Coil ◽

Deep Understanding ◽

Data Bank ◽

Water Soluble ◽

Limiting Factor ◽

Membrane Spanning

The rational ( de novo ) design of membrane-spanning proteins lags behind that for water-soluble globular proteins. This is due to gaps in our knowledge of membrane-protein structure, and experimental difficulties in studying such proteins compared to water-soluble counterparts. One limiting factor is the small number of experimentally determined three-dimensional structures for transmembrane proteins. By contrast, many tens of thousands of globular protein structures provide a rich source of ‘scaffolds’ for protein design, and the means to garner sequence-to-structure relationships to guide the design process. The α-helical coiled coil is a protein-structure element found in both globular and membrane proteins, where it cements a variety of helix–helix interactions and helical bundles. Our deep understanding of coiled coils has enabled a large number of successful de novo designs. For one class, the α-helical barrels—that is, symmetric bundles of five or more helices with central accessible channels—there are both water-soluble and membrane-spanning examples. Recent computational designs of water-soluble α-helical barrels with five to seven helices have advanced the design field considerably. Here we identify and classify analogous and more complicated membrane-spanning α-helical barrels from the Protein Data Bank. These provide tantalizing but tractable targets for protein engineering and de novo protein design. This article is part of the themed issue ‘Membrane pores: from structure and assembly, to medicine and technology’.

Download Full-text

Guidelines for the assembly of novel coiled-coil structures: α-sheets and α-cylinders

Biochemical Society Symposium ◽

10.1042/bss0680111 ◽

2001 ◽

Vol 68 ◽

pp. 111-123 ◽

Cited By ~ 6

Author(s):

John Walshaw ◽

Jennifer M. Shipway ◽

Derek N. Woolfson

Keyword(s):

Protein Design ◽

Protein Interactions ◽

Protein Structures ◽

Coiled Coil ◽

Data Bank ◽

Coiled Coils ◽

Heptad Repeat ◽

Protein Protein Interactions ◽

Helical Bundles ◽

Heptad Repeats

The coiled coil is a ubiquitous motif that guides many different protein-protein interactions. The accepted hallmark of coiled coils is a seven-residue (heptad) sequence repeat. The positions of this repeat are labelled a-b-c-d-e-f-g, with residues at a and d tending to be hydrophobic. Such sequences form amphipathic α-helices, which assemble into helical bundles via knobs-into-holes interdigitation of residues from neighbouring helices. We wrote an algorithm, SOCKET, to identify this packing in protein structures, and used this to gather a database of coiled-coil structures from the Protein Data Bank. Surprisingly, in addition to commonly accepted structures with a single, contiguous heptad repeat, we identified sequences with multiple, offset heptad repeats. These 'new' sequence patterns help to explain oligomer-state specification in coiled coils. Here we focus on the structural consequences for sequences with two heptad repeats offset by two residues, i.e. a/f′-b/g′-c/a′-d/b′-e/c′-f/d′-g/e′. This sets up two hydrophobic seams on opposite sides of the helix formed. We describe how such helices may combine to bury these hydrophobic surfaces in two different ways and form two distinct structures: open 'α-sheets' and closed 'α-cylinders'. We highlight these with descriptions of natural structures and outline possibilities for protein design.

Download Full-text

RamaNet: Computational de novo helical protein backbone design using a long short-term memory generative neural network

10.1101/671552 ◽

2019 ◽

Cited By ~ 1

Author(s):

Sari Sabban ◽

Mikhail Markovsky

Keyword(s):

Neural Network ◽

Protein Data Bank ◽

Protein Design ◽

Short Term Memory ◽

De Novo ◽

Protein Structures ◽

Data Bank ◽

Protein Backbone ◽

Helical Protein ◽

Long Short Term Memory

AbstractThe ability to perform de novo protein design will allow researchers to expand the variety of available proteins. By designing synthetic structures computationally, they can utilise more structures than those available in the Protein Data Bank, design structures that are not found in nature, or direct the design of proteins to acquire a specific desired structure. While some researchers attempt to design proteins from first physical and thermodynamic principals, we decided to attempt to test whether it is possible to perform de novo helical protein design of just the backbone statistically using machine learning by building a model that uses a long short-term memory (LSTM) architecture. The LSTM model used only the ϕ and ψ angles of each residue from an augmented dataset of only helical protein structures. Though the network’s generated backbone structures were not perfect, they were idealised and evaluated post generation where the non-ideal structures were filtered out and the adequate structures kept. The results were successful in developing a logical, rigid, compact, helical protein backbone topology. This paper is a proof of concept that shows it is possible to generate a novel helical backbone topology using an LSTM neural network architecture using only the ϕ and ψ angles as features. The next step is to attempt to use these backbone topologies and sequence design them to form complete protein structures.Author summaryThis research project stemmed from the desire to expand the pool of protein structures that can be used as scaffolds in computational vaccine development, since the number of structures available from the Protein Data Bank was not sufficient to allow for great diversity and increase the probability of grafting a target motif onto a protein scaffold. Since a protein structure’s backbone can be defined by the ϕ and ψ angles of each amino acid in the polypeptide and can effectively translate a protein’s 3D structure into a table of numbers, and since protein structures are not random, this numerical representation of protein structures can be used to train a neural network to mathematically generalise what a protein structure is, and therefore generate new a protein backbone. Instead of using all proteins in the Protein Data Bank a curated dataset was used encompassing protein structures with specific characteristics that will, theoretically, allow them to be evaluated computationally. This paper details how a trained neural network was able to successfully generate helical protein backbones.

Download Full-text

How coiled-coil assemblies accommodate multiple aromatic residues

10.1101/2021.02.01.429152 ◽

2021 ◽

Author(s):

Guto G. Rhys ◽

William M. Dawson ◽

Joseph L. Beesley ◽

Freddie J. O. Martin ◽

R. Leo Brady ◽

...

Keyword(s):

Hydrogen Bond ◽

Protein Design ◽

De Novo ◽

Complex Structure ◽

Protein Structures ◽

Coiled Coil ◽

Coiled Coils ◽

Aromatic Residues ◽

Hydrogen Bond Networks ◽

Rational Protein Design

ABSTRACTRational protein design requires understanding the contribution of each amino acid to a targeted protein fold. For a subset of protein structures, namely the α;-helical coiled coils (CCs), knowledge is sufficiently advanced to allow the rational de novo design of many structures, including entirely new protein folds. However, current CC design rules center on using aliphatic hydrophobic residues predominantly to drive the folding and assembly of amphipathic α helices. The consequences of using aromatic residues—which would be useful for introducing structural probes, and binding and catalytic functionalities—into these interfaces is not understood. There are specific examples of designed CCs containing such aromatic residues, e.g., phenylalanine-rich sequences, and the use of polar aromatic residues to make buried hydrogen-bond networks. However, it is not known generally if sequences rich in tyrosine can form CCs, or what CC assemblies these would lead to. Here we explore tyrosine-rich sequences in a general CC-forming background and resolve new CC structures. In one of these, an antiparallel tetramer, the tyrosine residues are solvent accessible and pack at the interface between the core and the surface. In the other more-complex structure, the residues are buried and form an extended hydrogen-bond network.

Download Full-text

A nanobody toolbox targeting dimeric coiled-coil modules for functionalization of designed protein origami structures

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2021899118 ◽

2021 ◽

Vol 118 (17) ◽

pp. e2021899118

Author(s):

Andreja Majerle ◽

San Hadži ◽

Jana Aupič ◽

Tadej Satler ◽

Fabio Lapenta ◽

...

Keyword(s):

Protein Design ◽

Self Assembly ◽

De Novo ◽

Protein Structures ◽

Coiled Coil ◽

Triangular Prism ◽

Single Chain ◽

Protein Cages ◽

X Ray Scattering ◽

Structure Relationship

Coiled-coil (CC) dimers are widely used in protein design because of their modularity and well-understood sequence–structure relationship. In CC protein origami design, a polypeptide chain is assembled from a defined sequence of CC building segments that determine the self-assembly of protein cages into polyhedral shapes, such as the tetrahedron, triangular prism, or four-sided pyramid. However, a targeted functionalization of the CC modules could significantly expand the versatility of protein origami scaffolds. Here, we describe a panel of single-chain camelid antibodies (nanobodies) directed against different CC modules of a de novo designed protein origami tetrahedron. We show that these nanobodies are able to recognize the same CC modules in different polyhedral contexts, such as isolated CC dimers, tetrahedra, triangular prisms, or trigonal bipyramids, thereby extending the ability to functionalize polyhedra with nanobodies in a desired stoichiometry. Crystal structures of five nanobody-CC complexes in combination with small-angle X-ray scattering show binding interactions between nanobodies and CC dimers forming the edges of a tetrahedron with the nanobody entering the tetrahedral cavity. Furthermore, we identified a pair of allosteric nanobodies in which the binding to the distant epitopes on the antiparallel homodimeric APH CC is coupled via a strong positive cooperativity. A toolbox of well-characterized nanobodies specific for CC modules provides a unique tool to target defined sites in the designed protein structures, thus opening numerous opportunities for the functionalization of CC protein origami polyhedra or CC-based bionanomaterials.

Download Full-text

Protein structure prediction and design in a biologically-realistic implicit membrane

10.1101/630715 ◽

2019 ◽

Author(s):

Rebecca F. Alford ◽

Patrick J. Fleming ◽

Karen G. Fleming ◽

Jeffrey J. Gray

Keyword(s):

Protein Structure ◽

Amino Acid ◽

Membrane Proteins ◽

Membrane Protein ◽

Protein Structure Prediction ◽

Protein Design ◽

Structure Prediction ◽

De Novo ◽

Computational Design ◽

Amino Acid Distribution

ABSTRACTProtein design is a powerful tool for elucidating mechanisms of function and engineering new therapeutics and nanotechnologies. While soluble protein design has advanced, membrane protein design remains challenging due to difficulties in modeling the lipid bilayer. In this work, we developed an implicit approach that captures the anisotropic structure, shape of water-filled pores, and nanoscale dimensions of membranes with different lipid compositions. The model improves performance in computational bench-marks against experimental targets including prediction of protein orientations in the bilayer, ΔΔG calculations, native structure dis-crimination, and native sequence recovery. When applied to de novo protein design, this approach designs sequences with an amino acid distribution near the native amino acid distribution in membrane proteins, overcoming a critical flaw in previous membrane models that were prone to generating leucine-rich designs. Further, the proteins designed in the new membrane model exhibit native-like features including interfacial aromatic side chains, hydrophobic lengths compatible with bilayer thickness, and polar pores. Our method advances high-resolution membrane protein structure prediction and design toward tackling key biological questions and engineering challenges.Significance StatementMembrane proteins participate in many life processes including transport, signaling, and catalysis. They constitute over 30% of all proteins and are targets for over 60% of pharmaceuticals. Computational design tools for membrane proteins will transform the interrogation of basic science questions such as membrane protein thermodynamics and the pipeline for engineering new therapeutics and nanotechnologies. Existing tools are either too expensive to compute or rely on manual design strategies. In this work, we developed a fast and accurate method for membrane protein design. The tool is available to the public and will accelerate the experimental design pipeline for membrane proteins.

Download Full-text

De Novo Design of Granulopoietic Proteins

Blood ◽

10.1182/blood-2020-138852 ◽

2020 ◽

Vol 136 (Supplement 1) ◽

pp. 34-35

Author(s):

Julia Skokowa ◽

Mohammad Elgamacy ◽

Patrick Müller

Keyword(s):

Protein Design ◽

De Novo ◽

Conflicts Of Interest ◽

Specific Activity ◽

Structural Information ◽

Coiled Coil ◽

Design Strategies ◽

Nmr Structure Determination ◽

Binding Epitope

Protein therapeutics are clinically developed and used as minorly engineered forms of their natural templates. This direct adoption of natural proteins in therapeutic contexts very frequently faces major challenges, including instability, poor solubility, and aggregation, which may result in undesired clinical outcomes. In contrast to classical protein engineering techniques, de novo protein design enables the introduction of radical sequence and structure manipulations, which can be used to address these challenges. In this work, we test the utility of two different design strategies to design novel granulopoietic proteins, using structural information from human granulocyte-colony stimulating factor (hG-CSF) as a template. The two strategies are: (1) An epitope rescaffolding where we migrate a tertiary structural epitope to simpler, idealised, proteins scaffolds (Fig. 1A-C), and (2) a topological refactoring strategy, where we change the protein fold by rearranging connections across the secondary structures and optimised the designed sequence of the new fold (Fig. 1A,D,E). Testing only eight designs, we obtained novel granulopoietic proteins that bind to the G-CSF receptor, have nanomolar activity in cell-based assays, and were highly thermostable and protease-resistant. NMR structure determination showed three designs to match their designed coordinates within less than 2.5 Å. While the designs possessed starkly different sequence and structure from the native G-CSF, they showed very specific activity in differentiating primary human haematopoietic stem cells into fully mature granulocytes. Morever, one design shows significant and specific activity in vivo in zebrafish and mice. These results are prospectively directing us to investigate the role of dimerisation geometry of G-GCSF receptor on activation magnitude and downstream signalling pathways. More broadly, the results also motivate our ongoing work on to design other heamatopoietic agents. In conclusion, our findings highlight the utility of computational protein design as a highly effective and guided means for discovering nover receptor modulators, and to obtain new mechanistic information about the target molecule. Figure 1. Two different strategies to generate superfolding G-CSF designs. (A) X-ray structure of G-CSF (orange) bound to its cognate receptor (red) through its binding epitope (blue). According to the epitope rescaffolding strategy, (B) the critical binding epitope residues were disembodied and used as a geometric search query against the entire Protein Data Bank (PDB) to retrieve structurally compatible scaffolds. The top six compatible scaffolds structures are shown in cartoon representation. (C) The top two templates chosen for sequence design, were a de novo designed coiled-coil and a four-helix bundle with unknown function. The binding epitopes were grafted, and the scaffolds were optimised to rigidly host the guest epitope. (D-E) According to the topological refactoring strategy (D) the topology of the native G-CSF was rewired from around the fixed binding epitope, and then was further mutated to idealise the core residues (blue volume (E)) and residues distal from the binding epitope (orange crust (E)). Both strategies aimed at simplifying the topology, reducing the size, and rigidifying the bound epitope conformation through alternate means. Figure 1 Disclosures No relevant conflicts of interest to declare.

Download Full-text

Protein designer David Baker: I like doing things that seem like magic

National Science Review ◽

10.1093/nsr/nwaa071 ◽

2020 ◽

Vol 7 (8) ◽

pp. 1410-1412

Author(s):

Weijie Zhao ◽

Chu Wang

Keyword(s):

Protein Design ◽

De Novo ◽

Protein Structures ◽

Computational Prediction ◽

Biological Functions ◽

Personal Experiences ◽

De Novo Protein Design ◽

And Function ◽

The University ◽

Opening Up

Abstract Search ‘de novo protein design’ on Google and you will find the name David Baker in all results of the first page. Professor David Baker at the University of Washington and other scientists are opening up a new world of fantastic proteins. Protein is the direct executor of most biological functions and its structure and function are fully determined by its primary sequence. Baker's group developed the Rosetta software suite that enabled the computational prediction and design of protein structures. Being able to design proteins from scratch means being able to design executors for diverse purposes and benefit society in multiple ways. Recently, NSR interviewed Prof. Baker on this fast-developing field and his personal experiences.

Download Full-text

De novo design of a pentameric coiled-coil: decoding the motif for tetramer versus pentamer formation in water-soluble phospholamban*

Journal of Peptide Research ◽

10.1111/j.1399-3011.2005.00244.x ◽

2005 ◽

Vol 65 (3) ◽

pp. 312-321 ◽

Cited By ~ 14

Author(s):

A.M. Slovic ◽

J.D. Lear ◽

W.F. DeGrado

Keyword(s):

De Novo ◽

Coiled Coil ◽

De Novo Design ◽

Water Soluble

Download Full-text

TOP: a new method for protein structure comparisons and similarity searches

Journal of Applied Crystallography ◽

10.1107/s0021889899012339 ◽

2000 ◽

Vol 33 (1) ◽

pp. 176-183 ◽

Cited By ~ 149

Author(s):

Guoguang Lu

Keyword(s):

User Interface ◽

Protein Structure ◽

Protein Structures ◽

Three Dimensional ◽

Data Bank ◽

Structure Alignment ◽

Dimensional Structure ◽

Protein Structure Alignment ◽

Protein Structure Analysis ◽

Structure Comparison

In order to facilitate the three-dimensional structure comparison of proteins, software for making comparisons and searching for similarities to protein structures in databases has been developed. The program identifies the residues that share similar positions of both main-chain and side-chain atoms between two proteins. The unique functions of the software also include database processingviaInternet- and Web-based servers for different types of users. The developed method and its friendly user interface copes with many of the problems that frequently occur in protein structure comparisons, such as detecting structurally equivalent residues, misalignment caused by coincident match of Cαatoms, circular sequence permutations, tedious repetition of access, maintenance of the most recent database, and inconvenience of user interface. The program is also designed to cooperate with other tools in structural bioinformatics, such as the 3DB Browser software [Prilusky (1998).Protein Data Bank Q. Newslett.84, 3–4] and the SCOP database [Murzin, Brenner, Hubbard & Chothia (1995).J. Mol. Biol.247, 536–540], for convenient molecular modelling and protein structure analysis. A similarity ranking score of `structure diversity' is proposed in order to estimate the evolutionary distance between proteins based on the comparisons of their three-dimensional structures. The function of the program has been utilized as a part of an automated program for multiple protein structure alignment. In this paper, the algorithm of the program and results of systematic tests are presented and discussed.

Download Full-text

De novo protein design: how do we expand into the universe of possible protein structures?

Current Opinion in Structural Biology ◽

10.1016/j.sbi.2015.05.009 ◽

2015 ◽

Vol 33 ◽

pp. 16-26 ◽

Cited By ~ 110

Author(s):

Derek N Woolfson ◽

Gail J Bartlett ◽

Antony J Burton ◽

Jack W Heal ◽

Ai Niitsu ◽

...

Keyword(s):

Protein Design ◽

De Novo ◽

Protein Structures ◽

De Novo Protein Design ◽

The Universe

Download Full-text