Consensus sequence design as a general strategy to create hyperstable, biologically active proteins
AbstractConsensus sequence design offers a promising strategy for designing proteins of high stability while retaining biological activity since it draws upon an evolutionary history in which residues important for both stability and function are likely to be conserved. Although there have been several reports of successful consensus design of individual targets, it is unclear from these anecdotal studies how often this approach succeeds, and how often it fails. Here, we attempt to assess generality by designing consensus sequences for a set of six protein families with a range of chain-lengths, structures, and activities. We characterize the resulting consensus proteins for stability, structure, and biological activities in an unbiased way. We find that all six consensus proteins adopt cooperatively folded structures in solution. Strikingly, four out of six of these consensus proteins show increased thermodynamic stability over naturally-occurring homologues. Each consensus protein tested for function maintained at least partial biological activity. Though peptide binding affinity by a consensus-designed SH3 is rather low, Km values for consensus enzymes are similar to values from extant homologues. Though consensus enzymes are slower than extant homologues at low temperature, they are faster than some thermophilic enzymes at high temperature. An analysis of sequence properties shows consensus proteins to be enriched in charged residues, and rarified in uncharged polar residues. Sequence differences between consensus and extant homologues are predominantly located at weakly conserved surface residues, highlighting the importance of these residues in the success of the consensus strategy.Significance StatementA major goal of protein design is to create proteins that have high stability and biological activity. Drawing on evolutionary information encoded within extant protein sequences, consensus sequence design has produced several successes in achieving this goal. Here we explore the generality with which consensus design can be used to enhance protein stability and maintain biological activity. By designing and characterizing consensus sequences for six unrelated protein families, we find that consensus design shows high success rates in creating well-folded, hyperstable proteins that retain biological activities. Remarkably, many of these consensus proteins show higher stabilities than naturally-occurring sequences of their respective protein families. Our study highlights the utility of consensus sequence design and informs the mechanisms by which it works.