Reporter system architecture affects measurements of noncanonical amino acid incorporation efficiency and fidelity

Mapping Intimacies ◽

10.1101/737197 ◽

2019 ◽

Author(s):

K.A. Potts ◽

J.T. Stieglitz ◽

M. Lei ◽

J.A. Van Deventer

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Genetic Code ◽

Fluorescent Protein ◽

Protein Translation ◽

Chemical Diversity ◽

Reporter System ◽

Noncanonical Amino Acids ◽

Genetic Codes ◽

Incorporation Efficiency

AbstractThe ability to genetically encode noncanonical amino acids (ncAAs) within proteins supports a growing number of applications ranging from fundamental biological studies to enhancing the properties of biological therapeutics. Currently, our quantitative understanding of ncAA incorporation systems is confounded by the diverse set of characterization and analysis approaches used to quantify ncAA incorporation events. While several effective reporter systems support such measurements, it is not clear how quantitative results from different reporters relate to one another, or which details influence measurements most strongly. Here, we evaluate the quantitative performance of single-fluorescent protein reporters, dual-fluorescent protein reporters, and cell surface displayed protein reporters of ncAA insertion in response to the TAG (amber) codon in yeast. While different reporters support varying levels of apparent readthough efficiencies, flow cytometry-based evaluations with dual reporters yielded measurements exhibiting consistent quantitative trends and precision across all evaluated conditions. Further investigations of dual-fluorescent protein reporter architecture revealed that quantitative outputs are influenced by stop codon location and N-and C-terminal fluorescent protein identity. Both dual-fluorescent protein reporters and a “drop-in” version of yeast display support quantification of ncAA incorporation in several single-gene knockout strains, revealing strains that enhance ncAA incorporation efficiency without compromising fidelity. Our studies reveal critical details regarding reporter system performance in yeast and how to effectively deploy such reporters. These findings have substantial implications for how to engineer ncAA incorporation systems—and protein translation apparatuses—to better accommodate alternative genetic codes for expanding the chemical diversity of biosynthesized proteins.Design, System, Application ParagraphOn earth, the genetic code provides nearly invariant instructions for generating the proteins present in all organisms using 20 primary amino acid building blocks. Scientists and engineers have long recognized the potential power of altering the genetic code to introduce amino acids that enhance the chemical versatility of proteins. Proteins containing such “noncanonical amino acids” (ncAAs) can be used to elucidate basic biological phenomena, discover new therapeutics, or engineer new materials. However, tools for measuring ncAA incorporation during protein translation (reporters) exhibit highly variable properties, severely limiting our ability to engineer improved ncAA incorporation systems. In this work, we sought to understand what properties of these reporters affect measurements of ncAA incorporation events. Using a series of ncAA incorporation systems in yeast, we evaluated reporter architecture, measurement techniques, and alternative data analysis methods. We identified key factors contributing to quantification of ncAA incorporation in all of these categories and demonstrated the immediate utility of our approach in identifying genomic knockouts that enhance ncAA incorporation efficiency. Our findings have important implications for how to evolve cells to better accommodate alternative genetic codes.

Download Full-text

Transferability of N-terminal mutations of pyrrolysyl-tRNA synthetase in one species to that in another species on unnatural amino acid incorporation efficiency

Amino Acids ◽

10.1007/s00726-020-02927-z ◽

2020 ◽

Author(s):

Thomas L. Williams ◽

Debra J. Iskandar ◽

Alexander R. Nödling ◽

Yurong Tan ◽

Louis Y. P. Luk ◽

...

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Genetic Code ◽

Unnatural Amino Acids ◽

Trna Synthetase ◽

Unnatural Amino Acid ◽

Amino Acid Incorporation ◽

Acid Incorporation ◽

Genetic Code Expansion ◽

Incorporation Efficiency

AbstractGenetic code expansion is a powerful technique for site-specific incorporation of an unnatural amino acid into a protein of interest. This technique relies on an orthogonal aminoacyl-tRNA synthetase/tRNA pair and has enabled incorporation of over 100 different unnatural amino acids into ribosomally synthesized proteins in cells. Pyrrolysyl-tRNA synthetase (PylRS) and its cognate tRNA from Methanosarcina species are arguably the most widely used orthogonal pair. Here, we investigated whether beneficial effect in unnatural amino acid incorporation caused by N-terminal mutations in PylRS of one species is transferable to PylRS of another species. It was shown that conserved mutations on the N-terminal domain of MmPylRS improved the unnatural amino acid incorporation efficiency up to five folds. As MbPylRS shares high sequence identity to MmPylRS, and the two homologs are often used interchangeably, we examined incorporation of five unnatural amino acids by four MbPylRS variants at two temperatures. Our results indicate that the beneficial N-terminal mutations in MmPylRS did not improve unnatural amino acid incorporation efficiency by MbPylRS. Knowledge from this work contributes to our understanding of PylRS homologs which are needed to improve the technique of genetic code expansion in the future.

Download Full-text

Evolving a mitigation of the stress response pathway to change the basic chemistry of life

10.1101/2021.09.23.461486 ◽

2021 ◽

Author(s):

Isabella Tolle ◽

Stefan Oehm ◽

Michael Georg Hoesl ◽

Christin Treiber-Kleinke ◽

Lauri Peil ◽

...

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Stress Response ◽

Genetic Code ◽

Protein Translation ◽

Synthetic Analog ◽

E Coli ◽

Regulatory Constraints ◽

Translation Apparatus ◽

Canonical Amino Acid

ABSTRACTBillions of years of evolution have produced only slight variations in the standard genetic code, and the number and identity of proteinogenic amino acids have remained mostly consistent throughout all three domains of life. These observations suggest a certain rigidity of the genetic code and prompt musings as to the origin and evolution of the code. Here we conducted an adaptive laboratory evolution (ALE) to push the limits of the code restriction, by evolving Escherichia coli to fully replace tryptophan, thought to be the latest addition to the genetic code, with the analog L-β-(thieno[3,2-b]pyrrolyl)alanine ([3,2]Tpa). We identified an overshooting of the stress response system to be the main inhibiting factor for limiting ancestral growth upon exposure to β-(thieno[3,2-b]pyrrole ([3,2]Tp), a metabolic precursor of [3,2]Tpa, and Trp limitation. During the ALE, E. coli was able to “calm down” its stress response machinery, thereby restoring growth. In particular, the inactivation of RpoS itself, the master regulon of the general stress response, was a key event during the adaptation. Knocking out the rpoS gene in the ancestral background independent of other changes conferred growth on [3,2]Tp. Our results add additional evidence that frozen regulatory constraints rather than a rigid protein translation apparatus are Life’s gatekeepers of the canonical amino acid repertoire. This information will not only enable us to design enhanced synthetic amino acid incorporation systems but may also shed light on a general biological mechanism trapping organismal configurations in a status quo.SIGNIFICANCE STATEMENTThe (apparent) rigidity of the genetic code, as well as its universality, have long since ushered explorations into expanding the code with synthetic, new-to-nature building blocks and testing its boundaries. While nowadays even proteome-wide incorporation of synthetic amino acids has been reported on several occasions1–3, little is known about the underlying mechanisms.We here report ALE with auxotrophic E. coli that yielded successful proteome-wide replacement of Trp by its synthetic analog [3,2]Tpa accompanied with the selection for loss of RpoS4 function. Such laboratory domestication of bacteria by the acquisition of rpoS mitigation mutations is beneficial not only to overcome the stress of nutrient (Trp) starvation but also to evolve the paths to use environmental xenobiotics (e.g. [3,2]Tp) as essential nutrients for growth.We pose that regulatory constraints rather than a rigid and conserved protein translation apparatus are Life’s gatekeepers of the canonical amino acid repertoire (at least where close structural analogs are concerned). Our findings contribute a step towards understanding possible environmental causes of genetic changes and their relationship to evolution.Our evolved strain affords a platform for homogenous protein labeling with [3,2]Tpa as well as for the production of biomolecules5, which are challenging to synthesize chemically. Top-down synthetic biology will also benefit greatly from breaking through the boundaries of the frozen bacterial genetic code, as this will enable us to begin creating synthetic cells capable to utilize an expanded range of substrates essential for life.

Download Full-text

Evolving Bacterial Fitness with an Expanded Genetic Code

10.1101/169409 ◽

2017 ◽

Author(s):

Drew S. Tack ◽

Austin C. Cole ◽

R. Shroff ◽

B.R. Morrow ◽

Andrew D. Ellington

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Genetic Code ◽

Unnatural Amino Acid ◽

Noncanonical Amino Acids ◽

Bacterial Fitness ◽

Genetic Code Expansion ◽

Encoding Strategy ◽

Translation Systems ◽

Expanded Genetic Code

AbstractEvolution has for the most part used the canonical 20 amino acids of the natural genetic code to construct proteins. While several theories regarding the evolution of the genetic code have been proposed, experimental exploration of these theories has largely been restricted to phylogenetic and computational modeling. The development of orthogonal translation systems has allowed noncanonical amino acids to be inserted at will into proteins. We have taken advantage of these advances to evolve bacteria to accommodate a 21 amino acid genetic code in which the amber codon ambiguously encodes either 3-nitro-L-tyrosine or stop. Such an ambiguous encoding strategy recapitulates numerous models for genetic code expansion, and we find that evolved lineages first accommodate the unnatural amino acid, and then begin to evolve on a neutral landscape where stop codons begin to appear within genes. The resultant lines represent transitional intermediates on the way to the fixation of a functional 21 amino acid code.

Download Full-text

On the Importance of Asymmetry in the Phenotypic Expression of the Genetic Code upon the Molecular Evolution of Proteins

Symmetry ◽

10.3390/sym12060997 ◽

2020 ◽

Vol 12 (6) ◽

pp. 997

Author(s):

Marco V. José ◽

Gabriel S. Zamudio

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Genetic Code ◽

Phenotypic Expression ◽

Standard Genetic Code ◽

Specific Amino Acid ◽

Trna Synthetases ◽

Probability Of Occurrence ◽

Genetic Codes ◽

Evolution Of Proteins

The standard genetic code (SGC) is a mapping between the 64 possible arrangements of the four RNA nucleotides (C, A, U, G) into triplets or codons, where 61 codons are assigned to a specific amino acid and the other three are stop codons for terminating protein synthesis. Aminoacyl-tRNA synthetases (aaRSs) are responsible for implementing the SGC by specifically amino-acylating only its cognate transfer RNA (tRNA), thereby linking an amino acid with its corresponding anticodon triplets. tRNAs molecules bind each codon with its anticodon. To understand the meaning of symmetrical/asymmetrical properties of the SGC, we designed synthetic genetic codes with known symmetries and with the same degeneracy of the SGC. We determined their impact on the substitution rates for each amino acid under a neutral model of protein evolution. We prove that the phenotypic graphs of the SGC for codons and anticodons for all the possible arrangements of nucleotides are asymmetric and the amino acids do not form orbits. In the symmetrical synthetic codes, the amino acids are grouped according to their codonicity, this is the number of triplets encoding a given amino acid. Both the SGC and symmetrical synthetic codes exhibit a probability of occurrence of the amino acids proportional to their degeneracy. Unlike the SGC, the synthetic codes display a constant probability of occurrence of the amino acid according to their codonicity. The asymmetry of the phenotypic graphs of codons and anticodons of the SGC, has important implications on the evolutionary processes of proteins.

Download Full-text

Optimization of Expanded Genetic Codes via Genetic Algorithms

10.5753/eniac.2018.4440 ◽

2018 ◽

Author(s):

Maísa de Carvalho Silva ◽

Lariza Laura De Oliveira ◽

Renato Tinós

Keyword(s):

Amino Acids ◽

Genetic Algorithms ◽

Amino Acid ◽

Genetic Code ◽

Unnatural Amino Acids ◽

Genetically Modified Organisms ◽

Fitness Function ◽

Unnatural Amino Acid ◽

Standard Genetic Code ◽

Genetic Codes

In the last decades, researchers have proposed the use of genetically modified organisms that utilize unnatural amino acids, i.e., amino acids other than the 20 amino acids encoded in the standard genetic code. Unnatural amino acids have been incorporated into genetically engineered organisms for the development of new drugs, fuels and chemicals. When new amino acids are incorporated, it is necessary to modify the standard genetic code. Expanded genetic codes have been created without considering the robustness of the code. The objective of this work is the use of genetic algorithms (GAs) for the optimization of expanded genetic codes. The GA indicates which codons of the standard genetic code should be used to encode a new unnatural amino acid. The fitness function has two terms; one for robustness of the new code and another that takes into account the frequency of use of amino acids. Experiments show that, by controlling the weighting between the two terms, it is possible to obtain more or less amino acid substitutions at the same time that the robustness is minimized.

Download Full-text

Targeting motifs and functional parameters governing the assembly of connexins into gap junctions

Biochemical Journal ◽

10.1042/bj3490281 ◽

2000 ◽

Vol 349 (1) ◽

pp. 281-287 ◽

Cited By ~ 7

Author(s):

Patricia E. M. MARTIN ◽

James STEGGLES ◽

Claire WILSON ◽

Shoeb AHMAD ◽

W. Howard EVANS

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Gap Junctions ◽

Gap Junction ◽

Mammalian Cells ◽

Fluorescent Protein ◽

Lucifer Yellow ◽

Amino Acid Sequences ◽

Amino Acid Residues ◽

Green Fluorescent

To study the assembly of gap junctions, connexin-green-fluorescent-protein (Cx-GFP) chimeras were expressed in COS-7 and HeLa cells. Cx26- and Cx32-GFP were targeted to gap junctions where they formed functional channels that transferred Lucifer Yellow. A series of Cx32-GFP chimeras, truncated from the C-terminal cytoplasmic tail, were studied to identify amino acid sequences governing targeting from intracellular assembly sites to the gap junction. Extensive truncation of Cx32 resulted in failure to integrate into membranes. Truncation of Cx32 to residue 207, corresponding to removal of most of the 78 amino acids on the cytoplasmic C-terminal tail, led to arrest in the endoplasmic reticulum and incomplete oligomerization. However, truncation to amino acid 219 did not impair Cx oligomerization and connexon hemichannels were targeted to the plasma membrane. It was concluded that a crucial gap-junction targeting sequence resides between amino acid residues 207 and 219 on the cytoplasmic C-terminal tail of Cx32. Studies of a Cx32E208K mutation identified this as one of the key amino acids dictating targeting to the gap junction, although oligomerization of this site-specific mutation into hexameric hemichannels was relatively unimpaired. The studies show that expression of these Cx-GFP constructs in mammalian cells allowed an analysis of amino acid residues involved in gap-junction assembly.

Download Full-text

Can Power Laws Help Us Understand Gene and Proteome Information?

Advances in Mathematical Physics ◽

10.1155/2013/917153 ◽

2013 ◽

Vol 2013 ◽

pp. 1-10 ◽

Cited By ~ 3

Author(s):

J. A. Tenreiro Machado ◽

António C. Costa ◽

Maria Dulce Quelhas

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Amino Acid Sequence ◽

Genetic Code ◽

Protein Function ◽

Sequence Data ◽

Proteome Analysis ◽

Power Laws ◽

Linear Sequence ◽

Graphical Visualization

Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.

Download Full-text

Constrained mutational sampling of amino acids in HIV-1 protease evolution

10.1101/354597 ◽

2018 ◽

Author(s):

Jeffrey I. Boucher ◽

Troy W. Whitfield ◽

Ann Dauphin ◽

Gily Nachum ◽

Carl Hollins ◽

...

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Genetic Code ◽

Protein Sequence ◽

Fitness Landscape ◽

Sequence Evolution ◽

Single Mutation ◽

The Impact ◽

Protein Sequence Evolution ◽

Hiv 1

AbstractThe evolution of HIV-1 protein sequences should be governed by a combination of factors including nucleotide mutational probabilities, the genetic code, and fitness. The impact of these factors on protein sequence evolution are interdependent, making it challenging to infer the individual contribution of each factor from phylogenetic analyses alone. We investigated the protein sequence evolution of HIV-1 by determining an experimental fitness landscape of all individual amino acid changes in protease. We compared our experimental results to the frequency of protease variants in a publicly available dataset of 32,163 sequenced isolates from drug-naïve individuals. The most common amino acids in sequenced isolates supported robust experimental fitness, indicating that the experimental fitness landscape captured key features of selection acting on protease during viral infections of hosts. Amino acid changes requiring multiple mutations from the likely ancestor were slightly less likely to support robust experimental fitness than single mutations, consistent with the genetic code favoring chemically conservative amino acid changes. Amino acids that were common in sequenced isolates were predominantly accessible by single mutations from the likely protease ancestor. Multiple mutations commonly observed in isolates were accessible by mutational walks with highly fit single mutation intermediates. Our results indicate that the prevalence of multiple base mutations in HIV-1 protease is strongly influenced by mutational sampling.

Download Full-text

Design of Orthogonal Pairs for Protein Translation: Selection Systems for Genetically Encoding Noncanonical Amino Acids in E. coli

Springer Protocols Handbooks - Hydrocarbon and Lipid Microbiology Protocols ◽

10.1007/8623_2015_105 ◽

2015 ◽

pp. 71-82 ◽

Cited By ~ 1

Author(s):

Jelena Jaric ◽

Nediljko Budisa

Keyword(s):

Amino Acids ◽

Protein Translation ◽

Noncanonical Amino Acids ◽

E Coli ◽

Selection Systems ◽

Orthogonal Pairs

Download Full-text

A global perspective of codon usage

10.1101/076679 ◽

2016 ◽

Author(s):

Bohdan B. Khomtchouk ◽

Claes Wahlestedt ◽

Wolfgang Nonner

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Codon Usage ◽

Genetic Code ◽

Common Ancestor ◽

Tree Of Life ◽

Last Universal Common Ancestor ◽

Synonymous Codons ◽

Universal Common Ancestor ◽

Evolutionary Progression

Codon usage in 2730 genomes is analyzed for evolutionary patterns in the usage of synonymous codons and amino acids across prokaryotic and eukaryotic taxa. We group genomes together that have similar amounts of intra-genomic bias in their codon usage, and then compare how usage of particular different codons is diversified across each genome group, and how that usage varies from group to group. Inter-genomic diversity of codon usage increases with intra-genomic usage bias, following a universal pattern. The frequencies of the different codons vary in robust mutual correlation, and the implied synonymous codon and amino acid usages drift together. This kind of correlation indicates that the variation of codon usage across organisms is chiefly a consequence of lateral DNA transfer among diverse organisms. The group of genomes with the greatest intra-genomic bias comprises two distinct subgroups, with each one restricting its codon usage to essentially one unique half of the genetic code table. These organisms include eubacteria and archaea thought to be closest to the hypothesized last universal common ancestor (LUCA). Their codon usages imply genetic diversity near the hypothesized base of the tree of life. There is a continuous evolutionary progression across taxa from the two extremely diversified usages toward balanced usage of different codons (as approached, e.g. in mammals). In that progression, codon frequency variations are correlated as expected from a blending of the two extreme codon usages seen in prokaryotes.AUTHOR SUMMARYThe redundancy intrinsic to the genetic code allows different amino acids to be encoded by up to six synonymous codons. Genomes of different organisms prefer different synonymous codons, a phenomenon known as ‘codon usage bias.’ The phenomenon of codon usage bias is of fundamental interest for evolutionary biology, and is important in a variety of applied settings (e.g., transgene expression). The spectrum of codon usage biases seen in current organisms is commonly thought to have arisen by the combined actions of mutations and selective pressures. This view focuses on codon usage in specific genomes and the consequences of that usage for protein expression.Here we investigate an unresolved question of molecular genetics: are there global rules governing the usage of synonymous codons made by genomic DNA across organisms? To answer this question, we employed a data-driven approach to surveying 2730 species from all kingdoms of the ‘tree of life’ in order to classify their codon usage. A first major result was that the large majority of these organisms use codons rather uniformly on the genome-wide scale, without giving preference to particular codons among possible synonymous alternatives. A second major result was that two compartments of codon usage seem to co-exist and to be expressed in different proportions by different organisms. As such, we investigate how individual different codons are used in different organisms from all taxa. Whereas codon usage is generally believed to be the evolutionary result of both mutations and natural selection, our results suggest a different perspective: the usage of different codons (and amino acids) by different organisms follows a superposition of two distinct patterns of usage. One distinction locates to the third base pair of all different codons, which in one pattern is U or A, and in the other pattern is G or C. This result has two major implications: (1) the variation of codon usage as seen across different organisms is best accounted for by lateral gene transfer among diverse organisms; (2) the organisms that are by protein homology grouped near the base of the ‘tree of life’ comprise two genetically distinct lineages.We find that, over evolutionary time, codon usages have converged from two distinct, non-overlapping usages (e.g., as evident in bacteria and archaea) to a near-uniform, balanced usage of synonymous codons (e.g., in mammals). This shows that the variations of codon (and amino acid) biases reveal a distinct evolutionary progression. We also find that codon usage in bacteria and archaea is most diverse between organisms thought to be closest to the hypothesized last universal common ancestor (LUCA). The dichotomy in codon (and amino acid usages) present near the origin of the current ‘tree of life’ might provide information about the evolutionary development of the genetic code.

Download Full-text