Crossing fitness valleys via double substitutions within codons
AbstractSingle nucleotide substitutions in protein-coding genes can be divided into synonymous (S), with little fitness effect, and non-synonymous (N) ones that alter amino acids and thus generally have a greater effect. Most of the N substitutions are affected by purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases can modulate the deleterious effect of single substitutions and thus might be subject to positive selection. To elucidate the effects of selection on double substitutions in all codons, it is critical to differentiate selection from mutational biases. We approached this problem by comparing the fractions of double substitutions within codons to those of the equivalent double S substitutions in adjacent codons. Under the assumption that substitutions occur one at a time, all within-codon double substitutions can be represented as “ancestral-intermediate-final” sequences and can be partitioned into 4 classes: 1) SS: S intermediate – S final, 2) SN: S intermediate – N final, 3) NS: N intermediate – S final, 4) NN: N intermediate – N final. We found that the selective pressure on the second substitution markedly differs among these classes of double substitutions. Analogous to single S substitutions, SS evolve neutrally whereas, analogous to single N substitutions, SN are subject to purifying selection. In contrast, NS show positive selection on the second step because the original amino acid is recovered. The NN double substitutions are heterogeneous and can be subject to either purifying or positive selection, or evolve neutrally, depending on the amino acid similarity between the final or intermediate and the ancestral states. The general trend is that the second mutation compensates for the deleterious effect of the first one, resulting in frequent crossing of valleys on the fitness landscape.