Ongoing evolution of KRAB zinc finger protein-coding genes in modern humans
AbstractBackgroundKrüppel-associated box (KRAB) zinc finger proteins (KZFPs) constitute the largest and fastest evolving family of gene regulators encoded by the human genome. Recent data indicate that many KZFPs serve as repressors of transposable element-embedded regulatory sequences (TEeRS) and that the evolutionary turnover of KZFP genes is mainly attributable to the changing transposable element (TE) load of their hosts. However, how natural selection and genetic variation are shaping this process is still poorly defined.MethodsGenetic information was collected from nine primate species and 138,500 human genomes. Gene-wide as well as functional amino acid position specific constraint was calculated across all human KZFPs.ResultsWe found that the most conserved KZFPs, some of which go back close to 400 million years, have been subjected to marked negative selection in the evolutionarily recent past and are very homogeneous within the human population. In contrast, younger, largely primate-restricted family members present evidence of less negative selection than the rest of genome and lower levels of coding constraint, particularly within the sequences encoding the functional sites of their zinc finger (ZF) arrays. We defined 33 sets of KZFP paralogs, which pairwise displayed a broad range of coding constraints differentials, with more recently emerged paralogs usually displaying a higher frequency of putatively deleterious mutations and missense variants within the functional sites of their ZF arrays than their source gene. Finally, we identified three KZFP genes more constrained in the genomes of individuals of African ancestry than in Europeans, with their modes of expression or DNA targets pointing to possible links between these inter-populational genetic differences and regional differences in the prevalence of some diseases.ConclusionsThis work shows how the ongoing selection of KZFPs contributes to modern human genetic variation, in particular through the constraint of putatively deleterious- and missense variants in functional protein sites, and how ongoing interplays between environment and KZFP genes might be impacting the biology of modern humans.