scholarly journals Compound Dynamics and Combinatorial Patterns of Amino Acid Repeats Encode a System of Evolutionary and Developmental Markers

2019 ◽  
Vol 11 (11) ◽  
pp. 3159-3178
Author(s):  
Ilaria Pelassa ◽  
Marica Cibelli ◽  
Veronica Villeri ◽  
Elena Lilliu ◽  
Serena Vaglietti ◽  
...  

Abstract Homopolymeric amino acid repeats (AARs) like polyalanine (polyA) and polyglutamine (polyQ) in some developmental proteins (DPs) regulate certain aspects of organismal morphology and behavior, suggesting an evolutionary role for AARs as developmental “tuning knobs.” It is still unclear, however, whether these are occasional protein-specific phenomena or hints at the existence of a whole AAR-based regulatory system in DPs. Using novel approaches to trace their functional and evolutionary history, we find quantitative evidence supporting a generalized, combinatorial role of AARs in developmental processes with evolutionary implications. We observe nonrandom AAR distributions and combinations in HOX and other DPs, as well as in their interactomes, defining elements of a proteome-wide combinatorial functional code whereby different AARs and their combinations appear preferentially in proteins involved in the development of specific organs/systems. Such functional associations can be either static or display detectable evolutionary dynamics. These findings suggest that progressive changes in AAR occurrence/combination, by altering embryonic development, may have contributed to taxonomic divergence, leaving detectable traces in the evolutionary history of proteomes. Consistent with this hypothesis, we find that the evolutionary trajectories of the 20 AARs in eukaryotic proteomes are highly interrelated and their individual or compound dynamics can sharply mark taxonomic boundaries, or display clock-like trends, carrying overall a strong phylogenetic signal. These findings provide quantitative evidence and an interpretive framework outlining a combinatorial system of AARs whose compound dynamics mark at the same time DP functions and evolutionary transitions.

Science ◽  
2021 ◽  
Vol 373 (6556) ◽  
pp. 792-796 ◽  
Author(s):  
Paul K. Strother ◽  
Clinton Foster

Molecular time trees indicating that embryophytes originated around 500 million years ago (Ma) during the Cambrian are at odds with the record of fossil plants, which first appear in the mid-Silurian almost 80 million years later. This time gap has been attributed to a missing fossil plant record, but that attribution belies the case for fossil spores. Here, we describe a Tremadocian (Early Ordovician, about 480 Ma) assemblage with elements of both Cambrian and younger embryophyte spores that provides a new level of evolutionary continuity between embryophytes and their algal ancestors. This finding suggests that the molecular phylogenetic signal retains a latent evolutionary history of the acquisition of the embryophytic developmental genome, a history that perhaps began during Ediacaran-Cambrian time but was not completed until the mid-Silurian (about 430 Ma).


2018 ◽  
Author(s):  
Gang Li ◽  
Henrique V. Figueiro ◽  
Eduardo Eizirik ◽  
William J. Murphy

Current phylogenomic approaches implicitly assume that the predominant phylogenetic signal within a genome reflects the true evolutionary history of organisms, without assessing the confounding effects of gene flow that result in a mosaic of phylogenetic signals that interact with recombinational variation. Here we tested the validity of this assumption with a recombination-aware analysis of whole genome sequences from 27 species of the cat family. We found that the prevailing phylogenetic signal within the autosomes is not always representative of speciation history, due to ancient hybridization throughout felid evolution. Instead, phylogenetic signal was concentrated within large, conserved X-chromosome recombination deserts that exhibited recurrent patterns of strong genetic differentiation and selective sweeps across mammalian orders. By contrast, regions of high recombination were enriched for signatures of ancient gene flow, and these sequences inflated crown-lineage divergence times by ~40%. We conclude that standard phylogenomic approaches to infer the Tree of Life may be highly misleading without considering the genomic partitioning of phylogenetic signal relative to recombination rate, and its interplay with historical hybridization.


2020 ◽  
Vol 165 (11) ◽  
pp. 2599-2603
Author(s):  
Mi-ran Yun ◽  
Jungsang Ryou ◽  
Wooyoung Choi ◽  
Joo-Yeon Lee ◽  
Sun-Whan Park ◽  
...  

AbstractSevere fever with thrombocytopenia syndrome (SFTS) is caused by SFTS virus (SFTSV). Although SFTS originated in China, it is an emerging infectious disease with prevalence confirmed in Japan, Korea, and Vietnam. The full-length genomes of 51 Korean SFTSV isolates from 2013 to 2016 were sequenced, and the sequences were deposited into a public database (GenBank) and analyzed to elucidate the phylogeny and evolution of the virus. Although most of the Korean SFTSV isolates were closely related to previously reported Japanese isolates, some were closely related to previously reported Chinese isolates. We identified one Korean strain that appears to have resulted from multiple inter-lineage reassortments. Several nucleotide and amino acid variations specific to the Korean isolates were identified. Future studies should focus on how these variations affect virus pathogenicity and evolution.


Quaternary ◽  
2018 ◽  
Vol 1 (3) ◽  
pp. 26
Author(s):  
Maria Palombo

Explaining the multifaceted, dynamic interactions of the manifold factors that have modelled throughout the ages the evolutionary history of the biosphere is undoubtedly a fascinating and challenging task that has been intriguing palaeontologists, biologists and ecologists for decades, in a never-ending pursuit of the causal factors that controlled the evolutionary dynamics of the Earth’s ecosystems throughout deep and Quaternary time. [...]


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Dario Karmeinski ◽  
Karen Meusemann ◽  
Jessica A. Goodheart ◽  
Michael Schroedl ◽  
Alexander Martynov ◽  
...  

Abstract Background The soft-bodied cladobranch sea slugs represent roughly half of the biodiversity of marine nudibranch molluscs on the planet. Despite their global distribution from shallow waters to the deep sea, from tropical into polar seas, and their important role in marine ecosystems and for humans (as targets for drug discovery), the evolutionary history of cladobranch sea slugs is not yet fully understood. Results To enlarge the current knowledge on the phylogenetic relationships, we generated new transcriptome data for 19 species of cladobranch sea slugs and two additional outgroup taxa (Berthella plumula and Polycera quadrilineata). We complemented our taxon sampling with previously published transcriptome data, resulting in a final data set covering 56 species from all but one accepted cladobranch superfamilies. We assembled all transcriptomes using six different assemblers, selecting those assemblies that provided the largest amount of potentially phylogenetically informative sites. Quality-driven compilation of data sets resulted in four different supermatrices: two with full coverage of genes per species (446 and 335 single-copy protein-coding genes, respectively) and two with a less stringent coverage (667 genes with 98.9% partition coverage and 1767 genes with 86% partition coverage, respectively). We used these supermatrices to infer statistically robust maximum-likelihood trees. All analyses, irrespective of the data set, indicate maximal statistical support for all major splits and phylogenetic relationships at the family level. Besides the questionable position of Noumeaella rubrofasciata, rendering the Facelinidae as polyphyletic, the only notable discordance between the inferred trees is the position of Embletonia pulchra. Extensive testing using Four-cluster Likelihood Mapping, Approximately Unbiased tests, and Quartet Scores revealed that its position is not due to any informative phylogenetic signal, but caused by confounding signal. Conclusions Our data matrices and the inferred trees can serve as a solid foundation for future work on the taxonomy and evolutionary history of Cladobranchia. The placement of E. pulchra, however, proves challenging, even with large data sets and various optimization strategies. Moreover, quartet mapping results show that confounding signal present in the data is sufficient to explain the inferred position of E. pulchra, again leaving its phylogenetic position as an enigma.


1975 ◽  
Vol 53 (5) ◽  
pp. 561-564 ◽  
Author(s):  
Keith Scott ◽  
Burt Zerner

The amino acid compositions of the carboxylesterases from chicken, horse, ox, sheep, and pig livers are reported and compared. As would be expected for this homologous series, the compositions show a general similarity. However, there are some significant differences, but the degree to which particular pairs of enzymes differ is consistent with the evolutionary history of the species from which they were isolated.


2021 ◽  
Author(s):  
Cedoljub Bundalovic-Torma ◽  
Darrell Desveaux ◽  
David S Guttman

A critical step in studying biological features (e.g., genetic variants, gene families, metabolic capabilities, or taxa) underlying traits or outcomes of interest is assessing their diversity and distribution. Accurate assessments of these patterns are essential for linking features to traits or outcomes and understanding their functional impact. Consequently, it is of crucial importance that the metrics employed for quantifying feature diversity can perform robustly under any evolutionary scenario. However, the standard metrics used for quantifying and comparing the distribution of features, such as prevalence, phylogenetic diversity, and related approaches, either do not take into consideration evolutionary history, or assume strictly vertical patterns of inheritance. Consequently, these approaches cannot accurately assess diversity for features that have undergone recombination or horizontal transfer. To address this issue, we have devised RecPD, a novel recombination-aware phylogenetic-diversity metric for measuring the distribution and diversity of features under all evolutionary scenarios. RecPD utilizes ancestral-state reconstruction to map the presence / absence of features onto ancestral nodes in a species tree, and then identifies potential recombination events in the evolutionary history of the feature. We also derive a number of related metrics from RecPD that can be used to assess and quantify evolutionary dynamics and correlation of feature evolutionary histories. We used simulation studies to show that RecPD reliably identifies evolutionary histories under diverse recombination and loss scenarios. We then apply RecPD in a real-world scenario in a preliminary study type III effector protein families secreted by the plant pathogenic bacterium Pseudomonas syringae and demonstrate that prevalence is an inadequate metric that obscures the potential impact of recombination. We believe RecPD will have broad utility for revealing and quantifying complex evolutionary processes for features at any biological level.


2021 ◽  
Author(s):  
Lei Yang ◽  
Raunaq Malhotra ◽  
Rayan Chikhi ◽  
Daniel Elleder ◽  
Theodora Kaiser ◽  
...  

AbstractBackgroundAll vertebrate genomes have been colonized by retroviruses along their evolutionary trajectory. Although it is clear that endogenous retroviruses (ERVs) can contribute important physiological functions to contemporary hosts, such benefits are attributed to long-term co-evolution of ERV and host. Newly colonized ERVs are thought unlikely to contribute to host genome evolution because germline infections are rare and because the host effectively silences them. The genomes of several outbred species including mule deer (Odocoileus hemionus) are currently being colonized by ERVs, which provides an opportunity to study ERV dynamics at a time when few are fixed.Here we investigate the history of cervid endogenous retrovirus (CrERV) acquisition and expansion in the mule deer genome to determine the potential impact of endogenizing retroviruses on host genomic diversity.MethodsA mule deer genome was de novo assembled from short and long insert mate pair reads. Scaffolds were further assembled using reference assisted chromosome assembly (RACA) to provide spatial orientation of CrERV insertion sites and to facilitate assembly of CrERV sequences. We applied phylogenetic and coalescent approaches to non-recombinant genomes to determine CrERV evolutionary history, augmenting ancestral divergence estimates with the prevalence of each CrERV locus in a population of mule deer. Recombination history was investigated on partial genome alignments.ResultsThe CrERV composition and diversity in the mule deer genome has recently measurably increased by horizontal acquisition of a new retroviruses lineage and because of recombination with existing CrERV. Resulting interlineage recombinants also endogenized and subsequently retrotransposed. CrERV loci are significantly closer to genes than expected if integration were random and gene proximity might explain the recent expansion by retrotransposition of one recombinant CrERV lineage.ConclusionsThere has been a burst of CrERV integrations during a recent retrovirus epizootic that increased genomic CrERV burden and has resulted in extensive insertional polymorphism in contemporary mule deer genomes. Recombination is a defining feature of CrERV evolutionary dynamics driven by this colonization, increasing CrERV burden and CrERV genetic diversity. These data support that retroviral colonization during an epizootic provides a burst of genomic diversity to the host population.


2018 ◽  
Vol 14 (10) ◽  
pp. 20180502 ◽  
Author(s):  
Manabu Sakamoto ◽  
Chris Venditti

Statistical non-independence of species’ biological traits is recognized in most traits under selection. Yet, whether or not the evolutionary rates of such biological traits are statistically non-independent remains to be tested. Here, we test the hypothesis that phenotypic evolutionary rates are non-independent, i.e. contain phylogenetic signal, using empirical rates of evolution in three separate traits: body mass in mammals, beak shape in birds and bite force in amniotes. Specifically, we test if evolutionary rates are phylogenetically interdependent. We find evidence for phylogenetic signal in evolutionary rates in all three case studies. While phylogenetic signal diminishes deeper in time, this is reflective of statistical power owing to small sample and effect sizes. When effect size is large, e.g. owing to the presence of fossil tips, we detect high phylogenetic signals even in deeper time slices. Thus, we recommend that rates be treated as being non-independent throughout the evolutionary history of the group of organisms under study, and any summaries or analyses of rates through time—including associations of rates with traits—need to account for the undesired effects of shared ancestry.


2017 ◽  
Author(s):  
Roman Sloutsky ◽  
Kristen M. Naegle

AbstractEvolutionary reconstruction algorithms produce models of the evolutionary history of proteins: the order of duplications and speciations that led to extant homologous proteins observed across species. Although they are regularly used to gain insight into protein function, these models are estimates of an unknowable truth according to the underlying assumptions inherent in each algorithm, its objective function, and the input sequences supplied for reconstruction. In practice, the generated models are highly sensitive to the sequence inputs. In this work, we asked whether we could identify stronger phylogenetic signal by capitalizing on the variance introduced by perturbing the input to evolutionary reconstruction to explore a rich space of possible models that could explain protein evolution. We subsampled from available protein orthologs, “same” proteins across multiple extant species, and produced an ensemble of topologies representing the duplication history which produced related proteins (paralogs) for simulated protein families and in a real protein family – the LacI transcription factor family. We found that two very important phenomena arise from this approach. First, the reproducibility of an all-sequence, single-alignment reconstruction, measured by comparing topologies inferred from 90% subsamples, directly correlates with the accuracy of that single-alignment reconstruction, producing a measurable value for something that has been traditionally unknowable. Second, if we take a large ensemble of trees inferred from 50% subsamples and cast the ensemble into a form that represents the distribution of pairwise leaf distances observed across the ensemble, then trees that capture the most frequently observed relationships are also the most accurate. We propose a new methodology, ASPEN, a meta-algorithm that finds and ranks the trees that are most consistent with observations across the ensemble. Top-ranked ASPEN trees are significantly more accurate than the single-alignment tree produced from all available sequences. Importantly, our findings suggest that the true tree is currently inaccessible for most real protein families. Instead, applications that rely on evolutionary models should integrate across many trees that are equally likely to represent the true evolutionary history of a protein family.


Sign in / Sign up

Export Citation Format

Share Document