scholarly journals Primate phylogenomics uncovers multiple rapid radiations and ancient interspecific introgression

PLoS Biology ◽  
2020 ◽  
Vol 18 (12) ◽  
pp. e3000954
Author(s):  
Dan Vanderpool ◽  
Bui Quang Minh ◽  
Robert Lanfear ◽  
Daniel Hughes ◽  
Shwetha Murali ◽  
...  

Our understanding of the evolutionary history of primates is undergoing continual revision due to ongoing genome sequencing efforts. Bolstered by growing fossil evidence, these data have led to increased acceptance of once controversial hypotheses regarding phylogenetic relationships, hybridization and introgression, and the biogeographical history of primate groups. Among these findings is a pattern of recent introgression between species within all major primate groups examined to date, though little is known about introgression deeper in time. To address this and other phylogenetic questions, here, we present new reference genome assemblies for 3 Old World monkey (OWM) species: Colobus angolensis ssp. palliatus (the black and white colobus), Macaca nemestrina (southern pig-tailed macaque), and Mandrillus leucophaeus (the drill). We combine these data with 23 additional primate genomes to estimate both the species tree and individual gene trees using thousands of loci. While our species tree is largely consistent with previous phylogenetic hypotheses, the gene trees reveal high levels of genealogical discordance associated with multiple primate radiations. We use strongly asymmetric patterns of gene tree discordance around specific branches to identify multiple instances of introgression between ancestral primate lineages. In addition, we exploit recent fossil evidence to perform fossil-calibrated molecular dating analyses across the tree. Taken together, our genome-wide data help to resolve multiple contentious sets of relationships among primates, while also providing insight into the biological processes and technical artifacts that led to the disagreements in the first place.

Author(s):  
Dan Vanderpool ◽  
Bui Quang Minh ◽  
Robert Lanfear ◽  
Daniel Hughes ◽  
Shwetha Murali ◽  
...  

AbstractOur understanding of the evolutionary history of primates is undergoing continual revision due to ongoing genome sequencing efforts. Bolstered by growing fossil evidence, these data have led to increased acceptance of once controversial hypotheses regarding phylogenetic relationships, hybridization and introgression, and the biogeographical history of primate groups. Among these findings is a pattern of recent introgression between species within all major primate groups examined to date, though little is known about introgression deeper in time. To address this and other phylogenetic questions, here we present new reference genome assemblies for three Old World Monkey species: Colobus angolensis ssp. palliatus (the black and white colobus), Macaca nemestrina (southern pig-tailed macaque), and Mandrillus leucophaeus (the drill). We combine these data with 23 additional primate genomes to estimate both the species tree and individual gene trees using thousands of loci. While our species tree is largely consistent with previous phylogenetic hypotheses, the gene trees reveal high levels of genealogical discordance associated with multiple primate radiations. We use strongly asymmetric patterns of gene tree discordance around specific branches to identify multiple instances of introgression between ancestral primate lineages. In addition, we exploit recent fossil evidence to perform fossil-calibrated molecular dating analyses across the tree. Taken together, our genome-wide data help to resolve multiple contentious sets of relationships among primates, while also providing insight into the biological processes and technical artifacts that led to the disagreements in the first place.


PLoS Genetics ◽  
2021 ◽  
Vol 17 (2) ◽  
pp. e1009095
Author(s):  
Thomas C. Nelson ◽  
Angela M. Stathos ◽  
Daniel D. Vanderpool ◽  
Findley R. Finseth ◽  
Yao-wu Yuan ◽  
...  

Inferences about past processes of adaptation and speciation require a gene-scale and genome-wide understanding of the evolutionary history of diverging taxa. In this study, we use genome-wide capture of nuclear gene sequences, plus skimming of organellar sequences, to investigate the phylogenomics of monkeyflowers in Mimulus section Erythranthe (27 accessions from seven species). Taxa within Erythranthe, particularly the parapatric and putatively sister species M. lewisii (bee-pollinated) and M. cardinalis (hummingbird-pollinated), have been a model system for investigating the ecological genetics of speciation and adaptation for over five decades. Across >8000 nuclear loci, multiple methods resolve a predominant species tree in which M. cardinalis groups with other hummingbird-pollinated taxa (37% of gene trees), rather than being sister to M. lewisii (32% of gene trees). We independently corroborate a single evolution of hummingbird pollination syndrome in Erythranthe by demonstrating functional redundancy in genetic complementation tests of floral traits in hybrids; together, these analyses overturn a textbook case of pollination-syndrome convergence. Strong asymmetries in allele-sharing (Patterson’s D-statistic and related tests) indicate that gene-tree discordance reflects ancient and recent introgression rather than incomplete lineage sorting. Consistent with abundant introgression blurring the history of divergence, low-recombination and adaptation-associated regions support the new species tree, while high-recombination regions generate phylogenetic evidence for sister status for M. lewisii and M. cardinalis. Population-level sampling of core taxa also revealed two instances of chloroplast capture, with Sierran M. lewisii and Southern Californian M. parishii each carrying organelle genomes nested within respective sympatric M. cardinalis clades. A recent organellar transfer from M. cardinalis, an outcrosser where selfish cytonuclear dynamics are more likely, may account for the unexpected cytoplasmic male sterility effects of selfer M. parishii organelles in hybrids with M. lewisii. Overall, our phylogenomic results reveal extensive reticulation throughout the evolutionary history of a classic monkeyflower radiation, suggesting that natural selection (re-)assembles and maintains species-diagnostic traits and barriers in the face of gene flow. Our findings further underline the challenges, even in reproductively isolated species, in distinguishing re-use of adaptive alleles from true convergence and emphasize the value of a phylogenomic framework for reconstructing the evolutionary genetics of adaptation and speciation.


2020 ◽  
Author(s):  
Manuel Lafond ◽  
Marc Hellmuth

Abstract Background: The history of gene families -- which are equivalent to event-labeled gene trees -- can to some extent be reconstructed from empirically estimated evolutionary event-relations containing pairs of orthologous, paralogous or xenologous genes. The question then arises as whether inferred event-labeled gene trees are "biologically feasible" which is the case if one can find a species tree with which the gene tree can be reconciled in a time-consistent way.Results: In this contribution, we consider event-labeled gene trees that contain speciations, duplications as well as horizontal gene transfer (HGT) and we assume that the species tree is unknown. Although many problems become NP-hard as soon as HGT and time-consistency are involved, we show, in contrast, that the problem of finding a time-consistent species tree for a given event-labeled gene can be solved in polynomial-time. We provide a cubic-time algorithm to decide whether a "time-consistent" species for a given event-labeled gene tree exists and, in the affirmative case, to construct the species tree within the same time-complexity.


Genetics ◽  
2003 ◽  
Vol 164 (4) ◽  
pp. 1645-1656 ◽  
Author(s):  
Bruce Rannala ◽  
Ziheng Yang

Abstract The effective population sizes of ancestral as well as modern species are important parameters in models of population genetics and human evolution. The commonly used method for estimating ancestral population sizes, based on counting mismatches between the species tree and the inferred gene trees, is highly biased as it ignores uncertainties in gene tree reconstruction. In this article, we develop a Bayes method for simultaneous estimation of the species divergence times and current and ancestral population sizes. The method uses DNA sequence data from multiple loci and extracts information about conflicts among gene tree topologies and coalescent times to estimate ancestral population sizes. The topology of the species tree is assumed known. A Markov chain Monte Carlo algorithm is implemented to integrate over uncertain gene trees and branch lengths (or coalescence times) at each locus as well as species divergence times. The method can handle any species tree and allows different numbers of sequences at different loci. We apply the method to published noncoding DNA sequences from the human and the great apes. There are strong correlations between posterior estimates of speciation times and ancestral population sizes. With the use of an informative prior for the human-chimpanzee divergence date, the population size of the common ancestor of the two species is estimated to be ∼20,000, with a 95% credibility interval (8000, 40,000). Our estimates, however, are affected by model assumptions as well as data quality. We suggest that reliable estimates have yet to await more data and more realistic models.


2020 ◽  
Author(s):  
Fernando Lopes ◽  
Larissa R Oliveira ◽  
Amanda Kessler ◽  
Yago Beux ◽  
Enrique Crespo ◽  
...  

Abstract The phylogeny and systematics of fur seals and sea lions (Otariidae) have long been studied with diverse data types, including an increasing amount of molecular data. However, only a few phylogenetic relationships have reached acceptance because of strong gene-tree species tree discordance. Divergence times estimates in the group also vary largely between studies. These uncertainties impeded the understanding of the biogeographical history of the group, such as when and how trans-equatorial dispersal and subsequent speciation events occurred. Here we used high-coverage genome-wide sequencing for 14 of the 15 species of Otariidae to elucidate the phylogeny of the family and its bearing on the taxonomy and biogeographical history. Despite extreme topological discordance among gene trees, we found a fully supported species tree that agrees with the few well-accepted relationships and establishes monophyly of the genus Arctocephalus. Our data support a relatively recent trans-hemispheric dispersal at the base of a southern clade, which rapidly diversified into six major lineages between 3 to 2.5 Ma. Otaria diverged first, followed by Phocarctos and then four major lineages within Arctocephalus. However, we found Zalophus to be non-monophyletic, with California (Z. californianus) and Steller sea lions (Eumetopias jubatus) grouping closer than the Galapagos sea lion (Z. wollebaeki) with evidence for introgression between the two genera. Overall, the high degree of genealogical discordance was best explained by incomplete lineage sorting resulting from quasi-simultaneous speciation within the southern clade with introgresssion playing a subordinate role in explaining the incongruence among and within prior phylogenetic studies of the family.


2022 ◽  
Vol 12 ◽  
Author(s):  
Martha Kandziora ◽  
Petr Sklenář ◽  
Filip Kolář ◽  
Roswitha Schmickl

A major challenge in phylogenetics and -genomics is to resolve young rapidly radiating groups. The fast succession of species increases the probability of incomplete lineage sorting (ILS), and different topologies of the gene trees are expected, leading to gene tree discordance, i.e., not all gene trees represent the species tree. Phylogenetic discordance is common in phylogenomic datasets, and apart from ILS, additional sources include hybridization, whole-genome duplication, and methodological artifacts. Despite a high degree of gene tree discordance, species trees are often well supported and the sources of discordance are not further addressed in phylogenomic studies, which can eventually lead to incorrect phylogenetic hypotheses, especially in rapidly radiating groups. We chose the high-Andean Asteraceae genus Loricaria to shed light on the potential sources of phylogenetic discordance and generated a phylogenetic hypothesis. By accounting for paralogy during gene tree inference, we generated a species tree based on hundreds of nuclear loci, using Hyb-Seq, and a plastome phylogeny obtained from off-target reads during target enrichment. We observed a high degree of gene tree discordance, which we found implausible at first sight, because the genus did not show evidence of hybridization in previous studies. We used various phylogenomic analyses (trees and networks) as well as the D-statistics to test for ILS and hybridization, which we developed into a workflow on how to tackle phylogenetic discordance in recent radiations. We found strong evidence for ILS and hybridization within the genus Loricaria. Low genetic differentiation was evident between species located in different Andean cordilleras, which could be indicative of substantial introgression between populations, promoted during Pleistocene glaciations, when alpine habitats shifted creating opportunities for secondary contact and hybridization.


2020 ◽  
Author(s):  
Matthew H Van Dam ◽  
James B Henderson ◽  
Lauren Esposito ◽  
Michelle Trautwein

Abstract Ultraconserved genomic elements (UCEs) are generally treated as independent loci in phylogenetic analyses. The identification pipeline for UCE probes does not require prior knowledge of genetic identity, only selecting loci that are highly conserved, single copy, without repeats, and of a particular length. Here, we characterized UCEs from 11 phylogenomic studies across the animal tree of life, from birds to marine invertebrates. We found that within vertebrate lineages, UCEs are mostly intronic and intergenic, while in invertebrates, the majority are in exons. We then curated four different sets of UCE markers by genomic category from five different studies including: birds, mammals, fish, Hymenoptera (ants, wasps, and bees), and Coleoptera (beetles). Of genes captured by UCEs, we find that many are represented by two or more UCEs, corresponding to nonoverlapping segments of a single gene. We considered these UCEs to be nonindependent, merged all UCEs that belonged to a particular gene, constructed gene and species trees, and then evaluated the subsequent effect of merging cogenic UCEs on gene and species tree reconstruction. Average bootstrap support for merged UCE gene trees was significantly improved across all data sets apparently driven by the increase in loci length. Additionally, we conducted simulations and found that gene trees generated from merged UCEs were more accurate than those generated by unmerged UCEs. As loci length improves gene tree accuracy, this modest degree of UCE characterization and curation impacts downstream analyses and demonstrates the advantages of incorporating basic genomic characterizations into phylogenomic analyses. [Anchored hybrid enrichment; ants; ASTRAL; bait capture; carangimorph; Coleoptera; conserved nonexonic elements; exon capture; gene tree; Hymenoptera; mammal; phylogenomic markers; songbird; species tree; ultraconserved elements; weevils.]


2018 ◽  
Author(s):  
Stephen A. Smith ◽  
Nathanael Walker-Hale ◽  
Joseph F. Walker ◽  
Joseph W. Brown

AbstractStudies have demonstrated that pervasive gene tree conflict underlies several important phylogenetic relationships where different species tree methods produce conflicting results. Here, we present a means of dissecting the phylogenetic signal for alternative resolutions within a dataset in order to resolve recalcitrant relationships and, importantly, identify what the dataset is unable to resolve. These procedures extend upon methods for isolating conflict and concordance involving specific candidate relationships and can be used to identify systematic error and disambiguate sources of conflict among species tree inference methods. We demonstrate these on a large phylogenomic plant dataset. Our results support the placement of Amborella as sister to the remaining extant angiosperms, Gnetales as sister to pines, and the monophyly of extant gymnosperms. Several other contentious relationships, including the resolution of relationships within the bryophytes and the eudicots, remain uncertain given the low number of supporting gene trees. To address whether concatenation of filtered genes amplified phylogenetic signal for relationships, we implemented a combinatorial heuristic to test combinability of genes. We found that nested conflicts limited the ability of data filtering methods to fully ameliorate conflicting signal amongst gene trees. These analyses confirmed that the underlying conflicting signal does not support broad concatenation of genes. Our approach provides a means of dissecting a specific dataset to address deep phylogenetic relationships while also identifying the inferential boundaries of the dataset.


2020 ◽  
Author(s):  
Michael J. Sanderson ◽  
Michelle M. McMahon ◽  
Mike Steel

AbstractTerraces in phylogenetic tree space are sets of trees with identical optimality scores for a given data set, arising from missing data. These were first described for multilocus phylogenetic data sets in the context of maximum parsimony inference and maximum likelihood inference under certain model assumptions. Here we show how the mathematical properties that lead to terraces extend to gene tree - species tree problems in which the gene trees are incomplete. Inference of species trees from either sets of gene family trees subject to duplication and loss, or allele trees subject to incomplete lineage sorting, can exhibit terraces in their solution space. First, we show conditions that lead to a new kind of terrace, which stems from subtree operations that appear in reconciliation problems for incomplete trees. Then we characterize when terraces of both types can occur when the optimality criterion for tree search is based on duplication, loss or deep coalescence scores. Finally, we examine the impact of assumptions about the causes of losses: whether they are due to imperfect sampling or true evolutionary deletion.


2020 ◽  
Author(s):  
Ishrat Tanzila Farah ◽  
Md Muktadirul Islam ◽  
Kazi Tasnim Zinat ◽  
Atif Hasan Rahman ◽  
Md Shamsuzzoha Bayzid

AbstractSpecies tree estimation from multi-locus dataset is extremely challenging, especially in the presence of gene tree heterogeneity across the genome due to incomplete lineage sorting (ILS). Summary methods have been developed which estimate gene trees and then combine the gene trees to estimate a species tree by optimizing various optimization scores. In this study, we have formalized the concept of “phylogenomic terraces” in the species tree space, where multiple species trees with distinct topologies may have exactly the same optimization score (quartet score, extra lineage score, etc.) with respect to a collection of gene trees. We investigated the presence and implication of terraces in species tree estimation from multi-locus data by taking ILS into account. We analyzed two of the most popular ILS-aware optimization criteria: maximize quartet consistency (MQC) and minimize deep coalescence (MDC). Methods based on MQC are provably statistically consistent, whereas MDC is not a consistent criterion for species tree estimation. Our experiments, on a collection of dataset simulated under ILS, indicate that MDC-based methods may achieve competitive or identical quartet consistency score as MQC but could be significantly worse than MQC in terms of tree accuracy – demonstrating the presence and affect of phylogenomic terraces. This is the first known study that formalizes the concept of phylogenomic terraces in the context of species tree estimation from multi-locus data, and reports the presence and implications of terraces in species tree estimation under ILS.


Sign in / Sign up

Export Citation Format

Share Document