scholarly journals Robust methods for detecting convergent shifts in evolutionary rates

2018 ◽  
Author(s):  
Raghavendran Partha ◽  
Amanda Kowalczyk ◽  
Nathan Clark ◽  
Maria Chikina

AbstractIdentifying genomic elements underlying phenotypic adaptations is an important problem in evolutionary biology. Comparative analyses learning from convergent evolution of traits are gaining momentum in accurately detecting such elements. We previously developed a method for predicting phenotypic associations of genetic elements by contrasting patterns of sequence evolution in species showing a phenotype with those that do not. Using this method, we successfully demonstrated convergent evolutionary rate shifts in genetic elements associated with two phenotypic adaptations, namely the independent subterranean and marine transitions of terrestrial mammalian lineages. Our method calculates gene-specific rates of evolution on branches of phylogenetic trees using linear regression. These rates represent the extent of sequence divergence on a branch after removing the expected divergence on the branch due to background factors. The rates calculated using this regression analysis exhibit an important statistical limitation, namely heteroscedasticity. We observe that the rates on branches that are longer on average show higher variance, and describe how this problem adversely affects the confidence with which we can make inferences about rate shifts. Using a combination of data transformation and weighted regression, we have developed an updated method that corrects this heteroscedasticity in the rates. We additionally illustrate the improved performance offered by the updated method at robust detection of convergent rate shifts in phylogenetic trees of protein-coding genes across mammals, as well as using simulated tree datasets. Overall, we present an important extension to our evolutionary-rates-based method that performs more robustly and consistently at detecting convergent shifts in evolutionary rates.

2019 ◽  
Vol 36 (8) ◽  
pp. 1817-1830 ◽  
Author(s):  
Raghavendran Partha ◽  
Amanda Kowalczyk ◽  
Nathan L Clark ◽  
Maria Chikina

AbstractIdentifying genomic elements underlying phenotypic adaptations is an important problem in evolutionary biology. Comparative analyses learning from convergent evolution of traits are gaining momentum in accurately detecting such elements. We previously developed a method for predicting phenotypic associations of genetic elements by contrasting patterns of sequence evolution in species showing a phenotype with those that do not. Using this method, we successfully demonstrated convergent evolutionary rate shifts in genetic elements associated with two phenotypic adaptations, namely the independent subterranean and marine transitions of terrestrial mammalian lineages. Our original method calculates gene-specific rates of evolution on branches of phylogenetic trees using linear regression. These rates represent the extent of sequence divergence on a branch after removing the expected divergence on the branch due to background factors. The rates calculated using this regression analysis exhibit an important statistical limitation, namely heteroscedasticity. We observe that the rates on branches that are longer on average show higher variance, and describe how this problem adversely affects the confidence with which we can make inferences about rate shifts. Using a combination of data transformation and weighted regression, we have developed an updated method that corrects this heteroscedasticity in the rates. We additionally illustrate the improved performance offered by the updated method at robust detection of convergent rate shifts in phylogenetic trees of protein-coding genes across mammals, as well as using simulated tree data sets. Overall, we present an important extension to our evolutionary-rates-based method that performs more robustly and consistently at detecting convergent shifts in evolutionary rates.


Genetics ◽  
1996 ◽  
Vol 144 (1) ◽  
pp. 427-437 ◽  
Author(s):  
C William Birky

Abstract Little attention has been paid to the consequences of long-term asexual reproduction for sequence evolution in diploid or polyploid eukaryotic organisms. Some elementary theory shows that the amount of neutral sequence divergence between two alleles of a protein-coding gene in an asexual individual will be greater than that in a sexual species by a factor of 2tu, where t is the number of generations since sexual reproduction was lost and u is the mutation rate per generation in the asexual lineage. Phylogenetic trees based on only one allele from each of two or more species will show incorrect divergence times and, more often than not, incorrect topologies. This allele sequence divergence can be stopped temporarily by mitotic gene conversion, mitotic crossing-over, or ploidy reduction. If these convergence events are rare, ancient asexual lineages can be recognized by their high allele sequence divergence. At intermediate frequencies of convergence events, it will be impossible to reconstruct the correct phylogeny of an asexual clade from the sequences of protein coding genes. Convergence may be limited by allele sequence divergence and heterozygous chromosomal rearrangements which reduce the homology needed for recombination and result in aneuploidy after crossing-over or ploidy cycles.


2014 ◽  
Vol 369 (1649) ◽  
pp. 20130252 ◽  
Author(s):  
William Pitchers ◽  
Jason B. Wolf ◽  
Tom Tregenza ◽  
John Hunt ◽  
Ian Dworkin

A fundamental question in evolutionary biology is the relative importance of selection and genetic architecture in determining evolutionary rates. Adaptive evolution can be described by the multivariate breeders' equation ( ), which predicts evolutionary change for a suite of phenotypic traits ( ) as a product of directional selection acting on them ( β ) and the genetic variance–covariance matrix for those traits ( G ). Despite being empirically challenging to estimate, there are enough published estimates of G and β to allow for synthesis of general patterns across species. We use published estimates to test the hypotheses that there are systematic differences in the rate of evolution among trait types, and that these differences are, in part, due to genetic architecture. We find some evidence that sexually selected traits exhibit faster rates of evolution compared with life-history or morphological traits. This difference does not appear to be related to stronger selection on sexually selected traits. Using numerous proposed approaches to quantifying the shape, size and structure of G , we examine how these parameters relate to one another, and how they vary among taxonomic and trait groupings. Despite considerable variation, they do not explain the observed differences in evolutionary rates.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Jaime Iranzo ◽  
Yuri I. Wolf ◽  
Eugene V. Koonin ◽  
Itamar Sela

AbstractBacterial and archaeal evolution involve extensive gene gain and loss. Thus, phylogenetic trees of prokaryotes can be constructed both by traditional sequence-based methods (gene trees) and by comparison of gene compositions (genome trees). Comparing the branch lengths in gene and genome trees with identical topologies for 34 clusters of closely related bacterial and archaeal genomes, we show here that terminal branches of gene trees are systematically compressed compared to those of genome trees. Thus, sequence evolution is delayed compared to genome evolution by gene gain and loss. The extent of this delay differs widely among bacteria and archaea. Mathematical modeling shows that the divergence delay can result from sequence homogenization by homologous recombination. The model explains how homologous recombination maintains the cohesiveness of the core genome of a species while allowing extensive gene gain and loss within the accessory genome. Once evolving genomes become isolated by barriers impeding homologous recombination, gene and genome evolution processes settle into parallel trajectories, and genomes diverge, resulting in speciation.


2019 ◽  
Author(s):  
Jaime Iranzo ◽  
Yuri I. Wolf ◽  
Eugene V. Koonin ◽  
Itamar Sela

AbstractEvolution of bacterial and archaeal genomes is a highly dynamic process that involves extensive gain and loss of genes. Therefore, phylogenetic trees of prokaryotes can be constructed both by the traditional sequence-based methods (gene trees) and by comparison of gene compositions (genome trees). Comparing the branch lengths in gene and genome trees with identical topologies for 34 clusters of closely related bacterial and archaeal genomes, we found that the terminal branches of gene trees were systematically compressed compared to those of genome trees. Thus, sequence evolution seems to be significantly delayed with respect to genome evolution by gene gain and loss. The extent of this delay widely differs among bacterial and archaeal lineages. We develop and explore mathematical models demonstrating that the delay of sequence divergence can be explained by sequence homogenization that is caused by homologous recombination. Once evolving genomes become isolated by barriers that impede homologous recombination, gene and genome evolution processes settle into parallel trajectories, and genomes diverge, resulting in speciation. This model of prokaryotic genome evolution gives a mechanistic explanation of our previous finding that archaeal genomes contain a class of genes that turn over rapidly, before significant sequence divergence occurs, and provides a framework for correcting phylogenetic trees, to make them consistent with the dynamics of gene turnover.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5498 ◽  
Author(s):  
Jadranka Rota ◽  
Tobias Malm ◽  
Nicolas Chazot ◽  
Carlos Peña ◽  
Niklas Wahlberg

Background Multiple studies have demonstrated that partitioning of molecular datasets is important in model-based phylogenetic analyses. Commonly, partitioning is done a priori based on some known properties of sequence evolution, e.g. differences in rate of evolution among codon positions of a protein-coding gene. Here we propose a new method for data partitioning based on relative evolutionary rates of the sites in the alignment of the dataset being analysed. The rates are inferred using the previously published Tree Independent Generation of Evolutionary Rates (TIGER), and the partitioning is conducted using our novel python script RatePartitions. We conducted simulations to assess the performance of our new method, and we applied it to eight published multi-locus phylogenetic datasets, representing different taxonomic ranks within the insect order Lepidoptera (butterflies and moths) and one phylogenomic dataset, which included ultra-conserved elements as well as introns. Methods We used TIGER-rates to generate relative evolutionary rates for all sites in the alignments. Then, using RatePartitions, we partitioned the data into partitions based on their relative evolutionary rate. RatePartitions applies a simple formula that ensures a distribution of sites into partitions following the distribution of rates of the characters from the full dataset. This ensures that the invariable sites are placed in a partition with slowly evolving sites, avoiding the pitfalls of previously used methods, such as k-means. Different partitioning strategies were evaluated using BIC scores as calculated by PartitionFinder. Results Simulations did not highlight any misbehaviour of our partitioning approach, even under difficult parameter conditions or missing data. In all eight phylogenetic datasets, partitioning using TIGER-rates and RatePartitions was significantly better as measured by the BIC scores than other partitioning strategies, such as the commonly used partitioning by gene and codon position. We compared the resulting topologies and node support for these eight datasets as well as for the phylogenomic dataset. Discussion We developed a new method of partitioning phylogenetic datasets without using any prior knowledge (e.g. DNA sequence evolution). This method is entirely based on the properties of the data being analysed and can be applied to DNA sequences (protein-coding, introns, ultra-conserved elements), protein sequences, as well as morphological characters. A likely explanation for why our method performs better than other tested partitioning strategies is that it accounts for the heterogeneity in the data to a much greater extent than when data are simply subdivided based on prior knowledge.


2014 ◽  
Author(s):  
William Pitchers ◽  
Jason B. Wolf ◽  
Tom Tregenza ◽  
John Hunt ◽  
Ian Dworkin

A fundamental question in evolutionary biology is the relative importance of selection and genetic architecture in determining evolutionary rates. Adaptive evolution can be described by the multivariate breeders' equation (Δz = Gβ), which predicts evolutionary change for a suite of phenotypic traits (Δz) as a product of directional selection acting on them (β) and the genetic variance-covariance matrix for those traits (G). Despite being empirically challenging to estimate, there are enough published estimates ofGandβto allow for synthesis of general patterns across species. We use published estimates to test the hypotheses that there are systematic differences in the rate of evolution among trait types, and that these differences are in part due to genetic architecture. We find evidence that sexually selected traits exhibit faster rates of evolution compared to life-history or morphological traits. This difference does not appear to be related to stronger selection on sexually selected traits. Using numerous proposed approaches to quantifying the shape, size and structure ofGwe examine how these parameters relate to one another, and how they vary among taxonomic and trait groupings. Despite considerable variation, they do not explain the observed differences in evolutionary rates.


Author(s):  
Jadranka Rota ◽  
Tobias Malm ◽  
Niklas Wahlberg

Background. Multiple studies have demonstrated that partitioning of molecular datasets is important in model-based phylogenetic analyses. Commonly, partitioning is done a priori based on some known properties of sequence evolution, e.g. differences in rate of evolution among codon positions of a protein-coding gene. Here we propose a new method for data partitioning based on relative evolutionary rates of the sites in the alignment of the dataset being analysed. The rates are inferred using the previously published Tree Independent Generation of Evolutionary Rates (TIGER), and the partitioning is conducted using our novel python script RatePartitions. We applied this method to eight published multi-locus phylogenetic datasets, representing different taxonomic ranks within the insect order Lepidoptera (butterflies and moths). Methods. We used TIGER to generate relative evolutionary rates for all sites in the alignments. Then, using RatePartitions, we partitioned the data into bins based on their relative evolutionary rate. RatePartitions applies a simple formula that ensures a distribution of sites into partitions following the distribution of rates of the characters from the full dataset. This ensures that the invariable sites are placed in a partition with slowly evolving sites, avoiding the pitfalls of previously used methods, such as k-means. Different partitioning strategies were evaluated using BIC scores as calculated by PartitionFinder. Results. In all eight datasets, partitioning using TIGER and RatePartitions was significantly better as measured by the BIC scores than other partitioning strategies, such as the commonly used partitioning by gene and codon position. Discussion. We developed a new method of partitioning phylogenetic datasets without using any prior knowledge (e.g. DNA sequence evolution). This method is entirely based on the properties of the data being analysed and can be applied to DNA sequences (protein-coding, introns, ultra-conserved elements), protein sequences, as well as morphological characters. A likely explanation for why our method performs better than other tested partitioning strategies is that it accounts for the heterogeneity in the data to a much greater extent than when data are simply subdivided based on prior knowledge.


2017 ◽  
Author(s):  
Jadranka Rota ◽  
Tobias Malm ◽  
Niklas Wahlberg

Background. Multiple studies have demonstrated that partitioning of molecular datasets is important in model-based phylogenetic analyses. Commonly, partitioning is done a priori based on some known properties of sequence evolution, e.g. differences in rate of evolution among codon positions of a protein-coding gene. Here we propose a new method for data partitioning based on relative evolutionary rates of the sites in the alignment of the dataset being analysed. The rates are inferred using the previously published Tree Independent Generation of Evolutionary Rates (TIGER), and the partitioning is conducted using our novel python script RatePartitions. We applied this method to eight published multi-locus phylogenetic datasets, representing different taxonomic ranks within the insect order Lepidoptera (butterflies and moths). Methods. We used TIGER to generate relative evolutionary rates for all sites in the alignments. Then, using RatePartitions, we partitioned the data into bins based on their relative evolutionary rate. RatePartitions applies a simple formula that ensures a distribution of sites into partitions following the distribution of rates of the characters from the full dataset. This ensures that the invariable sites are placed in a partition with slowly evolving sites, avoiding the pitfalls of previously used methods, such as k-means. Different partitioning strategies were evaluated using BIC scores as calculated by PartitionFinder. Results. In all eight datasets, partitioning using TIGER and RatePartitions was significantly better as measured by the BIC scores than other partitioning strategies, such as the commonly used partitioning by gene and codon position. Discussion. We developed a new method of partitioning phylogenetic datasets without using any prior knowledge (e.g. DNA sequence evolution). This method is entirely based on the properties of the data being analysed and can be applied to DNA sequences (protein-coding, introns, ultra-conserved elements), protein sequences, as well as morphological characters. A likely explanation for why our method performs better than other tested partitioning strategies is that it accounts for the heterogeneity in the data to a much greater extent than when data are simply subdivided based on prior knowledge.


2012 ◽  
Vol 39 (2) ◽  
pp. 217-233 ◽  
Author(s):  
J. David Archibald

Studies of the origin and diversification of major groups of plants and animals are contentious topics in current evolutionary biology. This includes the study of the timing and relationships of the two major clades of extant mammals – marsupials and placentals. Molecular studies concerned with marsupial and placental origin and diversification can be at odds with the fossil record. Such studies are, however, not a recent phenomenon. Over 150 years ago Charles Darwin weighed two alternative views on the origin of marsupials and placentals. Less than a year after the publication of On the origin of species, Darwin outlined these in a letter to Charles Lyell dated 23 September 1860. The letter concluded with two competing phylogenetic diagrams. One showed marsupials as ancestral to both living marsupials and placentals, whereas the other showed a non-marsupial, non-placental as being ancestral to both living marsupials and placentals. These two diagrams are published here for the first time. These are the only such competing phylogenetic diagrams that Darwin is known to have produced. In addition to examining the question of mammalian origins in this letter and in other manuscript notes discussed here, Darwin confronted the broader issue as to whether major groups of animals had a single origin (monophyly) or were the result of “continuous creation” as advocated for some groups by Richard Owen. Charles Lyell had held similar views to those of Owen, but it is clear from correspondence with Darwin that he was beginning to accept the idea of monophyly of major groups.


Sign in / Sign up

Export Citation Format

Share Document