scholarly journals Phylogenetic signal in phonotactics

Diachronica ◽  
2021 ◽  
Author(s):  
Jayden L. Macklin-Cordes ◽  
Claire Bowern ◽  
Erich R. Round

Abstract Phylogenetic methods have broad potential in linguistics beyond tree inference. Here, we show how a phylogenetic approach opens the possibility of gaining historical insights from entirely new kinds of linguistic data – in this instance, statistical phonotactics. We extract phonotactic data from 112 Pama-Nyungan vocabularies and apply tests for phylogenetic signal, quantifying the degree to which the data reflect phylogenetic history. We test three datasets: (1) binary variables recording the presence or absence of biphones (two-segment sequences) in a lexicon (2) frequencies of transitions between segments, and (3) frequencies of transitions between natural sound classes. Australian languages have been characterized as having a high degree of phonotactic homogeneity. Nevertheless, we detect phylogenetic signal in all datasets. Phylogenetic signal is greater in finer-grained frequency data than in binary data, and greatest in natural-class-based data. These results demonstrate the viability of employing a new source of readily extractable data in historical and comparative linguistics.

2014 ◽  
Vol 281 (1786) ◽  
pp. 20140479 ◽  
Author(s):  
Maximilian J. Telford ◽  
Christopher J. Lowe ◽  
Christopher B. Cameron ◽  
Olga Ortega-Martinez ◽  
Jochanan Aronowicz ◽  
...  

While some aspects of the phylogeny of the five living echinoderm classes are clear, the position of the ophiuroids (brittlestars) relative to asteroids (starfish), echinoids (sea urchins) and holothurians (sea cucumbers) is controversial. Ophiuroids have a pluteus-type larva in common with echinoids giving some support to an ophiuroid/echinoid/holothurian clade named Cryptosyringida. Most molecular phylogenetic studies, however, support an ophiuroid/asteroid clade (Asterozoa) implying either convergent evolution of the pluteus or reversals to an auricularia-type larva in asteroids and holothurians. A recent study of 10 genes from four of the five echinoderm classes used ‘phylogenetic signal dissection’ to separate alignment positions into subsets of (i) suboptimal, heterogeneously evolving sites (invariant plus rapidly changing) and (ii) the remaining optimal, homogeneously evolving sites. Along with most previous molecular phylogenetic studies, their set of heterogeneous sites, expected to be more prone to systematic error, support Asterozoa. The homogeneous sites, in contrast, support an ophiuroid/echinoid grouping, consistent with the cryptosyringid clade, leading them to posit homology of the ophiopluteus and echinopluteus. Our new dataset comprises 219 genes from all echinoderm classes; analyses using probabilistic Bayesian phylogenetic methods strongly support Asterozoa. The most reliable, slowly evolving quartile of genes also gives highest support for Asterozoa; this support diminishes in second and third quartiles and the fastest changing quartile places the ophiuroids close to the root. Using phylogenetic signal dissection, we find heterogenous sites support an unlikely grouping of Ophiuroidea + Holothuria while homogeneous sites again strongly support Asterozoa. Our large and taxonomically complete dataset finds no support for the cryptosyringid hypothesis; in showing strong support for the Asterozoa, our preferred topology leaves the question of homology of pluteus larvae open.


2018 ◽  
Author(s):  
Stephen A. Smith ◽  
Nathanael Walker-Hale ◽  
Joseph F. Walker ◽  
Joseph W. Brown

AbstractStudies have demonstrated that pervasive gene tree conflict underlies several important phylogenetic relationships where different species tree methods produce conflicting results. Here, we present a means of dissecting the phylogenetic signal for alternative resolutions within a dataset in order to resolve recalcitrant relationships and, importantly, identify what the dataset is unable to resolve. These procedures extend upon methods for isolating conflict and concordance involving specific candidate relationships and can be used to identify systematic error and disambiguate sources of conflict among species tree inference methods. We demonstrate these on a large phylogenomic plant dataset. Our results support the placement of Amborella as sister to the remaining extant angiosperms, Gnetales as sister to pines, and the monophyly of extant gymnosperms. Several other contentious relationships, including the resolution of relationships within the bryophytes and the eudicots, remain uncertain given the low number of supporting gene trees. To address whether concatenation of filtered genes amplified phylogenetic signal for relationships, we implemented a combinatorial heuristic to test combinability of genes. We found that nested conflicts limited the ability of data filtering methods to fully ameliorate conflicting signal amongst gene trees. These analyses confirmed that the underlying conflicting signal does not support broad concatenation of genes. Our approach provides a means of dissecting a specific dataset to address deep phylogenetic relationships while also identifying the inferential boundaries of the dataset.


Author(s):  
Julieta Rodríguez ◽  
Rocío Deanna ◽  
Franco Chiarini

AbstractWithin the cosmopolitan family Solanaceae, Physalideae is the tribe with the highest generic diversity (30 genera and more than 200 species). This tribe embraces subtribe Physalidinae, in which positions of some genera are not entirely resolved. Chromosomes may help on this goal, by providing information on the processes underlying speciation. Thus, cytogenetic analyses were carried out in the subtribe in order to establish its chromosome number and morphology. Physalidinae is characterized by x = 12 and most species shows a highly asymmetric karyotype. These karyotype traits were mapped onto a molecular phylogeny to test the congruence between karyotype evolution and clade differentiation. A diploid ancestor was reconstructed for the subtribe, and five to six polyploidy independent events were estimated, plus one aneuploidy event (X = 11 in the monotypic genus Quincula). Comparative phylogenetic methods showed that asymmetry indices and chromosome arm ratio (r) have a high phylogenetic signal, whereas the number of telocentric and submetacentric chromosomes presented a conspicuous amount of changes. Karyotype asymmetry allow us to differentiate genera within the subtribe. Overall, our study suggests that Physalidineae diversification has been accompanied by karyotype changes, which can be applied to delimit genera within the group.


2015 ◽  
Vol 282 (1815) ◽  
pp. 20151278 ◽  
Author(s):  
Kevin Zhou ◽  
Claire Bowern

Researchers have long been interested in the evolution of culture and the ways in which change in cultural systems can be reconstructed and tracked. Within the realm of language, these questions are increasingly investigated with Bayesian phylogenetic methods. However, such work in cultural phylogenetics could be improved by more explicit quantification of reconstruction and transition probabilities. We apply such methods to numerals in the languages of Australia. As a large phylogeny with almost universal ‘low-limit' systems, Australian languages are ideal for investigating numeral change over time. We reconstruct the most likely extent of the system at the root and use that information to explore the ways numerals evolve. We show that these systems do not increment serially, but most commonly vary their upper limits between 3 and 5. While there is evidence for rapid system elaboration beyond the lower limits, languages lose numerals as well as gain them. We investigate the ways larger numerals build on smaller bases, and show that there is a general tendency to both gain and replace 4 by combining 2 + 2 (rather than inventing a new unanalysable word ‘four'). We develop a series of methods for quantifying and visualizing the results.


2017 ◽  
Author(s):  
Xiaofan Zhou ◽  
Xingxing Shen ◽  
Chris Todd Hittinger ◽  
Antonis Rokas

AbstractPhylogenetics has witnessed dramatic increases in the sizes of data matrices assembled to resolve branches of the tree of life, motivating the development of programs for fast, yet accurate, inference. For example, several different fast programs have been developed in the very popular maximum likelihood framework, including RAxML/ExaML, PhyML, IQ-TREE, and FastTree. Although these four programs are widely used, a systematic evaluation and comparison of their performance using empirical genome-scale data matrices has so far been lacking. To address this question, we evaluated these four programs on 19 empirical phylogenomic data sets from diverse animal, plant, and fungal lineages with respect to likelihood maximization, tree topology, and computational speed. For single-gene tree inference, we found that the more exhaustive and slower strategies (ten searches per alignment) outperformed faster strategies (one tree search per alignment) using RAxML, PhyML, or IQ-TREE. Interestingly, single-gene trees inferred by the three programs yielded comparable coalescent-based species tree estimations. For concatenation–based species tree inference, IQ-TREE consistently achieved the best-observed likelihoods for all data sets, and RAxML/ExaML was a close second. In contrast, PhyML often failed to complete concatenation-based analyses, whereas FastTree was the fastest but generated lower likelihood values and more dissimilar tree topologies in both types of analyses. Finally, data matrix properties, such as the number of taxa and the strength of phylogenetic signal, sometimes substantially influenced the relative performance of the programs. Our results provide real-world gene and species tree phylogenetic inference benchmarks to inform the design and execution of large-scale phylogenomic data analyses.


2021 ◽  
Vol 8 ◽  
Author(s):  
Christian M. Ibáñez ◽  
Mariana Díaz-Santana-Iturrios ◽  
Sergio A. Carrasco ◽  
Fernando A. Fernández-Álvarez ◽  
David A. López-Córdova ◽  
...  

One of the major mechanisms responsible for the animals’ fitness dynamics is fecundity. Fecundity as a trait does not evolve independently, and rather interacts with other traits such as body and egg size. Here, our aim was to correctly infer the macroevolutionary trade-offs between body length, egg length, and potential fecundity, using cephalopods as study model. The correlated evolution among those traits was inferred by comparative phylogenetic methods. Literature data on biological and reproductive traits (body length, egg length, and potential fecundity) was obtained for 90 cephalopod species, and comparative phylogenetic methods based on a previous molecular phylogeny were used to test the correlated evolution hypothesis. Additionally, we estimated the phylogenetic signal and fitted five different evolutionary models to each trait. All traits showed high phylogenetic signal, and the selected model suggested an evolutionary trend toward increasing body length, egg length, and fecundity in relation to the ancestral state. Evidence of correlated evolution between body length and fecundity was observed, although this relationship was not detected between body length and egg length. The robust inverse relationship between fecundity and egg length indicates that cephalopods evolved a directional selection that favored an increase of fecundity and a reduction of egg length in larger species, or an increase in egg length with the concomitant reduction of fecundity and body length in order to benefit offspring survival. The use of phylogenetic comparative methods allowed us to properly detect macroevolutionary trade-offs.


2008 ◽  
Vol 363 (1512) ◽  
pp. 4013-4021 ◽  
Author(s):  
Mark T Holder ◽  
Derrick J Zwickl ◽  
Christophe Dessimoz

Computer simulations provide a flexible method for assessing the power and robustness of phylogenetic inference methods. Unfortunately, simulated data are often obviously atypical of data encountered in studies of molecular evolution. Unrealistic simulations can lead to conclusions that are irrelevant to real-data analyses or can provide a biased view of which methods perform well. Here, we present a software tool designed to generate data under a complex codon model that allows each residue in the protein sequence to have a different set of equilibrium amino acid frequencies. The software can obtain maximum-likelihood estimates of the parameters of the Halpern and Bruno model from empirical data and a fixed tree; given an arbitrary tree and a fixed set of parameters, the software can then simulate artificial datasets. We present the results of a simulation experiment using randomly generated tree shapes and substitution parameters estimated from 1610 mammalian cytochrome b sequences. We tested tree inference at the amino acid, nucleotide and codon levels and under parsimony, maximum-likelihood, Bayesian and distance criteria (for a total of more than 650 analyses on each dataset). Based on these simulations, nucleotide-level analyses seem to be more accurate than amino acid and codon analyses. The performance of distance-based phylogenetic methods appears to be quite sensitive to the choice of model and the form of rate heterogeneity used. Further studies are needed to assess the generality of these conclusions. For example, fitting parameters of the Halpern Bruno model to sequences from other genes will reveal the extent to which our conclusions were influenced by the choice of cytochrome b . Incorporating codon bias and more sources heterogeneity into the simulator will be crucial to determining whether the current results are caused by a bias in the current simulation study in favour of nucleotide analyses.


2010 ◽  
Vol 365 (1559) ◽  
pp. 3845-3854 ◽  
Author(s):  
Claire Bowern

This paper presents an overview of the current state of historical linguistics in Australian languages. Australian languages have been important in theoretical debates about the nature of language change and the possibilities for reconstruction and classification in areas of intensive diffusion. Here are summarized the most important outstanding questions for Australian linguistic prehistory; I also present a case study of the Karnic subgroup of Pama–Nyungan, which illustrates the problems for classification in Australian languages and potential approaches using phylogenetic methods.


2019 ◽  
Vol 36 (10) ◽  
pp. 2111-2126 ◽  
Author(s):  
Gang Li ◽  
Henrique V Figueiró ◽  
Eduardo Eizirik ◽  
William J Murphy

Abstract Current phylogenomic approaches implicitly assume that the predominant phylogenetic signal within a genome reflects the true evolutionary history of organisms, without assessing the confounding effects of postspeciation gene flow that can produce a mosaic of phylogenetic signals that interact with recombinational variation. Here, we tested the validity of this assumption with a phylogenomic analysis of 27 species of the cat family, assessing local effects of recombination rate on species tree inference and divergence time estimation across their genomes. We found that the prevailing phylogenetic signal within the autosomes is not always representative of the most probable speciation history, due to ancient hybridization throughout felid evolution. Instead, phylogenetic signal was concentrated within regions of low recombination, and notably enriched within large X chromosome recombination cold spots that exhibited recurrent patterns of strong genetic differentiation and selective sweeps across mammalian orders. By contrast, regions of high recombination were enriched for signatures of ancient gene flow, and these sequences inflated crown-lineage divergence times by ∼40%. We conclude that existing phylogenomic approaches to infer the Tree of Life may be highly misleading without considering the genomic architecture of phylogenetic signal relative to recombination rate and its interplay with historical hybridization.


2011 ◽  
Vol 279 (1729) ◽  
pp. 715-721 ◽  
Author(s):  
Kari L. Allen ◽  
Richard F. Kay

The high energetic costs of building and maintaining large brains are thought to constrain encephalization. The ‘expensive-tissue hypothesis’ (ETH) proposes that primates (especially humans) overcame this constraint through reduction of another metabolically expensive tissue, the gastrointestinal tract. Small guts characterize animals specializing on easily digestible diets. Thus, the hypothesis may be tested via the relationship between brain size and diet quality. Platyrrhine primates present an interesting test case, as they are more variably encephalized than other extant primate clades (excluding Hominoidea). We find a high degree of phylogenetic signal in the data for diet quality, endocranial volume and body size. Controlling for phylogenetic effects, we find no significant correlation between relative diet quality and relative endocranial volume. Thus, diet quality fails to account for differences in platyrrhine encephalization. One taxon, in particular, Brachyteles , violates predictions made by ETH in having a large brain and low-quality diet. Dietary reconstructions of stem platyrrhines further indicate that a relatively high-quality diet was probably in place prior to increases in encephalization. Therefore, it is unlikely that a shift in diet quality was a primary constraint release for encephalization in platyrrhines and, by extrapolation, humans.


Sign in / Sign up

Export Citation Format

Share Document