scholarly journals A simple method for data partitioning based on relative evolutionary rates

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5498 ◽  
Author(s):  
Jadranka Rota ◽  
Tobias Malm ◽  
Nicolas Chazot ◽  
Carlos Peña ◽  
Niklas Wahlberg

Background Multiple studies have demonstrated that partitioning of molecular datasets is important in model-based phylogenetic analyses. Commonly, partitioning is done a priori based on some known properties of sequence evolution, e.g. differences in rate of evolution among codon positions of a protein-coding gene. Here we propose a new method for data partitioning based on relative evolutionary rates of the sites in the alignment of the dataset being analysed. The rates are inferred using the previously published Tree Independent Generation of Evolutionary Rates (TIGER), and the partitioning is conducted using our novel python script RatePartitions. We conducted simulations to assess the performance of our new method, and we applied it to eight published multi-locus phylogenetic datasets, representing different taxonomic ranks within the insect order Lepidoptera (butterflies and moths) and one phylogenomic dataset, which included ultra-conserved elements as well as introns. Methods We used TIGER-rates to generate relative evolutionary rates for all sites in the alignments. Then, using RatePartitions, we partitioned the data into partitions based on their relative evolutionary rate. RatePartitions applies a simple formula that ensures a distribution of sites into partitions following the distribution of rates of the characters from the full dataset. This ensures that the invariable sites are placed in a partition with slowly evolving sites, avoiding the pitfalls of previously used methods, such as k-means. Different partitioning strategies were evaluated using BIC scores as calculated by PartitionFinder. Results Simulations did not highlight any misbehaviour of our partitioning approach, even under difficult parameter conditions or missing data. In all eight phylogenetic datasets, partitioning using TIGER-rates and RatePartitions was significantly better as measured by the BIC scores than other partitioning strategies, such as the commonly used partitioning by gene and codon position. We compared the resulting topologies and node support for these eight datasets as well as for the phylogenomic dataset. Discussion We developed a new method of partitioning phylogenetic datasets without using any prior knowledge (e.g. DNA sequence evolution). This method is entirely based on the properties of the data being analysed and can be applied to DNA sequences (protein-coding, introns, ultra-conserved elements), protein sequences, as well as morphological characters. A likely explanation for why our method performs better than other tested partitioning strategies is that it accounts for the heterogeneity in the data to a much greater extent than when data are simply subdivided based on prior knowledge.

Author(s):  
Jadranka Rota ◽  
Tobias Malm ◽  
Niklas Wahlberg

Background. Multiple studies have demonstrated that partitioning of molecular datasets is important in model-based phylogenetic analyses. Commonly, partitioning is done a priori based on some known properties of sequence evolution, e.g. differences in rate of evolution among codon positions of a protein-coding gene. Here we propose a new method for data partitioning based on relative evolutionary rates of the sites in the alignment of the dataset being analysed. The rates are inferred using the previously published Tree Independent Generation of Evolutionary Rates (TIGER), and the partitioning is conducted using our novel python script RatePartitions. We applied this method to eight published multi-locus phylogenetic datasets, representing different taxonomic ranks within the insect order Lepidoptera (butterflies and moths). Methods. We used TIGER to generate relative evolutionary rates for all sites in the alignments. Then, using RatePartitions, we partitioned the data into bins based on their relative evolutionary rate. RatePartitions applies a simple formula that ensures a distribution of sites into partitions following the distribution of rates of the characters from the full dataset. This ensures that the invariable sites are placed in a partition with slowly evolving sites, avoiding the pitfalls of previously used methods, such as k-means. Different partitioning strategies were evaluated using BIC scores as calculated by PartitionFinder. Results. In all eight datasets, partitioning using TIGER and RatePartitions was significantly better as measured by the BIC scores than other partitioning strategies, such as the commonly used partitioning by gene and codon position. Discussion. We developed a new method of partitioning phylogenetic datasets without using any prior knowledge (e.g. DNA sequence evolution). This method is entirely based on the properties of the data being analysed and can be applied to DNA sequences (protein-coding, introns, ultra-conserved elements), protein sequences, as well as morphological characters. A likely explanation for why our method performs better than other tested partitioning strategies is that it accounts for the heterogeneity in the data to a much greater extent than when data are simply subdivided based on prior knowledge.


2017 ◽  
Author(s):  
Jadranka Rota ◽  
Tobias Malm ◽  
Niklas Wahlberg

Background. Multiple studies have demonstrated that partitioning of molecular datasets is important in model-based phylogenetic analyses. Commonly, partitioning is done a priori based on some known properties of sequence evolution, e.g. differences in rate of evolution among codon positions of a protein-coding gene. Here we propose a new method for data partitioning based on relative evolutionary rates of the sites in the alignment of the dataset being analysed. The rates are inferred using the previously published Tree Independent Generation of Evolutionary Rates (TIGER), and the partitioning is conducted using our novel python script RatePartitions. We applied this method to eight published multi-locus phylogenetic datasets, representing different taxonomic ranks within the insect order Lepidoptera (butterflies and moths). Methods. We used TIGER to generate relative evolutionary rates for all sites in the alignments. Then, using RatePartitions, we partitioned the data into bins based on their relative evolutionary rate. RatePartitions applies a simple formula that ensures a distribution of sites into partitions following the distribution of rates of the characters from the full dataset. This ensures that the invariable sites are placed in a partition with slowly evolving sites, avoiding the pitfalls of previously used methods, such as k-means. Different partitioning strategies were evaluated using BIC scores as calculated by PartitionFinder. Results. In all eight datasets, partitioning using TIGER and RatePartitions was significantly better as measured by the BIC scores than other partitioning strategies, such as the commonly used partitioning by gene and codon position. Discussion. We developed a new method of partitioning phylogenetic datasets without using any prior knowledge (e.g. DNA sequence evolution). This method is entirely based on the properties of the data being analysed and can be applied to DNA sequences (protein-coding, introns, ultra-conserved elements), protein sequences, as well as morphological characters. A likely explanation for why our method performs better than other tested partitioning strategies is that it accounts for the heterogeneity in the data to a much greater extent than when data are simply subdivided based on prior knowledge.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e10364
Author(s):  
Natalia I. Abramson ◽  
Fedor N. Golenishchev ◽  
Semen Yu. Bodrov ◽  
Olga V. Bondareva ◽  
Evgeny A. Genelt-Yanovskiy ◽  
...  

In this article, we present the nearly complete mitochondrial genome of the Subalpine Kashmir vole Hyperacrius fertilis (Arvicolinae, Cricetidae, Rodentia), assembled using data from Illumina next-generation sequencing (NGS) of the DNA from a century-old museum specimen. De novo assembly consisted of 16,341 bp and included all mitogenome protein-coding genes as well as 12S and 16S RNAs, tRNAs and D-loop. Using the alignment of protein-coding genes of 14 previously published Arvicolini tribe mitogenomes, seven Clethrionomyini mitogenomes, and also Ondatra and Dicrostonyx outgroups, we conducted phylogenetic reconstructions based on a dataset of 13 protein-coding genes (PCGs) under maximum likelihood and Bayesian inference. Phylogenetic analyses robustly supported the phylogenetic position of this species within the tribe Arvicolini. Among the Arvicolini, Hyperacrius represents one of the early-diverged lineages. This result of phylogenetic analysis altered the conventional view on phylogenetic relatedness between Hyperacrius and Alticola and prompted the revision of morphological characters underlying the former assumption. Morphological analysis performed here confirmed molecular data and provided additional evidence for taxonomic replacement of the genus Hyperacrius from the tribe Clethrionomyini to the tribe Arvicolini.


2021 ◽  
Author(s):  
Phougeishangbam Rolish Singh ◽  
Bart van de Vossenberg ◽  
Katarzynar Rybarczyk-Mydłowska3 ◽  
Magdalena Kowalewska-Groszkowska ◽  
Wim Bert ◽  
...  

Rotylenchus is a widely-distributed economically important plant-parasitic nematode group whose species-level identification relies largely on limited morphological characters including character-based tabular keys and molecular data of ribosomal and mitochondrial genes. In this study, a combined morphological and molecular analysis of three populations of R. goodeyi from Belgium, Poland and the Netherlands revealed important character variations of this species leading to synonymisation of R. rhomboides with R. goodeyi, and a high nucleotide variation within cox1 gene sequences in these populations. Additional Illumina sequencing of DNA from individuals of the Dutch population revealed two variants of mitogenomes each of approximately 23 Kb in size, differing by about 9% and containing eleven protein coding genes, two ribosomal RNA genes and up to 29 transfer RNA genes. In addition to the first representative whole genome shotgun sequence datasets of the genus Rotylenchus, this study also provides the full length mitogenome and the ribosomal DNA sequences of R. goodeyi.


Phytotaxa ◽  
2019 ◽  
Vol 427 (1) ◽  
pp. 31-42
Author(s):  
LEI SHU ◽  
RUI-LIANG ZHU

Based on molecular phylogenetic analyses and morphological characters, a new species from Bangladesh, northern Vietnam, and southwestern China, Leptolejeunea nigra, is described. It is mostly similar to L. balansae but remarkable for having brownish black ocelli in its leaf lobes. In the molecular phylogeny, the samples of L. nigra are not nested within any clade and form an independent lineage. In particular, the molecular dating suggested that the divergence of L. nigra happened in time span of the formation of the Himalayas.


2015 ◽  
Vol 46 (3) ◽  
pp. 269-290 ◽  
Author(s):  
Ian J. Kitching ◽  
C. Lorna Culverwell ◽  
Ralph E. Harbach

Lutzia Theobald was reduced to a subgenus of Culex in 1932 and was treated as such until it was restored to its original generic status in 2003, based mainly on modifications of the larvae for predation. Previous phylogenetic studies based on morphological and molecular data have provided conflicting support for the generic status of Lutzia: analyses of morphological data support the generic status whereas analyses based on DNA sequences do not. Our previous phylogenetic analyses of Culicini (based on 169 morphological characters and 86 species representing the four genera and 26 subgenera of Culicini, most informal group taxa of subgenus Culex and five outgroup species from other tribes) seemed to indicate a conflict between adult and larval morphological data. Hence, we conducted a series of comparative and data exclusion analyses to determine whether the alternative positions of Lutzia are due to conflicting signal or to a lack of strong signal. We found that separate and combined analyses of adult and larval data support different patterns of relationships between Lutzia and other Culicini. However, the majority of conflicting clades are poorly supported and once these are removed from consideration, most of the topological disparity disappears, along with much of the resolution, suggesting that morphology alone does not have sufficiently strong signal to resolve the position of Lutzia. We critically examine the results of other phylogenetic studies of culicinine relationships and conclude that no morphological or molecular data set analysed in any study conducted to date has adequate signal to place Lutzia unequivocally with regard to other taxa in Culicini. Phylogenetic relationships observed thus far suggest that Lutzia is placed within Culex but further data and extended taxon sampling are required to confirm its position relative to Culex.


2014 ◽  
Vol 62 (8) ◽  
pp. 638 ◽  
Author(s):  
Farrokh Ghahremaninejad ◽  
Mehrshid Riahi ◽  
Melina Babaei ◽  
Faride Attar ◽  
Lütfi Behçet ◽  
...  

Verbascum is one of the main genera of Scrophulariaceae, but delimitation and phylogenetic relationships of this genus are unclear and have not yet been studied using DNA sequences. Here, using four selected molecular markers (nrDNA ITS and the plastid spacers trnS/G, psbA-trnH and trnY/T), we present a phylogeny of Verbascum and test previous infrageneric taxonomic hypotheses as well as its monophyly with respect to Scrophularia. We additionally discuss morphological variation and the utility of morphological characters as predictors of phylogenetic relationships. Our results show that while molecular data unambiguously support the circumscription of Verbascum inferred from morphology, they prove to be of limited utility in resolving infrageneric relationships, suggesting that Verbascum ‘s high species diversity is due to rapid and recent radiation. Our work provides phylogenetic estimation of the genus Verbascum using molecular data and can serve as a starting point for future investigations of Verbascum and relatives.


Phytotaxa ◽  
2017 ◽  
Vol 302 (2) ◽  
pp. 101 ◽  
Author(s):  
FABIO RENATO BORGES ◽  
ORLANDO NECCHI JR

South American studies on the genus Chara are relatively scarce, most consisting of floristic surveys and using only traditional morphological characters. This study is a first approach to the systematics of the genus Chara applying modern techniques (DNA sequences and oospore SEM analyses) in addition to the alpha-taxonomy investigations that have been conducted in Brazil. Twelve populations of Chara were analyzed from the midwest and southeast regions of Brazil. Sequences of three molecular markers were applied to infer phylogenies. The ultrastructure of the oospore wall and currently used morphological characters were analyzed for Chara populations. Maximum likelihood and Bayesian analyses of sequences of rbcL, ITS2, and matK were congruent in that they grouped the species in six clades, each representing one species: Chara braunii C.C. Gmelin, C. foliolosa C.L.Willdenow, C. guairensis R.Bicudo, C. haitensis M.P.J.F. Turpin, C. hydropitys H. Reichenbach and C. rusbyana M. Howe. Morphological characters, including ultrastructure of oospore wall, provided good evidences to characterize each species. Molecular data supported the recent view that some traditional infra-generic taxa (e.g. subgenus Charopsis and subsection Willdenowia) are not supported in phylogenetic analyses, whereas some species (e.g. C. foliolosa, C. haitensis, C. hydropitys and C. rusbyana previously considered as varieties and forms of C. zeylanica) were consistently distinguished in the analyses for the three molecular markers.


Botany ◽  
2016 ◽  
Vol 94 (9) ◽  
pp. 863-884 ◽  
Author(s):  
David S. Gernandt ◽  
Garth Holman ◽  
Christopher Campbell ◽  
Matthew Parks ◽  
Sarah Mathews ◽  
...  

Relationships of living and fossil Pinaceae were inferred using parsimony and Bayesian inference of morphological characters and plastid and nuclear DNA sequences. When considering extant taxa only, adding molecular to morphological characters resulted in markedly increased resolution and branch support compared with analysis of morphology alone. Including 45 fossil taxa resulted in drastically decreased resolution in morphology-based consensus trees. We evaluated the effect on branch support and resolution of including DNA sequences, deleting fossils lacking information for cone scale apices and seeds, using reduced consensus methods, and using implied weighting, and found that the greatest improvements were found by including DNA sequences and using implied weighting. The tree topologies from parsimony and Bayesian inference confirm previous findings that the fossil genus Pseudoaraucaria and a few species of Pityostrobus from the Lower Cretaceous are related to abietoid genera, and that other species of Pityostrobus are pinoid and closely related to Pinus. Focusing phylogenetic analyses on the most complete fossil cones, specifically those that are anatomically preserved and include both cone scale apices and seeds, and taking into account homoplasy, resulted in the clearest hypotheses for the timing and sequence of diversification in the family.


2021 ◽  
Author(s):  
◽  
Whitney L M Bouma

<p>The fern family Pteridaceae is among the largest fern families in New Zealand. It comprises 17 native species among five genera. Traditionally the classification of Pteridaceae was based on morphological characters. The advent of molecular technology, now makes is possible to test these morphology-based classifications. The Pteridaceae has previously been subjected to phylogenetic analyses; however representatives from New Zealand and the South Pacific have never been well represented in these studies. This thesis research aimed to investigate the phylogenetic relationships of the New Zealand Pteridaceae, as well as, the phylogenetic relationships of the New Zealand species to their overseas relatives. The DNA sequences of several Chloroplast loci (e.g. trnL-trnF locus, rps4 and rps4-trnS IGS, atpB, and rbcL) were determined and the phylogenetic relationships of the New Zealand Pteridaceae and several species-specific question within the genus Pellaea and Adiantum were investigated. Results presented in this thesis confirm previously published phylogenetics of the Pteridaceae, which show the resolution of five major clades, i.e.,cryptogrammoids, ceratopteridoids, pteridoids, cheilanthoids, and the adiantoids. The addition of the New Zealand species revealed a possible South West Pacific groups formed by the respective genera, where New Zealand species were generally more related to one another than to overseas relatives. Within the New Zealand Pellaea, the analysis of the trnL-trnF locus sequence data showed that the morphologically-intermediate plants P. aff. falcata, responsible for taxonomic confusion, were more closely related to P. rotundifolia than to P. falcata. Furthermore, the species collected on the Kermadec Islands, previously thought to be P. falcata, are genetically distinct from the Australian P. falcata and they could constitute a new species. Adiantum hispidulum, which is polymorphic for two different hair types being used to distinguish them as different species, was also reinvestigated morphologically and molecularly. Morphological inspection of hairs revealed three hair types as opposed to the previous thought two, and furthermore, they correspond to three different trnL-trnF sequences haplotypes.</p>


Sign in / Sign up

Export Citation Format

Share Document