scholarly journals TreeCluster: Massively scalable transmission clustering using phylogenetic trees

2018 ◽  
Author(s):  
Niema Moshiri

AbstractBackgroundThe ability to infer transmission clusters from molecular data is critical to designing and evaluating viral control strategies. Viral sequencing datasets are growing rapidly, but standard methods of transmission cluster inference do not scale well beyond thousands of sequences.ResultsI present TreeCluster, a cross-platform tool that performs transmission cluster inference on a given phylogenetic tree orders of magnitude faster than existing inference methods and supports multiple clustering optimization functions.ConclusionsTreeCluster is a freely-available cross-platform open source Python 3 tool for inferring transmission clusters from phylogenetic trees. Code, usage information, and in-depth descriptions of the implemented clustering modes are available publicly at the following repository:https://github.com/niemasd/TreeCluster

2017 ◽  
Author(s):  
Annam Pavan-Kumar

Background: India is one of the mega biodiverse countries with a large number of endemic freshwater fishes. Recently, species of genus Horaglanis (family: Clariidae) have been reported from the southern part of India. Due to their unique morphological adaptations, these enigmatic species have been subjected to phylogenetic studies to understand the evolution of adaptive traits. Further, the taxonomic status of these species has not been verified using molecular data. Methods: In the present study, secondary data i.e. reported sequences of mitochondrial cytochrome c oxidase subunit I gene was used to estimate the genetic divergence values for 14 species of the family Clariidae. Phylogenetic trees were reconstructed using maximum parsimony, maximum likelihood and Bayesian Inference methods. Results: The average genetic divergence value among genera Clarias-Clariallabes-Platyallabes-Dolichallabes-Gymnallabes-Tanganikallabes-Channallabes was 0.082 ± 0.01. However, these genera showed an average divergence value of 0.296±0.02 with genus Horaglanis. In all tree topologies, species of the genus Horaglanis formed a basal group to all other clariids. Discussion : The higher genetic divergence value between genus Horaglanis and other genera of Clariidae family suggested that genus Horaglanis may belong to a separate sub family. Based on phylogenetic trees, it is evident that species of Horaglanis might have originated early in the evolution of Clariids than other species.


2017 ◽  
Author(s):  
Annam Pavan-Kumar

Background: India is one of the mega biodiverse countries with a large number of endemic freshwater fishes. Recently, species of genus Horaglanis (family: Clariidae) have been reported from the southern part of India. Due to their unique morphological adaptations, these enigmatic species have been subjected to phylogenetic studies to understand the evolution of adaptive traits. Further, the taxonomic status of these species has not been verified using molecular data. Methods: In the present study, secondary data i.e. reported sequences of mitochondrial cytochrome c oxidase subunit I gene was used to estimate the genetic divergence values for 14 species of the family Clariidae. Phylogenetic trees were reconstructed using maximum parsimony, maximum likelihood and Bayesian Inference methods. Results: The average genetic divergence value among genera Clarias-Clariallabes-Platyallabes-Dolichallabes-Gymnallabes-Tanganikallabes-Channallabes was 0.082 ± 0.01. However, these genera showed an average divergence value of 0.296±0.02 with genus Horaglanis. In all tree topologies, species of the genus Horaglanis formed a basal group to all other clariids. Discussion : The higher genetic divergence value between genus Horaglanis and other genera of Clariidae family suggested that genus Horaglanis may belong to a separate sub family. Based on phylogenetic trees, it is evident that species of Horaglanis might have originated early in the evolution of Clariids than other species.


2016 ◽  
Author(s):  
Hussein A Hejase ◽  
Kevin J Liu

AbstractBackgroundBranching events in phylogenetic trees reflect strictly bifurcating and/or multifurcating speciation and splitting events. In the presence of gene flow, a phylogeny cannot be described by a tree but is instead a directed acyclic graph known as a phylogenetic network. Both phylogenetic trees and networks are typically reconstructed using computational analysis of multi-locus sequence data. The advent of high-throughput sequencing technologies has brought about two main scalability challenges:(1) dataset size in terms of the number of taxa and (2) the evolutionary divergence of the taxa in a study. The impact of both dimensions of scale on phylogenetic tree inference has been well characterized by recent studies; in contrast, the scalability limits of phylogenetic network inference methods are largely unknown. In this study, we quantify the performance of state-of-the-art phylogenetic network inference methods on large-scale datasets using empirical data sampled from natural mouse populations and synthetic data capturing a wide range of evolutionary scenarios.ResultsWe find that, as in the case of phylogenetic tree inference, the performance of leading network inference methods is negatively impacted by both dimensions of dataset scale. In general, we found that topological accuracy degrades as the number of taxa increases; a similar effect was observed with increased sequence mutation rate. The most accurate methods were probabilistic inference methods which maximize either likelihood under coalescent-based models or pseudo-likelihood approximations to the model likelihood. Furthermore, probabilistic inference methods with optimization criteria which did not make use of gene tree root and/or branch length information performed best-a result that runs contrary to widely held assumptions in the literature. The improved accuracy obtained with probabilistic inference methods comes at a computational cost in terms of runtime and main memory usage, which quickly become prohibitive as dataset size grows past thirty taxa.ConclusionsWe conclude that the state of the art of phylogenetic network inference lags well behind the scope of current phylogenomic studies. New algorithmic development is critically needed to address this methodological gap.


Entropy ◽  
2019 ◽  
Vol 21 (3) ◽  
pp. 313
Author(s):  
Jun Feng ◽  
Zeyun Liu ◽  
Hongwei Feng ◽  
Richard Sutcliffe ◽  
Jianni Liu ◽  
...  

To address the instability of phylogenetic trees in morphological datasets caused by missing values, we present a phylogenetic inference method based on a concept decision tree (CDT) in conjunction with attribute reduction. First, a reliable initial phylogenetic seed tree is created using a few species with relatively complete morphological information by using biologists’ prior knowledge or by applying existing tools such as MrBayes. Second, using a top-down data processing approach, we construct concept-sample templates by performing attribute reduction at each node in the initial phylogenetic seed tree. In this way, each node is turned into a decision point with multiple concept-sample templates, providing decision-making functions for grafting. Third, we apply a novel matching algorithm to evaluate the degree of similarity between the species’ attributes and their concept-sample templates and to determine the location of the species in the initial phylogenetic seed tree. In this manner, the phylogenetic tree is established step by step. We apply our algorithm to several datasets and compare it with the maximum parsimony, maximum likelihood, and Bayesian inference methods using the two evaluation criteria of accuracy and stability. The experimental results indicate that as the proportion of missing data increases, the accuracy of the CDT method remains at 86.5%, outperforming all other methods and producing a reliable phylogenetic tree.


Author(s):  
Lu-Lu Li ◽  
Ji-Wei Xu ◽  
Wei-Chen Yao ◽  
Hui-Hui Yang ◽  
Youssef Dewer ◽  
...  

Abstract The tobacco cutworm Spodoptera litura (Lepidoptera: Noctuidae) is a polyphagous pest with a highly selective and sensitive chemosensory system involved in complex physiological behaviors such as searching for food sources, feeding, courtship, and oviposition. However, effective management strategies for controlling the insect pest populations under threshold levels are lacking. Therefore, there is an urgent need to formulate eco-friendly pest control strategies based on the disruption of the insect chemosensory system. In this study, we identified 158 putative chemosensory genes based on transcriptomic and genomic data for S. litura, including 45 odorant-binding proteins (OBPs, nine were new), 23 chemosensory proteins (CSPs), 60 odorant receptors (ORs, three were new), and 30 gustatory receptors (GRs, three were new), a number higher than those reported by previous transcriptome studies. Subsequently, we constructed phylogenetic trees based on these genes in moths and analyzed the dynamic expression of various genes in head capsules across larval instars using quantitative real-time polymerase chain reaction. Nine genes–SlitOBP8, SlitOBP9, SlitOBP25, SlitCSP1, SlitCSP7, SlitCSP18, SlitOR34, SlitGR240, and SlitGR242–were highly expressed in the heads of 3- to 5-day-old S. litura larvae. The genes differentially expressed in olfactory organs during larval development might play crucial roles in the chemosensory system of S. litura larvae. Our findings substantially expand the gene inventory for S. litura and present potential target genes for further studies on larval feeding in S. litura.


2021 ◽  
Vol 82 (1-2) ◽  
Author(s):  
Lena Collienne ◽  
Alex Gavryushkin

AbstractMany popular algorithms for searching the space of leaf-labelled (phylogenetic) trees are based on tree rearrangement operations. Under any such operation, the problem is reduced to searching a graph where vertices are trees and (undirected) edges are given by pairs of trees connected by one rearrangement operation (sometimes called a move). Most popular are the classical nearest neighbour interchange, subtree prune and regraft, and tree bisection and reconnection moves. The problem of computing distances, however, is $${\mathbf {N}}{\mathbf {P}}$$ N P -hard in each of these graphs, making tree inference and comparison algorithms challenging to design in practice. Although anked phylogenetic trees are one of the central objects of interest in applications such as cancer research, immunology, and epidemiology, the computational complexity of the shortest path problem for these trees remained unsolved for decades. In this paper, we settle this problem for the ranked nearest neighbour interchange operation by establishing that the complexity depends on the weight difference between the two types of tree rearrangements (rank moves and edge moves), and varies from quadratic, which is the lowest possible complexity for this problem, to $${\mathbf {N}}{\mathbf {P}}$$ N P -hard, which is the highest. In particular, our result provides the first example of a phylogenetic tree rearrangement operation for which shortest paths, and hence the distance, can be computed efficiently. Specifically, our algorithm scales to trees with tens of thousands of leaves (and likely hundreds of thousands if implemented efficiently).


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Ranju Ravindran Santhakumari Manoj ◽  
Maria Stefania Latrofa ◽  
Sara Epis ◽  
Domenico Otranto

Abstract Background Wolbachia is an obligate intracellular maternally transmitted, gram-negative bacterium which forms a spectrum of endosymbiotic relationships from parasitism to obligatory mutualism in a wide range of arthropods and onchocercid nematodes, respectively. In arthropods Wolbachia produces reproductive manipulations such as male killing, feminization, parthenogenesis and cytoplasmic incompatibility for its propagation and provides an additional fitness benefit for the host to protect against pathogens, whilst in onchocercid nematodes, apart from the mutual metabolic dependence, this bacterium is involved in moulting, embryogenesis, growth and survival of the host. Methods This review details the molecular data of Wolbachia and its effect on host biology, immunity, ecology and evolution, reproduction, endosymbiont-based treatment and control strategies exploited for filariasis. Relevant peer-reviewed scientic papers available in various authenticated scientific data bases were considered while writing the review. Conclusions The information presented provides an overview on Wolbachia biology and its use in the control and/or treatment of vectors, onchocercid nematodes and viral diseases of medical and veterinary importance. This offers the development of new approaches for the control of a variety of vector-borne diseases. Graphic Abstract


2014 ◽  
Vol 95 (11) ◽  
pp. 2372-2376 ◽  
Author(s):  
Andi Krumbholz ◽  
Jeannette Lange ◽  
Andreas Sauerbrei ◽  
Marco Groth ◽  
Matthias Platzer ◽  
...  

The avian-like swine influenza viruses emerged in 1979 in Belgium and Germany. Thereafter, they spread through many European swine-producing countries, replaced the circulating classical swine H1N1 influenza viruses, and became endemic. Serological and subsequent molecular data indicated an avian source, but details remained obscure due to a lack of relevant avian influenza virus sequence data. Here, the origin of the European avian-like swine influenza viruses was analysed using a collection of 16 European swine H1N1 influenza viruses sampled in 1979–1981 in Germany, the Netherlands, Belgium, Italy and France, as well as several contemporaneous avian influenza viruses of various serotypes. The phylogenetic trees suggested a triple reassortant with a unique genotype constellation. Time-resolved maximum clade credibility trees indicated times to the most recent common ancestors of 34–46 years (before 2008) depending on the RNA segment and the method of tree inference.


1983 ◽  
Vol 19 (2) ◽  
pp. 153-170 ◽  
Author(s):  
Masatoshi Nei ◽  
Fumio Tajima ◽  
Yoshio Tateno

Zootaxa ◽  
2021 ◽  
Vol 4951 (3) ◽  
pp. 559-570
Author(s):  
EUGENYI A.  MAKARCHENKO ◽  
ALEXANDER A. SEMENCHENKO ◽  
DMITRY M. PALATOV

Chironomids of the genus Pagastia Oliver (Diamesinae, Diamesini) from the mountains of Central Asia are revised using both morphological characters and molecular data. Illustrated descriptions of the adult male Pagastia (P.) caelestomontana sp. nov. from Kirgizstan and Tajikistan, P. (P.) hanseni sp. nov. from Tajikistan, and record of a finding apparently a new species P. (P.) aff. lanceolata (Tokunaga) from Tajikistan as well as an updated a key to the determination of the adult males of all known species of Pagastia are provided. A phylogenetic framework is reconstructed based on two mitochondrial genes cytochrome oxidase subunit I (COI) sequences of 34 samples belonging to 7 species of the genus Pagastia and cytochrome oxidase subunit II (COII) available for most samples. Phylogenetic trees of some known species of the genus Pagastia were reconstructed using the combined dataset and Bayesian inference (BI) and Maximum Likelihood (ML) methods. The interspecific K2P distances between seven Pagastia species including P. (P.) caelestomontana sp. nov., P. (P.) hanseni sp. nov. and undescribed P. (P.) aff. lanceolata (Tokunaga) are 6.3–13.2 which corresponding to species level. 


Sign in / Sign up

Export Citation Format

Share Document