Display Sets of Normal and Tree-Child Networks

Janosch Döcker; Simone Linz; Charles Semple

doi:10.37236/9128

Display Sets of Normal and Tree-Child Networks

The Electronic Journal of Combinatorics ◽

10.37236/9128 ◽

2021 ◽

Vol 28 (1) ◽

Author(s):

Janosch Döcker ◽

Simone Linz ◽

Charles Semple

Keyword(s):

Decision Problem ◽

Phylogenetic Trees ◽

Phylogenetic Network ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Directed Acyclic Graphs ◽

Phylogenetic Networks ◽

Acyclic Graphs ◽

Normal Network ◽

Normal Networks

Phylogenetic networks are leaf-labelled directed acyclic graphs that are used in computational biology to analyse and represent the evolutionary relationships of a set of species or viruses. In contrast to phylogenetic trees, phylogenetic networks have vertices of in-degree at least two that represent reticulation events such as hybridisation, lateral gene transfer, or reassortment. By systematically deleting various combinations of arcs in a phylogenetic network $\mathcal N$, one derives a set of phylogenetic trees that are embedded in $\mathcal N$. We recently showed that the problem of deciding if two binary phylogenetic networks embed the same set of phylogenetic trees is computationally hard, in particular, we showed it to be $\Pi^P_2$-complete. In this paper, we establish a polynomial-time algorithm for this decision problem if the initial two networks consist of a normal network and a tree-child network; two well-studied topologically restricted subclasses of phylogenetic networks, with normal networks being more structurally constrained than tree-child networks. The running time of the algorithm is quadratic in the size of the leaf sets.

Download Full-text

Merging Arcs to Produce Acyclic Phylogenetic Networks and Normal Networks

Bulletin of Mathematical Biology ◽

10.1007/s11538-021-00986-1 ◽

2022 ◽

Vol 84 (2) ◽

Author(s):

Stephen J. Willson

Keyword(s):

Phylogenetic Network ◽

Phylogenetic Networks ◽

Original Network ◽

Acyclic Network ◽

Normal Network ◽

Normal Networks ◽

The Given

AbstractAs phylogenetic networks grow increasingly complicated, systematic methods for simplifying them to reveal properties will become more useful. This paper considers how to modify acyclic phylogenetic networks into other acyclic networks by contracting specific arcs that include a set D. The networks need not be binary, so vertices in the networks may have more than two parents and/or more than two children. In general, in order to make the resulting network acyclic, additional arcs not in D must also be contracted. This paper shows how to choose D so that the resulting acyclic network is “pre-normal”. As a result, removal of all redundant arcs yields a normal network. The set D can be selected based only on the geometry of the network, giving a well-defined normal phylogenetic network depending only on the given network. There are CSD maps relating most of the networks. The resulting network can be visualized as a “wired lift” in the original network, which appears as the original network with each arc drawn in one of three ways.

Download Full-text

On the Subnet Prune and Regraft Distance

The Electronic Journal of Combinatorics ◽

10.37236/7860 ◽

2019 ◽

Vol 26 (2) ◽

Cited By ~ 1

Author(s):

Jonathan Klawitter ◽

Simone Linz

Keyword(s):

Gene Transfer ◽

Horizontal Gene Transfer ◽

Phylogenetic Tree ◽

Phylogenetic Network ◽

Directed Acyclic Graphs ◽

Phylogenetic Networks ◽

Evolutionary Relationships ◽

Acyclic Graphs ◽

Subtree Prune And Regraft

Phylogenetic networks are rooted directed acyclic graphs that represent evolutionary relationships between species whose past includes reticulation events such as hybridisation and horizontal gene transfer. To search the space of phylogenetic networks, the popular tree rearrangement operation rooted subtree prune and regraft (rSPR) was recently generalised to phylogenetic networks. This new operation – called subnet prune and regraft (SNPR) – induces a metric on the space of all phylogenetic networks as well as on several widely-used network classes. In this paper, we investigate several problems that arise in the context of computing the SNPR-distance. For a phylogenetic tree $T$ and a phylogenetic network $N$, we show how this distance can be computed by considering the set of trees that are embedded in $N$ and then use this result to characterise the SNPR-distance between $T$ and $N$ in terms of agreement forests. Furthermore, we analyse properties of shortest SNPR-sequences between two phylogenetic networks $N$ and $N'$, and answer the question whether or not any of the classes of tree-child, reticulation-visible, or tree-based networks isometrically embeds into the class of all phylogenetic networks under SNPR.

Download Full-text

OPTIMAL, EFFICIENT RECONSTRUCTION OF PHYLOGENETIC NETWORKS WITH CONSTRAINED RECOMBINATION

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720004000521 ◽

2004 ◽

Vol 02 (01) ◽

pp. 173-213 ◽

Cited By ~ 117

Author(s):

DAN GUSFIELD ◽

SATISH EDDHU ◽

CHARLES LANGLEY

Keyword(s):

Phylogenetic Tree ◽

Polynomial Time ◽

Phylogenetic Network ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Phylogenetic Networks ◽

Seminal Paper ◽

Back Mutation ◽

True Tree ◽

The One

A phylogenetic network is a generalization of a phylogenetic tree, allowing structural properties that are not tree-like. In a seminal paper, Wang et al.1 studied the problem of constructing a phylogenetic network, allowing recombination between sequences, with the constraint that the resulting cycles must be disjoint. We call such a phylogenetic network a "galled-tree". They gave a polynomial-time algorithm that was intended to determine whether or not a set of sequences could be generated on galled-tree. Unfortunately, the algorithm by Wang et al.1 is incomplete and does not constitute a necessary test for the existence of a galled-tree for the data. In this paper, we completely solve the problem. Moreover, we prove that if there is a galled-tree, then the one produced by our algorithm minimizes the number of recombinations over all phylogenetic networks for the data, even allowing multiple-crossover recombinations. We also prove that when there is a galled-tree for the data, the galled-tree minimizing the number of recombinations is "essentially unique". We also note two additional results: first, any set of sequences that can be derived on a galled tree can be derived on a true tree (without recombination cycles), where at most one back mutation per site is allowed; second, the site compatibility problem (which is NP-hard in general) can be solved in polynomial time for any set of sequences that can be derived on a galled tree. Perhaps more important than the specific results about galled-trees, we introduce an approach that can be used to study recombination in general phylogenetic networks. This paper greatly extends the conference version that appears in an earlier work.8 PowerPoint slides of the conference talk can be found at our website.7

Download Full-text

k-Efficient domination: Algorithmic perspective

Discrete Mathematics Algorithms and Applications ◽

10.1142/s1793830922500513 ◽

2022 ◽

Author(s):

Mohsen Alambardar Meybodi

Keyword(s):

Decision Problem ◽

Dominating Set ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Chordal Graphs ◽

Sparse Graphs ◽

Fpt Algorithm ◽

Domination Problem ◽

Efficient Dominating Set ◽

Np Complete

A set [Formula: see text] of a graph [Formula: see text] is called an efficient dominating set of [Formula: see text] if every vertex [Formula: see text] has exactly one neighbor in [Formula: see text], in other words, the vertex set [Formula: see text] is partitioned to some circles with radius one such that the vertices in [Formula: see text] are the centers of partitions. A generalization of this concept, introduced by Chellali et al. [k-Efficient partitions of graphs, Commun. Comb. Optim. 4 (2019) 109–122], is called [Formula: see text]-efficient dominating set that briefly partitions the vertices of graph with different radiuses. It leads to a partition set [Formula: see text] such that each [Formula: see text] consists a center vertex [Formula: see text] and all the vertices in distance [Formula: see text], where [Formula: see text]. In other words, there exist the dominators with various dominating powers. The problem of finding minimum set [Formula: see text] is called the minimum [Formula: see text]-efficient domination problem. Given a positive integer [Formula: see text] and a graph [Formula: see text], the [Formula: see text]-efficient Domination Decision problem is to decide whether [Formula: see text] has a [Formula: see text]-efficient dominating set of cardinality at most [Formula: see text]. The [Formula: see text]-efficient Domination Decision problem is known to be NP-complete even for bipartite graphs [M. Chellali, T. W. Haynes and S. Hedetniemi, k-Efficient partitions of graphs, Commun. Comb. Optim. 4 (2019) 109–122]. Clearly, every graph has a [Formula: see text]-efficient dominating set but it is not correct for efficient dominating set. In this paper, we study the following: [Formula: see text]-efficient domination problem set is NP-complete even in chordal graphs. A polynomial-time algorithm for [Formula: see text]-efficient domination in trees. [Formula: see text]-efficient domination on sparse graphs from the parametrized complexity perspective. In particular, we show that it is [Formula: see text]-hard on d-degenerate graphs while the original dominating set has Fixed Parameter Tractable (FPT) algorithm on d-degenerate graphs. [Formula: see text]-efficient domination on nowhere-dense graphs is FPT.

Download Full-text

IT IS NL-COMPLETE TO DECIDE WHETHER A HAIRPIN COMPLETION OF REGULAR LANGUAGES IS REGULAR

International Journal of Foundations of Computer Science ◽

10.1142/s0129054111009057 ◽

2011 ◽

Vol 22 (08) ◽

pp. 1813-1828 ◽

Cited By ~ 1

Author(s):

VOLKER DIEKERT ◽

STEFFEN KOPECKI

Keyword(s):

Polynomial Time ◽

Dna Computing ◽

Regular Language ◽

Decision Problem ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Complexity Bound ◽

The One ◽

Context Free ◽

Hairpin Formation

The hairpin completion is an operation on formal languages which is inspired by the hairpin formation in biochemistry. Hairpin formations occur naturally within DNA-computing. It has been known that the hairpin completion of a regular language is linear context-free, but not regular, in general. However, for some time it is was open whether the regularity of the hairpin completion of a regular language is decidable. In 2009 this decidability problem has been solved positively in [5] by providing a polynomial time algorithm. In this paper we improve the complexity bound by showing that the decision problem is actually NL-complete. This complexity bound holds for both, the one-sided and the two-sided hairpin completions.

Download Full-text

Applicability of several rooted phylogenetic network algorithms for representing the evolutionary history of SARS-CoV-2

BMC Ecology and Evolution ◽

10.1186/s12862-021-01946-y ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Rosanne Wallin ◽

Leo van Iersel ◽

Steven Kelk ◽

Leen Stougie

Keyword(s):

Phylogenetic Trees ◽

Evolutionary History ◽

Network Inference ◽

Phylogenetic Network ◽

Phylogenetic Networks ◽

Network Algorithms ◽

Running Time ◽

Inference Algorithms ◽

History Of ◽

The Impact

Abstract Background Rooted phylogenetic networks are used to display complex evolutionary history involving so-called reticulation events, such as genetic recombination. Various methods have been developed to construct such networks, using for example a multiple sequence alignment or multiple phylogenetic trees as input data. Coronaviruses are known to recombine frequently, but rooted phylogenetic networks have not yet been used extensively to describe their evolutionary history. Here, we created a workflow to compare the evolutionary history of SARS-CoV-2 with other SARS-like viruses using several rooted phylogenetic network inference algorithms. This workflow includes filtering noise from sets of phylogenetic trees by contracting edges based on branch length and bootstrap support, followed by resolution of multifurcations. We explored the running times of the network inference algorithms, the impact of filtering on the properties of the produced networks, and attempted to derive biological insights regarding the evolution of SARS-CoV-2 from them. Results The network inference algorithms are capable of constructing rooted phylogenetic networks for coronavirus data, although running-time limitations require restricting such datasets to a relatively small number of taxa. Filtering generally reduces the number of reticulations in the produced networks and increases their temporal consistency. Taxon bat-SL-CoVZC45 emerges as a major and structural source of discordance in the dataset. The tested algorithms often indicate that SARS-CoV-2/RaTG13 is a tree-like clade, with possibly some reticulate activity further back in their history. A smaller number of constructed networks posit SARS-CoV-2 as a possible recombinant, although this might be a methodological artefact arising from the interaction of bat-SL-CoVZC45 discordance and the optimization criteria used. Conclusion Our results demonstrate that as part of a wider workflow and with careful attention paid to running time, rooted phylogenetic network algorithms are capable of producing plausible networks from coronavirus data. These networks partly corroborate existing theories about SARS-CoV-2, and partly produce new avenues for exploration regarding the location and significance of reticulate activity within the wider group of SARS-like viruses. Our workflow may serve as a model for pipelines in which phylogenetic network algorithms can be used to analyse different datasets and test different hypotheses.

Download Full-text

Tree-Based Unrooted Phylogenetic Networks

Bulletin of Mathematical Biology ◽

10.1007/s11538-017-0381-3 ◽

2017 ◽

Vol 80 (2) ◽

pp. 404-416 ◽

Cited By ~ 10

Author(s):

A. Francis ◽

K. T. Huber ◽

V. Moulton

Keyword(s):

Gene Transfer ◽

Horizontal Gene Transfer ◽

Phylogenetic Tree ◽

Phylogenetic Trees ◽

Phylogenetic Network ◽

Simple Graph ◽

Phylogenetic Networks ◽

Underlying Graph ◽

Finite Set ◽

Computational Properties

Abstract Phylogenetic networks are a generalization of phylogenetic trees that are used to represent non-tree-like evolutionary histories that arise in organisms such as plants and bacteria, or uncertainty in evolutionary histories. An unrooted phylogenetic network on a non-empty, finite set X of taxa, or network, is a connected, simple graph in which every vertex has degree 1 or 3 and whose leaf set is X. It is called a phylogenetic tree if the underlying graph is a tree. In this paper we consider properties of tree-based networks, that is, networks that can be constructed by adding edges into a phylogenetic tree. We show that although they have some properties in common with their rooted analogues which have recently drawn much attention in the literature, they have some striking differences in terms of both their structural and computational properties. We expect that our results could eventually have applications to, for example, detecting horizontal gene transfer or hybridization which are important factors in the evolution of many organisms.

Download Full-text

Constructing Phylogenetic Networks Based on the Isomorphism of Datasets

BioMed Research International ◽

10.1155/2016/4236858 ◽

2016 ◽

Vol 2016 ◽

pp. 1-7

Author(s):

Juan Wang ◽

Zhibin Zhang ◽

Yanjuan Li

Keyword(s):

Molecular Evolution ◽

Phylogenetic Trees ◽

Phylogenetic Network ◽

Phylogenetic Networks ◽

The Relationship

Constructing rooted phylogenetic networks from rooted phylogenetic trees has become an important problem in molecular evolution. So far, many methods have been presented in this area, in which most efficient methods are based on the incompatible graph, such as the CASS, the LNETWORK,and the BIMLR. This paper will research the commonness of the methods based on the incompatible graph, the relationship between incompatible graph and the phylogenetic network, and the topologies of incompatible graphs. We can find out all the simplest datasets for a topologyGand construct a network for every dataset. For any one datasetC, we can compute a network from the network representing the simplest dataset which is isomorphic toC. This process will save more time for the algorithms when constructing networks.

Download Full-text

Generating normal networks via leaf insertion and nearest neighbor interchange

BMC Bioinformatics ◽

10.1186/s12859-019-3209-3 ◽

2019 ◽

Vol 20 (S20) ◽

Cited By ~ 1

Author(s):

Louxin Zhang

Keyword(s):

Population Genetics ◽

Phylogenetic Trees ◽

Nearest Neighbor ◽

Structural Condition ◽

Phylogenetic Networks ◽

Topological Structures ◽

Theoretical Population ◽

Theoretical Population Genetics ◽

Normal Networks ◽

Leaf Insertion

Abstract Background Galled trees are studied as a recombination model in theoretical population genetics. This class of phylogenetic networks has been generalized to tree-child networks and other network classes by relaxing a structural condition imposed on galled trees. Although these networks are simple, their topological structures have yet to be fully understood. Results It is well-known that all phylogenetic trees on n taxa can be generated by the insertion of the n-th taxa to each edge of all the phylogenetic trees on n−1 taxa. We prove that all tree-child (resp. normal) networks with k reticulate nodes on n taxa can be uniquely generated via three operations from all the tree-child (resp. normal) networks with k−1 or k reticulate nodes on n−1 taxa. Applying this result to counting rooted phylogenetic networks, we show that there are exactly $\frac {(2n)!}{2^{n} (n-1)!}-2^{n-1} n!$(2n)!2n(n−1)!−2n−1n! binary phylogenetic networks with one reticulate node on n taxa. Conclusions The work makes two contributions to understand normal networks. One is a generalization of an enumeration procedure for phylogenetic trees into one for normal networks. Another is simple formulas for counting normal networks and phylogenetic networks that have only one reticulate node.

Download Full-text

Phylogenetic Trees and Networks Can Serve as Powerful and Complementary Approaches for Analysis of Genomic Data

Systematic Biology ◽

10.1093/sysbio/syz056 ◽

2019 ◽

Vol 69 (3) ◽

pp. 593-601 ◽

Cited By ~ 9

Author(s):

Christopher Blair ◽

Cécile Ané

Keyword(s):

Gene Flow ◽

Phylogenetic Trees ◽

Evolutionary History ◽

Incomplete Lineage Sorting ◽

Gene Tree ◽

Phylogenetic Network ◽

Genomic Data ◽

Point Of View ◽

Phylogenetic Networks ◽

Lineage Sorting

Abstract Genomic data have had a profound impact on nearly every biological discipline. In systematics and phylogenetics, the thousands of loci that are now being sequenced can be analyzed under the multispecies coalescent model (MSC) to explicitly account for gene tree discordance due to incomplete lineage sorting (ILS). However, the MSC assumes no gene flow post divergence, calling for additional methods that can accommodate this limitation. Explicit phylogenetic network methods have emerged, which can simultaneously account for ILS and gene flow by representing evolutionary history as a directed acyclic graph. In this point of view, we highlight some of the strengths and limitations of phylogenetic networks and argue that tree-based inference should not be blindly abandoned in favor of networks simply because they represent more parameter rich models. Attention should be given to model selection of reticulation complexity, and the most robust conclusions regarding evolutionary history are likely obtained when combining tree- and network-based inference.

Download Full-text