scholarly journals Phylogenetic tree shapes resolve disease transmission patterns

2014 ◽  
Author(s):  
Caroline Colijn ◽  
Jennifer Gardy

AbstractWhole genome sequencing is becoming popular as a tool for understanding outbreaks of communicable diseases, with phylogenetic trees being used to identify individual transmission events or to characterize outbreak-level overall transmission dynamics. Existing methods to infer transmission dynamics from sequence data rely on well-characterised infectious periods, epidemiological and clinical meta-data which may not always be available, and typically require computationally intensive analysis focussing on the branch lengths in phylogenetic trees. We sought to determine whether the topological structures of phylogenetic trees contain signatures of the overall transmission patterns underyling an outbreak. Here we use simulated outbreaks to train and then test computational classifiers. We test the method on data from two real-world outbreaks. We find that different transmission patterns result in quantitatively different phylogenetic tree shapes. We describe five topological features that summarize a phylogeny’s structure and find that computational classifiers based on these are capable of predicting an outbreak’s transmission dynamics. The method is robust to variations in the transmission parameters and network types, and recapitulates known epidemiology of previously characterized real-world outbreaks. We conclude that there are simple structural properties of phylogenetic trees which, when combined, can distinguish communicable disease outbreaks with a super-spreader, homogeneous transmission, and chains of transmission. This is possible using genome data alone, and can be done during an outbreak. We discuss the implications for management of outbreaks.

1980 ◽  
Vol 187 (1) ◽  
pp. 65-74 ◽  
Author(s):  
D Penny ◽  
M D Hendy ◽  
L R Foulds

We have recently reported a method to identify the shortest possible phylogenetic tree for a set of protein sequences [Foulds Hendy & Penny (1979) J. Mol. Evol. 13. 127–150; Foulds, Penny & Hendy (1979) J. Mol. Evol. 13, 151–166]. The present paper discusses issues that arise during the construction of minimal phylogenetic trees from protein-sequence data. The conversion of the data from amino acid sequences into nucleotide sequences is shown to be advantageous. A new variation of a method for constructing a minimal tree is presented. Our previous methods have involved first constructing a tree and then either proving that it is minimal or transforming it into a minimal tree. The approach presented in the present paper progressively builds up a tree, taxon by taxon. We illustrate this approach by using it to construct a minimal tree for ten mammalian haemoglobin alpha-chain sequences. Finally we define a measure of the complexity of the data and illustrate a method to derive a directed phylogenetic tree from the minimal tree.


2013 ◽  
Author(s):  
Xavier Didelot ◽  
Jennifer Gardy ◽  
Caroline Colijn

Genomics is increasingly being used to investigate disease outbreaks, but an important question remains unanswered -- how well do genomic data capture known transmission events, particularly for pathogens with long carriage periods or large within-host population sizes? Here we present a novel Bayesian approach to reconstruct densely-sampled outbreaks from genomic data whilst considering within-host diversity. We infer a time-labelled phylogeny using BEAST, then infer a transmission network via a Monte-Carlo Markov Chain. We find that under a realistic model of within-host evolution, reconstructions of simulated outbreaks contain substantial uncertainty even when genomic data reflect a high substitution rate. Reconstruction of a real-world tuberculosis outbreak displayed similar uncertainty, although the correct source case and several clusters of epidemiologically linked cases were identified. We conclude that genomics cannot wholly replace traditional epidemiology, but that Bayesian reconstructions derived from sequence data may form a useful starting point for a genomic epidemiology investigation.


2015 ◽  
Author(s):  
Jennifer Fouquier ◽  
Jai R Rideout ◽  
Evan Bolyen ◽  
John H Chase ◽  
Arron Shiffer ◽  
...  

Ghost-tree is a bioinformatics tool that integrates sequence data from two genetic markers into a single phylogenetic tree that can be used for diversity analyses. Our approach uses one genetic marker whose sequences can be aligned across organisms spanning divergent taxonomic groups (e.g., fungal families) as a “foundation” phylogeny. A second, more rapidly evolving genetic marker is then used to build “extension” phylogenies for more closely related organisms (e.g., fungal species or strains) that are then grafted on to the foundation tree by mapping taxonomic names. We apply ghost-tree to graft fungal extension phylogenies derived from ITS sequences onto a foundation phylogeny derived from fungal 18S sequences. The result is a phylogenetic tree, compatible with the commonly used UNITE fungal database, that supports phylogenetic diversity analysis (e.g., UniFrac) of fungal communities profiled using ITS markers. Availability: ghost-tree is pip-installable. All source code, documentation, and test code are available under the BSD license at https://github.com/JTFouquier/ghost-tree.


2018 ◽  
Vol 4 (1) ◽  
pp. 21-26
Author(s):  
Sean Oddoye

Lassa Virus (LASV) is the etiological catalyst for Lassa fever, an acute hemorrhagic disease with a mortality rate of 15%. Many aspects of the Lassa virus are not understood, like the causation of deafness in ⅓ of surviving patients or why symptoms are benign for 80% of those infected with the virus. Ambiguities like these suggest that there might exist some genomic heterogeneity among infecting viruses and demonstrate a need to quantify and analyze polymorphisms within LASV. Patterns that emerge from phylogenetic trees can be used to assess the structure of a population while also providing insights to the genetic makeup. The purpose of this investigation was to develop a more streamlined means of calculating nucleotide diversity within a subpopulation of Lassa virus strains and to augment a phylogenetic tree of the Lassa Virus glycoprotein precursor (GPC) segment. A total of 25 partial and complete data sequences of LASV strains were obtained from the Genbank Archives. During phase one of this investigation, the sequence data was inputted into MEGA analytical software and the sequence diversity was derived on a nucleotide level. Data from the individual strand sequences was used to augment a phylogenetic tree using Treeview X software. In phase two of this investigation, an algorithm was created using RStudio, with BSGenome and BioStrings extensions. The sequence diversity derived from the statistical analyses on MEGA was compared to that of the algorithm created. A p-value of 0.08 was found, which deviates from the accepted range of non-medical p-value of 0.00 to 0.05. It is suggested that future research focuses on creating a refurbished version of the algorithm to calculate a nucleotide diversity within a percent error of 5%.


2018 ◽  
Author(s):  
Axel Trefzer ◽  
Alexandros Stamatakis

AbstractBayesian Markov-Chain Monte Carlo (MCMC) methods for phylogenetic tree inference, that is, inference of the evolutionary history of distinct species using their molecular sequence data, typically generate large sets of phylogenetic trees. The trees generated by the MCMC procedure are samples of the posterior probability distribution that MCMC methods approximate. Thus, they generate a stream of correlated binary trees that need to be stored. Here, we adapt state-of-the art algorithms for binary tree compression to phylogenetic tree data streams and extend them to also store the required meta-data. On a phylogenetic tree stream containing 1, 000 trees with 500 leaves including branch length values, we achieve a compression rate of 5.4 compared to the uncompressed tree files and of 1.8 compared to bzip2-compressed tree files. For compressing the same trees, but without branch length values, our compression method is approximately an order of magnitude better than bzip2. A prototype implementation is available at https://github.com/axeltref/tree-compression.git.


2021 ◽  
Vol 2 (2) ◽  
pp. 19-28
Author(s):  
Oke Isaiah Idisi ◽  
Tunde Tajudeen Yusuf

Lassa Fever, caused by Lassa virus, is a vector-host transmitted infectious disease whose prevalence has been on the upsurge over the past few decades. Thus, considering the grave implications of the continuous spread of the disease, an epidemic model was developed to describe the disease transmission dynamics with impacts of proposed control measures. This is to help inform effective control strategies that would successfully curtail and contain the disease in its endemic areas. The model is qualitatively analyzed in order to contextualize the long run behavior of the model while the model associated basic reproduction number $(\mathcal{R}_0)$ is derived. The model analysis reveals that the disease-free equilibrium is locally and globally stable whenever $ \mathcal{R}_0 < 1 $ and the disease prevalence would be high as long as $ \mathcal{R}_0 > 1 $. Finally, the model is numerically solved and simulated for different scenarios of the disease outbreaks while the findings from simulations are discussed.


2017 ◽  
Vol 18 (4) ◽  
pp. 193-198 ◽  
Author(s):  
Jonathan Besney ◽  
Danusia Moreau ◽  
Angela Jacobs ◽  
Dan Woods ◽  
Diane Pyne ◽  
...  

Correctional facilities face increased risk of communicable disease transmission and outbreaks. We describe the progression of an influenza outbreak in a Canadian remand facility and suggest strategies for preventing, identifying and responding to outbreaks in this setting. In total, six inmates had laboratory-confirmed influenza resulting in 144 exposed contacts. Control measures included enhanced isolation precautions, restricting admissions to affected living units, targeted vaccination and antiviral prophylaxis. This report highlights the importance of setting specific outbreak guidelines in addressing population and environmental challenges, as well as implementation of effective infection prevention and control (IPAC) and public health measures when managing influenza and other communicable disease outbreaks.


2021 ◽  
Author(s):  
Xuemei Liu ◽  
Wen Li ◽  
Guanda Huang ◽  
Tianlai Huang ◽  
Qingang Xiong ◽  
...  

Algorithms for constructing phylogenetic trees are fundamental to study the evolution of viruses, bacteria, and other microbes. Established multiple alignment-based algorithms are inefficient for large scale metagenomic sequence data because of their high requirement of inter-sequence correlation and high computational complexity. In this paper, we present SeqDistK, a novel tool for alignment-free phylogenetic analysis. SeqDistK computes the dissimilarity matrix for phylogenetic analysis, incorporating seven k-mer based dissimilarity measures, namely d2, d2S, d2star, Euclidean, Manhattan, CVTree, and Chebyshev. Based on these dissimilarities, SeqDistK constructs phylogenetic tree using the Unweighted Pair Group Method with Arithmetic Mean algorithm. Using a golden standard dataset of 16S rRNA and its associated phylogenetic tree, we compared SeqDistK to Muscle - a multi sequence aligner. We found SeqDistK was not only 38 times faster than Muscle in computational efficiency but also more accurate. SeqDistK achieved the smallest symmetric difference between the inferred and ground truth trees with a range between 13 to 18, while that of Muscle was 62. When measures d2, d2star, d2S, Euclidean, and k-mer size k=5 were used, SeqDistK consistently inferred phylogenetic tree almost identical to the ground truth tree. We also performed clustering of 16S rRNA sequences using SeqDistK and found the clustering was highly consistent with known biological taxonomy. Among all the measures, d2S (k=5, M=2) showed the best accuracy as it correctly clustered and classified all sample sequences. In summary, SeqDistK is a novel, fast and accurate alignment-free tool for large-scale phylogenetic analysis. SeqDistK software is freely available at https://github.com/htczero/SeqDistK.


2015 ◽  
Author(s):  
Jennifer Fouquier ◽  
Jai R Rideout ◽  
Evan Bolyen ◽  
John H Chase ◽  
Arron Shiffer ◽  
...  

Ghost-tree is a bioinformatics tool that integrates sequence data from two genetic markers into a single phylogenetic tree that can be used for diversity analyses. Our approach uses one genetic marker whose sequences can be aligned across organisms spanning divergent taxonomic groups (e.g., fungal families) as a “foundation” phylogeny. A second, more rapidly evolving genetic marker is then used to build “extension” phylogenies for more closely related organisms (e.g., fungal species or strains) that are then grafted on to the foundation tree by mapping taxonomic names. We apply ghost-tree to graft fungal extension phylogenies derived from ITS sequences onto a foundation phylogeny derived from fungal 18S sequences. The result is a phylogenetic tree, compatible with the commonly used UNITE fungal database, that supports phylogenetic diversity analysis (e.g., UniFrac) of fungal communities profiled using ITS markers. Availability: ghost-tree is pip-installable. All source code, documentation, and test code are available under the BSD license at https://github.com/JTFouquier/ghost-tree.


2018 ◽  
Vol 10 (1) ◽  
Author(s):  
Aisha Haynie ◽  
Sherry Jin ◽  
Leann Liu ◽  
Sherrill Pirsamadi ◽  
Benjamin Hornstein ◽  
...  

Objective1) Describe HCPH’s disease surveillance and prevention activities within the NRG Center mega-shelter; 2) Present surveillance findings with an emphasis on sharing tools that were developed and may be utilized for future disaster response efforts; 3) Discuss successes achieved, challenges encountered, and lessons learned from this emergency response.IntroductionHurricane Harvey made landfall along the Texas coast on August 25th, 2017 as a Category 4 storm. It is estimated that the ensuing rainfall caused record flooding of at least 18 inches in 70% of Harris County. Over 30,000 residents were displaced and 50 deaths occurred due to the devastation. At least 53 temporary refuge shelters opened in various parts of Harris County to accommodate displaced residents. On the evening of August 29th, Harris County and community partners set up a 10,000 bed mega-shelter at NRG Center, in efforts to centralize refuge efforts. Harris County Public Health (HCPH) was responsible for round-the-clock surveillance to monitor resident health status and prevent communicable disease outbreaks within the mega-shelter. This was accomplished through direct and indirect resident health assessments, along with coordinated prevention and disease control efforts. Despite HCPH’s 20-day active response, and identification of two relatively small but potentially worrisome communicable disease outbreaks, no large-scale disease outbreaks occurred within the NRG Center mega-shelter.MethodsActive surveillance was conducted in the NRG shelter to rapidly detect communicable and high-consequence illness and to prevent disease transmission. An online survey tool and novel epidemiology consulting method were developed to aid in this surveillance. Surveillance included daily review of onsite medical, mental health, pharmacy, and vaccination activities, as well as nightly cot-to-cot resident health surveys. Symptoms of infectious disease, exacerbation of chronic disease, and mental health issues among evacuees were closely monitored. Rapid epidemiology consultations were performed for shelter residents displaying symptoms consistent with communicable illness or other signs of distress during nightly cot surveys. Onsite rapid assay tests and public health laboratory testing were used to confirm disease diagnoses. When indicated, disease control measures were implemented and residents referred for further evaluation. Frequencies and percentages were used in the descriptive analysis.ResultsHarris County’s NRG Center mega-shelter housed 3,365 evacuees at its peak. 3,606 household health surveys were completed during 20 days of active surveillance, representing 7,152 individual resident evaluations, and 395 epidemiology consultations. Multifaceted surveillance uncovered influenza-like illness and gastrointestinal (GI) complaints, revealing an Influenza A outbreak of 20 cases, 3 isolated cases of strep throat, and a Norovirus cluster of 5 cases. Disease control activities included creation of respiratory and GI isolation rooms, provision of over 771 influenza vaccinations, generous distribution of hand sanitizer throughout the shelter, placement of hygiene signage, and frequent bilingual public health public service announcements in the dormitory areas. No widespread outbreaks of communicable disease occurred. Additionally, a number of shelter residents were referred to the clinic after reporting exacerbation of chronical conditions or mental health concerns, including one individual with suicidal ideations.ConclusionsEffective public health surveillance and implementation of disease control measures in disaster shelters are critical to detecting and preventing communicable illness. HCPH’s rigorous surveillance and response system in the NRG Center mega-shelter, including online survey tool and novel consultation method, resulted in timely identification and isolation of patients with gastrointestinal and influenza-like illness. These were likely key factors in the successful prevention of widespread disease transmission. Additional success factors included successful partnerships with onsite clinical and pharmacy teams, cooperative and engaged shelter leadership, synergistic internal surveillance team dynamics, availability of student volunteers, sufficient quantities of influenza vaccine, and access to mobile survey technology. Challenges, mostly related to scope and magnitude of response, included lack of pre-designed survey tools, relatively new staff without significant disaster experience, and simultaneous management of multiple surveillance activities within the community. Personal hurricane-related losses experienced by HCPH staff also impacted response efforts. HCPH’s rich disaster response experiences at the NRG mega-shelter and developed surveillance tools can serve as a planning guide for future public health emergencies in Harris County and other jurisdictions.


Sign in / Sign up

Export Citation Format

Share Document