Evolutionary and Functional Implications of Hypervariable Loci Within the Skin Virome

Mapping Intimacies ◽

10.1101/078170 ◽

2016 ◽

Author(s):

Geoffrey D. Hannigan ◽

Qi Zheng ◽

Jacquelyn S. Meisel ◽

Samuel Minot ◽

Frederick D. Bushman ◽

...

Keyword(s):

Gene Annotation ◽

Viral Evolution ◽

Human Microbiome ◽

Purifying Selection ◽

Genomic Variation ◽

Healthy Human ◽

Metagenomic Sequence ◽

Genomic Variability ◽

Functional Implications ◽

Hypervariable Loci

ABSTRACTLocalized genomic variability is crucial for the ongoing conflicts between infectious microbes and their hosts. An understanding of evolutionary and adaptive patterns associated with genomic variability will help guide development of vaccines and anti-microbial agents. While most analyses of the human microbiome have focused on taxonomic classification and gene annotation, we investigated genomic variation of skin-associated viral communities. We evaluated patterns of viral genomic variation across 16 healthy human volunteers. HPV and Staphylococcus phages contained 106 and 465 regions of diversification, or hypervariable loci, respectively. Propionibacterium phage genomes were minimally divergent and contained no hypervariable loci. Genes containing hypervariable loci were involved in functions including host tropism and immune evasion. HPV and Staphylococcus phage hypervariable loci were associated with purifying selection. Amino acid substitution patterns were virus dependent, as were predictions of their phenotypic effects. We identified diversity generating retroelements as one likely mechanism driving hypervariability. We validated these findings in an independently collected skin metagenomic sequence dataset, suggesting that these features of skin virome genomic variability are widespread. Our results highlight the genomic variation landscape of the skin virome and provide a foundation for better understanding community viral evolution and the functional implications of genomic diversification of skin viruses.

Download Full-text

Evolutionary and functional implications of hypervariable loci within the skin virome

PeerJ ◽

10.7717/peerj.2959 ◽

2017 ◽

Vol 5 ◽

pp. e2959 ◽

Cited By ~ 14

Author(s):

Geoffrey D. Hannigan ◽

Qi Zheng ◽

Jacquelyn S. Meisel ◽

Samuel S. Minot ◽

Frederick D. Bushman ◽

...

Keyword(s):

Antimicrobial Agents ◽

Gene Annotation ◽

Human Microbiome ◽

Purifying Selection ◽

Genomic Variation ◽

Healthy Human ◽

Metagenomic Sequence ◽

Genomic Variability ◽

Functional Implications ◽

Hypervariable Loci

Localized genomic variability is crucial for the ongoing conflicts between infectious microbes and their hosts. An understanding of evolutionary and adaptive patterns associated with genomic variability will help guide development of vaccines and antimicrobial agents. While most analyses of the human microbiome have focused on taxonomic classification and gene annotation, we investigated genomic variation of skin-associated viral communities. We evaluated patterns of viral genomic variation across 16 healthy human volunteers. Human papillomavirus (HPV) and Staphylococcus phages contained 106 and 465 regions of diversification, or hypervariable loci, respectively. Propionibacterium phage genomes were minimally divergent and contained no hypervariable loci. Genes containing hypervariable loci were involved in functions including host tropism and immune evasion. HPV and Staphylococcus phage hypervariable loci were associated with purifying selection. Amino acid substitution patterns were virus dependent, as were predictions of their phenotypic effects. We identified diversity generating retroelements as one likely mechanism driving hypervariability. We validated these findings in an independently collected skin metagenomic sequence dataset, suggesting that these features of skin virome genomic variability are widespread. Our results highlight the genomic variation landscape of the skin virome and provide a foundation for better understanding community viral evolution and the functional implications of genomic diversification of skin viruses.

Download Full-text

Peer Review #1 of "Evolutionary and functional implications of hypervariable loci within the skin virome (v0.2)"

10.7287/peerj.2959v0.2/reviews/1 ◽

2017 ◽

Keyword(s):

Peer Review ◽

Functional Implications ◽

Hypervariable Loci

Download Full-text

Metabolic network-guided binning of metagenomic sequence fragments

Bioinformatics ◽

10.1093/bioinformatics/btv671 ◽

2015 ◽

Vol 32 (6) ◽

pp. 867-874 ◽

Cited By ~ 4

Author(s):

Matthew B. Biggs ◽

Jason A. Papin

Keyword(s):

Metabolic Network ◽

Dna Sequences ◽

Metabolic Networks ◽

Human Microbiome ◽

Environmental Dna ◽

Human Microbiome Project ◽

Supplementary Information ◽

Metagenomic Sequence ◽

Connectivity Score ◽

Genome Scale

Abstract Motivation: Most microbes on Earth have never been grown in a laboratory, and can only be studied through DNA sequences. Environmental DNA sequence samples are complex mixtures of fragments from many different species, often unknown. There is a pressing need for methods that can reliably reconstruct genomes from complex metagenomic samples in order to address questions in ecology, bioremediation, and human health. Results: We present the SOrting by NEtwork Completion (SONEC) approach for assigning reactions to incomplete metabolic networks based on a metabolite connectivity score. We successfully demonstrate proof of concept in a set of 100 genome-scale metabolic network reconstructions, and delineate the variables that impact reaction assignment accuracy. We further demonstrate the integration of SONEC with existing approaches (such as cross-sample scaffold abundance profile clustering) on a set of 94 metagenomic samples from the Human Microbiome Project. We show that not only does SONEC aid in reconstructing species-level genomes, but it also improves functional predictions made with the resulting metabolic networks. Availability and implementation: The datasets and code presented in this work are available at: https://bitbucket.org/mattbiggs/sorting_by_network_completion/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Download Full-text

Gastrointestinal carriage is a major reservoir of K. pneumoniae infection in intensive care patients

10.1101/096446 ◽

2016 ◽

Cited By ~ 2

Author(s):

Claire L Gorrie ◽

Mirjana Mirceta ◽

Ryan R Wick ◽

David J Edwards ◽

Richard A Strugnell ◽

...

Keyword(s):

Intensive Care ◽

Human Microbiome ◽

Significant Risk ◽

Clinical Information ◽

Epidemiological Data ◽

Significant Risk Factor ◽

Opportunistic Pathogen ◽

Healthy Human ◽

Icu Patients ◽

Gut Colonization

AbstractBackgroundKlebsiella pneumoniae is an opportunistic pathogen and a leading cause of hospital-associated (HA) infections. Patients in intensive care units (ICUs) are particularly at risk, and outbreaks are frequently reported in ICUs. K. pneumoniae is also part of the healthy human microbiome, providing a potential reservoir for HA infection. However, the frequency of K. pneumoniae gut colonization and its contribution to HA infections are not well characterized.MethodsWe conducted one-year prospective cohort study of ICU patients. Participants (n=498) were screened for rectal and throat carriage of K. pneumoniae shortly after admission, and clinical information was extracted from hospital records.K. pneumoniae isolated from screening swabs and clinical diagnostic samples were characterized using whole genome sequencing. Genomic and epidemiological data were combined to identify likely transmission events.Results and ConclusionsK. pneumoniae carriage frequencies were estimated at 6% (95% CI, 3%-8%) amongst ICU patients admitted direct from the community, and 19% (95% CI, 14% – 51%) amongst those who had recent contact with healthcare. Gut colonisation on admission was significantly associated with subsequent K. pneumoniae infection (infection risk 16% vs 3%, OR=6.9, p<0.001), and genome data indicated a match between carriage and infection isolates in most patients. Five likely transmission chains were identified, resulting in six infections (12% of K. pneumoniae infections in ICU). In contrast, 49% of K. pneumoniae infections were caused by a strain that was unique to the patient, and 48% of patients with K. pneumoniae infections who participated in screening were positive for prior colonisation. These data confirm K. pneumoniae colonisation is a significant risk factor for subsequent infection in ICU, and indicate that half of all K. pneumoniae infections result from patients’ own microbiota. Screening for colonisation on admission could limit risk of infection in the colonised patient and others.

Download Full-text

The ancient fusogen EnvP(b)1 is expressed in human tissues and its structure informs the evolution of gammaretrovirus envelope proteins

10.1101/2020.04.22.056234 ◽

2020 ◽

Author(s):

Kevin R. McCarthy ◽

Joseph L. Timpona ◽

Simon Jenni ◽

Vesna Brusic ◽

Welkin E. Johnson ◽

...

Keyword(s):

Receptor Binding ◽

Transcriptional Control ◽

Purifying Selection ◽

Binding Domain ◽

Endogenous Retroviruses ◽

Envelope Gene ◽

Human Tissues ◽

Receptor Binding Domain ◽

Functional Protein ◽

Healthy Human

ABSTRACTHost genomes have acquired diversity from viruses through the capture of viral elements, often from endogenous retroviruses (ERVs). These viral elements contribute new transcriptional control elements and new protein encoding genes, and their refinement through evolution can generate novel physiological functions for the host. EnvP(b)1 is an endogenous retroviral envelope gene found in human and other primate genomes. We show that EnvP(b)1 arose very early in the evolution of primates, i.e. at least 40-47 million years ago, but has nevertheless retained its ability to fuse primate cells. We have detected similar sequences in the genome of a lemur species, suggesting that a progenitor virus may have circulated 55+ million years ago. We demonstrate that EnvP(b)1 protein is expressed in multiple human tissues and is fully processed, rendering it competent to fuse cells. This activated fusogen is expressed in multiple healthy human tissues and is under purifying selection, suggesting that its expression is selectively advantageous. We determined a structure of the inferred receptor binding domain of human EnvP(b)1, revealing close structural similarities between this Env protein and those of currently circulating leukemia viruses, despite poor sequence conservation. This observation highlights a common scaffold from which novel receptor binding specificities have evolved. The evolutionary plasticity of this domain may underlie the diversity of related Envs in circulating viruses and coopted elements alike. The function of EnvP(b)1 in primates remains unknown.SIGNIFICANCE STATEMENTOrganisms can access genetic and functional novelty by capturing viral elements within their genomes, where they can evolve to drive new cellular or organismal processes. We demonstrate that a retrovirus envelope gene, EnvP(b)1, has been maintained as a functional protein for 40 to ≥55 million years and is expressed as a protein in multiple healthy human tissues. We believe it has an unknown function in primates. We determined the structure of its inferred receptor binding domain and compared it with the same domain in modern viruses. We find a common conserved architecture that underlies the varied receptor binding activity of divergent Env genes. The modularity and versatility of this domain may underpin the evolutionary success of this clade of fusogens.

Download Full-text

Searching more genomic sequence with less memory for fast and accurate metagenomic profiling

10.1101/036681 ◽

2016 ◽

Author(s):

Shea N Gardner ◽

Sasha K Ames ◽

Maya B Gokhale ◽

Tom R Slezak ◽

Jonathan Allen

Keyword(s):

Large Scale ◽

Genomic Sequence ◽

Sequence Data ◽

Low Cost ◽

False Negative ◽

Human Microbiome ◽

Human Microbiome Project ◽

Metagenomic Data ◽

Reference Database ◽

Metagenomic Sequence

Software for rapid, accurate, and comprehensive microbial profiling of metagenomic sequence data on a desktop will play an important role in large scale clinical use of metagenomic data. Here we describe LMAT-ML (Livermore Metagenomics Analysis Toolkit-Marker Library) which can be run with 24 GB of DRAM memory, an amount available on many clusters, or with 16 GB DRAM plus a 24 GB low cost commodity flash drive (NVRAM), a cost effective alternative for desktop or laptop users. We compared results from LMAT with five other rapid, low-memory tools for metagenome analysis for 131 Human Microbiome Project samples, and assessed discordant calls with BLAST. All the tools except LMAT-ML reported overly specific or incorrect species and strain resolution of reads that were in fact much more widely conserved across species, genera, and even families. Several of the tools misclassified reads from synthetic or vector sequence as microbial or human reads as viral. We attribute the high numbers of false positive and false negative calls to a limited reference database with inadequate representation of known diversity. Our comparisons with real world samples show that LMAT-ML is the only tool tested that classifies the majority of reads, and does so with high accuracy.

Download Full-text

Evolutionary and Functional Lessons from Human-Specific Amino-Acid Substitution Matrices

10.21203/rs.3.rs-63387/v1 ◽

2020 ◽

Author(s):

Tair Shauli ◽

Nadav Brandes ◽

Michal Linial

Keyword(s):

Amino Acid ◽

Protein Function ◽

Purifying Selection ◽

Post Translational Modification ◽

Single Nucleotide Variants ◽

Healthy Human ◽

Substitution Rates ◽

Substitution Matrices ◽

Specific Direction ◽

Ion Binding Sites

Abstract The characterization of human genetic variation in coding regions is fundamental to the understanding of protein function, structure and evolution. Amino-acid (AA) substitution matrices encapsulate the stochastic nature of such proteomic variation and are widely used in studying protein families and evolutionary processes. The conventional substitution matrices, namely BLOSUM and PAM, were constructed to reflect polymorphism across species. In this study, we analyzed the frequencies of >4.8M single nucleotide variants within the healthy human population to accurately represent proteomic variability within the human species, at codon and AA resolution. Our model exposes various AA substitutions which are observed more frequently in one specific direction than in the opposite direction. We further demonstrate that nucleotide substitution rates only partially determine AA substitution rates. Finally, we investigate AA substitutions in post-translational modification and ion-binding sites, exposing purifying selection over a range of residue-based functions. These novel matrices provide a robust baseline for the analysis of protein variation in health and disease.

Download Full-text

Genomic Variability of Serial Human Isolates of Salmonella enterica Serovar Typhimurium Associated with Prolonged Carriage

Journal of Clinical Microbiology ◽

10.1128/jcm.01733-15 ◽

2015 ◽

Vol 53 (11) ◽

pp. 3507-3514 ◽

Cited By ~ 11

Author(s):

Sophie Octavia ◽

Qinning Wang ◽

Mark M. Tanaka ◽

Vitali Sintchenko ◽

Ruiting Lan

Keyword(s):

Nonsense Mutation ◽

Salmonella Enterica ◽

Salmonella Enterica Serovar Typhimurium ◽

Genomic Variation ◽

Nucleotide Polymorphisms ◽

Nonsense Mutations ◽

Genomic Variability ◽

Content Type ◽

Serovar Typhimurium ◽

Severe Gastroenteritis

Salmonella entericaserovar Typhimurium is an important foodborne human pathogen that often causes self-limiting but severe gastroenteritis. Prolonged excretion ofS. Typhimurium after the infection can lead to secondary transmissions. However, little is known about within-host genomic variation in bacteria associated with asymptomatic shedding. Genomes of 35 longitudinal isolates ofS. Typhimurium recovered from 11 patients (children and adults) with culture-confirmed gastroenteritis were sequenced. There were three or four isolates obtained from each patient. Single nucleotide polymorphisms (SNPs) were analyzed in these isolates, which were recovered between 1 and 279 days after the initial diagnosis. Limited genomic variation (5 SNPs or fewer) was associated with short- and long-term carriage ofS. Typhimurium. None of the isolates was shown to be due to reinfection. SNPs occurred randomly, and the majority of the SNPs were nonsynonymous. Two nonsense mutations were observed. A nonsense mutation inflhCrendered the isolate nonmotile, whereas the significance of a nonsense mutation inyihVis unknown. The estimated mutation rate is 1.49 × 10−6substitution per site per year.S. Typhimurium isolates excreted in stools following acute gastroenteritis in children and adults demonstrated limited genomic variability over time, regardless of the duration of carriage. These findings have important implications for the detection of possible transmission events suspected by public health genomic surveillance ofS. Typhimurium infections.

Download Full-text

Protein length distribution is remarkably consistent across Life

10.1101/2021.12.03.470944 ◽

2021 ◽

Author(s):

Yannis Nevers ◽

Natasha Glover ◽

Christophe Dessimoz ◽

Odile Lecompte

Keyword(s):

Gene Annotation ◽

Length Distribution ◽

Gc Content ◽

Purifying Selection ◽

Genomic Features ◽

Protein Length ◽

Open Questions ◽

Size Number ◽

A Genome ◽

Living Species

AbstractIn every living species, the function of a protein depends on its organisation of structural domains, and the length of a protein is a direct reflection of this. Because every species evolved under different evolutionary pressures, the protein length distribution, much like other genomic features, is expected to vary across species. Here we evaluated this diversity by comparing protein length distribution across 2,326 species (1,688 bacteria, 153 archaea and 485 eukaryotes). We found that proteins tend to be on average slightly longer in eukaryotes than in bacteria or archaea, but that the variation of length distribution across species is low, especially compared to the variation of other genomic features (genome size, number of proteins, gene length, GC content, isoelectric points of proteins). Moreover, most cases of atypical protein length distribution appear to be due to artifactual gene annotation, suggesting the actual variation of protein length distribution across species is even smaller. These results open the way for developing a genome annotation quality metric based on protein length distribution to complement conventional quality measures. Overall, our findings show that protein length distribution between living species is more consistent than previously thought, and provide evidence for a universal purifying selection on protein length, whose mechanism and fitness effect remain intriguing open questions.

Download Full-text

Into the deep (sequence) of the foot-and-mouth disease virus gene pool: bottlenecks and adaptation during infection in naïve and vaccinated cattle

10.1101/850743 ◽

2019 ◽

Author(s):

Ian Fish ◽

Carolina Stenfeldt ◽

Rachel M. Palinski ◽

Steven J. Pauszek ◽

Jonathan Arzt

Keyword(s):

Disease Virus ◽

Asymptomatic Carrier ◽

Viral Evolution ◽

Foot And Mouth Disease ◽

Mouth Disease ◽

Genomic Variation ◽

Vaccination Programs ◽

Natural Host Species ◽

Mouth Disease Virus ◽

Foot And Mouth

AbstractFoot-and-mouth disease virus (FMDV), like many RNA viruses, infects hosts as a population of closely related viruses referred to as a quasispecies. The behavior of this quasispecies has not been described in detail over the full course of infection in a natural host species. In this study, virus samples taken from vaccinated and non-vaccinated cattle up to 35 days post experimental infection with FMDV A24-Cruzeiro were analyzed by deep-sequencing. Vaccination induced significant differences compared to viruses from non-vaccinated cattle. in virus substitution rates, entropy, and evidence for adaptation. Genomic variation detected during early infection was found to reflect the diversity inherited from the source virus (inoculum), whereas by 12 days post infection (dpi) dominant viruses were defined by newly acquired mutations. In most serially sampled cattle, mutations conferring recognized fitness gain occurred within numerous genetic backgrounds, often associated with selective sweeps. Persistent infections always included multiple FMDV subpopulations, suggesting independently maintained foci of infection within the nasopharyngeal mucosa. Although vaccination prevented disease, subclinical infection in this group was associated with very early bottlenecks which subsequently reduced the diversity within the virus population. This implies an added consequence of vaccination in the control of foot-and-mouth disease. Viruses sampled from both animal cohorts contained putative antigenic escape mutations. However, these mutations occurred during later stages of infection, at which time transmission between animals is less likely to occur.ImportancePreparedness and control of foot-and-mouth disease virus have substantial, yet distinct implications in endemic and free regions. Viral evolution and emergence of novel strains are of critical concern in both settings. The factors that contribute to the asymptomatic carrier state, a common form of long-term FMDV infection in cattle and other species, are important but not well-understood. This experimental study of foot-and-mouth disease virus in cattle explored the evolution of the pathogen through detailed sampling and analytical methods in both vaccinated and non-vaccinated hosts. Significant differences were identified between the viruses subclinically infecting vaccinated animals and those causing clinical disease in the non-vaccinated cohort. These results can benefit vaccination programs and contribute to the understanding of persistent infection of cattle.

Download Full-text