scholarly journals The single-species metagenome: subtypingStaphylococcus aureuscore genome sequences from shotgun metagenomic data

PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2571 ◽  
Author(s):  
Sandeep J. Joseph ◽  
Ben Li ◽  
Robert A. Petit III ◽  
Zhaohui S. Qin ◽  
Lyndsey Darrow ◽  
...  

In this study we developed a genome-based method for detectingStaphylococcus aureussubtypes from metagenome shotgun sequence data. We used a binomial mixture model and the coverage counts at >100,000 knownS. aureusSNP (single nucleotide polymorphism) sites derived from prior comparative genomic analysis to estimate the proportion of 40 subtypes in metagenome samples. We were able to obtain >87% sensitivity and >94% specificity at 0.025X coverage forS. aureus. We found that 321 and 149 metagenome samples from the Human Microbiome Project and metaSUB analysis of the New York City subway, respectively, containedS. aureusat genome coverage >0.025. In both projects, CC8 and CC30 were the most commonS. aureusclonal complexes encountered. We found evidence that the subtype composition at different body sites of the same individual were more similar than random sampling and more limited evidence that certain body sites were enriched for particular subtypes. One surprising finding was the apparent high frequency of CC398, a lineage often associated with livestock, in samples from the tongue dorsum. Epidemiologic analysis of the HMP subject population suggested that high BMI (body mass index) and health insurance are possibly associated withS. aureuscarriage but there was limited power to identify factors linked to carriage of even the most common subtype. In the NYC subway data, we found a small signal of geographic distance affecting subtype clustering but other unknown factors influence taxonomic distribution of the species around the city.

2015 ◽  
Author(s):  
Sandeep J. Joseph ◽  
Ben Li ◽  
Robert A. Petit ◽  
Zhaohui S. Qin ◽  
Lyndsey A. Darrow ◽  
...  

AbstractMetagenome shotgun sequence projects offer the potential for large scale biogeographic analysis of microbial species. In this project we developed a method for detecting 33 common subtypes of the pathogenic bacterium Staphylococcus aureus. We used a binomial mixture model implemented in the binstrain software and the coverage counts at > 100,000 known S. aureus SNP (single nucleotide polymorphism) sites derived from prior comparative genomic analysis to estimate the proportion of each subtype in metagenome samples. Using this pipeline we were able to obtain > 87% sensitivity and > 94% specificity when testing on low genome coverage samples of diverse S. aureus strains (0.025X). We found that 321 and 149 metagenome samples from the Human Microbiome Project and metaSUB analysis of the New York City subway, respectively, contained S. aureus at genome coverage > 0.025. In both projects, CC8 and CC30 were the most common S. aureus subtypes encountered. We found evidence that the subtype composition at different body sites of the same individual were more similar than random sampling and more limited evidence that certain body sites were enriched for particular subtypes. One surprising finding was the apparent high frequency of CC398, a lineage associated with livestock, in samples from the tongue dorsum. Epidemiologic analysis of the HMP subject population suggested that high BMI (body mass index) and health insurance are risk factors for S. aureus but there was limited power to find factors linked to carriage of even the most common subtype. In the NYC subway data, we found a small signal of geographic distance affecting subtype clustering but other unknown factors influence taxonomic distribution of the species around the city. We argue that pathogen detection in metagenome samples requires the use of subtypes based on whole species population genomic analysis rather than using ad hoc collections of reference strains.


2016 ◽  
Author(s):  
Shea N Gardner ◽  
Sasha K Ames ◽  
Maya B Gokhale ◽  
Tom R Slezak ◽  
Jonathan Allen

Software for rapid, accurate, and comprehensive microbial profiling of metagenomic sequence data on a desktop will play an important role in large scale clinical use of metagenomic data. Here we describe LMAT-ML (Livermore Metagenomics Analysis Toolkit-Marker Library) which can be run with 24 GB of DRAM memory, an amount available on many clusters, or with 16 GB DRAM plus a 24 GB low cost commodity flash drive (NVRAM), a cost effective alternative for desktop or laptop users. We compared results from LMAT with five other rapid, low-memory tools for metagenome analysis for 131 Human Microbiome Project samples, and assessed discordant calls with BLAST. All the tools except LMAT-ML reported overly specific or incorrect species and strain resolution of reads that were in fact much more widely conserved across species, genera, and even families. Several of the tools misclassified reads from synthetic or vector sequence as microbial or human reads as viral. We attribute the high numbers of false positive and false negative calls to a limited reference database with inadequate representation of known diversity. Our comparisons with real world samples show that LMAT-ML is the only tool tested that classifies the majority of reads, and does so with high accuracy.


Pathogens ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 86
Author(s):  
Erin M. Garcia ◽  
Myrna G. Serrano ◽  
Laahirie Edupuganti ◽  
David J. Edwards ◽  
Gregory A. Buck ◽  
...  

Gardnerella vaginalis has recently been split into 13 distinct species. In this study, we tested the hypotheses that species-specific variations in the vaginolysin (VLY) amino acid sequence could influence the interaction between the toxin and vaginal epithelial cells and that VLY variation may be one factor that distinguishes less virulent or commensal strains from more virulent strains. This was assessed by bioinformatic analyses of publicly available Gardnerella spp. sequences and quantification of cytotoxicity and cytokine production from purified, recombinantly produced versions of VLY. After identifying conserved differences that could distinguish distinct VLY types, we analyzed metagenomic data from a cohort of female subjects from the Vaginal Human Microbiome Project to investigate whether these different VLY types exhibited any significant associations with symptoms or Gardnerella spp.-relative abundance in vaginal swab samples. While Type 1 VLY was most prevalent among the subjects and may be associated with increased reports of symptoms, subjects with Type 2 VLY dominant profiles exhibited increased relative Gardnerella spp. abundance. Our findings suggest that amino acid differences alter the interaction of VLY with vaginal keratinocytes, which may potentiate differences in bacterial vaginosis (BV) immunopathology in vivo.


2021 ◽  
Vol 9 (2) ◽  
pp. 348
Author(s):  
Florian Tagini ◽  
Trestan Pillonel ◽  
Claire Bertelli ◽  
Katia Jaton ◽  
Gilbert Greub

The Mycobacterium kansasii species comprises six subtypes that were recently classified into six closely related species; Mycobacterium kansasii (formerly M. kansasii subtype 1), Mycobacterium persicum (subtype 2), Mycobacterium pseudokansasii (subtype 3), Mycobacterium ostraviense (subtype 4), Mycobacterium innocens (subtype 5) and Mycobacterium attenuatum (subtype 6). Together with Mycobacterium gastri, they form the M. kansasii complex. M. kansasii is the most frequent and most pathogenic species of the complex. M. persicum is classically associated with diseases in immunosuppressed patients, and the other species are mostly colonizers, and are only very rarely reported in ill patients. Comparative genomics was used to assess the genetic determinants leading to the pathogenicity of members of the M. kansasii complex. The genomes of 51 isolates collected from patients with and without disease were sequenced and compared with 24 publicly available genomes. The pathogenicity of each isolate was determined based on the clinical records or public metadata. A comparative genomic analysis showed that all M. persicum, M. ostraviense, M innocens and M. gastri isolates lacked the ESX-1-associated EspACD locus that is thought to play a crucial role in the pathogenicity of M. tuberculosis and other non-tuberculous mycobacteria. Furthermore, M. kansasii was the only species exhibiting a 25-Kb-large genomic island encoding for 17 type-VII secretion system-associated proteins. Finally, a genome-wide association analysis revealed that two consecutive genes encoding a hemerythrin-like protein and a nitroreductase-like protein were significantly associated with pathogenicity. These two genes may be involved in the resistance to reactive oxygen and nitrogen species, a required mechanism for the intracellular survival of bacteria. Three non-pathogenic M. kansasii lacked these genes likely due to two distinct distributive conjugal transfers (DCTs) between M. attenuatum and M. kansasii, and one DCT between M. persicum and M. kansasii. To our knowledge, this is the first study linking DCT to reduced pathogenicity.


2021 ◽  
Author(s):  
Xinxin Yi ◽  
Jing Liu ◽  
Shengcai Chen ◽  
Hao Wu ◽  
Min Liu ◽  
...  

Cultivated soybean (Glycine max) is an important source for protein and oil. Many elite cultivars with different traits have been developed for different conditions. Each soybean strain has its own genetic diversity, and the availability of more high-quality soybean genomes can enhance comparative genomic analysis for identifying genetic underpinnings for its unique traits. In this study, we constructed a high-quality de novo assembly of an elite soybean cultivar Jidou 17 (JD17) with chromsome contiguity and high accuracy. We annotated 52,840 gene models and reconstructed 74,054 high-quality full-length transcripts. We performed a genome-wide comparative analysis based on the reference genome of JD17 with three published soybeans (WM82, ZH13 and W05) , which identified five large inversions and two large translocations specific to JD17, 20,984 - 46,912 PAVs spanning 13.1 - 46.9 Mb in size, and 5 - 53 large PAV clusters larger than 500kb. 1,695,741 - 3,664,629 SNPs and 446,689 - 800,489 Indels were identified and annotated between JD17 and them. Symbiotic nitrogen fixation (SNF) genes were identified and the effects from these variants were further evaluated. It was found that the coding sequences of 9 nitrogen fixation-related genes were greatly affected. The high-quality genome assembly of JD17 can serve as a valuable reference for soybean functional genomics research.


mSphere ◽  
2019 ◽  
Vol 4 (6) ◽  
Author(s):  
Sophie L. Nixon ◽  
Rebecca A. Daly ◽  
Mikayla A. Borton ◽  
Lindsey M. Solden ◽  
Susan A. Welch ◽  
...  

ABSTRACT Bacteria of the phylum Verrucomicrobia are prevalent and are particularly common in soil and freshwater environments. Their cosmopolitan distribution and reported capacity for polysaccharide degradation suggests members of Verrucomicrobia are important contributors to carbon cycling across Earth’s ecosystems. Despite their prevalence, the Verrucomicrobia are underrepresented in isolate collections and genome databases; consequently, their ecophysiological roles may not be fully realized. Here, we expand genomic sampling of the Verrucomicrobia phylum by describing a novel genus, “Candidatus Marcellius,” belonging to the order Opitutales. “Ca. Marcellius” was recovered from a shale-derived produced fluid metagenome collected 313 days after hydraulic fracturing, the deepest environment from which a member of the Verrucomicrobia has been recovered to date. We uncover genomic attributes that may explain the capacity of this organism to inhabit a shale gas well, including the potential for utilization of organic polymers common in hydraulic fracturing fluids, nitrogen fixation, adaptation to high salinities, and adaptive immunity via CRISPR-Cas. To illuminate the phylogenetic and environmental distribution of these metabolic and adaptive traits across the Verrucomicrobia phylum, we performed a comparative genomic analysis of 31 publicly available, nearly complete Verrucomicrobia genomes. Our genomic findings extend the environmental distribution of the Verrucomicrobia 2.3 kilometers into the terrestrial subsurface. Moreover, we reveal traits widely encoded across members of the Verrucomicrobia, including the capacity to degrade hemicellulose and to adapt to physical and biological environmental perturbations, thereby contributing to the expansive habitat range reported for this phylum. IMPORTANCE The Verrucomicrobia phylum of bacteria is widespread in many different ecosystems; however, its role in microbial communities remains poorly understood. Verrucomicrobia are often low-abundance community members, yet previous research suggests they play a major role in organic carbon degradation. While Verrucomicrobia remain poorly represented in culture collections, numerous genomes have been reconstructed from metagenomic data sets in recent years. The study of genomes from across the phylum allows for an extensive assessment of their potential ecosystem roles. The significance of this work is (i) the recovery of a novel genus of Verrucomicrobia from 2.3 km in the subsurface with the ability to withstand the extreme conditions that characterize this environment, and (ii) the most extensive assessment of ecophysiological traits encoded by Verrucomicrobia genomes to date. We show that members of this phylum are specialist organic polymer degraders that can withstand a wider range of environmental conditions than previously thought.


2014 ◽  
Vol 58 (5) ◽  
pp. 2871-2877 ◽  
Author(s):  
L. Chen ◽  
K. D. Chavda ◽  
R. G. Melano ◽  
M. R. Jacobs ◽  
B. Koll ◽  
...  

2019 ◽  
Author(s):  
Thomas Flouris ◽  
Xiyun Jiao ◽  
Bruce Rannala ◽  
Ziheng Yang

AbstractRecent analyses suggest that cross-species gene flow or introgression is common in nature, especially during species divergences. Genomic sequence data can be used to infer introgression events and to estimate the timing and intensity of introgression, providing an important means to advance our understanding of the role of gene flow in speciation. Here we implement the multispecies-coalescent-with-introgression (MSci) model, an extension of the multispecies-coalescent (MSC) model to incorporate introgression, in our Bayesian Markov chain Monte Carlo (MCMC) program BPP. The MSci model accommodates deep coalescence (or incomplete lineage sorting) and introgression and provides a natural framework for inference using genomic sequence data. Computer simulation confirms the good statistical properties of the method, although hundreds or thousands of loci are typically needed to estimate introgression probabilities reliably. Re-analysis of datasets from the purple cone spruce confirms the hypothesis of homoploid hybrid speciation. We estimated the introgression probability using the genomic sequence data from six mosquito species in the Anopheles gambiae species complex, which varies considerably across the genome, likely driven by differential selection against introgressed alleles.


2020 ◽  
Author(s):  
Marko Premzl

Abstract The eutherian genomics momentum greatly advanced biology and medicine. Nevertheless, future revisions and updates of eutherian genomic sequence data sets were expected, due to potential genomic sequence errors and incompleteness of genomic sequences. The eutherian comparative genomic analysis protocol was established as guidance in protection against potential genomic sequence errors in public eutherian genomic sequence assemblies. The protocol revised, updated and published 12 major eutherian gene data sets, including 1853 complete coding sequences deposited in European Nucleotide Archive as curated third party data gene data sets under accession numbers: FR734011-FR734074, HF564658-HF564785, HF564786-HF564815, HG328835-HG329089, HG426065-HG426183, HG931734-HG931849, LM644135-LM644234, LN874312-LN874522, LT548096-LT548244, LT631550-LT631670, LT962964-LT963174 and LT990249-LT990597.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Prince Kumar ◽  
Mukesh K. Meghvansi ◽  
D. V. Kamboj

AbstractShigella has the remarkable capability to acquire antibiotic resistance rapidly thereby posing a significant public health challenge for the effective treatment of dysentery (Shigellosis). The phage therapy has been proven as an effective alternative strategy for controlling Shigella infections. In this study, we illustrate the isolation and detailed characterization of a polyvalent phage 2019SD1, which demonstrates lytic activity against Shigella dysenteriae, Escherichia coli, Vibrio cholerae, Enterococcus saccharolyticus and Enterococcus faecium. The newly isolated phage 2019SD1 shows adsorption time < 6 min, a latent period of 20 min and burst size of 151 PFU per bacterial cell. 2019SD1 exhibits considerable stability in a wide pH range and survives an hour at 50 °C. Under transmission electron microscope, 2019SD1 shows an icosahedral capsid (60 nm dia) and a 140 nm long tail. Further, detailed bioinformatic analyses of whole genome sequence data obtained through Oxford Nanopore platform revealed that 2019SD1 belongs to genus Hanrivervirus of subfamily Tempevirinae under the family Drexlerviridae. The concatenated protein phylogeny of 2019SD1 with the members of Drexlerviridae taking four genes (DNA Primase, ATP Dependent DNA Helicase, Large Terminase Protein, and Portal Protein) using the maximum parsimony method also suggested that 2019SD1 formed a distinct clade with the closest match of the taxa belonging to the genus Hanrivervirus. The genome analysis data indicate the occurrence of putative tail fiber proteins and DNA methylation mechanism. In addition, 2019SD1 has a well-established anti-host defence system as suggested through identification of putative anti-CRISPR and anti-restriction endonuclease systems thereby also indicating its biocontrol potential.


Sign in / Sign up

Export Citation Format

Share Document