An integrated metagenomics pipeline for strain profiling reveals novel patterns of transmission and global biogeography of bacteria

Mapping Intimacies ◽

10.1101/031757 ◽

2015 ◽

Cited By ~ 4

Author(s):

Stephen Nayfach ◽

Beltran Rodriguez-Mueller ◽

Nandita Garud ◽

Katherine S. Pollard

Keyword(s):

Bacterial Species ◽

Species Abundance ◽

Human Microbiome ◽

Genomic Variation ◽

Strain Level ◽

Gene Content ◽

Taxonomic Resolution ◽

Nucleotide Polymorphisms ◽

Single Nucleotide Variants ◽

Single Nucleotide

AbstractWe present the Metagenomic Intra-species Diversity Analysis System (MIDAS), which is an integrated computational pipeline for quantifying bacterial species abundance and strain-level genomic variation, including gene content and single nucleotide polymorphisms, from shotgun metagenomes. Our method leverages a database of >30,000 bacterial reference genomes which we clustered into species groups. These cover the majority of abundant species in the human microbiome but only a small proportion of microbes in other environments, including soil and seawater. We applied MIDAS to stool metagenomes from 98 Swedish mothers and their infants over one year and used rare single nucleotide variants to reveal extensive vertical transmission of strains at birth but colonization with strains unlikely to derive from the mother at later time points. This pattern was missed with species-level analysis, because the infant gut microbiome composition converges towards that of an adult over time. We also applied MIDAS to 198 globally distributed marine metagenomes and used gene content to show that many prevalent bacterial species have population structure that correlates with geographic location. Strain-level genetic variants present in metagenomes clearly reveal extensive structure and dynamics that are obscured when data is analyzed at a higher taxonomic resolution.

Download Full-text

Strand-wise and bait-assisted assembly of nearly-full rrn operons applied to assess species engraftment after faecal microbiota transplantation

10.1101/2020.09.11.292896 ◽

2020 ◽

Author(s):

Alfonso Benítez-Páez ◽

Annick V. Hartstra ◽

Max Nieuwdorp ◽

Yolanda Sanz

Keyword(s):

Gut Microbiota ◽

Bacterial Species ◽

Cost Effective ◽

Strain Level ◽

Level Variation ◽

Faecal Microbiota ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Faecal Microbiota Transplantation ◽

Effective Manner

AbstractBackgroundEffective methodologies to accurately identify members of the gut microbiota at the species and strain levels are necessary to unveiling more specific and detailed host-microbe interactions and associations with health and disease.MethodsMinION™ MkIb nanopore-based device and the R9.5 flowcell chemistry were used to sequence and assemble dozens of rrn regions (16S-ITS-23S) derived from the most prevalent bacterial species in the human gut microbiota. As a method proof-of-concept to disclose further strain-level variation, we performed a complementary analysis in a subset of samples derived from an faecal microbiota transplantation (FMT) trial aiming amelioration of glucose and lipid metabolism in overweight subjects with metabolic syndrome.ResultsThe resulting updated rrn database, the data processing pipeline, and the precise control of covariates (sequencing run, sex, age, BMI, donor) were pivotal to accurately estimate the changes in gut microbial species abundance in the recipients after FMT. Furthermore, the rrn methodology described here demonstrated the ability to detect strain-level variation, critical to evaluate the transference of bacteria from donors to recipients as a consequence of the FMT. At this regard, we showed that our FMT trial successfully induced donors’ strain engraftment of e.g. Parabacteroides merdae species in recipients by mapping and assessing their associated single nucleotide variants (SNV).ConclusionsWe developed a methodology that enables the identification of microbiota at species- and strain-level in a cost-effective manner. Despite its error-prone nature and its modest per-base accuracy, the nanopore data showed to have enough quality to estimate single-nucleotide variation. This methodology and data analysis represents a cost-effective manner to trace genetic variability needed for better understanding the health effects of the human microbiome.Trial registrationThe study was prospectively registered at the Dutch Trial registry - NTR4488 (https://www.trialregister.nl/trial/4488).

Download Full-text

Single-Nucleotide Polymorphism-Based Genetic Diversity Analysis of Clinical Pseudomonas aeruginosa Isolates

Genome Biology and Evolution ◽

10.1093/gbe/evaa059 ◽

2020 ◽

Vol 12 (4) ◽

pp. 396-406 ◽

Cited By ~ 1

Author(s):

Uthayakumar Muthukumarasamy ◽

Matthias Preusse ◽

Adrian Kordes ◽

Michal Koska ◽

Monika Schniederjans ◽

...

Keyword(s):

Pseudomonas Aeruginosa ◽

Single Nucleotide Polymorphisms ◽

Core Genome ◽

Phenotypic Diversity ◽

Bacterial Species ◽

Genomic Variation ◽

Genomic Diversity ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Genetic Changes

Abstract Extensive use of next-generation sequencing has the potential to transform our knowledge on how genomic variation within bacterial species impacts phenotypic versatility. Because different environments have unique selection pressures, they drive divergent evolution. However, there is also parallel or convergent evolution of traits in independent bacterial isolates inhabiting similar environments. The application of tools to describe population-wide genomic diversity provides an opportunity to measure the predictability of genetic changes underlying adaptation. Here, we describe patterns of sequence variations in the core genome among 99 individual Pseudomonas aeruginosa clinical isolates and identified single-nucleotide polymorphisms that are the basis for branching of the phylogenetic tree. We also identified single-nucleotide polymorphisms that were acquired independently, in separate lineages, and not through inheritance from a common ancestor. Although our results demonstrate that the Pseudomonas aeruginosa core genome is highly conserved and in general, not subject to adaptive evolution, instances of parallel evolution will provide an opportunity to uncover genetic changes that underlie phenotypic diversity.

Download Full-text

Impact of Pre and Post Variant Filtration Strategies on Imputation

10.21203/rs.3.rs-128366/v1 ◽

2020 ◽

Author(s):

Celine Charon ◽

Rodrigue Allodji ◽

Vincent Meyer ◽

Jean-François Deleuze

Keyword(s):

Quality Control ◽

Rare Variants ◽

Association Studies ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Direct Effects ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Genome Wide ◽

Conservative Post

Abstract Quality control methods for genome-wide association studies and fine mapping are commonly used for imputation, however, they result in loss of many single nucleotide polymorphisms (SNPs). To investigate the consequences of filtration on imputation, we studied the direct effects on the number of markers, their allele frequencies, imputation quality scores and post-filtration events. We pre-phrased 1,031 genotyped individuals from diverse ethnicities and compared the imputed variants to 1,089 NCBI recorded individuals for additional validation.Without variant pre-filtration based on quality control (QC), we observed no impairment in the imputation of SNPs that failed QC whereas with pre-filtration there was an overall loss of information. Significant differences between frequencies with and without pre-filtration were found only in the range of very rare (5E-04-1E-03) and rare variants (1E-03-5E-03) (p < 1E-04). Increasing the post-filtration imputation quality score from 0.3 to 0.8 reduced the number of single nucleotide variants (SNVs) <0.001 2.5 fold with or without QC pre-filtration and halved the number of very rare variants (5E-04). As a result, to maintain confidence and enough SNVs, we propose here a 2-step post-filtration approach to increase the number of very rare and rare variants compared to conservative post-filtration methods.

Download Full-text

Genetics of Schizophrenia and Bipolar Disorder

10.1093/med/9780190681425.003.0013 ◽

2017 ◽

Author(s):

Alexander Charney ◽

Pamela Sklar

Keyword(s):

Bipolar Disorder ◽

Copy Number ◽

Psychotic Disorders ◽

Copy Number Variants ◽

Nucleotide Polymorphisms ◽

Common Variants ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Number Variation

Schizophrenia and bipolar disorder are the classic psychotic disorders. Both diseases are strongly familial, but have proven recalcitrant to genetic methodologies for identifying the etiology until recently. There is now convincing genetic evidence that indicates a contribution of many DNA changes to the risk of becoming ill. For schizophrenia, there are large contributions of rare copy number variants and common single nucleotide variants, with an overall highly polygenic genetic architecture. For bipolar disorder, the role of copy number variation appears to be much less pronounced. Specific common single nucleotide polymorphisms are associated, and there is evidence for polygenicity. Several surprises have emerged from the genetic data that indicate there is significantly more molecular overlap in copy number variants between autism and schizophrenia, and in common variants between schizophrenia and bipolar disorder.

Download Full-text

Fido-SNP: the first webserver for scoring the impact of single nucleotide variants in the dog genome

Nucleic Acids Research ◽

10.1093/nar/gkz420 ◽

2019 ◽

Vol 47 (W1) ◽

pp. W136-W141 ◽

Cited By ~ 1

Author(s):

Emidio Capriotti ◽

Ludovica Montanucci ◽

Giuseppe Profiti ◽

Ivan Rossi ◽

Diana Giannuzzi ◽

...

Keyword(s):

Matthews Correlation Coefficient ◽

Genomic Variation ◽

Gradient Boosting ◽

Binary Classifier ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Coding Regions ◽

Variation Data ◽

Boosting Algorithm ◽

The Impact

Abstract As the amount of genomic variation data increases, tools that are able to score the functional impact of single nucleotide variants become more and more necessary. While there are several prediction servers available for interpreting the effects of variants in the human genome, only few have been developed for other species, and none were specifically designed for species of veterinary interest such as the dog. Here, we present Fido-SNP the first predictor able to discriminate between Pathogenic and Benign single-nucleotide variants in the dog genome. Fido-SNP is a binary classifier based on the Gradient Boosting algorithm. It is able to classify and score the impact of variants in both coding and non-coding regions based on sequence features within seconds. When validated on a previously unseen set of annotated variants from the OMIA database, Fido-SNP reaches 88% overall accuracy, 0.77 Matthews correlation coefficient and 0.91 Area Under the ROC Curve.

Download Full-text

Genetic Factors of Nitric Oxide’s System in Psychoneurologic Disorders

International Journal of Molecular Sciences ◽

10.3390/ijms21051604 ◽

2020 ◽

Vol 21 (5) ◽

pp. 1604 ◽

Cited By ~ 1

Author(s):

Regina F. Nasyrova ◽

Polina V. Moskaleva ◽

Elena E. Vaiman ◽

Natalya A. Shnayder ◽

Nataliya L. Blatt ◽

...

Keyword(s):

Nitric Oxide ◽

Neurological Diseases ◽

Small Sample ◽

Nucleotide Polymorphisms ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Penile Erections ◽

Genetic And Environmental Factors ◽

Small Sample Sizes ◽

Disease Modifying Treatment

According to the recent data, nitric oxide (NO) is a chemical messenger that mediates functions such as vasodilation and neurotransmission, as well as displaying antimicrobial and antitumoral activities. NO has been implicated in the neurotoxicity associated with stroke and neurodegenerative diseases; neural regulation of smooth muscle, including peristalsis; and penile erections. We searched for full-text English publications from the past 15 years in Pubmed and SNPedia databases using keywords and combined word searches (nitric oxide, single nucleotide variants, single nucleotide polymorphisms, genes). In addition, earlier publications of historical interest were included in the review. In our review, we have summarized information regarding all NOS1, NOS2, NOS3, and NOS1AP single nucleotide variants (SNVs) involved in the development of mental disorders and neurological diseases/conditions. The results of the studies we have discussed in this review are contradictory, which might be due to different designs of the studies, small sample sizes in some of them, and different social and geographical characteristics. However, the contribution of genetic and environmental factors has been understudied, which makes this issue increasingly important for researchers as the understanding of these mechanisms can support a search for new approaches to pathogenetic and disease-modifying treatment.

Download Full-text

AsCRISPR: a web server for allele-specific sgRNA design in precision medicine

10.1101/672634 ◽

2019 ◽

Author(s):

Guihu Zhao ◽

Jinchen Li ◽

Yu Tang

Keyword(s):

Nucleotide Polymorphisms ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Guide Rna ◽

Inherited Diseases ◽

Bioinformatic Tools ◽

Point Of Entry ◽

Allele Specific ◽

Sgrna Design ◽

Specific Restriction

AbstractAllele-specific genomic targeting by CRISPR provides a point of entry for personalized gene therapy of dominantly inherited diseases, by selectively disrupting the mutant alleles or disease-causing single nucleotide polymorphisms (SNPs), ideally while leaving normal alleles intact. Moreover, the allele-specific engineering has been increasingly exploited not only in treating inherited diseases and mutation-driven cancers, but also in other important fields such as genome imprinting, haploinsufficiency, genome loci imaging and immunocompatible manipulations. Despite the tremendous utilities of allele-specific targeting by CRISPR, very few bioinformatic tools have been implemented for the allele-specific purpose. We thus developed AsCRISPR (Allele-specific CRISPR), a web tool to aid the design of guide RNA (gRNA) sequences that can discriminate between alleles. It provides users with limited bioinformatics skills to analyze both their own identified variants and heterozygous SNPs deposited in the dbSNP database. Multiple CRISPR nucleases and their engineered variants including newly-developed Cas12b and CasX are included for users’ choice. Meanwhile, AsCRISPR evaluates the on-target efficiencies, specificities and potential off-targets of gRNA candidates, and also displays the allele-specific restriction enzyme sites that might be disrupted upon successful genome edits. In addition, AsCRISPR analyzed with dominant single nucleotide variants (SNVs) retrieved from ClinVar and OMIM databases, and generated a Dominant Database of candidate discriminating gRNAs that may specifically target the alternative allele for each dominant SNV site. A Validated Database was also established, which manually curated the discriminating gRNAs that were experimentally validated in the mounting literatures. AsCRISPR is freely available at http://www.genemed.tech/ascrispr.

Download Full-text

A high-resolution pipeline for 16S-sequencing identifies bacterial strains in human microbiome

10.1101/565572 ◽

2019 ◽

Cited By ~ 1

Author(s):

Igor Segota ◽

Tao Long

Keyword(s):

Bacterial Species ◽

Human Microbiome ◽

Amplicon Sequencing ◽

R Package ◽

Strain Level ◽

Sequencing Data ◽

Bacterial Strains ◽

16S Sequencing ◽

16S Amplicon Sequencing ◽

Sequencing Data Analysis

We developed a High-resolution Microbial Analysis Pipeline (HiMAP) for 16S amplicon sequencing data analysis, aiming at bacterial species or strain-level identification from human microbiome to enable experimental validation for causal effects of the associated bacterial strains on health and diseases. HiMAP achieved higher accuracy in identifying species in human microbiome mock community than other pipelines. HiMAP identified majority of the species, with strain-level resolution wherever possible, as detected by whole genome shotgun sequencing using MetaPhlAn2 and reported comparable relative abundances. HiMAP is an open-source R package available at https://github.com/taolonglab/himap.

Download Full-text

Rare single nucleotide variants in COL5A1 promoter do not play a major role in keratoconus susceptibility associated with rs1536482

BMC Ophthalmology ◽

10.1186/s12886-021-02128-6 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Liubov O. Skorodumova ◽

Alexandra V. Belodedova ◽

Elena I. Sharova ◽

Elena S. Zakharova ◽

Liliia N. Iulmetova ◽

...

Keyword(s):

Rare Variants ◽

Snp Array ◽

Promoter Sequence ◽

Minor Allele ◽

Control Group ◽

Nucleotide Polymorphisms ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Degenerative Disorder ◽

Pcr Targeting

Abstract Background Keratoconus is a chronic degenerative disorder of the cornea characterized by thinning and cone-shaped protrusions. Although genetic factors play a key role in keratoconus development, the etiology is still under investigation. The occurrence of single-nucleotide polymorphisms (SNPs) associated with keratoconus in Russian patients is poorly studied. The purpose of this study was to validate whether three reported keratoconus-associated SNPs (rs1536482 near the COL5A1 gene, rs2721051 near the FOXO1 gene, rs1324183 near the MPDZ gene) are also actual for a Russian cohort of patients. Additionally, we investigated the COL5A1 promoter sequence for single-nucleotide variants (SNVs) in a subgroup of keratoconus patients with at least one rs1536482 minor allele (rs1536482+) to assess the role of these SNVs in keratoconus susceptibility associated with rs1536482. Methods This case-control study included 150 keratoconus patients and two control groups (main and additional, 205 and 474 participants, respectively). We performed PCR targeting regions flanking SNVs and the COL5A1 promoter, followed by Sanger sequencing of amplicons. The additional control group was genotyped using an SNP array. Results The minor allele frequency was significantly different between the keratoconus and control cohorts (main and combined) for rs1536482, rs2721051, and rs1324183 (p-value < 0.05). The rare variants rs1043208782 and rs569248712 were found in the COL5A1 promoter in two out of 94 rs1536482+ keratoconus patients. Conclusion rs1536482, rs2721051, and rs1324183 were associated with keratoconus in a Russian cohort. SNVs in the COL5A1 promoter do not play a major role in keratoconus susceptibility associated with rs1536482.

Download Full-text

Impact of insertion sequences on convergent evolution ofShigellaspecies

10.1101/680777 ◽

2019 ◽

Cited By ~ 1

Author(s):

Jane Hawkey ◽

Jonathan M. Monk ◽

Helen Billman-Jacobe ◽

Bernhard Palsson ◽

Kathryn E. Holt

Keyword(s):

Evolutionary Dynamics ◽

Shigella Flexneri ◽

Bacterial Species ◽

Genomic Variation ◽

Shigella Dysenteriae ◽

Insertion Sequences ◽

Single Nucleotide Variants ◽

Genomic Studies ◽

Metabolic Reduction ◽

Genome Scale

AbstractShigellaspecies are specialised lineages ofEscherichia colithat have converged to become human-adapted and cause dysentery by invading human gut epithelial cells. Most studies ofShigellaevolution have been restricted to comparisons of single representatives of each species; and population genomic studies of individualShigellaspecies have focused on genomic variation caused by single nucleotide variants and ignored the contribution of insertion sequences (IS) which are highly prevalent inShigellagenomes. Here, we investigate the distribution and evolutionary dynamics of IS within populations ofShigella dysenteriaeSd1,Shigella sonneiandShigella flexneri. We find that five IS (IS1, IS2, IS4, IS600and IS911) have undergone expansion in allShigellaspecies, creating substantial strain-to-strain variation within each population and contributing to convergent patterns of functional gene loss within and between species. We find that IS expansion and genome degradation are most advanced inS. dysenteriaeand least advanced inS. sonnei; and using genome-scale models of metabolism we show thatShigellaspecies display convergent loss of coreE. colimetabolic capabilities, withS. sonneiandS. flexnerifollowing a similar trajectory of metabolic streamlining to that ofS. dysenteriae. This study highlights the importance of IS to the evolution ofShigellaand provides a framework for the investigation of IS dynamics and metabolic reduction in other bacterial species.

Download Full-text