scholarly journals Retrospective Definition of Clostridioides difficile PCR Ribotypes on the Basis of Whole Genome Polymorphisms: A Proof of Principle Study

Diagnostics ◽  
2020 ◽  
Vol 10 (12) ◽  
pp. 1078
Author(s):  
Manisha Goyal ◽  
Lysiane Hauben ◽  
Hannes Pouseele ◽  
Magali Jaillard ◽  
Katrien De Bruyne ◽  
...  

Clostridioides difficile is a cause of health care-associated infections. The epidemiological study of C. difficile infection (CDI) traditionally involves PCR ribotyping. However, ribotyping will be increasingly replaced by whole genome sequencing (WGS). This implies that WGS types need correlation with classical ribotypes (RTs) in order to perform retrospective clinical studies. Here, we selected genomes of hyper-virulent C. difficile strains of RT001, RT017, RT027, RT078, and RT106 to try and identify new discriminatory markers using in silico ribotyping PCR and De Bruijn graph-based Genome Wide Association Studies (DBGWAS). First, in silico ribotyping PCR was performed using reference primer sequences and 30 C. difficile genomes of the five different RTs identified above. Second, discriminatory genomic markers were sought with DBGWAS using a set of 160 independent C. difficile genomes (14 ribotypes). RT-specific genetic polymorphisms were annotated and validated for their specificity and sensitivity against a larger dataset of 2425 C. difficile genomes covering 132 different RTs. In silico PCR ribotyping was unsuccessful due to non-specific or missing theoretical RT PCR fragments. More successfully, DBGWAS discovered a total of 47 new markers (13 in RT017, 12 in RT078, 9 in RT106, 7 in RT027, and 6 in RT001) with minimum q-values of 0 to 7.40 × 10−5, indicating excellent marker selectivity. The specificity and sensitivity of individual markers ranged between 0.92 and 1.0 but increased to 1 by combining two markers, hence providing undisputed RT identification based on a single genome sequence. Markers were scattered throughout the C. difficile genome in intra- and intergenic regions. We propose here a set of new genomic polymorphisms that efficiently identify five hyper-virulent RTs utilizing WGS data only. Further studies need to show whether this initial proof-of-principle observation can be extended to all 600 existing RTs.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Chao-Yu Guo ◽  
Reng-Hong Wang ◽  
Hsin-Chou Yang

AbstractAfter the genome-wide association studies (GWAS) era, whole-genome sequencing is highly engaged in identifying the association of complex traits with rare variations. A score-based variance-component test has been proposed to identify common and rare genetic variants associated with complex traits while quickly adjusting for covariates. Such kernel score statistic allows for familial dependencies and adjusts for random confounding effects. However, the etiology of complex traits may involve the effects of genetic and environmental factors and the complex interactions between genes and the environment. Therefore, in this research, a novel method is proposed to detect gene and gene-environment interactions in a complex family-based association study with various correlated structures. We also developed an R function for the Fast Gene-Environment Sequence Kernel Association Test (FGE-SKAT), which is freely available as supplementary material for easy GWAS implementation to unveil such family-based joint effects. Simulation studies confirmed the validity of the new strategy and the superior statistical power. The FGE-SKAT was applied to the whole genome sequence data provided by Genetic Analysis Workshop 18 (GAW18) and discovered concordant and discordant regions compared to the methods without considering gene by environment interactions.


2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Gabriel Costa Monteiro Moreira ◽  
Clarissa Boschiero ◽  
Aline Silva Mello Cesar ◽  
James M. Reecy ◽  
Thaís Fernanda Godoy ◽  
...  

2021 ◽  
Author(s):  
Helgi Hilmarsson ◽  
Arvind S. Kumar ◽  
Richa Rastogi ◽  
Carlos D. Bustamante ◽  
Daniel Mas Montserrat ◽  
...  

ABSTRACTAs genome-wide association studies and genetic risk prediction models are extended to globally diverse and admixed cohorts, ancestry deconvolution has become an increasingly important tool. Also known as local ancestry inference (LAI), this technique identifies the ancestry of each region of an individual’s genome, thus permitting downstream analyses to account for genetic effects that vary between ancestries. Since existing LAI methods were developed before the rise of massive, whole genome biobanks, they are computationally burdened by these large next generation datasets. Current LAI algorithms also fail to harness the potential of whole genome sequences, falling well short of the accuracy that such high variant densities can enable. Here we introduce Gnomix, a set of algorithms that address each of these points, achieving higher accuracy and swifter computational performance than any existing LAI method, while also enabling portable models that are particularly useful when training data are not shareable due to privacy or other restrictions. We demonstrate Gnomix (and its swift phase correction counterpart Gnofix) on worldwide whole-genome data from both humans and canids and utilize its high resolution accuracy to identify the location of ancient New World haplotypes in the Xoloitzcuintle, dating back over 100 generations. Code is available at https://github.com/AI-sandbox/gnomix.


2020 ◽  
Vol 2 (4) ◽  
Author(s):  
Gerard A Bouland ◽  
Joline W J Beulens ◽  
Joey Nap ◽  
Arno R van der Slik ◽  
Arnaud Zaldumbide ◽  
...  

Abstract Numerous large genome-wide association studies have been performed to understand the influence of genetics on traits. Many identified risk loci are in non-coding and intergenic regions, which complicates understanding how genes and their downstream pathways are influenced. An integrative data approach is required to understand the mechanism and consequences of identified risk loci. Here, we developed the R-package CONQUER. Data for SNPs of interest are acquired from static- and dynamic repositories (build GRCh38/hg38), including GTExPortal, Epigenomics Project, 4D genome database and genome browsers. All visualizations are fully interactive so that the user can immediately access the underlying data. CONQUER is a user-friendly tool to perform an integrative approach on multiple SNPs where risk loci are not seen as individual risk factors but rather as a network of risk factors.


2020 ◽  
Vol 27 (9) ◽  
pp. 1425-1430
Author(s):  
Inès Krissaane ◽  
Carlos De Niz ◽  
Alba Gutiérrez-Sacristán ◽  
Gabor Korodi ◽  
Nneka Ede ◽  
...  

Abstract Objective Advancements in human genomics have generated a surge of available data, fueling the growth and accessibility of databases for more comprehensive, in-depth genetic studies. Methods We provide a straightforward and innovative methodology to optimize cloud configuration in order to conduct genome-wide association studies. We utilized Spark clusters on both Google Cloud Platform and Amazon Web Services, as well as Hail (http://doi.org/10.5281/zenodo.2646680) for analysis and exploration of genomic variants dataset. Results Comparative evaluation of numerous cloud-based cluster configurations demonstrate a successful and unprecedented compromise between speed and cost for performing genome-wide association studies on 4 distinct whole-genome sequencing datasets. Results are consistent across the 2 cloud providers and could be highly useful for accelerating research in genetics. Conclusions We present a timely piece for one of the most frequently asked questions when moving to the cloud: what is the trade-off between speed and cost?


2020 ◽  
Vol 11 ◽  
Author(s):  
Frederik Krull ◽  
Marc Hirschfeld ◽  
Wilhelm Ewald Wemheuer ◽  
Bertram Brenig

Since their first description almost 100 years ago, bovine spastic paresis (BSP) and bovine spastic syndrome (BSS) are assumed to be inherited neuronal-progressive diseases in cattle. Affected animals are characterized by (frequent) spasms primarily located in the hind limbs, accompanied by severe pain symptoms and reduced vigor, thus initiating premature slaughter or euthanasia. Due to the late onset of BSP and BSS and the massively decreased lifespan of modern cattle, the importance of these diseases is underestimated. In the present study, BSP/BSS-affected German Holstein breeding sires from artificial insemination centers were collected and pedigree analysis, genome-wide association studies, whole genome resequencing, protein–protein interaction network analysis, and protein-homology modeling were performed to elucidate the genetic background. The analysis of 46 affected and 213 control cattle revealed four significantly associated positions on chromosome 15 (BTA15), i.e., AC_000172.1:g.83465449A>G (–log10P = 19.17), AC_000172.1:g.81871849C>T (–log10P = 8.31), AC_000172.1:g.81872621A>T (–log10P = 6.81), and AC_000172.1:g.81872661G>C (–log10P = 6.42). Two additional loci were significantly associated located on BTA8 and BTA19, i.e., AC_000165.1:g.71177788T>C and AC_000176.1:g.30140977T>G, respectively. Whole genome resequencing of five affected individuals and six unaffected relatives (two fathers, two mothers, a half sibling, and a full sibling) belonging to three different not directly related families was performed. After filtering, a homozygous loss of function variant was identified in the affected cattle, causing a frameshift in the so far unknown gene locus LOC100848076 encoding an adenosine-A1-receptor homolog. An allele frequency of the variant of 0.74 was determined in 3,093 samples of the 1000 Bull Genomes Project.


2019 ◽  
Vol 47 (14) ◽  
pp. e79-e79
Author(s):  
Aitor González ◽  
Marie Artufel ◽  
Pascal Rihet

Abstract Genome-wide association studies (GWAS) associate single nucleotide polymorphisms (SNPs) to complex phenotypes. Most human SNPs fall in non-coding regions and are likely regulatory SNPs, but linkage disequilibrium (LD) blocks make it difficult to distinguish functional SNPs. Therefore, putative functional SNPs are usually annotated with molecular markers of gene regulatory regions and prioritized with dedicated prediction tools. We integrated associated SNPs, LD blocks and regulatory features into a supervised model called TAGOOS (TAG SNP bOOSting) and computed scores genome-wide. The TAGOOS scores enriched and prioritized unseen associated SNPs with an odds ratio of 4.3 and 3.5 and an area under the curve (AUC) of 0.65 and 0.6 for intronic and intergenic regions, respectively. The TAGOOS score was correlated with the maximal significance of associated SNPs and expression quantitative trait loci (eQTLs) and with the number of biological samples annotated for key regulatory features. Analysis of loci and regions associated to cleft lip and human adult height phenotypes recovered known functional loci and predicted new functional loci enriched in transcriptions factors related to the phenotypes. In conclusion, we trained a supervised model based on associated SNPs to prioritize putative functional regions. The TAGOOS scores, annotations and UCSC genome tracks are available here: https://tagoos.readthedocs.io.


2020 ◽  
Vol 14 (Supplement_1) ◽  
pp. S092-S092
Author(s):  
D Modos ◽  
J Brooks ◽  
P Sudhakar ◽  
B Verstockt ◽  
B Alexander-Dann ◽  
...  

Abstract Background Genome-wide association studies have deciphered the single nucleotide polymorphisms (SNPs) which are responsible for ulcerative colitis (UC) susceptibility. However, to understand how these SNPs are involved in UC, additional methods are necessary. One such approach is in silico network propagation modelling, which can discover how the effects of SNPs in UC can affect the whole cell. A complementary approach is weighted gene co-expression network analysis (WGCNA), where co-regulated genes are identified using transcriptomic data. Integrating these two methods can shed light on how SNPs are affecting the transcriptome of UC patients. Methods We used immunochip profiles of 941 UC patients and focussed on UC-associated SNPs altering regulatory regions. Based on these regions, we identified affected genes. To understand how their corresponding proteins rewire transcriptional regulation, we predicted the path between these proteins and relevant transcription factors (TF) using the OmniPath signalling network (http://omnipathdb.org). From the TFs, we propagated the signal further to target genes using TFlink (https://tflink.net) and GTRD (http://gtrd.biouml.org). To evaluate the predicted network propagation signal, we conducted WGCNA with transcriptomics data from 46 matching patients’ (GEO ID: GSE48959). To interpret the results, we used Gene Ontology Biological Process annotations of the target genes, and we compared the function and regulation of affected genes and the determined WGCNA modules. Results We found 9 predominant signalling pathways, some already known from other studies to be involved in UC pathogenesis, including NFkB signalling, chemokine signalling, Notch pathway, JAK/STAT signalling. Downstream of these pathways we identified potential key TFs regulate the UC phenotype, for example NFKB1, GATA3, GTF2I. The targets of these TFs were enriched in the WGCNA modules of the patients. The WGCNA modules and the transcriptionally affected genes had enriched processes including cell migration, TGF-β signalling, exocytosis, adaptive T- and B-cell-specific immune responses and tight junctions. We also found myogenetic development specific TFs affected transcriptionally such as MyoD, MEF2A, MEF2D. We are currently validating these results through patient-specific biopsies. Conclusion In silico methods bring us closer to understanding UC pathogenesis. Our results suggest that in a well-defined set of patients, weakened tight junctions and insufficient immune response can lead to dysfunctional epithelial barrier, resulting in poor wound healing in UC. We hope the developed workflow will provide novel diagnostic and therapeutic options in UC.


Sign in / Sign up

Export Citation Format

Share Document