scholarly journals Potpourri: An Epistasis Test Prioritization Algorithm via Diverse SNP Selection

2019 ◽  
Author(s):  
Gizem Caylak ◽  
Oznur Tastan ◽  
A. Ercument Cicek

AbstractGenome-wide association studies explain a fraction of the underlying heritability of genetic diseases. Investigating epistatic interactions between two or more loci help closing this gap. Unfortunately, sheer number of loci combinations to process and hypotheses to test prohibit the process both computationally and statistically. Epistasis test prioritization algorithms rank likely-epistatic SNP pairs to limit the number of tests. Yet, they still suffer from very low precision. It was shown in the literature that selecting SNPs that are individually correlated with the phenotype and also diverse with respect to genomic location, leads to better phenotype prediction due to genetic complementation. Here, we propose that an algorithm that pairs SNPs from such diverse regions and ranks them can improve prediction power. We propose an epistasis test prioritization algorithm which optimizes a submodular set function to select a diverse and complementary set of genomic regions that span the underlying genome. SNP pairs from these regions are then further ranked w.r.t. their co-coverage of the case cohort. We compare our algorithm with the state-of-the-art on three GWAS and show that (i) we substantially improve precision (from 0.003 to 0.652) while maintaining the significance of selected pairs, (ii) decrease the number of tests by 25 folds, and (iii) decrease the runtime by 4 folds. We also show that promoting SNPs from regulatory/coding regions improves the performance (up to 0.8). Potpourri is available at http:/ciceklab.cs.bilkent.edu.tr/potpourri.

2015 ◽  
Vol 97 ◽  
Author(s):  
LISETTE GRAAE ◽  
SILVIA PADDOCK ◽  
ANDREA CARMINE BELIN

SummaryStudies of complex genetic diseases have revealed many risk factors of small effect, but the combined amount of heritability explained is still low. Genome-wide association studies are often underpowered to identify true effects because of the very large number of parallel tests. There is, therefore, a great need to generate data sets that are enriched for those markers that have an increased a priori chance of being functional, such as markers in genomic regions involved in gene regulation. ReMo-SNPs is a computational program developed to aid researchers in the process of selecting functional SNPs for association analyses in user-specified regions and/or motifs genome-wide. The useful feature of automatic selection of genotyped markers in the user-provided material makes the output data ready to be used in a following association study. In this article we describe the program and its functions. We also validate the program by including an example study on three different transcription factors and results from an association study on two psychiatric phenotypes. The flexibility of the ReMo-SNPs program enables the user to study any region or sequence of interest, without limitation to transcription factor binding regions and motifs. The program is freely available at: http://www.neuro.ki.se/ReMo-SNPs/


2020 ◽  
Vol 36 (9) ◽  
pp. 2936-2937 ◽  
Author(s):  
Gareth Peat ◽  
William Jones ◽  
Michael Nuhn ◽  
José Carlos Marugán ◽  
William Newell ◽  
...  

Abstract Motivation Genome-wide association studies (GWAS) are a powerful method to detect even weak associations between variants and phenotypes; however, many of the identified associated variants are in non-coding regions, and presumably influence gene expression regulation. Identifying potential drug targets, i.e. causal protein-coding genes, therefore, requires crossing the genetics results with functional data. Results We present a novel data integration pipeline that analyses GWAS results in the light of experimental epigenetic and cis-regulatory datasets, such as ChIP-Seq, Promoter-Capture Hi-C or eQTL, and presents them in a single report, which can be used for inferring likely causal genes. This pipeline was then fed into an interactive data resource. Availability and implementation The analysis code is available at www.github.com/Ensembl/postgap and the interactive data browser at postgwas.opentargets.io.


2021 ◽  
Vol 28 ◽  
Author(s):  
Vinutha Kanuganahalli Somegowda ◽  
Laavanya Rayaprolu ◽  
Abhishek Rathore ◽  
Santosh Pandurang Deshpande ◽  
Rajeev Gupta

: The main focus of this review is to discuss the current status of the use of GWAS for fodder quality and biofuel owing to its similarity of traits. Sorghum is a potential multipurpose crop, popularly cultivated for various uses as food, feed fodder, and biomass for ethanol. Production of a huge quantity of biomass and genetic variation for complex sugars are the main motivation not only to use sorghum as fodder for livestock nutritionists but also a potential candidate for biofuel generation. Few studies have been reported on the knowledge transfer that can be used from the development of biofuel technologies to complement improved fodder quality and vice versa. With recent advances in genotyping technologies, GWAS became one of the primary tools used to identify the genes/genomic regions associated with the phenotype. These modern tools and technologies accelerate the genomic assisted breeding process to enhance the rate of genetic gains. Hence, this mini-review focuses on GWAS studies on genetic architecture and dissection of traits underpinning fodder quality and biofuel traits and their limited comparison with other related model crop species.


2020 ◽  
Vol 82 (1) ◽  
pp. 413-431 ◽  
Author(s):  
Edwin K. Silverman

Although chronic obstructive pulmonary disease (COPD) risk is strongly influenced by cigarette smoking, genetic factors are also important determinants of COPD. In addition to Mendelian syndromes such as alpha-1 antitrypsin deficiency, many genomic regions that influence COPD susceptibility have been identified in genome-wide association studies. Similarly, multiple genomic regions associated with COPD-related phenotypes, such as quantitative emphysema measures, have been found. Identifying the functional variants and key genes within these association regions remains a major challenge. However, newly identified COPD susceptibility genes are already providing novel insights into COPD pathogenesis. Network-based approaches that leverage these genetic discoveries have the potential to assist in decoding the complex genetic architecture of COPD.


2011 ◽  
Vol 26 (S2) ◽  
pp. 1346-1346
Author(s):  
D. Benmessaoud ◽  
A.-M. Lepagnol-Bestel ◽  
M. Delepine ◽  
J. Hager ◽  
J.-M. Moalic ◽  
...  

Genome wide association studies (GWAS) of Schizophrenia (SZ) patients have identified common variants in ten genes including SMARCA2 (Koga et al., HMG, 2009). We found that the SZ-GWAS genes are part of an interacting network centered on SMARCA2 (Loe-Mie et al., HMG, 2010). Furthermore, SMARCA2 was found disrupted in SZ (Walsh et al., Science, 2008). SMARCA2 encodes the ATPase (BRM) of the SWI/SNF chromatin remodeling complex that is at the interface of genome and environmental adaptation.Taking advantage of an Algerian trio cohort of one hundred SZ patients (Benmessaoud et al., BMC Psychiatry, 2008), we replicated the association of SNP rs2296212 localized in exon 33, already shown associated in Koga study and resulting in D1546E amino acid change in the SMARCA2 protein. We studied SMARCA2 codons and found that exon 33 displays a signature of positive evolution in the primate lineage.Our working hypothesis is that the coding regions displaying positive selection are target of novel rare variants. To address this question, we sequenced two exons displaying positive evolution and one exon without evidence of positive evolution.We found (i) that rare variants are significantly in excess in SZ-patients compared to their parents (p = 0.038, Fisher test) and (ii) a higher proportion of rare variants in the primate-accelerated exons compared with the non-evolutionary exon in SZ-patients (p = 0.032, Fisher test).SMARCA2 exon sequencing and whole exome sequencing from patients harboring SNP rs2296212 common variant are under progress. Altogether, these results are expected to give new insights into the genetic architecture of SZ.


Author(s):  
Anne Hinks ◽  
Wendy Thomson

Juvenile rheumatic diseases are heterogeneous, complex genetic diseases; to date only juvenile idiopathic arthritis (JIA) has been extensively studied in terms of identifying genetic risk factors. The MHC region is a well-established risk factor but in the last few years candidate gene and large-scale genome-wide association studies have been utilized in the search for non-HLA risk factors. There are now 17 JIA susceptibility loci which reach the genome-wide significance threshold for association and a further 7 regions with evidence for association in more than one study. In addition, some subtype-specific associations are emerging. These risk loci now need to be investigated further using fine-mapping strategies and then appropriate functional studies to show how the variant alters the gene function. This knowledge will not only lead to a better understanding of disease pathogenesis for juvenile rheumatic diseases but may also aid in the classification of these heterogeneous diseases. It may identify new pathways for potential therapeutic targets and help in the prediction of disease outcome and response to treatment.


2020 ◽  
Vol 10 (11) ◽  
pp. 3991-4000
Author(s):  
Wenqian Kong ◽  
Huizhe Jin ◽  
Valorie H. Goff ◽  
Susan A. Auckland ◽  
Lisa K. Rainville ◽  
...  

Biofuel made from agricultural products has the potential in contribute to a stable supply of fuel for growing energy demands. Some salient plant traits, such as stem diameter and water content, and their relationship to other important biomass-related traits are so far poorly understood. Here, we performed QTL mapping for three stem diameter and two water content traits in a S. bicolor BTx623 x IS3620c recombinant inbred line population of 399 genotypes, and validated the genomic regions identified using genome-wide association studies (GWAS) in a diversity panel of 354 accessions. The discovery of both co-localized and non-overlapping loci affecting stem diameter traits suggests that stem widths at different heights share some common genetic control, but also have some distinct genetic influences. Co-localizations of stem diameter and water content traits with other biomass traits including plant height, flowering time and the ‘dry’ trait, suggest that their inheritance may be linked functionally (pleiotropy) or physically (linkage disequilibrium). Water content QTL in homeologous regions resulting from an ancient duplication event may have been retained and continue to have related functions for an estimated 96 million years. Integration of QTL and GWAS data advanced knowledge of the genetic basis of stem diameter and water content components in sorghum, which may lead to tools and strategies for either enhancing or suppressing these traits, supporting advances toward improved quality of plant-based biomass for biofuel production.


2019 ◽  
Vol 35 (22) ◽  
pp. 4724-4729 ◽  
Author(s):  
Wujuan Zhong ◽  
Cassandra N Spracklen ◽  
Karen L Mohlke ◽  
Xiaojing Zheng ◽  
Jason Fine ◽  
...  

Abstract Summary Tens of thousands of reproducibly identified GWAS (Genome-Wide Association Studies) variants, with the vast majority falling in non-coding regions resulting in no eventual protein products, call urgently for mechanistic interpretations. Although numerous methods exist, there are few, if any methods, for simultaneously testing the mediation effects of multiple correlated SNPs via some mediator (e.g. the expression of a gene in the neighborhood) on phenotypic outcome. We propose multi-SNP mediation intersection-union test (SMUT) to fill in this methodological gap. Our extensive simulations demonstrate the validity of SMUT as well as substantial, up to 92%, power gains over alternative methods. In addition, SMUT confirmed known mediators in a real dataset of Finns for plasma adiponectin level, which were missed by many alternative methods. We believe SMUT will become a useful tool to generate mechanistic hypotheses underlying GWAS variants, facilitating functional follow-up. Availability and implementation The R package SMUT is publicly available from CRAN at https://CRAN.R-project.org/package=SMUT. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Satish K Nandakumar ◽  
Sean K McFarland ◽  
Laura Marlene Mateyka ◽  
Caleb A Lareau ◽  
Jacob C Ulirsch ◽  
...  

Genome-wide association studies (GWAS) have identified thousands of variants associated with human diseases and traits. However, the majority of GWAS-implicated variants are in non-coding genomic regions and require in depth follow-up to identify target genes and decipher biological mechanisms. Here, rather than focusing on causal variants, we have undertaken a pooled loss-of-function screen in primary hematopoietic cells to interrogate 389 candidate genes contained in 75 loci associated with red blood cell traits. Using this approach, we identify 77 genes at 38 GWAS loci, with most loci harboring 1-2 candidate genes. Importantly, the hit set was strongly enriched for genes validated through orthogonal genetic approaches. Genes identified by this approach are enriched in relevant biological pathways, allowing regulators of human erythropoiesis and blood disease modifiers to be defined. More generally, this functional screen provides a paradigm for gene-centric follow up of GWAS for a variety of human diseases and traits.


Sign in / Sign up

Export Citation Format

Share Document