scholarly journals LFMM 2.0: Latent factor models for confounder adjustment in genome and epigenome-wide association studies

2018 ◽  
Author(s):  
Kevin Caye ◽  
Basile Jumentier ◽  
Olivier François

AbstractMotivationGenome-wide, epigenome-wide and gene-environment association studies are plagued with the problems of confounding and causality. Although those problems have received considerable attention in each application field, no consensus have emerged on which approaches are the most appropriate to solve this problem. Current methods use approximate heuristics for estimating confounders, and often ignore correlation between confounders and primary variables, resulting in suboptimal power and precision.ResultsIn this study, we developed a least-squares estimation theory of confounder estimation using latent factor models, providing a unique framework for several categories of genomic data. Based on statistical learning methods, the proposed algorithms are fast and efficient, and can be proven to provide optimal solutions mathematically. In simulations, the algorithms outperformed commonly used methods based on principal components and surrogate variable analysis. In analysis of methylation profiles and genotypic data, they provided new insights on the molecular basis of diseases and adaptation of humans to their environment.Availability and implementationSoftware is available in the R package lfmm at https://bcm-uga.github.io/lfmm/.

2021 ◽  
Author(s):  
Wei Q. Deng ◽  
Lei Sun

AbstractA joint analysis of location and scale can be a powerful tool in genome-wide association studies to uncover previously overlooked markers that influence a quantitative trait through both mean and variance, as well as to prioritize candidates for gene-environment interactions. This approach has recently been generalized to handle related samples, dosage data, and the analytically challenging X-chromosome. We disseminate the latest advances in methodology through a user-friendly software package. The implemented R package (https://cran.r-project.org/web/packages/gJLS2) can be called via PLINK to enable a streamlined population-based analysis genome-wide, or called directly as a front end R script for inter-mediate gene-set based analyses, or used in any R GUI for smaller analyses targeting specific variants of interest.


2020 ◽  
Author(s):  
Cléement Gain ◽  
Olivier François

AbstractA major objective of evolutionary biology is to understand the processes by which organisms have adapted to various environments, and to predict the response of organisms to new or future conditions. The availability of large genomic and environmental data sets provides an opportunity to address those questions, and the R package LEA has been introduced to facilitate population and ecological genomic analyses in this context. By using latent factor models, the program computes ancestry coefficients from population genetic data, and performs genotype-environment association analyses with correction for unobserved confounding variables. In this study, we present new functionalities of LEA, which include imputation of missing genotypes, fast algorithms for latent factor mixed models using multivariate predictors for genotype-environment association studies, population differentiation tests for admixed or continuous populations, and estimation of genetic offset based on climate models. The new functionalities are implemented in version 3.0 and higher releases of the package. Using simulated and real data sets, our study provides evaluations and examples of applications, outlining important practical considerations when analyzing ecological genomic data in R.


2020 ◽  
Vol 36 (15) ◽  
pp. 4374-4376
Author(s):  
Ninon Mounier ◽  
Zoltán Kutalik

Abstract Summary Increasing sample size is not the only strategy to improve discovery in Genome Wide Association Studies (GWASs) and we propose here an approach that leverages published studies of related traits to improve inference. Our Bayesian GWAS method derives informative prior effects by leveraging GWASs of related risk factors and their causal effect estimates on the focal trait using multivariable Mendelian randomization. These prior effects are combined with the observed effects to yield Bayes Factors, posterior and direct effects. The approach not only increases power, but also has the potential to dissect direct and indirect biological mechanisms. Availability and implementation bGWAS package is freely available under a GPL-2 License, and can be accessed, alongside with user guides and tutorials, from https://github.com/n-mounier/bGWAS. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Matthias Munz ◽  
Inken Wohlers ◽  
Eric Simon ◽  
Tobias Reinberger ◽  
Hauke Busch ◽  
...  

AbstractExploration of genetic variant-to-gene relationships by quantitative trait loci such as expression QTLs is a frequently used tool in genome-wide association studies. However, the wide range of public QTL databases and the lack of batch annotation features complicate a comprehensive annotation of GWAS results. In this work, we introduce the tool “Qtlizer” for annotating lists of variants in human with associated changes in gene expression and protein abundance using an integrated database of published QTLs. Features include incorporation of variants in linkage disequilibrium and reverse search by gene names. Analyzing the database for base pair distances between best significant eQTLs and their affected genes suggests that the commonly used cis-distance limit of 1,000,000 base pairs might be too restrictive, implicating a substantial amount of wrongly and yet undetected eQTLs. We also ranked genes with respect to the maximum number of tissue-specific eQTL studies in which a most significant eQTL signal was consistent. For the top 100 genes we observed the strongest enrichment with housekeeping genes (P = 2 × 10–6) and with the 10% highest expressed genes (P = 0.005) after grouping eQTLs by r2 > 0.95, underlining the relevance of LD information in eQTL analyses. Qtlizer can be accessed via https://genehopper.de/qtlizer or by using the respective Bioconductor R-package (https://doi.org/10.18129/B9.bioc.Qtlizer).


2011 ◽  
Vol 38 (3) ◽  
pp. 564-566 ◽  
Author(s):  
PROTON RAHMAN

Psoriasis and psoriatic arthritis (PsA) are heterogeneous diseases. While both have a strong genetic basis, it is strongest for PsA, where fewer investigators are studying its genetics. Over the last year the number of independent genetic loci associated with psoriasis has substantially increased, mostly due to completion of multiple genome-wide association studies (GWAS) in psoriasis. At least 2 GWAS efforts are now under way in PsA to identify novel genes in this disease; a metaanalysis of genome-wide scans and further studies must follow to examine the genetics of disease expression, epistatic interaction, and gene-environment interaction. In the long term, it is anticipated that genome-wide sequencing is likely to generate another wave of novel genes in PsA. At the annual meeting of the Group for Research and Assessment of Psoriasis and Psoriatic Arthritis (GRAPPA) in Stockholm, Sweden, in 2009, members discussed issues and challenges regarding the advancement of the genetics of PsA; results of those discussions are summarized here.


2013 ◽  
Vol 475-476 ◽  
pp. 1084-1089
Author(s):  
Hui Yuan Chang ◽  
Ding Xia Li ◽  
Qi Dong Liu ◽  
Rong Jing Hu ◽  
Rui Sheng Zhang

Recommender systems are widely employed in many fields to recommend products, services and information to potential customers. As the most successful approach to recommender systems, collaborative filtering (CF) predicts user preferences in item selection based on the known user ratings of items. It can be divided into two main braches - the neighbourhood approach (NB) and latent factor models. Some of the most successful realizations of latent factor models are based on matrix factorization (MF). Accuracy is one of the most important measurement criteria for recommender systems. In this paper, to improve accuracy, we propose an improved MF model. In this model, we not only consider the latent factors describing the user and item, but also incorporate content information directly into MF.Experiments are performed on the Movielens dataset to compare the present approach with the other method. The experiment results indicate that the proposed approach can remarkably improve the recommendation quality.


Sign in / Sign up

Export Citation Format

Share Document