A novel privacy-preserving federated genome-wide association study framework and its application in identifying potential risk variants in ankylosing spondylitis

Briefings in Bioinformatics ◽

10.1093/bib/bbaa090 ◽

2020 ◽

Author(s):

Xin Wu ◽

Hao Zheng ◽

Zuochao Dou ◽

Feng Chen ◽

Jieren Deng ◽

...

Keyword(s):

Ankylosing Spondylitis ◽

Sample Size ◽

Potential Risk ◽

Data Privacy ◽

Privacy Preserving ◽

Genome Wide Association ◽

Genomic Research ◽

Risk Variants ◽

Genome Wide ◽

The Impact

Abstract Genome-wide association studies (GWAS) have been widely used for identifying potential risk variants in various diseases. A statistically meaningful GWAS typically requires a large sample size to detect disease-associated single nucleotide polymorphisms (SNPs). However, a single institution usually only possesses a limited number of samples. Therefore, cross-institutional partnerships are required to increase sample size and statistical power. However, cross-institutional partnerships offer significant challenges, a major one being data privacy. For example, the privacy awareness of people, the impact of data privacy leakages and the privacy-related risks are becoming increasingly important, while there is no de-identification standard available to safeguard genomic data sharing. In this paper, we introduce a novel privacy-preserving federated GWAS framework (iPRIVATES). Equipped with privacy-preserving federated analysis, iPRIVATES enables multiple institutions to jointly perform GWAS analysis without leaking patient-level genotyping data. Only aggregated local statistics are exchanged within the study network. In addition, we evaluate the performance of iPRIVATES through both simulated data and a real-world application for identifying potential risk variants in ankylosing spondylitis (AS). The experimental results showed that the strongest signal of AS-associated SNPs reside mostly around the human leukocyte antigen (HLA) regions. The proposed iPRIVATES framework achieved equivalent results as traditional centralized implementation, demonstrating its great potential in driving collaborative genomic research for different diseases while preserving data privacy.

Download Full-text

A cautionary note on the impact of protocol changes for genome-wide association SNP × SNP interaction studies: an example on ankylosing spondylitis

Human Genetics ◽

10.1007/s00439-015-1560-7 ◽

2015 ◽

Vol 134 (7) ◽

pp. 761-773 ◽

Cited By ~ 8

Author(s):

Kyrylo Bessonov ◽

Elena S. Gusareva ◽

Kristel Van Steen

Keyword(s):

Ankylosing Spondylitis ◽

Genome Wide Association ◽

Cautionary Note ◽

Interaction Studies ◽

Genome Wide ◽

The Impact ◽

Snp Interaction

Download Full-text

Association of type 2 diabetes mellitus and periodontal disease susceptibility with genome-wide association–identified risk variants in a Southeastern Brazilian population

Clinical Oral Investigations ◽

10.1007/s00784-020-03717-3 ◽

2021 ◽

Author(s):

Thamiris Cirelli ◽

Rafael Nepomuceno ◽

Jéssica Marina Goveia ◽

Silvana R. P. Orrico ◽

Joni A. Cirelli ◽

...

Keyword(s):

Diabetes Mellitus ◽

Type 2 Diabetes ◽

Type 2 Diabetes Mellitus ◽

Periodontal Disease ◽

Disease Susceptibility ◽

Brazilian Population ◽

Genome Wide Association ◽

Risk Variants ◽

Genome Wide

Download Full-text

E-MAGMA: an eQTL-informed method to identify risk genes using genome-wide association study summary statistics

Bioinformatics ◽

10.1093/bioinformatics/btab115 ◽

2021 ◽

Author(s):

Zachary F Gerring ◽

Angela Mina-Vargas ◽

Eric R Gamazon ◽

Eske M Derks

Keyword(s):

Genome Wide Association ◽

Chromosome 1 ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Tissue Specific ◽

Complex Disorders ◽

Risk Variants ◽

Genome Wide ◽

Causal Genes

Abstract Motivation Genome-wide association studies have successfully identified multiple independent genetic loci that harbour variants associated with human traits and diseases, but the exact causal genes are largely unknown. Common genetic risk variants are enriched in non-protein-coding regions of the genome and often affect gene expression (expression quantitative trait loci, eQTL) in a tissue-specific manner. To address this challenge, we developed a methodological framework, E-MAGMA, which converts genome-wide association summary statistics into gene-level statistics by assigning risk variants to their putative genes based on tissue-specific eQTL information. Results We compared E-MAGMA to three eQTL informed gene-based approaches using simulated phenotype data. Phenotypes were simulated based on eQTL reference data using GCTA for all genes with at least one eQTL at chromosome 1. We performed 10 simulations per gene. The eQTL-h2 (i.e., the proportion of variation explained by the eQTLs) was set at 1%, 2%, and 5%. We found E-MAGMA outperforms other gene-based approaches across a range of simulated parameters (e.g. the number of identified causal genes). When applied to genome-wide association summary statistics for five neuropsychiatric disorders, E-MAGMA identified more putative candidate causal genes compared to other eQTL-based approaches. By integrating tissue-specific eQTL information, these results show E-MAGMA will help to identify novel candidate causal genes from genome-wide association summary statistics and thereby improve the understanding of the biological basis of complex disorders. Availability A tutorial and input files are made available in a github repository: https://github.com/eskederks/eMAGMA-tutorial. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

The Impact of Incomplete Linkage Disequilibrium and Genetic Model Choice on the Analysis and Interpretation of Genome-wide Association Studies

Annals of Human Genetics ◽

10.1111/j.1469-1809.2010.00579.x ◽

2010 ◽

Vol 74 (4) ◽

pp. 375-379 ◽

Cited By ~ 6

Author(s):

Mark M. Iles

Keyword(s):

Linkage Disequilibrium ◽

Genetic Model ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Model Choice ◽

Genome Wide ◽

The Impact

Download Full-text

Faculty Opinions recommendation of Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.733105041.793567849 ◽

2019 ◽

Author(s):

Benjamin Neale

Keyword(s):

Major Depression ◽

Genetic Architecture ◽

Genome Wide Association ◽

Association Analyses ◽

Risk Variants ◽

Genome Wide

Download Full-text

Faculty Opinions recommendation of Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.733105041.793546400 ◽

2018 ◽

Author(s):

Holger Lerche ◽

Mahmoud Koko Musa

Keyword(s):

Major Depression ◽

Genetic Architecture ◽

Genome Wide Association ◽

Association Analyses ◽

Risk Variants ◽

Genome Wide

Download Full-text

A Review on the Impact of Genetics and Genome Wide Association Studies in Autoimmunity

MOJ Proteomics & Bioinformatics ◽

10.15406/mojpb.2017.06.00203 ◽

2017 ◽

Vol 6 (4) ◽

Author(s):

Harishchander Anandaram

Keyword(s):

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

The Impact

Download Full-text

Prediction of Alzheimer's disease using multi-variants from a Chinese genome-wide association study

Brain ◽

10.1093/brain/awaa364 ◽

2020 ◽

Cited By ~ 1

Author(s):

Longfei Jia ◽

Fangyu Li ◽

Cuibai Wei ◽

Min Zhu ◽

Qiumin Qu ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Association Study ◽

Predictive Models ◽

Genome Wide Association Study ◽

Disease Onset ◽

Genome Wide Association ◽

Nucleotide Polymorphisms ◽

Risk Variants ◽

Genome Wide

Abstract Previous genome-wide association studies have identified dozens of susceptibility loci for sporadic Alzheimer’s disease, but few of these loci have been validated in longitudinal cohorts. Establishing predictive models of Alzheimer’s disease based on these novel variants is clinically important for verifying whether they have pathological functions and provide a useful tool for screening of disease risk. In the current study, we performed a two-stage genome-wide association study of 3913 patients with Alzheimer’s disease and 7593 controls and identified four novel variants (rs3777215, rs6859823, rs234434, and rs2255835; Pcombined = 3.07 × 10−19, 2.49 × 10−23, 1.35 × 10−67, and 4.81 × 10−9, respectively) as well as nine variants in the apolipoprotein E region with genome-wide significance (P < 5.0 × 10−8). Literature mining suggested that these novel single nucleotide polymorphisms are related to amyloid precursor protein transport and metabolism, antioxidation, and neurogenesis. Based on their possible roles in the development of Alzheimer’s disease, we used different combinations of these variants and the apolipoprotein E status and successively built 11 predictive models. The predictive models include relatively few single nucleotide polymorphisms useful for clinical practice, in which the maximum number was 13 and the minimum was only four. These predictive models were all significant and their peak of area under the curve reached 0.73 both in the first and second stages. Finally, these models were validated using a separate longitudinal cohort of 5474 individuals. The results showed that individuals carrying risk variants included in the models had a shorter latency and higher incidence of Alzheimer’s disease, suggesting that our models can predict Alzheimer’s disease onset in a population with genetic susceptibility. The effectiveness of the models for predicting Alzheimer’s disease onset confirmed the contributions of these identified variants to disease pathogenesis. In conclusion, this is the first study to validate genome-wide association study-based predictive models for evaluating the risk of Alzheimer’s disease onset in a large Chinese population. The clinical application of these models will be beneficial for individuals harbouring these risk variants, and particularly for young individuals seeking genetic consultation.

Download Full-text

A Differential Privacy Preserving Framework with Nash Equilibrium in Genome-Wide Association studies

2018 International Conference on Networking and Network Applications (NaNA) ◽

10.1109/nana.2018.8648711 ◽

2018 ◽

Cited By ~ 1

Author(s):

Ziwei Han ◽

Hai Liu ◽

Zhenqiang Wu

Keyword(s):

Nash Equilibrium ◽

Differential Privacy ◽

Association Studies ◽

Privacy Preserving ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text

Erratum to: Power estimation and sample size determination for replication studies of genome-wide association studies

BMC Genomics ◽

10.1186/s12864-017-3482-3 ◽

2017 ◽

Vol 18 (1) ◽

Author(s):

Wei Jiang ◽

Weichuan Yu

Keyword(s):

Sample Size ◽

Association Studies ◽

Power Estimation ◽

Sample Size Determination ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Size Determination ◽

Replication Studies ◽

Genome Wide

Download Full-text