A W-test collapsing method for rare-variant association testing in exome sequencing data

Rui Sun; Haoyi Weng; Inchi Hu; Junfeng Guo; William K. K. Wu; Benny Chung-Ying Zee; Maggie Haitian Wang

doi:10.1002/gepi.22000

The weighting is the hardest part: on the behavior of the likelihood ratio test and score test under weight misspecification in rare variant association studies

10.1101/020198 ◽

2015 ◽

Author(s):

Camelia C. Minica ◽

Giulio Genovese ◽

Christina M. Hultman ◽

René Pool ◽

Jacqueline M. Vink ◽

...

Keyword(s):

Exome Sequencing ◽

Likelihood Ratio ◽

Rare Variant ◽

Association Studies ◽

Score Test ◽

P Value ◽

Ratio Test ◽

Sequencing Data ◽

Rare Variant Association ◽

Exome Sequencing Data

Rare variant association studies are at a critical inflexion point with the increasing availability of exome-sequencing data. A popular test of association is the sequence kernel association test (SKAT). Weights are embedded within SKAT to reflect the hypothesized contribution of the variants to the trait variance. Correct weighting is expected to boost power, and yet the correct weights are generally unknown. It is therefore important to assess the effect of weight misspecification in SKAT. We evaluated the behavior of the score and likelihood ratio tests (LRT) under weight misspecification. Simulation and empirical results revealed that LRT is generally more robust and more powerful than score test in such a circumstance. For instance, when the simulated betas were larger for rarer than for more common variants, (incorrectly) assigning equal weights reduced the power of the LRT by ~5%, while the power of the score test dropped by ~30%. To optimize weighting we proposed a data-driven weighting scheme. With this scheme and LRT we detected significant enrichment of rare case mutations (MAF<5%; P-value=7E-04) of a set of constrained genes in the Swedish schizophrenia case-control cohort with exome-sequencing data. The score test is currently preferred for its computational efficiency and power. Indeed, assuming correct specification, in some circumstances the score test is the most powerful test. However, LRT has the compelling qualities of being generally more powerful and more robust to misspecification. This is an important result given that, arguably, misspecified models are likely to be the rule rather than the exception in weighting-based approaches.

Download Full-text

Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test

The American Journal of Human Genetics ◽

10.1016/j.ajhg.2011.05.029 ◽

2011 ◽

Vol 89 (1) ◽

pp. 82-93 ◽

Cited By ~ 1381

Author(s):

Michael C. Wu ◽

Seunggeun Lee ◽

Tianxi Cai ◽

Yun Li ◽

Michael Boehnke ◽

...

Keyword(s):

Rare Variant ◽

Association Test ◽

Sequence Kernel Association Test ◽

Sequencing Data ◽

Rare Variant Association ◽

Association Testing

Download Full-text

Rare Variant Association Testing for Next-Generation Sequencing Data via Hierarchical Clustering

Human Heredity ◽

10.1159/000346022 ◽

2012 ◽

Vol 74 (3-4) ◽

pp. 165-171 ◽

Cited By ~ 1

Author(s):

Ioanna Tachmazidou ◽

Andrew Morris ◽

Eleftheria Zeggini

Keyword(s):

Next Generation Sequencing ◽

Hierarchical Clustering ◽

Rare Variant ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Rare Variant Association ◽

Association Testing ◽

Generation Sequencing

Download Full-text

Regularized Rare Variant Enrichment Analysis for Case-Control Exome Sequencing Data

Genetic Epidemiology ◽

10.1002/gepi.21783 ◽

2013 ◽

Vol 38 (2) ◽

pp. 104-113 ◽

Cited By ~ 7

Author(s):

Nicholas B. Larson ◽

Daniel J. Schaid

Keyword(s):

Exome Sequencing ◽

Rare Variant ◽

Enrichment Analysis ◽

Case Control ◽

Sequencing Data ◽

Exome Sequencing Data

Download Full-text

Meta-analysis of whole-exome sequencing data from two independent cohorts finds no evidence for rare variant enrichment in Parkinson disease associated loci

PLoS ONE ◽

10.1371/journal.pone.0239824 ◽

2020 ◽

Vol 15 (10) ◽

pp. e0239824

Author(s):

Johannes Jernqvist Gaare ◽

Gonzalo Nido ◽

Christian Dölle ◽

Paweł Sztromwasser ◽

Guido Alves ◽

...

Keyword(s):

Parkinson Disease ◽

Exome Sequencing ◽

Whole Exome Sequencing ◽

Rare Variant ◽

Meta Analysis ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data

Download Full-text

Faculty Opinions recommendation of A statistical approach for rare-variant association testing in affected sibships.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.725400621.793510953 ◽

2015 ◽

Author(s):

Ellen Wijsman

Keyword(s):

Rare Variant ◽

Statistical Approach ◽

Rare Variant Association ◽

Association Testing

Download Full-text

ETumorMetastasis: A Network-based Algorithm Predicts Clinical Outcomes Using Whole-exome Sequencing Data of Cancer Patients

Genomics Proteomics & Bioinformatics ◽

10.1016/j.gpb.2020.06.009 ◽

2021 ◽

Cited By ~ 1

Author(s):

Jean-Sébastien Milanese ◽

Chabane Tibiche ◽

Naif Zaman ◽

Jinfeng Zou ◽

Pengyong Han ◽

...

Keyword(s):

Exome Sequencing ◽

Cancer Patients ◽

Clinical Outcomes ◽

Whole Exome Sequencing ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data

Download Full-text

Variable Phenotypes of Epilepsy, Intellectual Disability, and Schizophrenia Caused by 12p13.33–p13.32 Terminal Microdeletion in a Korean Family: A Case Report and Literature Review

Genes ◽

10.3390/genes12071001 ◽

2021 ◽

Vol 12 (7) ◽

pp. 1001

Author(s):

Jiyoon Han ◽

Joonhong Park

Keyword(s):

Intellectual Disability ◽

Exome Sequencing ◽

Environmental Influence ◽

Copy Number Variations ◽

Genetic Modifiers ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Korean Family ◽

Coverage Analysis ◽

Patient Will

A simultaneous analysis of nucleotide changes and copy number variations (CNVs) based on exome sequencing data was demonstrated as a potential new first-tier diagnosis strategy for rare neuropsychiatric disorders. In this report, using depth-of-coverage analysis from exome sequencing data, we described variable phenotypes of epilepsy, intellectual disability (ID), and schizophrenia caused by 12p13.33–p13.32 terminal microdeletion in a Korean family. We hypothesized that CACNA1C and KDM5A genes of the six candidate genes located in this region were the best candidates for explaining epilepsy, ID, and schizophrenia and may be responsible for clinical features reported in cases with monosomy of the 12p13.33 subtelomeric region. On the background of microdeletion syndrome, which was described in clinical cases with mild, moderate, and severe neurodevelopmental manifestations as well as impairments, the clinician may determine whether the patient will end up with a more severe or milder end‐phenotype, which in turn determines disease prognosis. In our case, the 12p13.33–p13.32 terminal microdeletion may explain the variable expressivity in the same family. However, further comprehensive studies with larger cohorts focusing on careful phenotyping across the lifespan are required to clearly elucidate the possible contribution of genetic modifiers and the environmental influence on the expressivity of 12p13.33 microdeletion and associated characteristics.

Download Full-text

UTILIZATION OF WHOLE EXOME SEQUENCING DATA TO IDENTIFY CLINICALLY RELEVANT PHARMACOGENOMIC VARIANTS IN INFLAMMATORY BOWEL DISEASE

Gastroenterology ◽

10.1053/j.gastro.2021.01.119 ◽

2021 ◽

Vol 160 (3) ◽

pp. S43

Author(s):

Daniel Mulder ◽

Sam Khalouei ◽

Neil Warner ◽

Claudia Gonzaga-Jauregui ◽

Peter Church ◽

...

Keyword(s):

Inflammatory Bowel Disease ◽

Exome Sequencing ◽

Whole Exome Sequencing ◽

Bowel Disease ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data ◽

Inflammatory Bowel

Download Full-text

FC 011KIDNEYNETWORK: USING KIDNEY DERIVED GENE EXPRESSION DATA TO PREDICT AND PRIORITIZE NOVEL GENES INVOLVED IN KIDNEY DISEASE

Nephrology Dialysis Transplantation ◽

10.1093/ndt/gfab131.001 ◽

2021 ◽

Vol 36 (Supplement_1) ◽

Author(s):

Floranne Boulogne ◽

Laura Claus ◽

Henry Wiersma ◽

Roy Oelen ◽

Floor Schukking ◽

...

Keyword(s):

Gene Expression ◽

Kidney Disease ◽

Candidate Gene ◽

Exome Sequencing ◽

Rna Sequencing ◽

Expression Patterns ◽

Genetic Diagnosis ◽

Specific Gene ◽

Sequencing Data ◽

Exome Sequencing Data

Abstract Background and Aims Genetic testing in patients with suspected hereditary kidney disease does not always reveal the genetic cause for the patient's disorder. Potentially pathogenic variants can reside in genes that are not known to be involved in kidney disease, which makes it difficult to prioritize and interpret the relevance of these variants. As such, there is a clear need for methods that predict the phenotypic consequences of gene expression in a way that is as unbiased as possible. To help identify candidate genes we have developed KidneyNetwork, in which tissue-specific expression is utilized to predict kidney-specific gene functions. Method We combined gene co-expression in 878 publicly available kidney RNA-sequencing samples with the co-expression of a multi-tissue RNA-sequencing dataset of 31,499 samples to build KidneyNetwork. The expression patterns were used to predict which genes have a kidney-related function, and which (disease) phenotypes might be caused when these genes are mutated. By integrating the information from the HPO database, in which known phenotypic consequences of disease genes are annotated, with the gene co-expression network we obtained prediction scores for each gene per HPO term. As proof of principle, we applied KidneyNetwork to prioritize variants in exome-sequencing data from 13 kidney disease patients without a genetic diagnosis. Results We assessed the prediction performance of KidneyNetwork by comparing it to GeneNetwork, a multi-tissue co-expression network we previously developed. In KidneyNetwork, we observe a significantly improved prediction accuracy of kidney-related HPO-terms, as well as an increase in the total number of significantly predicted kidney-related HPO-terms (figure 1). To examine its clinical utility, we applied KidneyNetwork to 13 patients with a suspected hereditary kidney disease without a genetic diagnosis. Based on the HPO terms “Renal cyst” and “Hepatic cysts”, combined with a list of potentially damaging variants in one of the undiagnosed patients with mild ADPKD/PCLD, we identified ALG6 as a new candidate gene. ALG6 bears a high resemblance to other genes implicated in this phenotype in recent years. Through the 100,000 Genomes Project and collaborators we identified three additional patients with kidney and/or liver cysts carrying a suspected deleterious variant in ALG6. Conclusion We present KidneyNetwork, a kidney specific co-expression network that accurately predicts what genes have kidney-specific functions and may result in kidney disease. Gene-phenotype associations of genes unknown for kidney-related phenotypes can be predicted by KidneyNetwork. We show the added value of KidneyNetwork by applying it to exome sequencing data of kidney disease patients without a molecular diagnosis and consequently we propose ALG6 as a promising candidate gene. KidneyNetwork can be applied to clinically unsolved kidney disease cases, but it can also be used by researchers to gain insight into individual genes to better understand kidney physiology and pathophysiology. Acknowledgments This research was made possible through access to the data and findings generated by the 100,000 Genomes Project; http://www.genomicsengland.co.uk.

Download Full-text