The weighting is the hardest part: on the behavior of the likelihood ratio test and score test under weight misspecification in rare variant association studies

Mapping Intimacies ◽

10.1101/020198 ◽

2015 ◽

Author(s):

Camelia C. Minica ◽

Giulio Genovese ◽

Christina M. Hultman ◽

René Pool ◽

Jacqueline M. Vink ◽

...

Keyword(s):

Exome Sequencing ◽

Likelihood Ratio ◽

Rare Variant ◽

Association Studies ◽

Score Test ◽

P Value ◽

Ratio Test ◽

Sequencing Data ◽

Rare Variant Association ◽

Exome Sequencing Data

Rare variant association studies are at a critical inflexion point with the increasing availability of exome-sequencing data. A popular test of association is the sequence kernel association test (SKAT). Weights are embedded within SKAT to reflect the hypothesized contribution of the variants to the trait variance. Correct weighting is expected to boost power, and yet the correct weights are generally unknown. It is therefore important to assess the effect of weight misspecification in SKAT. We evaluated the behavior of the score and likelihood ratio tests (LRT) under weight misspecification. Simulation and empirical results revealed that LRT is generally more robust and more powerful than score test in such a circumstance. For instance, when the simulated betas were larger for rarer than for more common variants, (incorrectly) assigning equal weights reduced the power of the LRT by ~5%, while the power of the score test dropped by ~30%. To optimize weighting we proposed a data-driven weighting scheme. With this scheme and LRT we detected significant enrichment of rare case mutations (MAF<5%; P-value=7E-04) of a set of constrained genes in the Swedish schizophrenia case-control cohort with exome-sequencing data. The score test is currently preferred for its computational efficiency and power. Indeed, assuming correct specification, in some circumstances the score test is the most powerful test. However, LRT has the compelling qualities of being generally more powerful and more robust to misspecification. This is an important result given that, arguably, misspecified models are likely to be the rule rather than the exception in weighting-based approaches.

Download Full-text

The Weighting is the Hardest Part: On the Behavior of the Likelihood Ratio Test and the Score Test Under a Data-Driven Weighting Scheme in Sequenced Samples

Twin Research and Human Genetics ◽

10.1017/thg.2017.7 ◽

2017 ◽

Vol 20 (2) ◽

pp. 108-118 ◽

Cited By ~ 1

Author(s):

Camelia C. Minică ◽

Giulio Genovese ◽

Christina M. Hultman ◽

René Pool ◽

Jacqueline M. Vink ◽

...

Keyword(s):

Exome Sequencing ◽

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Association Studies ◽

Score Test ◽

Weighting Scheme ◽

Data Driven ◽

Ratio Test ◽

Sequencing Data ◽

Exome Sequencing Data

Sequence-based association studies are at a critical inflexion point with the increasing availability of exome-sequencing data. A popular test of association is the sequence kernel association test (SKAT). Weights are embedded within SKAT to reflect the hypothesized contribution of the variants to the trait variance. Because the true weights are generally unknown, and so are subject to misspecification, we examined the efficiency of a data-driven weighting scheme. We propose the use of a set of theoretically defensible weighting schemes, of which, we assume, the one that gives the largest test statistic is likely to capture best the allele frequency–functional effect relationship. We show that the use of alternative weights obviates the need to impose arbitrary frequency thresholds. As both the score test and the likelihood ratio test (LRT) may be used in this context, and may differ in power, we characterize the behavior of both tests. The two tests have equal power, if the weights in the set included weights resembling the correct ones. However, if the weights are badly specified, the LRT shows superior power (due to its robustness to misspecification). With this data-driven weighting procedure the LRT detected significant signal in genes located in regions already confirmed as associated with schizophrenia — the PRRC2A (p = 1.020e-06) and the VARS2 (p = 2.383e-06) — in the Swedish schizophrenia case-control cohort of 11,040 individuals with exome-sequencing data. The score test is currently preferred for its computational efficiency and power. Indeed, assuming correct specification, in some circumstances, the score test is the most powerful test. However, LRT has the advantageous properties of being generally more robust and more powerful under weight misspecification. This is an important result given that, arguably, misspecified models are likely to be the rule rather than the exception in weighting-based approaches.

Download Full-text

Incorporation of protein binding effects into likelihood ratio test for exome sequencing data

BMC Proceedings ◽

10.1186/s12919-016-0043-8 ◽

2016 ◽

Vol 10 (S7) ◽

Cited By ~ 1

Author(s):

Dongni Zhang ◽

Hongzhu Cui ◽

Dmitry Korkin ◽

Zheyang Wu

Keyword(s):

Protein Binding ◽

Exome Sequencing ◽

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Ratio Test ◽

Sequencing Data ◽

Exome Sequencing Data

Download Full-text

A W-test collapsing method for rare-variant association testing in exome sequencing data

Genetic Epidemiology ◽

10.1002/gepi.22000 ◽

2016 ◽

Vol 40 (7) ◽

pp. 591-596 ◽

Cited By ~ 3

Author(s):

Rui Sun ◽

Haoyi Weng ◽

Inchi Hu ◽

Junfeng Guo ◽

William K. K. Wu ◽

...

Keyword(s):

Exome Sequencing ◽

Rare Variant ◽

Sequencing Data ◽

Rare Variant Association ◽

Exome Sequencing Data ◽

Association Testing

Download Full-text

Exome sequencing and complex disease: practical aspects of rare variant association studies

Human Molecular Genetics ◽

10.1093/hmg/dds387 ◽

2012 ◽

Vol 21 (R1) ◽

pp. R1-R9 ◽

Cited By ~ 97

Author(s):

R. Do ◽

S. Kathiresan ◽

G. R. Abecasis

Keyword(s):

Exome Sequencing ◽

Rare Variant ◽

Complex Disease ◽

Association Studies ◽

Rare Variant Association

Download Full-text

Regularized Rare Variant Enrichment Analysis for Case-Control Exome Sequencing Data

Genetic Epidemiology ◽

10.1002/gepi.21783 ◽

2013 ◽

Vol 38 (2) ◽

pp. 104-113 ◽

Cited By ~ 7

Author(s):

Nicholas B. Larson ◽

Daniel J. Schaid

Keyword(s):

Exome Sequencing ◽

Rare Variant ◽

Enrichment Analysis ◽

Case Control ◽

Sequencing Data ◽

Exome Sequencing Data

Download Full-text

Meta-analysis of whole-exome sequencing data from two independent cohorts finds no evidence for rare variant enrichment in Parkinson disease associated loci

PLoS ONE ◽

10.1371/journal.pone.0239824 ◽

2020 ◽

Vol 15 (10) ◽

pp. e0239824

Author(s):

Johannes Jernqvist Gaare ◽

Gonzalo Nido ◽

Christian Dölle ◽

Paweł Sztromwasser ◽

Guido Alves ◽

...

Keyword(s):

Parkinson Disease ◽

Exome Sequencing ◽

Whole Exome Sequencing ◽

Rare Variant ◽

Meta Analysis ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data

Download Full-text

ETumorMetastasis: A Network-based Algorithm Predicts Clinical Outcomes Using Whole-exome Sequencing Data of Cancer Patients

Genomics Proteomics & Bioinformatics ◽

10.1016/j.gpb.2020.06.009 ◽

2021 ◽

Cited By ~ 1

Author(s):

Jean-Sébastien Milanese ◽

Chabane Tibiche ◽

Naif Zaman ◽

Jinfeng Zou ◽

Pengyong Han ◽

...

Keyword(s):

Exome Sequencing ◽

Cancer Patients ◽

Clinical Outcomes ◽

Whole Exome Sequencing ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data

Download Full-text

Variable Phenotypes of Epilepsy, Intellectual Disability, and Schizophrenia Caused by 12p13.33–p13.32 Terminal Microdeletion in a Korean Family: A Case Report and Literature Review

Genes ◽

10.3390/genes12071001 ◽

2021 ◽

Vol 12 (7) ◽

pp. 1001

Author(s):

Jiyoon Han ◽

Joonhong Park

Keyword(s):

Intellectual Disability ◽

Exome Sequencing ◽

Environmental Influence ◽

Copy Number Variations ◽

Genetic Modifiers ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Korean Family ◽

Coverage Analysis ◽

Patient Will

A simultaneous analysis of nucleotide changes and copy number variations (CNVs) based on exome sequencing data was demonstrated as a potential new first-tier diagnosis strategy for rare neuropsychiatric disorders. In this report, using depth-of-coverage analysis from exome sequencing data, we described variable phenotypes of epilepsy, intellectual disability (ID), and schizophrenia caused by 12p13.33–p13.32 terminal microdeletion in a Korean family. We hypothesized that CACNA1C and KDM5A genes of the six candidate genes located in this region were the best candidates for explaining epilepsy, ID, and schizophrenia and may be responsible for clinical features reported in cases with monosomy of the 12p13.33 subtelomeric region. On the background of microdeletion syndrome, which was described in clinical cases with mild, moderate, and severe neurodevelopmental manifestations as well as impairments, the clinician may determine whether the patient will end up with a more severe or milder end‐phenotype, which in turn determines disease prognosis. In our case, the 12p13.33–p13.32 terminal microdeletion may explain the variable expressivity in the same family. However, further comprehensive studies with larger cohorts focusing on careful phenotyping across the lifespan are required to clearly elucidate the possible contribution of genetic modifiers and the environmental influence on the expressivity of 12p13.33 microdeletion and associated characteristics.

Download Full-text

Exome-Wide Pan-Cancer Analysis of Germline Variants in 8,719 Individuals Finds Little Evidence of Rare Variant Associations

Human Heredity ◽

10.1159/000519355 ◽

2021 ◽

pp. 1-10

Author(s):

Zoe Guan ◽

Ronglai Shen ◽

Colin B. Begg

Keyword(s):

Rare Variant ◽

Rare Variants ◽

Association Studies ◽

The Cancer Genome Atlas ◽

Considerable Proportion ◽

Genome Wide Association Studies ◽

Sequencing Data ◽

Risk Variants ◽

Cancer Types ◽

Pan Cancer

Background: Many cancer types show considerable heritability, and extensive research has been done to identify germline susceptibility variants. Linkage studies have discovered many rare high-risk variants, and genome-wide association studies (GWAS) have discovered many common low-risk variants. However, it is believed that a considerable proportion of the heritability of cancer remains unexplained by known susceptibility variants. The “rare variant hypothesis” proposes that much of the missing heritability lies in rare variants that cannot reliably be detected by linkage analysis or GWAS. Until recently, high sequencing costs have precluded extensive surveys of rare variants, but technological advances have now made it possible to analyze rare variants on a much greater scale. Objectives: In this study, we investigated associations between rare variants and 14 cancer types. Methods: We ran association tests using whole-exome sequencing data from The Cancer Genome Atlas (TCGA) and validated the findings using data from the Pan-Cancer Analysis of Whole Genomes Consortium (PCAWG). Results: We identified four significant associations in TCGA, only one of which was replicated in PCAWG (BRCA1 and ovarian cancer). Conclusions: Our results provide little evidence in favor of the rare variant hypothesis. Much larger sample sizes may be needed to detect undiscovered rare cancer variants.

Download Full-text

UTILIZATION OF WHOLE EXOME SEQUENCING DATA TO IDENTIFY CLINICALLY RELEVANT PHARMACOGENOMIC VARIANTS IN INFLAMMATORY BOWEL DISEASE

Gastroenterology ◽

10.1053/j.gastro.2021.01.119 ◽

2021 ◽

Vol 160 (3) ◽

pp. S43

Author(s):

Daniel Mulder ◽

Sam Khalouei ◽

Neil Warner ◽

Claudia Gonzaga-Jauregui ◽

Peter Church ◽

...

Keyword(s):

Inflammatory Bowel Disease ◽

Exome Sequencing ◽

Whole Exome Sequencing ◽

Bowel Disease ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data ◽

Inflammatory Bowel

Download Full-text