Accurate error control in high dimensional association testing using conditional false discovery rates

Mapping Intimacies ◽

10.1101/414318 ◽

2018 ◽

Cited By ~ 1

Author(s):

James Liley ◽

Chris Wallace

Keyword(s):

Error Rate ◽

Error Control ◽

Association Studies ◽

New Method ◽

High Dimensional ◽

Biomedical Sciences ◽

Type 1 Error ◽

False Discovery ◽

Multiple Covariates

AbstractHigh-dimensional hypothesis testing is ubiquitous in the biomedical sciences, and informative covariates may be employed to improve power. The conditional false discovery rate (cFDR) is widely-used approach suited to the setting where the covariate is a set of p-values for the equivalent hypotheses for a second trait. Although related to the Benjamini-Hochberg procedure, it does not permit any easy control of type-1 error rate, and existing methods are over-conservative. We propose a new method for type-1 error rate control based on identifying mappings from the unit square to the unit interval defined by the estimated cFDR, and splitting observations so that each map is independent of the observations it is used to test. We also propose an adjustment to the existing cFDR estimator which further improves power. We show by simulation that the new method more than doubles potential improvement in power over unconditional analyses compared to existing methods. We demonstrate our method on transcriptome-wide association studies, and show that the method can be used in an iterative way, enabling the use of multiple covariates successively. Our methods substantially improve the power and applicability of cFDR analysis.

Download Full-text

Empirical Investigation of Type 1 Error Rate of Some Normality Test Statistics

International Journal of Psychosocial Rehabilitation ◽

10.37200/ijpr/v24i4/pr201037 ◽

2020 ◽

Vol 24 (04) ◽

pp. 591-599 ◽

Cited By ~ 1

Author(s):

John O Kuranga ◽

Kayode Ayinde ◽

Gbenga S. Solomon

Keyword(s):

Error Rate ◽

Empirical Investigation ◽

Test Statistics ◽

Type 1 Error ◽

Normality Test

Download Full-text

Maximum type 1 error rate inflation in multiarmed clinical trials with adaptive interim sample size modifications

Biometrical Journal ◽

10.1002/bimj.201300153 ◽

2014 ◽

Vol 56 (4) ◽

pp. 614-630 ◽

Cited By ~ 11

Author(s):

Alexandra C. Graf ◽

Peter Bauer ◽

Ekkehard Glimm ◽

Franz Koenig

Keyword(s):

Clinical Trials ◽

Sample Size ◽

Error Rate ◽

Type 1 Error ◽

Multiarmed Clinical Trials

Download Full-text

Empirical Investigation of Type 1 Error Rate of Univariate Tests of Normality

International Journal of Computer Applications ◽

10.5120/ijca2016911253 ◽

2016 ◽

Vol 148 (8) ◽

pp. 24-31

Author(s):

Kayode Ayinde ◽

John Olatunde ◽

Gbenga Sunday

Keyword(s):

Error Rate ◽

Empirical Investigation ◽

Type 1 Error

Download Full-text

Controlling the type 1 error rate in non-inferiority trials

Statistics in Medicine ◽

10.1002/sim.3072 ◽

2008 ◽

Vol 27 (3) ◽

pp. 371-381 ◽

Cited By ~ 20

Author(s):

Steven Snapinn ◽

Qi Jiang

Keyword(s):

Error Rate ◽

Type 1 Error

Download Full-text

Controlling type 1 error rates in genome-wide association studies in plants

Heredity ◽

10.1038/hdy.2012.101 ◽

2012 ◽

Vol 111 (1) ◽

pp. 86-87 ◽

Cited By ~ 5

Author(s):

A W George

Keyword(s):

Association Studies ◽

Error Rates ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Type 1 Error ◽

Genome Wide

Download Full-text

Statistical Power in Psychiatric Research

Australian & New Zealand Journal of Psychiatry ◽

10.3109/00048678609161331 ◽

1986 ◽

Vol 20 (2) ◽

pp. 189-200 ◽

Cited By ~ 19

Author(s):

Kevin D. Bird ◽

Wayne Hall

Keyword(s):

Sample Size ◽

Error Rate ◽

Statistical Power ◽

Error Rates ◽

Psychiatric Research ◽

Type 1 Error ◽

Type 2 Error ◽

Power Analyses

Statistical power is neglected in much psychiatric research, with the consequence that many studies do not provide a reasonable chance of detecting differences between groups if they exist in the population. This paper attempts to improve current practice by providing an introduction to the essential quantities required for performing a power analysis (sample size, effect size, type 1 and type 2 error rates). We provide simplified tables for estimating the sample size required to detect a specified size of effect with a type 1 error rate of α and a type 2 error rate of β, and for estimating the power provided by a given sample size for detecting a specified size of effect with a type 1 error rate of α. We show how to modify these tables to perform power analyses for multiple comparisons in univariate and some multivariate designs. Power analyses for each of these types of design are illustrated by examples.

Download Full-text

Robustness of testing procedures for confirmatory subpopulation analyses based on a continuous biomarker

Statistical Methods in Medical Research ◽

10.1177/0962280218777538 ◽

2018 ◽

Vol 28 (6) ◽

pp. 1879-1892 ◽

Cited By ~ 7

Author(s):

Alexandra Christine Graf ◽

Gernot Wassmer ◽

Tim Friede ◽

Roland Gerard Gera ◽

Martin Posch

Keyword(s):

Error Rate ◽

Dependence Structure ◽

Sequential Designs ◽

Group Sequential ◽

Testing Procedures ◽

Prognostic Effect ◽

Type 1 Error ◽

The Family ◽

Sequential Trials

With the advent of personalized medicine, clinical trials studying treatment effects in subpopulations are receiving increasing attention. The objectives of such studies are, besides demonstrating a treatment effect in the overall population, to identify subpopulations, based on biomarkers, where the treatment has a beneficial effect. Continuous biomarkers are often dichotomized using a threshold to define two subpopulations with low and high biomarker levels. If there is insufficient information on the dependence structure of the outcome on the biomarker, several thresholds may be investigated. The nested structure of such subpopulations is similar to the structure in group sequential trials. Therefore, it has been proposed to use the corresponding critical boundaries to test such nested subpopulations. We show that for biomarkers with a prognostic effect that is not adjusted for in the statistical model, the variability of the outcome may vary across subpopulations which may lead to an inflation of the family-wise type 1 error rate. Using simulations we quantify the potential inflation of testing procedures based on group sequential designs. Furthermore, alternative hypotheses tests that control the family-wise type 1 error rate under minimal assumptions are proposed. The methodological approaches are illustrated by a trial in depression.

Download Full-text

Magnitude Based Inference in Relation to One-sided Hypotheses Testing Procedures

10.31236/osf.io/pn9s3 ◽

2020 ◽

Author(s):

Janet Aisbett ◽

Daniel Lakens ◽

Kristin Sainani

Keyword(s):

Error Control ◽

Hypothesis Test ◽

A Priori ◽

Error Rates ◽

Equivalence Testing ◽

High Type ◽

Testing Procedures ◽

Type 1 Error ◽

Sample Size Calculations

Magnitude based inference (MBI) was widely adopted by sport science researchers as an alternative to null hypothesis significance tests. It has been criticized for lacking a theoretical framework, mixing Bayesian and frequentist thinking, and encouraging researchers to run small studies with high Type 1 error rates. MBI terminology describes the position of confidence intervals in relation to smallest meaningful effect sizes. We show these positions correspond to combinations of one-sided tests of hypotheses about the presence or absence of meaningful effects, and formally describe MBI as a multiple decision procedure. MBI terminology operates as if tests are conducted at multiple alpha levels. We illustrate how error rates can be controlled by limiting each one-sided hypothesis test to a single alpha level. To provide transparent error control in a Neyman-Pearson framework and encourage the use of standard statistical software, we recommend replacing MBI with one-sided tests against smallest meaningful effects, or pairs of such tests as in equivalence testing. Researchers should pre-specify their hypotheses and alpha levels, perform a priori sample size calculations, and justify all assumptions. Our recommendations show researchers what tests to use and how to design and report their statistical analyses to accord with standard frequentist practice.

Download Full-text

Accurate error control in high‐dimensional association testing using conditional false discovery rates

Biometrical Journal ◽

10.1002/bimj.201900254 ◽

2021 ◽

Author(s):

James Liley ◽

Chris Wallace

Keyword(s):

Error Control ◽

High Dimensional ◽

False Discovery Rates ◽

Association Testing ◽

False Discovery ◽

Discovery Rates

Download Full-text

EraSOR: Erase Sample Overlap in polygenic score analyses

10.1101/2021.12.10.472164 ◽

2021 ◽

Author(s):

Shing Wan Choi ◽

Timothy Shin Heng Mak ◽

Clive J. Hoggart ◽

Paul F. O'Reilly

Keyword(s):

Association Studies ◽

Polygenic Risk Score ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Uk Biobank ◽

Type 1 Error ◽

Wide Range ◽

Close Relatedness ◽

Target Data

Background: Polygenic risk score (PRS) analyses are now routinely applied in biomedical research, with great hope that they will aid in our understanding of disease aetiology and contribute to personalized medicine. The continued growth of multi-cohort genome-wide association studies (GWASs) and large-scale biobank projects has provided researchers with a wealth of GWAS summary statistics and individual-level data suitable for performing PRS analyses. However, as the size of these studies increase, the risk of inter-cohort sample overlap and close relatedness increases. Ideally sample overlap would be identified and removed directly, but this is typically not possible due to privacy laws or consent agreements. This sample overlap, whether known or not, is a major problem in PRS analyses because it can lead to inflation of type 1 error and, thus, erroneous conclusions in published work. Results: Here, for the first time, we report the scale of the sample overlap problem for PRS analyses by generating known sample overlap across sub-samples of the UK Biobank data, which we then use to produce GWAS and target data to mimic the effects of inter-cohort sample overlap. We demonstrate that inter-cohort overlap results in a significant and often substantial inflation in the observed PRS-trait association, coefficient of determination (R2) and false-positive rate. This inflation can be high even when the absolute number of overlapping individuals is small if this makes up a notable fraction of the target sample. We develop and introduce EraSOR (Erase Sample Overlap and Relatedness), a software for adjusting inflation in PRS prediction and association statistics in the presence of sample overlap or close relatedness between the GWAS and target samples. A key component of the EraSOR approach is inference of the degree of sample overlap from the intercept of a bivariate LD score regression applied to the GWAS and target data, making it powered in settings where both have sample sizes over 1,000 individuals. Through extensive benchmarking using UK Biobank and HapGen2 simulated genotype-phenotype data, we demonstrate that PRSs calculated using EraSOR-adjusted GWAS summary statistics are robust to inter-cohort overlap in a wide range of realistic scenarios and are even robust to high levels of residual genetic and environmental stratification. Conclusion: The results of all PRS analyses for which sample overlap cannot be definitively ruled out should be considered with caution given high type 1 error observed in the presence of even low overlap between base and target cohorts. Given the strong performance of EraSOR in eliminating inflation caused by sample overlap in PRS studies with large (>5k) target samples, we recommend that EraSOR be used in all future such PRS studies to mitigate the potential effects of inter-cohort overlap and close relatedness.

Download Full-text