Estimation of false discovery proportion in multiple testing: From normal to chi-squared test statistics

Lilun Du; Chunming Zhang

doi:10.1214/17-ejs1256

On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics

Journal of Educational and Behavioral Statistics ◽

10.3102/10769986025001060 ◽

2000 ◽

Vol 25 (1) ◽

pp. 60-83 ◽

Cited By ~ 812

Author(s):

Yoav Benjamini ◽

Yosef Hochberg

Keyword(s):

False Discovery Rate ◽

Multiple Testing ◽

Meta Analysis ◽

Test Statistics ◽

Adaptive Procedure ◽

New Approach ◽

False Discovery ◽

Behavioral Studies ◽

Simultaneous Testing ◽

Independent Test

A new approach to problems of multiple significance testing was presented in Benjamini and Hochberg (1995), which calls for controlling the expected ratio of the number of erroneous rejections to the number of rejections–the False Discovery Rate (FDR). The procedure given there was shown to control the FDR for independent test statistics. When some of the hypotheses are in fact false, that procedure is too conservative. We present here an adaptive procedure, where the number of true null hypotheses is estimated first as in Hochberg and Benjamini (1990), and this estimate is used in the procedure of Benjamini and Hochberg (1995). The result is still a simple stepwise procedure, to which we also give a graphical companion. The new procedure is used in several examples drawn from educational and behavioral studies, addressing problems in multi-center studies, subset analysis and meta-analysis. The examples vary in the number of hypotheses tested, and the implication of the new procedure on the conclusions. In a large simulation study of independent test statistics the adaptive procedure is shown to control the FDR and have substantially better power than the previously suggested FDR controlling method, which by itself is more powerful than the traditional family wise error-rate controlling methods. In cases where most of the tested hypotheses are far from being true there is hardly any penalty due to the simultaneous testing of many hypotheses.

Download Full-text

Fast closed testing for exchangeable local tests

Biometrika ◽

10.1093/biomet/asz082 ◽

2020 ◽

Vol 107 (3) ◽

pp. 761-768 ◽

Cited By ~ 1

Author(s):

E Dobriban

Keyword(s):

Error Rate ◽

Multiple Testing ◽

Exact Algorithm ◽

Multiple Hypothesis Testing ◽

Test Statistics ◽

Familywise Error Rate ◽

Closure Principle ◽

Worst Case ◽

Higher Criticism ◽

False Discovery

Summary Multiple hypothesis testing problems arise naturally in science. This note introduces a new fast closed testing method for multiple testing which controls the familywise error rate. Controlling the familywise error rate is state-of-the-art in many important application areas and is preferred over false discovery rate control for many reasons, including that it leads to stronger reproducibility. The closure principle rejects an individual hypothesis if all global nulls of subsets containing it are rejected using some test statistics. It takes exponential time in the worst case. When the tests are symmetric and monotone, the proposed method is an exact algorithm for computing the closure, is quadratic in the number of tests, and is linear in the number of discoveries. Our framework generalizes most examples of closed testing, such as Holm’s method and the Bonferroni method. As a special case of the method, we propose the Simes and higher criticism fusion test, which is powerful both for detecting a few strong signals and for detecting many moderate signals.

Download Full-text

Control of the False Discovery Proportion for Independently Tested Null Hypotheses

Journal of Probability and Statistics ◽

10.1155/2012/320425 ◽

2012 ◽

Vol 2012 ◽

pp. 1-19 ◽

Cited By ~ 2

Author(s):

Yongchao Ge ◽

Xiaochun Li

Keyword(s):

Multiple Testing ◽

Asymptotic Theory ◽

Random Variable ◽

Convergence Problem ◽

Finite Samples ◽

False Discovery ◽

Classical Procedure ◽

Multiple Testing Problem ◽

Upper Confidence Bound ◽

False Discovery Proportion

Consider the multiple testing problem of testingmnull hypothesesH1,…,Hm, among whichm0hypotheses are truly null. Given theP-values for each hypothesis, the question of interest is how to combine theP-values to find out which hypotheses are false nulls and possibly to make a statistical inference onm0. Benjamini and Hochberg proposed a classical procedure that can control the false discovery rate (FDR). The FDR control is a little bit unsatisfactory in that it only concerns the expectation of the false discovery proportion (FDP). The control of the actual random variable FDP has recently drawn much attention. For any level1−α, this paper proposes a procedure to construct an upper prediction bound (UPB) for the FDP for a fixed rejection region. When1−α=50%, our procedure is very close to the classical Benjamini and Hochberg procedure. Simultaneous UPBs for all rejection regions' FDPs and the upper confidence bound for the unknownm0are presented consequently. This new proposed procedure works for finite samples and hence avoids the slow convergence problem of the asymptotic theory.

Download Full-text

Sample Size Calculation for Controlling False Discovery Proportion

Journal of Probability and Statistics ◽

10.1155/2012/817948 ◽

2012 ◽

Vol 2012 ◽

pp. 1-13 ◽

Cited By ~ 1

Author(s):

Shulian Shang ◽

Qianhe Zhou ◽

Mengling Liu ◽

Yongzhao Shao

Keyword(s):

Sample Size ◽

Multiple Testing ◽

Sample Size Calculation ◽

Design Parameters ◽

Cancer Dataset ◽

False Discovery ◽

The Mean ◽

Study Designs ◽

The Relationship ◽

False Discovery Proportion

The false discovery proportion (FDP), the proportion of incorrect rejections among all rejections, is a direct measure of abundance of false positive findings in multiple testing. Many methods have been proposed to control FDP, but they are too conservative to be useful for power analysis. Study designs for controlling the mean of FDP, which is false discovery rate, have been commonly used. However, there has been little attempt to design study with direct FDP control to achieve certain level of efficiency. We provide a sample size calculation method using the variance formula of the FDP under weak-dependence assumptions to achieve the desired overall power. The relationship between design parameters and sample size is explored. The adequacy of the procedure is assessed by simulation. We illustrate the method using estimated correlations from a prostate cancer dataset.

Download Full-text

2dFDR: a new approach to confounder adjustment substantially increases detection power in omics association studies

Genome Biology ◽

10.1186/s13059-021-02418-8 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Sangyoon Yi ◽

Xianyang Zhang ◽

Lu Yang ◽

Jinyan Huang ◽

Yuanhang Liu ◽

...

Keyword(s):

Multiple Testing ◽

Statistical Power ◽

Association Studies ◽

Control Procedure ◽

Multiple Testing Correction ◽

New Approach ◽

False Discovery ◽

Traditional Procedure ◽

Extensive Evaluation ◽

Confounder Adjustment

AbstractOne challenge facing omics association studies is the loss of statistical power when adjusting for confounders and multiple testing. The traditional statistical procedure involves fitting a confounder-adjusted regression model for each omics feature, followed by multiple testing correction. Here we show that the traditional procedure is not optimal and present a new approach, 2dFDR, a two-dimensional false discovery rate control procedure, for powerful confounder adjustment in multiple testing. Through extensive evaluation, we demonstrate that 2dFDR is more powerful than the traditional procedure, and in the presence of strong confounding and weak signals, the power improvement could be more than 100%.

Download Full-text

Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics

Journal of Statistical Planning and Inference ◽

10.1016/s0378-3758(99)00041-5 ◽

1999 ◽

Vol 82 (1-2) ◽

pp. 171-196 ◽

Cited By ~ 325

Author(s):

Daniel Yekutieli ◽

Yoav Benjamini

Keyword(s):

False Discovery Rate ◽

Test Statistics ◽

Test Procedures ◽

Multiple Test ◽

False Discovery

Download Full-text

Identifying signals of potentially harmful medications in pregnancy: use of the double false discovery rate method to adjust for multiple testing

British Journal of Clinical Pharmacology ◽

10.1111/bcp.13799 ◽

2018 ◽

Vol 85 (2) ◽

pp. 356-365 ◽

Cited By ~ 1

Author(s):

Alana Cavadino ◽

David Prieto‐Merino ◽

Joan K. Morris

Keyword(s):

False Discovery Rate ◽

Multiple Testing ◽

False Discovery ◽

False Discovery Rate Method ◽

Rate Method ◽

In Pregnancy

Download Full-text

A Tight Prediction Interval for False Discovery Proportion under Dependence

Open Journal of Statistics ◽

10.4236/ojs.2012.22018 ◽

2012 ◽

Vol 02 (02) ◽

pp. 163-171 ◽

Cited By ~ 1

Author(s):

Shulian Shang ◽

Mengling Liu ◽

Yongzhao Shao

Keyword(s):

Prediction Interval ◽

False Discovery ◽

False Discovery Proportion

Download Full-text

A note on control of the false discovery proportion

Applicationes Mathematicae ◽

10.4064/am36-4-2 ◽

2009 ◽

Vol 36 (4) ◽

pp. 397-418

Author(s):

Marcin Dudziński ◽

Konrad Furmańczyk

Keyword(s):

False Discovery ◽

False Discovery Proportion

Download Full-text

An approach to gene-based testing accounting for dependence of tests among nearby genes

10.1101/2021.05.24.445494 ◽

2021 ◽

Author(s):

Ronald J Yurko ◽

Kathryn Roeder ◽

Bernie Devlin ◽

Max G'Sell

Keyword(s):

Multiple Testing ◽

Association Studies ◽

Autism Spectrum ◽

P Value ◽

Genome Wide Association Studies ◽

Strongly Correlated ◽

Test Statistics ◽

Test Statistic ◽

Genome Wide ◽

Insight Into

In genome-wide association studies (GWAS), it has become commonplace to test millions of SNPs for phenotypic association. Gene-based testing can improve power to detect weak signal by reducing multiple testing and pooling signal strength. While such tests account for linkage disequilibrium (LD) structure of SNP alleles within each gene, current approaches do not capture LD of SNPs falling in different nearby genes, which can induce correlation of gene-based test statistics. We introduce an algorithm to account for this correlation. When a gene's test statistic is independent of others, it is assessed separately; when test statistics for nearby genes are strongly correlated, their SNPs are agglomerated and tested as a locus. To provide insight into SNPs and genes driving association within loci, we develop an interactive visualization tool to explore localized signal. We demonstrate our approach in the context of weakly powered GWAS for autism spectrum disorder, which is contrasted to more highly powered GWAS for schizophrenia and educational attainment. To increase power for these analyses, especially those for autism, we use adaptive p-value thresholding (AdaPT), guided by high-dimensional metadata modeled with gradient boosted trees, highlighting when and how it can be most useful. Notably our workflow is based on summary statistics.

Download Full-text