Robust inference from multiple test statistics via permutations: a better alternative to the single test statistics approach for randomized trials

Jitendra Ganju; Xinxin Yu; Guoguang Julie Ma

doi:10.1002/pst.1735

Robust inference from multiple test statistics via permutations: a better alternative to the single test statistic approach for randomized trials

Pharmaceutical Statistics ◽

10.1002/pst.1582 ◽

2013 ◽

Vol 12 (5) ◽

pp. 282-290 ◽

Cited By ~ 8

Author(s):

Jitendra Ganju ◽

Xinxin Yu ◽

Guoguang Julie Ma

Keyword(s):

Randomized Trials ◽

Robust Inference ◽

Test Statistics ◽

Single Test ◽

Test Statistic ◽

Multiple Test ◽

Statistic Approach

Download Full-text

Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics

Journal of Statistical Planning and Inference ◽

10.1016/s0378-3758(99)00041-5 ◽

1999 ◽

Vol 82 (1-2) ◽

pp. 171-196 ◽

Cited By ~ 325

Author(s):

Daniel Yekutieli ◽

Yoav Benjamini

Keyword(s):

False Discovery Rate ◽

Test Statistics ◽

Test Procedures ◽

Multiple Test ◽

False Discovery

Download Full-text

Replicating Anomalies

Review of Financial Studies ◽

10.1093/rfs/hhy131 ◽

2018 ◽

Vol 33 (5) ◽

pp. 2019-2133 ◽

Cited By ~ 94

Author(s):

Kewei Hou ◽

Chen Xue ◽

Lu Zhang

Keyword(s):

Failure Rate ◽

Capital Markets ◽

Empirical Finance ◽

Single Test ◽

Significance Level ◽

Multiple Test ◽

Oxford University ◽

Editorial Decision ◽

Published Paper ◽

Data Library

Abstract Most anomalies fail to hold up to currently acceptable standards for empirical finance. With microcaps mitigated via NYSE breakpoints and value-weighted returns, 65% of the 452 anomalies in our extensive data library, including 96% of the trading frictions category, cannot clear the single test hurdle of the absolute $t$-value of 1.96. Imposing the higher multiple test hurdle of 2.78 at the 5% significance level raises the failure rate to 82%. Even for replicated anomalies, their economic magnitudes are much smaller than originally reported. In all, capital markets are more efficient than previously recognized. Received June 12, 2017; editorial decision October 29, 2018 by Editor Stijn Van Nieuwerburgh. Authors have furnished an Internet Appendix, which is available on the Oxford University Press Web site next to the link to the final published paper online.

Download Full-text

Doubly robust inference for targeted minimum loss-based estimation in randomized trials with missing outcome data

Statistics in Medicine ◽

10.1002/sim.7389 ◽

2017 ◽

Vol 36 (24) ◽

pp. 3807-3819 ◽

Cited By ~ 4

Author(s):

Iván Díaz ◽

Mark J. van der Laan

Keyword(s):

Randomized Trials ◽

Outcome Data ◽

Robust Inference ◽

Minimum Loss ◽

Doubly Robust

Download Full-text

Improved inference on the rank of a matrix

Quantitative Economics ◽

10.3982/qe1139 ◽

2019 ◽

Vol 10 (4) ◽

pp. 1787-1824 ◽

Cited By ~ 3

Author(s):

Qihui Chen ◽

Zheng Fang

Keyword(s):

Size Control ◽

Tuning Parameter ◽

Test Statistics ◽

Limiting Distributions ◽

Rank Tests ◽

First Order ◽

Multiple Test ◽

Step Procedure ◽

Rank Of A Matrix ◽

Normal Rank

This paper develops a general framework for conducting inference on the rank of an unknown matrix Π 0. A defining feature of our setup is the null hypothesis of the form H 0 : rank ( Π 0 ) ≤ r . The problem is of first‐order importance because the previous literature focuses on H 0 ′ : rank ( Π 0 ) = r by implicitly assuming away rank ( Π 0 ) < r , which may lead to invalid rank tests due to overrejections. In particular, we show that limiting distributions of test statistics under H 0 ′ may not stochastically dominate those under rank ( Π 0 ) < r . A multiple test on the nulls rank ( Π 0 ) = 0 , … , r , though valid, may be substantially conservative. We employ a testing statistic whose limiting distributions under H 0 are highly nonstandard due to the inherent irregular natures of the problem, and then construct bootstrap critical values that deliver size control and improved power. Since our procedure relies on a tuning parameter, a two‐step procedure is designed to mitigate concerns on this nuisance. We additionally argue that our setup is also important for estimation. We illustrate the empirical relevance of our results through testing identification in linear IV models that allows for clustered data and inference on sorting dimensions in a two‐sided matching model with transferrable utility.

Download Full-text

Diagnostic methods for uncovering outcome dependent visit processes

Biostatistics ◽

10.1093/biostatistics/kxy068 ◽

2018 ◽

Vol 21 (3) ◽

pp. 483-498 ◽

Cited By ~ 1

Author(s):

Charles E McCulloch ◽

John M Neuhaus

Keyword(s):

Generalized Estimating Equations ◽

Randomized Trials ◽

Diagnostic Methods ◽

Regression Coefficients ◽

Score Statistic ◽

Test Statistics ◽

Health Records ◽

Significant Bias ◽

Outcome Dependence ◽

Regular Practice

Summary With the advent of electronic health records, information collected in the course of regular health care is increasingly being used for clinical research. The hope is that the wealth of clinical data and the realistic setting (compared with information derived from highly controlled experiments like randomized trials) will aid in the investigation of determinants of disease and understanding of which treatments are effective in regular practice and for which patients. The availability of information in such databases is often driven by how a patient feels and may therefore be associated with the health outcomes being considered. We call this an outcome dependent visit process and recent work has shown that ignoring the outcome dependence can produce significant bias in the regression coefficients when fitting longitudinal data models. It is therefore important to have tools to recognize datasets exhibiting outcome dependence. We develop a score statistic to motivate the form of diagnostic test statistics, suggest a variety of approaches for diagnosing such situations, and evaluate their performance. Simple diagnostic tests achieve high power for diagnosing outcome dependent visit processes. This occurs when generalized estimating equations methods begin to be exhibit bias in estimating regression coefficients and before likelihood based methods are substantially biased.

Download Full-text

Optimizing effective numbers of tests by vine copula modeling

Dependence Modeling ◽

10.1515/demo-2020-0010 ◽

2020 ◽

Vol 8 (1) ◽

pp. 172-185

Author(s):

Nico Steffen ◽

Thorsten Dickhaus

Keyword(s):

Multiple Testing ◽

Test Statistics ◽

Effective Number ◽

Significance Level ◽

Multiple Test ◽

Family Wise Error Rate ◽

Copula Modeling ◽

The Family ◽

Vine Copula

AbstractIn the multiple testing context, we utilize vine copulae for optimizing the effective number of tests. It is well known that for the calibration of multiple tests for control of the family-wise error rate the dependencies between the marginal tests are of utmost importance. It has been shown in previous work, that positive dependencies between the marginal tests can be exploited in order to derive a relaxed Šidák-type multiplicity correction. This correction can conveniently be expressed by calculating the corresponding „effective number of tests“ for a given (global) significance level. This methodology can also be applied to blocks of test statistics so that the effective number of tests can be calculated by the sum of the effective numbers of tests for each block. In the present work, we demonstrate how the power of the multiple test can be optimized by taking blocks with high inner-block dependencies. The determination of those blocks will be performed by means of an estimated vine copula model. An algorithm is presented which uses the information of the estimated vine copula to make a data-driven choice of appropriate blocks in terms of (estimated) dependencies. Numerical experiments demonstrate the usefulness of the proposed approach.

Download Full-text

Confidence intervals for policy evaluation in adaptive experiments

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2014602118 ◽

2021 ◽

Vol 118 (15) ◽

pp. e2014602118

Author(s):

Vitor Hadad ◽

David A. Hirshberg ◽

Ruohan Zhan ◽

Stefan Wager ◽

Susan Athey

Keyword(s):

Randomized Trials ◽

Mean Squared Error ◽

Test Statistics ◽

Test Statistic ◽

Asymptotically Normal ◽

Squared Error ◽

Normal Test ◽

Weighted Means ◽

Heavy Tailed ◽

Inverse Propensity Weighting

Adaptive experimental designs can dramatically improve efficiency in randomized trials. But with adaptively collected data, common estimators based on sample means and inverse propensity-weighted means can be biased or heavy-tailed. This poses statistical challenges, in particular when the experimenter would like to test hypotheses about parameters that were not targeted by the data-collection mechanism. In this paper, we present a class of test statistics that can handle these challenges. Our approach is to adaptively reweight the terms of an augmented inverse propensity-weighting estimator to control the contribution of each term to the estimator’s variance. This scheme reduces overall variance and yields an asymptotically normal test statistic. We validate the accuracy of the resulting estimates and their CIs in numerical experiments and show that our methods compare favorably to existing alternatives in terms of mean squared error, coverage, and CI size.

Download Full-text

Asymptotic comparison of step-down and step-up multiple test procedures based on exchangeable test statistics

10.1214/aos/1028144847 ◽

1998 ◽

Vol 26 (2) ◽

pp. 505-524 ◽

Cited By ~ 15

Author(s):

H. Finner ◽

M. Roters

Keyword(s):

Test Statistics ◽

Test Procedures ◽

Multiple Test ◽

Step Down

Download Full-text

OPTIMALITY FOR THE INTEGRATED CONDITIONAL MOMENT TEST

Econometric Theory ◽

10.1017/s026646669915504x ◽

1999 ◽

Vol 15 (5) ◽

pp. 710-718 ◽

Cited By ~ 11

Author(s):

Wm. Brent Boning ◽

Fallaw Sowell

Keyword(s):

Function Space ◽

Random Function ◽

Test Statistics ◽

Single Test ◽

Test Statistic ◽

Conditional Moment ◽

Conditional Mean ◽

Basis Element

This paper proposes a version of the integrated conditional moment (ICM) test that is optimal for a class of composite alternatives. The ICM test is built on the fact that a random function based on a correctly specified model should have zero mean, whereas any misspecification in the conditional mean implies a divergent mean for the random function. We derive test statistics that are optimal for each basis element of an orthonormal decomposition of the function space for which the random function is an element. We then use a weighted summation of these test statistics to compose the single test statistic that is optimal for any pair of alternatives that are symmetric about zero. This test is equivalent to using a particular measure in the ICM test of Bierens and Ploberger (1997, Econometrica 65, 1129–1152).

Download Full-text