False discovery control in large-scale spatial multiple testing

Wenguang Sun; Brian J. Reich; T. Tony Cai; Michele Guindani; Armin Schwartzman

doi:10.1111/rssb.12064

Weighted False Discovery Rate Control in Large-Scale Multiple Testing

Journal of the American Statistical Association ◽

10.1080/01621459.2017.1336443 ◽

2018 ◽

Vol 113 (523) ◽

pp. 1172-1183 ◽

Cited By ~ 13

Author(s):

Pallavi Basu ◽

T. Tony Cai ◽

Kiranmoy Das ◽

Wenguang Sun

Keyword(s):

False Discovery Rate ◽

Rate Control ◽

Multiple Testing ◽

Large Scale ◽

False Discovery Rate Control ◽

False Discovery

Download Full-text

An experimentally-derived measure of inter-replicate variation in reference samples: the same-same permutation methodology

10.1101/797217 ◽

2019 ◽

Cited By ~ 3

Author(s):

David C. Handler ◽

Paul A. Haynes

Keyword(s):

Multiple Testing ◽

Large Scale ◽

Reference Sample ◽

Q Value ◽

Testing Problem ◽

False Discovery ◽

Proteomics Experiment ◽

Multiple Testing Problem ◽

High Throughput Data Analysis ◽

Reference Samples

AbstractThe multiple testing problem is a well-known statistical stumbling block in high-throughput data analysis, where large scale repetition of statistical methods introduces unwanted noise into the results. While approaches exist to overcome the multiple testing problem, these methods focus on theoretical statistical clarification rather than incorporating experimentally-derived measures to ensure appropriately tailored analysis parameters. Here, we introduce a method for estimating inter-replicate variability in reference samples for a quantitative proteomics experiment using permutation analysis. This can function as a modulator to multiple testing corrections such as the Benjamini-Hochberg ordered Q value test. We refer to this as a ‘same-same’ analysis, since this method incorporates the use of six biological replicates of the reference sample and determines, through non-redundant triplet pairwise comparisons, the level of quantitative noise inherent within the system. The method can be used to produce an experiment-specific Q value cut-off that achieves a specified false discovery rate at the quantitation level, such as 1%. The same-same method is applicable to any experimental set that incorporates six replicates of a reference sample. To facilitate access to this approach, we have developed a same-same analysis R module that is freely available and ready to use via the internet.

Download Full-text

Estimating the Proportion of True Null Hypotheses in Multiple Testing Problems

Journal of Probability and Statistics ◽

10.1155/2016/3937056 ◽

2016 ◽

Vol 2016 ◽

pp. 1-7

Author(s):

Oluyemi Oyeniran ◽

Hanfeng Chen

Keyword(s):

Multiple Testing ◽

Large Scale ◽

Simulation Studies ◽

Multinomial Models ◽

Nonparametric Likelihood ◽

False Discovery ◽

Likelihood Approach ◽

Multiple Testing Problem ◽

True Null Hypotheses ◽

Microarray Datasets

The problem of estimating the proportion, π0, of the true null hypotheses in a multiple testing problem is important in cases where large scale parallel hypotheses tests are performed independently. While the problem is a quantity of interest in its own right in applications, the estimate of π0 can be used for assessing or controlling an overall false discovery rate. In this article, we develop an innovative nonparametric maximum likelihood approach to estimate π0. The nonparametric likelihood is proposed to be restricted to multinomial models and an EM algorithm is also developed to approximate the estimate of π0. Simulation studies show that the proposed method outperforms other existing methods. Using experimental microarray datasets, we demonstrate that the new method provides satisfactory estimate in practice.

Download Full-text

A Powerful Discovery Procedure for Large-Scale Significance Testing, with Application to Comparative Microarray Experiments in Response to Different Biomaterials

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.311-313.1661 ◽

2011 ◽

Vol 311-313 ◽

pp. 1661-1666

Author(s):

Pei Jin ◽

Jian Zhang

Keyword(s):

Multiple Testing ◽

Large Scale ◽

Differentially Expressed ◽

Microarray Experiments ◽

False Discovery ◽

Number Of Genes ◽

Resampling Method ◽

Cancer Tumor ◽

Multiple Testing Problem ◽

Tumor Types

Several biomaterials have been widely used in the treatment of cancer. However, how these biomaterials alter gene expression is poorly understood. The problem of identifying genes that are differentially expressed across varying biological conditions or in response to different biomaterials based on microarray data is a typical multiple testing problem. In this paper, we focus on FDR control for large-scale multiple testing problems, by our proposed statistics and resampling method, a powerful FDR controlling procedure for large-scale multiple testing problems is provided. Simulations show that, our Fiducial estimator is accurate and stable than other five traditional methods, with satisfactory FDR control. In particular, we propose a generally applicable estimate of the proposed procedure for identifying differentially expressed genes in microarray experiments. This microarray method consistently shows favorable performance over the existing methods. For example, in testing for differential expression between two breast cancer tumor types, the proposed procedure provides increases from 37% to 127% in the number of genes called significant at a false discovery rate of 3%.

Download Full-text

2dFDR: a new approach to confounder adjustment substantially increases detection power in omics association studies

Genome Biology ◽

10.1186/s13059-021-02418-8 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Sangyoon Yi ◽

Xianyang Zhang ◽

Lu Yang ◽

Jinyan Huang ◽

Yuanhang Liu ◽

...

Keyword(s):

Multiple Testing ◽

Statistical Power ◽

Association Studies ◽

Control Procedure ◽

Multiple Testing Correction ◽

New Approach ◽

False Discovery ◽

Traditional Procedure ◽

Extensive Evaluation ◽

Confounder Adjustment

AbstractOne challenge facing omics association studies is the loss of statistical power when adjusting for confounders and multiple testing. The traditional statistical procedure involves fitting a confounder-adjusted regression model for each omics feature, followed by multiple testing correction. Here we show that the traditional procedure is not optimal and present a new approach, 2dFDR, a two-dimensional false discovery rate control procedure, for powerful confounder adjustment in multiple testing. Through extensive evaluation, we demonstrate that 2dFDR is more powerful than the traditional procedure, and in the presence of strong confounding and weak signals, the power improvement could be more than 100%.

Download Full-text

On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics

Journal of Educational and Behavioral Statistics ◽

10.3102/10769986025001060 ◽

2000 ◽

Vol 25 (1) ◽

pp. 60-83 ◽

Cited By ~ 812

Author(s):

Yoav Benjamini ◽

Yosef Hochberg

Keyword(s):

False Discovery Rate ◽

Multiple Testing ◽

Meta Analysis ◽

Test Statistics ◽

Adaptive Procedure ◽

New Approach ◽

False Discovery ◽

Behavioral Studies ◽

Simultaneous Testing ◽

Independent Test

A new approach to problems of multiple significance testing was presented in Benjamini and Hochberg (1995), which calls for controlling the expected ratio of the number of erroneous rejections to the number of rejections–the False Discovery Rate (FDR). The procedure given there was shown to control the FDR for independent test statistics. When some of the hypotheses are in fact false, that procedure is too conservative. We present here an adaptive procedure, where the number of true null hypotheses is estimated first as in Hochberg and Benjamini (1990), and this estimate is used in the procedure of Benjamini and Hochberg (1995). The result is still a simple stepwise procedure, to which we also give a graphical companion. The new procedure is used in several examples drawn from educational and behavioral studies, addressing problems in multi-center studies, subset analysis and meta-analysis. The examples vary in the number of hypotheses tested, and the implication of the new procedure on the conclusions. In a large simulation study of independent test statistics the adaptive procedure is shown to control the FDR and have substantially better power than the previously suggested FDR controlling method, which by itself is more powerful than the traditional family wise error-rate controlling methods. In cases where most of the tested hypotheses are far from being true there is hardly any penalty due to the simultaneous testing of many hypotheses.

Download Full-text

Phase transition and regularized bootstrap in large-scale $t$-tests with false discovery rate control

The Annals of Statistics ◽

10.1214/14-aos1249 ◽

2014 ◽

Vol 42 (5) ◽

pp. 2003-2025 ◽

Cited By ~ 9

Author(s):

Weidong Liu ◽

Qi-Man Shao

Keyword(s):

Phase Transition ◽

False Discovery Rate ◽

Rate Control ◽

Large Scale ◽

False Discovery Rate Control ◽

False Discovery

Download Full-text

Dynamic Measurement Modeling: Using Nonlinear Growth Models to Estimate Student Learning Capacity

Educational Researcher ◽

10.3102/0013189x17725747 ◽

2017 ◽

Vol 46 (6) ◽

pp. 284-292 ◽

Cited By ~ 12

Author(s):

Denis G. Dumas ◽

Daniel M. McNeish

Keyword(s):

Multiple Testing ◽

Large Scale ◽

Growth Models ◽

Dynamic Assessment ◽

The United States ◽

Dynamic Measurement ◽

Data Sets ◽

Data Systems ◽

Longitudinal Assessment ◽

Assessment Practice

Single-timepoint educational measurement practices are capable of assessing student ability at the time of testing but are not designed to be informative of student capacity for developing in any particular academic domain, despite commonly being used in such a manner. For this reason, such measurement practice systematically underestimates the potential of students from nondominant socioeconomic or ethnic groups, who may not have had adequate opportunity to develop various academic skills but can nonetheless do so in the future. One long-standing approach to the partial rectification of this issue is dynamic assessment (DA), a technique that features multiple testing occasions integrated with learning opportunities. However, DA is extremely resource intensive to incorporate into educational assessment practice and cannot be applied to extant large-scale data sets. In this article, the authors describe a recently developed statistical technique, dynamic measurement modeling (DMM), which is capable of estimating quantities associated with DA—including student capacity for learning a particular skill—from existing large-scale longitudinal assessment data, allowing the core concepts of DA to be scaled up for use with secondary data sets such as those collected by Statewide Longitudinal Data Systems in the United States. The authors show that by considering several assessments over time, student capacity can be reliably estimated, and these capacity estimates are much less affected by student race/ethnicity, gender, and socioeconomic status than are single-timepoint assessment scores, thereby improving the consequential validity of measurement.

Download Full-text

Identifying signals of potentially harmful medications in pregnancy: use of the double false discovery rate method to adjust for multiple testing

British Journal of Clinical Pharmacology ◽

10.1111/bcp.13799 ◽

2018 ◽

Vol 85 (2) ◽

pp. 356-365 ◽

Cited By ~ 1

Author(s):

Alana Cavadino ◽

David Prieto‐Merino ◽

Joan K. Morris

Keyword(s):

False Discovery Rate ◽

Multiple Testing ◽

False Discovery ◽

False Discovery Rate Method ◽

Rate Method ◽

In Pregnancy

Download Full-text

Abstract P263: The Associations of Epithelial Sodium Channel Genes with Blood Pressure Changes and Hypertension Incidence: the GenSalt Study

Circulation ◽

10.1161/circ.129.suppl_1.p263 ◽

2014 ◽

Vol 129 (suppl_1) ◽

Author(s):

Tanika N Kelly ◽

Xueli Yang ◽

Dabeeru C Rao ◽

Dongfeng Gu ◽

James E Hixson ◽

...

Keyword(s):

Mixed Models ◽

Repeated Measures ◽

Multiple Testing ◽

Han Chinese ◽

Time Interaction ◽

False Discovery ◽

Pressure Changes ◽

Multiple Variants ◽

Baseline Examination

We examined the associations of common variants in ENaC genes with BP changes and hypertension incidence in a longitudinal family study. A total of 2755 Han Chinese participants of the Genetic Epidemiology Network of Salt Sensitivity (GenSalt) baseline examination were eligible for the current study. The average of nine BP measurements obtained using a random-zero sphygmomanometer at the baseline and each of two follow-up visits were used for the current analysis. The associations of 43 tag-SNPs in ENaC genes with BP changes and hypertension incidence were assessed using mixed models to account for the correlations of repeated measures among individuals and within families. A genotype by time interaction term was used to model differences in BP change according to genotype over time. The false discovery rate (FDR) method was used to adjust for multiple testing. During an average of 7.4 years follow-up, systolic BP (SBP) and diastolic BP (DBP) increased, and approximately 33% of participants developed hypertension. Multiple variants in ENaC genes were significantly associated with longitudinal changes in SBP after adjustment for multiple testing, including SCNN1A SNP rs11064153 ( P interaction = 5.8х10 -4 ; FDR-Q = 0.01); SCNN1G SNPs rs4299163 ( P interaction = 0.011; FDR-Q = 0.05) and rs4401050 ( P interaction = 0.001; FDR-Q = 0.01); and SCNN1B SNP rs1004749 ( P interaction = 0.006; FDR-Q = 0.03). Similar but not significant trends were observed for the associations between both rs11064153 and rs4401050 and DBP changes ( P interaction = 0.024 and 0.005, respectively; FDR-Q = 0.41 and 0.17, respectively) and between rs11604153 and hypertension incidence ( P = 0.017; FDR-Q = 0.19). Our findings indicate that ENaC genes may contribute to longitudinal BP changes in the Han Chinese population. Replication of these findings is warranted.

Download Full-text