scholarly journals False discovery control in large-scale spatial multiple testing

Author(s):  
Wenguang Sun ◽  
Brian J. Reich ◽  
T. Tony Cai ◽  
Michele Guindani ◽  
Armin Schwartzman
2018 ◽  
Vol 113 (523) ◽  
pp. 1172-1183 ◽  
Author(s):  
Pallavi Basu ◽  
T. Tony Cai ◽  
Kiranmoy Das ◽  
Wenguang Sun

2019 ◽  
Author(s):  
David C. Handler ◽  
Paul A. Haynes

AbstractThe multiple testing problem is a well-known statistical stumbling block in high-throughput data analysis, where large scale repetition of statistical methods introduces unwanted noise into the results. While approaches exist to overcome the multiple testing problem, these methods focus on theoretical statistical clarification rather than incorporating experimentally-derived measures to ensure appropriately tailored analysis parameters. Here, we introduce a method for estimating inter-replicate variability in reference samples for a quantitative proteomics experiment using permutation analysis. This can function as a modulator to multiple testing corrections such as the Benjamini-Hochberg ordered Q value test. We refer to this as a ‘same-same’ analysis, since this method incorporates the use of six biological replicates of the reference sample and determines, through non-redundant triplet pairwise comparisons, the level of quantitative noise inherent within the system. The method can be used to produce an experiment-specific Q value cut-off that achieves a specified false discovery rate at the quantitation level, such as 1%. The same-same method is applicable to any experimental set that incorporates six replicates of a reference sample. To facilitate access to this approach, we have developed a same-same analysis R module that is freely available and ready to use via the internet.


2016 ◽  
Vol 2016 ◽  
pp. 1-7
Author(s):  
Oluyemi Oyeniran ◽  
Hanfeng Chen

The problem of estimating the proportion, π0, of the true null hypotheses in a multiple testing problem is important in cases where large scale parallel hypotheses tests are performed independently. While the problem is a quantity of interest in its own right in applications, the estimate of π0 can be used for assessing or controlling an overall false discovery rate. In this article, we develop an innovative nonparametric maximum likelihood approach to estimate π0. The nonparametric likelihood is proposed to be restricted to multinomial models and an EM algorithm is also developed to approximate the estimate of π0. Simulation studies show that the proposed method outperforms other existing methods. Using experimental microarray datasets, we demonstrate that the new method provides satisfactory estimate in practice.


2011 ◽  
Vol 311-313 ◽  
pp. 1661-1666
Author(s):  
Pei Jin ◽  
Jian Zhang

Several biomaterials have been widely used in the treatment of cancer. However, how these biomaterials alter gene expression is poorly understood. The problem of identifying genes that are differentially expressed across varying biological conditions or in response to different biomaterials based on microarray data is a typical multiple testing problem. In this paper, we focus on FDR control for large-scale multiple testing problems, by our proposed statistics and resampling method, a powerful FDR controlling procedure for large-scale multiple testing problems is provided. Simulations show that, our Fiducial estimator is accurate and stable than other five traditional methods, with satisfactory FDR control. In particular, we propose a generally applicable estimate of the proposed procedure for identifying differentially expressed genes in microarray experiments. This microarray method consistently shows favorable performance over the existing methods. For example, in testing for differential expression between two breast cancer tumor types, the proposed procedure provides increases from 37% to 127% in the number of genes called significant at a false discovery rate of 3%.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Sangyoon Yi ◽  
Xianyang Zhang ◽  
Lu Yang ◽  
Jinyan Huang ◽  
Yuanhang Liu ◽  
...  

AbstractOne challenge facing omics association studies is the loss of statistical power when adjusting for confounders and multiple testing. The traditional statistical procedure involves fitting a confounder-adjusted regression model for each omics feature, followed by multiple testing correction. Here we show that the traditional procedure is not optimal and present a new approach, 2dFDR, a two-dimensional false discovery rate control procedure, for powerful confounder adjustment in multiple testing. Through extensive evaluation, we demonstrate that 2dFDR is more powerful than the traditional procedure, and in the presence of strong confounding and weak signals, the power improvement could be more than 100%.


2000 ◽  
Vol 25 (1) ◽  
pp. 60-83 ◽  
Author(s):  
Yoav Benjamini ◽  
Yosef Hochberg

A new approach to problems of multiple significance testing was presented in Benjamini and Hochberg (1995), which calls for controlling the expected ratio of the number of erroneous rejections to the number of rejections–the False Discovery Rate (FDR). The procedure given there was shown to control the FDR for independent test statistics. When some of the hypotheses are in fact false, that procedure is too conservative. We present here an adaptive procedure, where the number of true null hypotheses is estimated first as in Hochberg and Benjamini (1990), and this estimate is used in the procedure of Benjamini and Hochberg (1995). The result is still a simple stepwise procedure, to which we also give a graphical companion. The new procedure is used in several examples drawn from educational and behavioral studies, addressing problems in multi-center studies, subset analysis and meta-analysis. The examples vary in the number of hypotheses tested, and the implication of the new procedure on the conclusions. In a large simulation study of independent test statistics the adaptive procedure is shown to control the FDR and have substantially better power than the previously suggested FDR controlling method, which by itself is more powerful than the traditional family wise error-rate controlling methods. In cases where most of the tested hypotheses are far from being true there is hardly any penalty due to the simultaneous testing of many hypotheses.


2017 ◽  
Vol 46 (6) ◽  
pp. 284-292 ◽  
Author(s):  
Denis G. Dumas ◽  
Daniel M. McNeish

Single-timepoint educational measurement practices are capable of assessing student ability at the time of testing but are not designed to be informative of student capacity for developing in any particular academic domain, despite commonly being used in such a manner. For this reason, such measurement practice systematically underestimates the potential of students from nondominant socioeconomic or ethnic groups, who may not have had adequate opportunity to develop various academic skills but can nonetheless do so in the future. One long-standing approach to the partial rectification of this issue is dynamic assessment (DA), a technique that features multiple testing occasions integrated with learning opportunities. However, DA is extremely resource intensive to incorporate into educational assessment practice and cannot be applied to extant large-scale data sets. In this article, the authors describe a recently developed statistical technique, dynamic measurement modeling (DMM), which is capable of estimating quantities associated with DA—including student capacity for learning a particular skill—from existing large-scale longitudinal assessment data, allowing the core concepts of DA to be scaled up for use with secondary data sets such as those collected by Statewide Longitudinal Data Systems in the United States. The authors show that by considering several assessments over time, student capacity can be reliably estimated, and these capacity estimates are much less affected by student race/ethnicity, gender, and socioeconomic status than are single-timepoint assessment scores, thereby improving the consequential validity of measurement.


Circulation ◽  
2014 ◽  
Vol 129 (suppl_1) ◽  
Author(s):  
Tanika N Kelly ◽  
Xueli Yang ◽  
Dabeeru C Rao ◽  
Dongfeng Gu ◽  
James E Hixson ◽  
...  

We examined the associations of common variants in ENaC genes with BP changes and hypertension incidence in a longitudinal family study. A total of 2755 Han Chinese participants of the Genetic Epidemiology Network of Salt Sensitivity (GenSalt) baseline examination were eligible for the current study. The average of nine BP measurements obtained using a random-zero sphygmomanometer at the baseline and each of two follow-up visits were used for the current analysis. The associations of 43 tag-SNPs in ENaC genes with BP changes and hypertension incidence were assessed using mixed models to account for the correlations of repeated measures among individuals and within families. A genotype by time interaction term was used to model differences in BP change according to genotype over time. The false discovery rate (FDR) method was used to adjust for multiple testing. During an average of 7.4 years follow-up, systolic BP (SBP) and diastolic BP (DBP) increased, and approximately 33% of participants developed hypertension. Multiple variants in ENaC genes were significantly associated with longitudinal changes in SBP after adjustment for multiple testing, including SCNN1A SNP rs11064153 ( P interaction = 5.8х10 -4 ; FDR-Q = 0.01); SCNN1G SNPs rs4299163 ( P interaction = 0.011; FDR-Q = 0.05) and rs4401050 ( P interaction = 0.001; FDR-Q = 0.01); and SCNN1B SNP rs1004749 ( P interaction = 0.006; FDR-Q = 0.03). Similar but not significant trends were observed for the associations between both rs11064153 and rs4401050 and DBP changes ( P interaction = 0.024 and 0.005, respectively; FDR-Q = 0.41 and 0.17, respectively) and between rs11604153 and hypertension incidence ( P = 0.017; FDR-Q = 0.19). Our findings indicate that ENaC genes may contribute to longitudinal BP changes in the Han Chinese population. Replication of these findings is warranted.


Sign in / Sign up

Export Citation Format

Share Document