scholarly journals An experimentally-derived measure of inter-replicate variation in reference samples: the same-same permutation methodology

2019 ◽  
Author(s):  
David C. Handler ◽  
Paul A. Haynes

AbstractThe multiple testing problem is a well-known statistical stumbling block in high-throughput data analysis, where large scale repetition of statistical methods introduces unwanted noise into the results. While approaches exist to overcome the multiple testing problem, these methods focus on theoretical statistical clarification rather than incorporating experimentally-derived measures to ensure appropriately tailored analysis parameters. Here, we introduce a method for estimating inter-replicate variability in reference samples for a quantitative proteomics experiment using permutation analysis. This can function as a modulator to multiple testing corrections such as the Benjamini-Hochberg ordered Q value test. We refer to this as a ‘same-same’ analysis, since this method incorporates the use of six biological replicates of the reference sample and determines, through non-redundant triplet pairwise comparisons, the level of quantitative noise inherent within the system. The method can be used to produce an experiment-specific Q value cut-off that achieves a specified false discovery rate at the quantitation level, such as 1%. The same-same method is applicable to any experimental set that incorporates six replicates of a reference sample. To facilitate access to this approach, we have developed a same-same analysis R module that is freely available and ready to use via the internet.

2016 ◽  
Vol 2016 ◽  
pp. 1-7
Author(s):  
Oluyemi Oyeniran ◽  
Hanfeng Chen

The problem of estimating the proportion, π0, of the true null hypotheses in a multiple testing problem is important in cases where large scale parallel hypotheses tests are performed independently. While the problem is a quantity of interest in its own right in applications, the estimate of π0 can be used for assessing or controlling an overall false discovery rate. In this article, we develop an innovative nonparametric maximum likelihood approach to estimate π0. The nonparametric likelihood is proposed to be restricted to multinomial models and an EM algorithm is also developed to approximate the estimate of π0. Simulation studies show that the proposed method outperforms other existing methods. Using experimental microarray datasets, we demonstrate that the new method provides satisfactory estimate in practice.


2011 ◽  
Vol 311-313 ◽  
pp. 1661-1666
Author(s):  
Pei Jin ◽  
Jian Zhang

Several biomaterials have been widely used in the treatment of cancer. However, how these biomaterials alter gene expression is poorly understood. The problem of identifying genes that are differentially expressed across varying biological conditions or in response to different biomaterials based on microarray data is a typical multiple testing problem. In this paper, we focus on FDR control for large-scale multiple testing problems, by our proposed statistics and resampling method, a powerful FDR controlling procedure for large-scale multiple testing problems is provided. Simulations show that, our Fiducial estimator is accurate and stable than other five traditional methods, with satisfactory FDR control. In particular, we propose a generally applicable estimate of the proposed procedure for identifying differentially expressed genes in microarray experiments. This microarray method consistently shows favorable performance over the existing methods. For example, in testing for differential expression between two breast cancer tumor types, the proposed procedure provides increases from 37% to 127% in the number of genes called significant at a false discovery rate of 3%.


2012 ◽  
Vol 2012 ◽  
pp. 1-19 ◽  
Author(s):  
Yongchao Ge ◽  
Xiaochun Li

Consider the multiple testing problem of testingmnull hypothesesH1,…,Hm, among whichm0hypotheses are truly null. Given theP-values for each hypothesis, the question of interest is how to combine theP-values to find out which hypotheses are false nulls and possibly to make a statistical inference onm0. Benjamini and Hochberg proposed a classical procedure that can control the false discovery rate (FDR). The FDR control is a little bit unsatisfactory in that it only concerns the expectation of the false discovery proportion (FDP). The control of the actual random variable FDP has recently drawn much attention. For any level1−α, this paper proposes a procedure to construct an upper prediction bound (UPB) for the FDP for a fixed rejection region. When1−α=50%, our procedure is very close to the classical Benjamini and Hochberg procedure. Simultaneous UPBs for all rejection regions' FDPs and the upper confidence bound for the unknownm0are presented consequently. This new proposed procedure works for finite samples and hence avoids the slow convergence problem of the asymptotic theory.


Author(s):  
Wenguang Sun ◽  
Brian J. Reich ◽  
T. Tony Cai ◽  
Michele Guindani ◽  
Armin Schwartzman

2013 ◽  
Vol 2013 ◽  
pp. 1-11 ◽  
Author(s):  
Dongmei Li ◽  
Timothy D. Dye

Resampling-based multiple testing procedures are widely used in genomic studies to identify differentially expressed genes and to conduct genome-wide association studies. However, the power and stability properties of these popular resampling-based multiple testing procedures have not been extensively evaluated. Our study focuses on investigating the power and stability of seven resampling-based multiple testing procedures frequently used in high-throughput data analysis for small sample size data through simulations and gene oncology examples. The bootstrap single-step minPprocedure and the bootstrap step-down minPprocedure perform the best among all tested procedures, when sample size is as small as 3 in each group and either familywise error rate or false discovery rate control is desired. When sample size increases to 12 and false discovery rate control is desired, the permutation maxTprocedure and the permutation minPprocedure perform best. Our results provide guidance for high-throughput data analysis when sample size is small.


2009 ◽  
Vol 4 (3) ◽  
pp. 291-293 ◽  
Author(s):  
Thomas E. Nichols ◽  
Jean-Baptist Poline

The article “Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition” ( Vul, Harris, Winkielman, & Pashler, 2009 , this issue) makes a broad case that current practice in neuroimaging methodology is deficient. Vul et al. go so far as to demand that authors retract or restate results, which we find wrongly casts suspicion on the confirmatory inference methods that form the foundation of neuroimaging statistics. We contend the authors' argument is overstated and that their work can be distilled down to two points already familiar to the neuroimaging community: that the multiple testing problem must be accounted for, and that reporting of methods and results should be improved. We also illuminate their concerns with standard statistical concepts such as the distinction between estimation and inference and between confirmatory and post hoc inferences, which makes their findings less puzzling.


2018 ◽  
Vol 113 (523) ◽  
pp. 1172-1183 ◽  
Author(s):  
Pallavi Basu ◽  
T. Tony Cai ◽  
Kiranmoy Das ◽  
Wenguang Sun

Sign in / Sign up

Export Citation Format

Share Document