scholarly journals 4497 Accessible False Discovery Rate Computation

2020 ◽  
Vol 4 (s1) ◽  
pp. 44-44
Author(s):  
Megan C Hollister ◽  
Jeffrey D. Blume

OBJECTIVES/GOALS: To improve the implementation of FDRs in translation research. Current statistical packages are hard to use and fail to adequately convey strong assumptions. We developed a software package that allows the user to decide on assumptions and choose the hey desire. We encourage wider reporting of FDRs for observed findings. METHODS/STUDY POPULATION: We developed a user-friendly R function for computing FDRs from observed p-values. A variety of methods for FDR estimation and for FDR control are included so the user can select the approach most appropriate for their setting. Options include Efron’s Empirical Bayes FDR, Benjamini-Hochberg FDR control for multiple testing, Lindsey’s method for smoothing empirical distributions, estimation of the mixing proportion, and central matching. We illustrate the important difference between estimating the FDR for a particular finding and adjusting a hypothesis test to control the false discovery propensity. RESULTS/ANTICIPATED RESULTS: We performed a comparison of the capabilities of our new p.fdr function to the popular p.adjust function from the base stats-package. Specifically, we examined multiple examples of data coming from different unknown mixture distributions to highlight the null estimation methods p.fdr includes. The base package does not provide the optimal FDR usage nor sufficient estimation options. We also compared the step-up/step-down procedure used in adjusted p-value hypothesis test and discuss when this is inappropriate. The p.adjust function is not able to report raw-adjusted values and this will be shown in the graphical results. DISCUSSION/SIGNIFICANCE OF IMPACT: FDRs reveal the propensity for an observed result to be incorrect. FDRs should accompany observed results to help contextualize the relevance and potential impact of research findings. Our results show that previous methods are not sufficient rich or precise in their calculations. Our new package allows the user to be in control of the null estimation and step-up implementation when reporting FDRs.

2018 ◽  
Author(s):  
Tuomas Puoliväli ◽  
Satu Palva ◽  
J. Matias Palva

AbstractBackgroundReproducibility of research findings has been recently questioned in many fields of science, including psychology and neurosciences. One factor influencing reproducibility is the simultaneous testing of multiple hypotheses, which increases the number of false positive findings unless the p-values are carefully corrected. While this multiple testing problem is well known and has been studied for decades, it continues to be both a theoretical and practical problem.New MethodHere we assess the reproducibility of research involving multiple-testing corrected for family-wise error rate (FWER) or false discovery rate (FDR) by techniques based on random field theory (RFT), cluster-mass based permutation testing, adaptive FDR, and several classical methods. We also investigate the performance of these methods under two different models.ResultsWe found that permutation testing is the most powerful method among the considered approaches to multiple testing, and that grouping hypotheses based on prior knowledge can improve power. We also found that emphasizing primary and follow-up studies equally produced most reproducible outcomes.Comparison with Existing Method(s)We have extended the use of two-group and separate-classes models for analyzing reproducibility and provide a new open-source software “MultiPy” for multiple hypothesis testing.ConclusionsOur results suggest that performing strict corrections for multiple testing is not sufficient to improve reproducibility of neuroimaging experiments. The methods are freely available as a Python toolkit “MultiPy” and we aim this study to help in improving statistical data analysis practices and to assist in conducting power and reproducibility analyses for new experiments.


2013 ◽  
Vol 2013 ◽  
pp. 1-9
Author(s):  
Hisashi Noma ◽  
Shigeyuki Matsui

Multiple testing has been widely adopted for genome-wide studies such as microarray experiments. For effective gene selection in these genome-wide studies, the optimal discovery procedure (ODP), which maximizes the number of expected true positives for each fixed number of expected false positives, was developed as a multiple testing extension of the most powerful test for a single hypothesis by Storey (Journal of the Royal Statistical Society, Series B,vol. 69, no. 3, pp. 347–368, 2007). In this paper, we develop an empirical Bayes method for implementing the ODP based on a semiparametric hierarchical mixture model using the “smoothing-by-roughening" approach. Under the semiparametric hierarchical mixture model, (i) the prior distribution can be modeled flexibly, (ii) the ODP test statistic and the posterior distribution are analytically tractable, and (iii) computations are easy to implement. In addition, we provide a significance rule based on the false discovery rate (FDR) in the empirical Bayes framework. Applications to two clinical studies are presented.


2017 ◽  
Author(s):  
Kerstin Scheubert ◽  
Franziska Hufsky ◽  
Daniel Petras ◽  
Mingxun Wang ◽  
Louis-Félix Nothias ◽  
...  

AbstractThe annotation of small molecules in untargeted mass spectrometry relies on the matching of fragment spectra to reference library spectra. While various spectrum-spectrum match scores exist, the field lacks statistical methods for estimating the false discovery rates (FDR) of these annotations. We present empirical Bayes and target-decoy based methods to estimate the false discovery rate. Relying on estimations of false discovery rates, we explore the effect of different spectrum-spectrum match criteria on the number and the nature of the molecules annotated. We show that the spectral matching settings needs to be adjusted for each project. By adjusting the scoring parameters and thresholds, the number of annotations rose, on average, by +139% (ranging from −92% up to +5705%) when compared to a default parameter set available at GNPS. The FDR estimation methods presented will enable a user to define the scoring criteria for large scale analysis of untargeted small molecule data that has been essential in the advancement of large scale proteomics, transcriptomics, and genomics science.


2021 ◽  
Vol 11 (2) ◽  
Author(s):  
Cynthia Dwork ◽  
Weijie Su ◽  
Li Zhang

Differential privacy provides a rigorous framework for privacy-preserving data analysis. This paper proposes the first differentially private procedure for controlling the false discovery rate (FDR) in multiple hypothesis testing. Inspired by the Benjamini-Hochberg procedure (BHq), our approach is to first repeatedly add noise to the logarithms of the p-values to ensure differential privacy and to select an approximately smallest p-value serving as a promising candidate at each iteration; the selected p-values are further supplied to the BHq and our private procedure releases only the rejected ones. Moreover, we develop a new technique that is based on a backward submartingale for proving FDR control of a broad class of multiple testing procedures, including our private procedure, and both the BHq step- up and step-down procedures. As a novel aspect, the proof works for arbitrary dependence between the true null and false null test statistics, while FDR control is maintained up to a small multiplicative factor.


2016 ◽  
Vol 2016 ◽  
pp. 1-16 ◽  
Author(s):  
Meredith A. Ray ◽  
Xin Tong ◽  
Gabrielle A. Lockett ◽  
Hongmei Zhang ◽  
Wilfried J. J. Karmaus

Screening cytosine-phosphate-guanine dinucleotide (CpG) DNA methylation sites in association with some covariate(s) is desired due to high dimensionality. We incorporate surrogate variable analyses (SVAs) into (ordinary or robust) linear regressions and utilize training and testing samples for nested validation to screen CpG sites. SVA is to account for variations in the methylation not explained by the specified covariate(s) and adjust for confounding effects. To make it easier to users, this screening method is built into a user-friendly R package,ttScreening, with efficient algorithms implemented. Various simulations were implemented to examine the robustness and sensitivity of the method compared to the classical approaches controlling for multiple testing: the false discovery rates-based (FDR-based) and the Bonferroni-based methods. The proposed approach in general performs better and has the potential to control both types I and II errors. We appliedttScreeningto 383,998 CpG sites in association with maternal smoking, one of the leading factors for cancer risk.


Genes ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 1029
Author(s):  
John A. Williams ◽  
Dominic Russ ◽  
Laura Bravo-Merodio ◽  
Victor Roth Cardoso ◽  
Samantha C. Pendleton ◽  
...  

Observational and experimental evidence has linked chronotype to both psychological and cardiometabolic traits. Recent Mendelian randomization (MR) studies have investigated direct links between chronotype and several of these traits, often in isolation of outside potential mediating or moderating traits. We mined the EpiGraphDB MR database for calculated chronotype–trait associations (p-value < 5 × 10−8). We then re-analyzed those relevant to metabolic or mental health and investigated for statistical evidence of horizontal pleiotropy. Analyses passing multiple testing correction were then investigated for confounders, colliders, intermediates, and reverse intermediates using the EpiGraphDB database, creating multiple chronotype–trait interactions among each of the the traits studied. We revealed 10 significant chronotype–exposure associations (false discovery rate < 0.05) exposed to 111 potential previously known confounders, 52 intermediates, 18 reverse intermediates, and 31 colliders. Chronotype–lipid causal associations collided with treatment and diabetes effects; chronotype–bipolar associations were mediated by breast cancer; and chronotype–alcohol intake associations were impacted by confounders and intermediate variables including known zeitgebers and molecular traits. We have reported the influence of chronotype on several cardiometabolic and behavioural traits, and identified potential confounding variables not reported on in studies while discovering new associations to drugs and disease.


2020 ◽  
Author(s):  
Ahmad Sudi Pratikno

In statistics, there are various terms that may feel unfamiliar to researcher who is not accustomed to discussing it. However, despite all of many functions and benefits that we can get as researchers to process data, it will later be interpreted into a conclusion. And then researcher can digest and understand the research findings. The distribution of continuous random opportunities illustrates obtaining opportunities with some detection of time, weather, and other data obtained from the field. The standard normal distribution represents a stable curve with zero mean and standard deviation 1, while the t distribution is used as a statistical test in the hypothesis test. Chi square deals with the comparative test on two variables with a nominal data scale, while the f distribution is often used in the ANOVA test and regression analysis.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Sangyoon Yi ◽  
Xianyang Zhang ◽  
Lu Yang ◽  
Jinyan Huang ◽  
Yuanhang Liu ◽  
...  

AbstractOne challenge facing omics association studies is the loss of statistical power when adjusting for confounders and multiple testing. The traditional statistical procedure involves fitting a confounder-adjusted regression model for each omics feature, followed by multiple testing correction. Here we show that the traditional procedure is not optimal and present a new approach, 2dFDR, a two-dimensional false discovery rate control procedure, for powerful confounder adjustment in multiple testing. Through extensive evaluation, we demonstrate that 2dFDR is more powerful than the traditional procedure, and in the presence of strong confounding and weak signals, the power improvement could be more than 100%.


Sign in / Sign up

Export Citation Format

Share Document