A fast and accurate algorithm to test for binary phenotypes and its application to PheWAS

Mapping Intimacies ◽

10.1101/109876 ◽

2017 ◽

Cited By ~ 1

Author(s):

Rounak Dey ◽

Ellen M. Schmidt ◽

Goncalo R. Abecasis ◽

Seunggeun Lee

Keyword(s):

Type I Error ◽

Score Test ◽

Saddlepoint Approximation ◽

Case Control ◽

Error Rates ◽

Type I ◽

Test Statistic ◽

Association Analyses ◽

Type I Error Rates ◽

And Control

AbstractThe availability of electronic health record (EHR)-based phenotypes allows for genome-wide association analyses in thousands of traits, and has great potential to identify novel genetic variants associated with clinical phenotypes. We can interpret the phenome-wide association study (PheWAS) result for a single genetic variant by observing its association across a landscape of phenotypes. Since PheWAS can test 1000s of binary phenotypes, and most of them have unbalanced (case:control = 1:10) or often extremely unbalanced (case:control = 1:600) case-control ratios, existing methods cannot provide an accurate and scalable way to test for associations. Here we propose a computationally fast score test-based method that estimates the distribution of the test statistic using the saddlepoint approximation. Our method is much faster than the state of the art Firth’s test (∼ 100 times). It can also adjust for covariates and control type I error rates even when the case-control ratio is extremely unbalanced. Through application to PheWAS data from the Michigan Genomics Initiative, we show that the proposed method can control type I error rates while replicating previously known association signals even for traits with a very small number of cases and a large number of controls.

Download Full-text

UK-Biobank Whole Exome Sequence Binary Phenome Analysis with Robust Region-based Rare Variant Test

10.1101/697912 ◽

2019 ◽

Cited By ~ 2

Author(s):

Zhangchen Zhao ◽

Wenjian Bi ◽

Wei Zhou ◽

Peter VandeHaar ◽

Lars G. Fritsche ◽

...

Keyword(s):

Data Analysis ◽

Rare Variant ◽

Type I Error ◽

Case Control ◽

Error Rates ◽

Type I ◽

Uk Biobank ◽

Type I Error Rates ◽

Whole Exome ◽

Exome Sequence

AbstractIn biobank data analysis, most binary phenotypes have unbalanced case-control ratios, which can cause inflation of type I error rates. Recently, a saddlepoint approximation (SPA) based single variant test has been developed to provide an accurate and scalable method to test for associations of such phenotypes. For gene- or region-based multiple variant tests, a few methods exist which adjust for unbalanced case-control ratios; however, these methods are either less accurate when case-control ratios are extremely unbalanced or not scalable for large data analyses. To address these problems, we propose SKAT/SKAT-O type region-based tests, where the single-variant score statistic is calibrated based on SPA and Efficient Resampling (ER). Through simulation studies, we show that the proposed method provides well-calibrated p-values. In contrast, the unadjusted approach has greatly inflated type I error rates (90 times of exome-wideα=2.5×10-6) when the case-control ratio is 1:99. Additionally, the proposed method has similar computation time as the unadjusted approaches and is scalable for large sample data. Our UK Biobank whole exome sequence data analysis of 45,596 unrelated European samples and 791 PheCode phenotypes identified 10 rare variant associations with p-value < 10-7, including the associations betweenJAK2and myeloproliferative disease,TNCand large cell lymphoma andF11and congenital coagulation defects. All analysis summary results are publicly available through a web-based visual server.

Download Full-text

A Maximum Test for Scale: Type I Error Rates and Power

Journal of Educational and Behavioral Statistics ◽

10.3102/10769986020001027 ◽

1995 ◽

Vol 20 (1) ◽

pp. 27-39 ◽

Cited By ~ 2

Author(s):

James Algina ◽

R. Clifford Blair ◽

William T. Coombs

Keyword(s):

Type I Error ◽

Error Rates ◽

Type I ◽

Test Statistics ◽

Test Statistic ◽

Nominal Level ◽

Scale Type ◽

Type I Error Rates ◽

Maximum Test ◽

Study Type

A maximum test in which the test statistic is the more extreme of the Brown-Forsythe and O’Brien’s test statistics is developed. Estimated Type I error rates and power are presented for the Brown-Forsythe test, O’Brien’s test, and the maximum test. For the conditions included in the study, Type I error rates for the maximum test are near the nominal level. In all conditions, the power of the maximum test tended to be equal to or greater than that of the test—O’Brien or Brown-Forsythe—that had the larger power.

Download Full-text

Type I error rates and power of several versions of scaled chi-square difference tests in investigations of measurement invariance.

Psychological Methods ◽

10.1037/met0000097 ◽

2017 ◽

Vol 22 (3) ◽

pp. 467-485 ◽

Cited By ~ 4

Author(s):

Jordan Campbell Brace ◽

Victoria Savalei

Keyword(s):

Measurement Invariance ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Chi Square ◽

Type I Error Rates

Download Full-text

Type I Error Rates, Coverage of Confidence Intervals, and Variance Estimation in Propensity-Score Matched Analyses

The International Journal of Biostatistics ◽

10.2202/1557-4679.1146 ◽

2009 ◽

Vol 5 (1) ◽

Cited By ~ 65

Author(s):

Peter C Austin

Keyword(s):

Propensity Score ◽

Confidence Intervals ◽

Variance Estimation ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Type I Error Rates

Download Full-text

Type I Error Rates for Parscale’s Fit Index

Educational and Psychological Measurement ◽

10.1177/0013164404264849 ◽

2005 ◽

Vol 65 (1) ◽

pp. 42-50 ◽

Cited By ~ 17

Author(s):

Christine E. Demars

Keyword(s):

Type I Error ◽

Error Rates ◽

Type I ◽

Type I Error Rates ◽

Fit Index

Download Full-text

Control of Type I Error Rates in Bayesian Sequential Designs

Bayesian Analysis ◽

10.1214/18-ba1109 ◽

2019 ◽

Vol 14 (2) ◽

pp. 399-425 ◽

Cited By ~ 4

Author(s):

Haolun Shi ◽

Guosheng Yin

Keyword(s):

Type I Error ◽

Error Rates ◽

Type I ◽

Sequential Designs ◽

Type I Error Rates

Download Full-text

Sisvar: a Guide for its Bootstrap procedures in multiple comparisons

Ciência e Agrotecnologia ◽

10.1590/s1413-70542014000200001 ◽

2014 ◽

Vol 38 (2) ◽

pp. 109-112 ◽

Cited By ~ 299

Author(s):

Daniel Furtado Ferreira

Keyword(s):

Scientific Community ◽

Type I Error ◽

Multiple Comparisons ◽

Error Rates ◽

Type I ◽

Type I Error Rates ◽

Statistical Analysis System ◽

Scientific Results ◽

Analysis System ◽

Multiple Comparison Procedures

Sisvar is a statistical analysis system with a large usage by the scientific community to produce statistical analyses and to produce scientific results and conclusions. The large use of the statistical procedures of Sisvar by the scientific community is due to it being accurate, precise, simple and robust. With many options of analysis, Sisvar has a not so largely used analysis that is the multiple comparison procedures using bootstrap approaches. This paper aims to review this subject and to show some advantages of using Sisvar to perform such analysis to compare treatments means. Tests like Dunnett, Tukey, Student-Newman-Keuls and Scott-Knott are performed alternatively by bootstrap methods and show greater power and better controls of experimentwise type I error rates under non-normal, asymmetric, platykurtic or leptokurtic distributions.

Download Full-text

Bootstrap Type I error rates for the correlation coefficient: An examination of alternate procedures.

Psychological Bulletin ◽

10.1037/0033-2909.104.2.290 ◽

1988 ◽

Vol 104 (2) ◽

pp. 290-292 ◽

Cited By ~ 21

Author(s):

Michael J Strube

Keyword(s):

Correlation Coefficient ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Type I Error Rates

Download Full-text

Outlier Impact and Accommodation Methods: Multiple Comparisons of Type I Error Rates

Journal of Modern Applied Statistical Methods ◽

10.22237/jmasm/1462076520 ◽

2016 ◽

Vol 15 (1) ◽

pp. 452-471 ◽

Cited By ~ 7

Author(s):

Hongjing Liao ◽

Yanju Li ◽

Gordon Brooks

Keyword(s):

Type I Error ◽

Multiple Comparisons ◽

Error Rates ◽

Type I ◽

Type I Error Rates

Download Full-text

Cluster Wild Bootstrapping to Handle Dependent Effect Sizes in Meta-Analysis with a Small Number of Studies

10.31222/osf.io/x6uhk ◽

2021 ◽

Author(s):

Megha Joshi ◽

James E Pustejovsky ◽

S. Natasha Beretvas

Keyword(s):

Effect Size ◽

Type I Error ◽

Meta Analysis ◽

Error Rates ◽

Small Sample ◽

Type I ◽

Hypothesis Tests ◽

Type I Error Rates ◽

Meta Analyses ◽

Small Sample Correction

The most common and well-known meta-regression models work under the assumption that there is only one effect size estimate per study and that the estimates are independent. However, meta-analytic reviews of social science research often include multiple effect size estimates per primary study, leading to dependence in the estimates. Some meta-analyses also include multiple studies conducted by the same lab or investigator, creating another potential source of dependence. An increasingly popular method to handle dependence is robust variance estimation (RVE), but this method can result in inflated Type I error rates when the number of studies is small. Small-sample correction methods for RVE have been shown to control Type I error rates adequately but may be overly conservative, especially for tests of multiple-contrast hypotheses. We evaluated an alternative method for handling dependence, cluster wild bootstrapping, which has been examined in the econometrics literature but not in the context of meta-analysis. Results from two simulation studies indicate that cluster wild bootstrapping maintains adequate Type I error rates and provides more power than extant small sample correction methods, particularly for multiple-contrast hypothesis tests. We recommend using cluster wild bootstrapping to conduct hypothesis tests for meta-analyses with a small number of studies. We have also created an R package that implements such tests.

Download Full-text