New Equivalence Tests for Hardy–Weinberg Equilibrium and Multiple Alleles

Vladimir Ostrovski

doi:10.3390/stats3010004

New Equivalence Tests for Hardy–Weinberg Equilibrium and Multiple Alleles

Stats ◽

10.3390/stats3010004 ◽

2020 ◽

Vol 3 (1) ◽

pp. 34-39

Author(s):

Vladimir Ostrovski

Keyword(s):

Bootstrap Method ◽

Real Data ◽

Data Sets ◽

Test Statistics ◽

Finite Sample ◽

Multiple Alleles ◽

Weinberg Equilibrium ◽

Equivalence Tests ◽

Hardy Weinberg Equilibrium ◽

The Bootstrap Method

We consider testing equivalence to Hardy–Weinberg Equilibrium in case of multiple alleles. Two different test statistics are proposed for this test problem. The asymptotic distribution of the test statistics is derived. The corresponding tests can be carried out using asymptotic approximation. Alternatively, the variance of the test statistics can be estimated by the bootstrap method. The proposed tests are applied to three real data sets. The finite sample performance of the tests is studied by simulations, which are inspired by the real data sets.

Download Full-text

Goodness-of-Fit Tests for Bivariate Time Series of Counts

Econometrics ◽

10.3390/econometrics9010010 ◽

2021 ◽

Vol 9 (1) ◽

pp. 10

Author(s):

Šárka Hudecová ◽

Marie Hušková ◽

Simos G. Meintanis

Keyword(s):

Goodness Of Fit ◽

Probability Generating Function ◽

Parametric Bootstrap ◽

Real Data ◽

Data Sets ◽

Test Statistics ◽

Finite Sample ◽

Generalized Poisson ◽

Goodness Of Fit Tests ◽

Monte Carlo Experiments

This article considers goodness-of-fit tests for bivariate INAR and bivariate Poisson autoregression models. The test statistics are based on an L2-type distance between two estimators of the probability generating function of the observations: one being entirely nonparametric and the second one being semiparametric computed under the corresponding null hypothesis. The asymptotic distribution of the proposed tests statistics both under the null hypotheses as well as under alternatives is derived and consistency is proved. The case of testing bivariate generalized Poisson autoregression and extension of the methods to dimension higher than two are also discussed. The finite-sample performance of a parametric bootstrap version of the tests is illustrated via a series of Monte Carlo experiments. The article concludes with applications on real data sets and discussion.

Download Full-text

New Equivalence Tests for Approximate Independence in Contingency Tables

Stats ◽

10.3390/stats2020018 ◽

2019 ◽

Vol 2 (2) ◽

pp. 239-246 ◽

Cited By ~ 1

Author(s):

Vladimir Ostrovski

Keyword(s):

Contingency Tables ◽

Real Data ◽

Critical Values ◽

Data Sets ◽

Finite Sample ◽

Boundary Points ◽

Equivalence Tests

We introduce new equivalence tests for approximate independence in two-way contingency tables. The critical values are calculated asymptotically. The finite sample performance of the tests is improved by means of the bootstrap. An estimator of boundary points is developed to make the bootstrap based tests statistically efficient and computationally feasible. We compare the performance of the proposed tests for different table sizes by simulation. Then we apply the tests to real data sets.

Download Full-text

Computation of Exact Bootstrap Confidence Intervals: Complexity and Deterministic Algorithms

Operations Research ◽

10.1287/opre.2019.1904 ◽

2020 ◽

Vol 68 (3) ◽

pp. 949-964

Author(s):

Dimitris Bertsimas ◽

Bradley Sturt

Keyword(s):

Confidence Intervals ◽

Bootstrap Method ◽

Synthetic Data ◽

Randomized Algorithm ◽

Data Sets ◽

Bootstrap Confidence Intervals ◽

Integer Points ◽

Deterministic Algorithms ◽

New Perspective ◽

The Bootstrap Method

The bootstrap method is one of the major developments in statistics in the 20th century for computing confidence intervals directly from data. However, the bootstrap method is traditionally approximated with a randomized algorithm, which can sometimes produce inaccurate confidence intervals. In “Computation of Exact Bootstrap Confidence Intervals: Complexity and Deterministic Algorithms,” Bertsimas and Sturt present a new perspective of the bootstrap method through the lens of counting integer points in a polyhedron. Through this perspective, the authors develop the first computational complexity results and efficient deterministic approximation algorithm (fully polynomial time approximation scheme) for bootstrap confidence intervals, which unlike traditional methods, has guaranteed bounds on its error. In experiments on real and synthetic data sets from clinical trials, the proposed deterministic algorithms quickly produce reliable confidence intervals, which are significantly more accurate than those from randomization.

Download Full-text

Estimation of Expected Fisher Information for IRT Models

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998619838240 ◽

2019 ◽

Vol 44 (4) ◽

pp. 431-447 ◽

Cited By ~ 2

Author(s):

Scott Monroe

Keyword(s):

Fisher Information ◽

Information Matrix ◽

Real Data ◽

Data Sets ◽

Test Statistics ◽

Irt Models ◽

Expected Information ◽

Complex Models ◽

Multidimensional Irt Models ◽

Two Parameter

In item response theory (IRT) modeling, the Fisher information matrix is used for numerous inferential procedures such as estimating parameter standard errors, constructing test statistics, and facilitating test scoring. In principal, these procedures may be carried out using either the expected information or the observed information. However, in practice, the expected information is not typically used, as it often requires a large amount of computation. In the present research, two methods to approximate the expected information by Monte Carlo are proposed. The first method is suitable for less complex IRT models such as unidimensional models. The second method is generally applicable but is designed for use with more complex models such as high-dimensional IRT models. The proposed methods are compared to existing methods using real data sets and a simulation study. The comparisons are based on simple structure multidimensional IRT models with two-parameter logistic item models.

Download Full-text

Comparison of the bootstrap method with another method for the analysis of discrepant data sets

Nuclear Instruments and Methods in Physics Research Section A Accelerators Spectrometers Detectors and Associated Equipment ◽

10.1016/j.nima.2007.01.183 ◽

2007 ◽

Vol 574 (1) ◽

pp. 144-149 ◽

Cited By ~ 1

Author(s):

O. Helene

Keyword(s):

Bootstrap Method ◽

Data Sets ◽

The Bootstrap Method

Download Full-text

Application of Resampling Techniques to the Statistical Analysis of the Brier Score

Methods of Information in Medicine ◽

10.1055/s-0038-1634163 ◽

2001 ◽

Vol 40 (03) ◽

pp. 259-264 ◽

Cited By ~ 9

Author(s):

S. Itoh ◽

T. Ishigaki ◽

K. Yamauchi ◽

M. Ikeda

Keyword(s):

Statistical Analysis ◽

Receiver Operating Characteristic ◽

Operating Characteristic ◽

Bootstrap Method ◽

Brier Score ◽

Sampling Techniques ◽

Data Sets ◽

Statistical Comparison ◽

Jackknife Method ◽

The Bootstrap Method

Abstract:We investigated the application of resampling techniques to the statistical analysis of the Brier score (B), and extended them to the statistical comparison of two Bs derived from the same set of patients. The re-sampling techniques are helpful in the statistical analysis of B, and there are almost no differences between the jackknife method and the bootstrap method in this analysis. Thus, we believe that B should be used more often as an index to evaluate probabilistic judgments in the case in which the data sets for the assessment are “degenerate” as the “receiver operating characteristic data sets.”

Download Full-text

On the Maintenance Modeling of a Hybrid Model with Exponential Repair Efficiency

International Journal of Mathematical Engineering and Management Sciences ◽

10.33889/ijmems.2021.6.1.016 ◽

2020 ◽

Vol 6 (1) ◽

pp. 254-267

Author(s):

Wassila Nissas ◽

Soufiane Gasmi

Keyword(s):

Hybrid Model ◽

Likelihood Function ◽

Bootstrap Method ◽

Real Data ◽

Random Variable ◽

Model Parameters ◽

Data Sets ◽

Imperfect Maintenance ◽

Repairable Systems ◽

Repair Efficiency

In the reliability literature, maintenance efficiency is usually dealt with as a fixed value. Since repairable systems are subject to different degrees and types of repair, it is more convenient to regard a random variable for maintenance efficiency. This paper is devoted to the statistical study of a general hybrid model for repairable systems working under imperfect maintenance. For both failure improvement and virtual age reduction of the system, maintenance efficiency is assumed to be random, with an exponential distribution as a probability density function. The likelihood function of this model is provided, and the estimation of the model parameters is computed by considering the maximization likelihood procedure. Obtained results were tested and applied to simulated and real data sets. To construct confidence intervals, the bias-corrected accelerated bootstrap method has been used.

Download Full-text

Multi-allelic Exact tests for Hardy-Weinberg equilibrium that account for gender

10.1101/172874 ◽

2017 ◽

Author(s):

Jan Graffelman ◽

Bruce Weir

Keyword(s):

Permutation Test ◽

Statistical Tests ◽

Enumeration Algorithm ◽

Test Procedures ◽

Allelic Variants ◽

Multiple Alleles ◽

Exact Test ◽

Weinberg Equilibrium ◽

Hardy Weinberg Equilibrium ◽

And Gender

Statistical tests for Hardy-Weinberg equilibrium are important elementary tools in genetic data analysis. X-chromosomal variants have long been tested by applying autosomal test procedures to females only, and gender is usually not considered when testing autosomal variants for equilibrium. Recently, we proposed specific X-chromosomal exact test procedures for bi-allelic variants that include the hemizygous males, as well as autosomal tests that consider gender. In this paper we present the extension of the previous work for variants with multiple alleles. A full enumeration algorithm is used for the exact calculations of tri-allelic variants. For variants with many alternate alleles we use a permutation test. Some empirical examples with data from the 1000 genomes project are discussed.

Download Full-text

Deviations from Hardy Weinberg Equilibrium at CCR5-Δ32 in Large Sequencing Data Sets

10.1101/768390 ◽

2019 ◽

Cited By ~ 3

Author(s):

Xinzhu Wei ◽

Rasmus Nielsen

Keyword(s):

Mortality Rate ◽

Deleterious Effect ◽

Data Sets ◽

Sequencing Data ◽

Genotyping Array ◽

Array Data ◽

Weinberg Equilibrium ◽

Show Evidence ◽

Hardy Weinberg Equilibrium ◽

The Uk

AbstractPrevious analyses of the UK Biobank (UKB) genotyping array data in the CCR5-Δ32 locus show evidence for deviations from Hardy-Weinberg Equilibrium (HWE) and an increased mortality rate of homozygous individuals, consistent with a recessive deleterious effect of the deletion mutation. We here examine if similar deviations from HWE can be observed in the newly released UKB Whole Exome Sequencing (WES) data and in the sequencing data of the Genome Aggregation Database (gnomAD). We also examine the reliability of the genotype calls in the UKB array data. The UKB genotyping array probe targeting CCR5-Δ32 (rs62625034) and the WES of Δ32 are strongly correlated (r2 = 0.97). This contrasts to tag SNPs of CCR5-Δ32 in the UKB which have high missing data rates and imputation errors rates. We also show that, while different data sets are subject to different biases, both the UKB-WES and the gnomAD data have a deficiency of homozygous CCR5-Δ32 individuals compared to the HWE expectation (combined P-value < 0.01), consistent with an increased mortality rate in homozygotes. Finally, we perform a survival analysis on data from parents of UKB volunteers, that, while underpowered, is also consistent with the original report of a deleterious effect of CCR5-Δ32 in the homozygous state.

Download Full-text

Robust, flexible, and scalable tests for Hardy–Weinberg equilibrium across diverse ancestries

Genetics ◽

10.1093/genetics/iyab044 ◽

2021 ◽

Author(s):

Alan M Kwong ◽

Thomas W Blackwell ◽

Jonathon LeFaive ◽

Mariza de Andrade ◽

John Barnard ◽

...

Keyword(s):

Population Structure ◽

Statistical Power ◽

Sequence Data ◽

Population Heterogeneity ◽

Alternative Methods ◽

Data Sets ◽

Real Sequence ◽

Weinberg Equilibrium ◽

Hardy Weinberg Equilibrium ◽

The Impact

Abstract Traditional Hardy–Weinberg equilibrium (HWE) tests (the χ2 test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in data sets composed of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and to evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence data sets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false-positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently among the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth.

Download Full-text