scholarly journals A review of issues about null hypothesis Bayesian testing.

2019 ◽  
Vol 24 (6) ◽  
pp. 774-795 ◽  
Author(s):  
Jorge N. Tendeiro ◽  
Henk A. L. Kiers
2021 ◽  
Author(s):  
Jorge Tendeiro ◽  
Henk Kiers

In 2019 we wrote a paper (Tendeiro & Kiers, 2019) in Psychological Methods over null hypothesis Bayesian testing and its working horse, the Bayes factor. Recently, van Ravenzwaaij and Wagenmakers (2021) offered a response to our piece, also in this journal. Although we do welcome their contribution with thought-provoking remarks on our paper, we ended up concluding that there were too many ‘issues’ in van Ravenzwaaij and Wagenmakers (2021) that warrant a rebuttal. In this paper we both defend the main premises of our original paper and we put the contribution of van Ravenzwaaij and Wagenmakers (2021) under critical appraisal. Our hope is that this exchange between scholars decisively contributes towards a better understanding among psychologists of null hypothesis Bayesian testing in general and of the Bayes factor in particular.


2019 ◽  
Author(s):  
Don van Ravenzwaaij ◽  
Eric-Jan Wagenmakers

Tendeiro and Kiers (2019) provide a detailed and scholarly critique of Null Hypothesis Bayesian Testing (NHBT) and its central component –the Bayes factor– that allows researchers to update knowledge and quantify statistical evidence. Tendeiro and Kiers conclude that NHBT constitutes an improvement over frequentist p-values, but primarily elaborate on a list of eleven ‘issues’ of NHBT. In this commentary, we provide context to each issue and conclude that many issues may in fact be conceived as pronounced advantages of NHBT.


2019 ◽  
Author(s):  
Jorge Tendeiro ◽  
Henk Kiers ◽  
Don van Ravenzwaaij

Description: The practice of sequentially testing a null hypothesis as data are collected until the null hypothesis is rejected is known as optional stopping. It is well-known that optional stopping is problematic in the context of null hypothesis significance testing: The false positive rates quickly overcome the single test’s significance level. However, the state of affairs under null hypothesis Bayesian testing, where p-values are replaced by Bayes factors, is perhaps surprisingly much less consensual. Rouder (2014) used simulations to defend the use of optional stopping under null hypothesis Bayesian testing. The idea behind these simulations is closely related to the idea of sampling from prior predictive distributions. In this paper we provide formal mathematical derivations for Rouder’s approximate simulation results for the two Bayesian hypothesis tests that he considered. The key idea is to consider the probability distribution of the Bayes factor, which is regarded as being a random variable across repeated sampling. This paper therefore offers a solid mathematical footing to the literature and we believe it is a valid contribution towards understanding the practice of optional stopping in the context of Bayesian hypothesis testing.


2006 ◽  
Vol 11 (1) ◽  
pp. 12-24 ◽  
Author(s):  
Alexander von Eye

At the level of manifest categorical variables, a large number of coefficients and models for the examination of rater agreement has been proposed and used. The most popular of these is Cohen's κ. In this article, a new coefficient, κ s , is proposed as an alternative measure of rater agreement. Both κ and κ s allow researchers to determine whether agreement in groups of two or more raters is significantly beyond chance. Stouffer's z is used to test the null hypothesis that κ s = 0. The coefficient κ s allows one, in addition to evaluating rater agreement in a fashion parallel to κ, to (1) examine subsets of cells in agreement tables, (2) examine cells that indicate disagreement, (3) consider alternative chance models, (4) take covariates into account, and (5) compare independent samples. Results from a simulation study are reported, which suggest that (a) the four measures of rater agreement, Cohen's κ, Brennan and Prediger's κ n , raw agreement, and κ s are sensitive to the same data characteristics when evaluating rater agreement and (b) both the z-statistic for Cohen's κ and Stouffer's z for κ s are unimodally and symmetrically distributed, but slightly heavy-tailed. Examples use data from verbal processing and applicant selection.


1991 ◽  
Vol 46 (10) ◽  
pp. 1089-1089 ◽  
Author(s):  
John J. Bartko
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document