scholarly journals Three New Methods for Analysis of Answer Changes

2016 ◽  
Vol 77 (1) ◽  
pp. 54-81 ◽  
Author(s):  
Sandip Sinharay ◽  
Matthew S. Johnson

In a pioneering research article, Wollack and colleagues suggested the “erasure detection index” (EDI) to detect test tampering. The EDI can be used with or without a continuity correction and is assumed to follow the standard normal distribution under the null hypothesis of no test tampering. When used without a continuity correction, the EDI often has inflated Type I error rates. When used with a continuity correction, the EDI has satisfactory Type I error rates, but smaller power compared with the EDI without a continuity correction. This article suggests three methods for detecting test tampering that do not rely on the assumption of a standard normal distribution under the null hypothesis. It is demonstrated in a detailed simulation study that the performance of each suggested method is slightly better than that of the EDI. The EDI and the suggested methods were applied to a real data set. The suggested methods, although more computation intensive than the EDI, seem to be promising in detecting test tampering.

2016 ◽  
Vol 5 (5) ◽  
pp. 16 ◽  
Author(s):  
Guolong Zhao

To evaluate a drug, statistical significance alone is insufficient and clinical significance is also necessary. This paper explains how to analyze clinical data with considering both statistical and clinical significance. The analysis is practiced by combining a confidence interval under null hypothesis with that under non-null hypothesis. The combination conveys one of the four possible results: (i) both significant, (ii) only significant in the former, (iii) only significant in the latter or (iv) neither significant. The four results constitute a quadripartite procedure. Corresponding tests are mentioned for describing Type I error rates and power. The empirical coverage is exhibited by Monte Carlo simulations. In superiority trials, the four results are interpreted as clinical superiority, statistical superiority, non-superiority and indeterminate respectively. The interpretation is opposite in inferiority trials. The combination poses a deflated Type I error rate, a decreased power and an increased sample size. The four results may helpful for a meticulous evaluation of drugs. Of these, non-superiority is another profile of equivalence and so it can also be used to interpret equivalence. This approach may prepare a convenience for interpreting discordant cases. Nevertheless, a larger data set is usually needed. An example is taken from a real trial in naturally acquired influenza.


2017 ◽  
Vol 43 (3) ◽  
pp. 286-315 ◽  
Author(s):  
Sandip Sinharay

Wollack, Cohen, and Eckerly suggested the “erasure detection index” (EDI) to detect fraudulent erasures for individual examinees. Wollack and Eckerly extended the EDI to detect fraudulent erasures at the group level. The EDI at the group level was found to be slightly conservative. This article suggests two modifications of the EDI for the group level. The asymptotic null distribution of the two modified indices is proved to be the standard normal distribution. In a simulation study, the modified indices are shown to have Type I error rates close to the nominal level and larger power than the index of Wollack and Eckerly. A real data example is also included.


Author(s):  
Riko Kelter

Abstract Testing for differences between two groups is among the most frequently carried out statistical methods in empirical research. The traditional frequentist approach is to make use of null hypothesis significance tests which use p values to reject a null hypothesis. Recently, a lot of research has emerged which proposes Bayesian versions of the most common parametric and nonparametric frequentist two-sample tests. These proposals include Student’s two-sample t-test and its nonparametric counterpart, the Mann–Whitney U test. In this paper, the underlying assumptions, models and their implications for practical research of recently proposed Bayesian two-sample tests are explored and contrasted with the frequentist solutions. An extensive simulation study is provided, the results of which demonstrate that the proposed Bayesian tests achieve better type I error control at slightly increased type II error rates. These results are important, because balancing the type I and II errors is a crucial goal in a variety of research, and shifting towards the Bayesian two-sample tests while simultaneously increasing the sample size yields smaller type I error rates. What is more, the results highlight that the differences in type II error rates between frequentist and Bayesian two-sample tests depend on the magnitude of the underlying effect.


2011 ◽  
Vol 14 (2) ◽  
pp. 1023-1049 ◽  
Author(s):  
Miguel A. García-Pérez ◽  
Rocío Alcalá-Quintana

Solving theoretical or empirical issues sometimes involves establishing the equality of two variables with repeated measures. This defies the logic of null hypothesis significance testing, which aims at assessing evidence against the null hypothesis of equality, not for it. In some contexts, equivalence is assessed through regression analysis by testing for zero intercept and unit slope (or simply for unit slope in case that regression is forced through the origin). This paper shows that this approach renders highly inflated Type I error rates under the most common sampling models implied in studies of equivalence. We propose an alternative approach based on omnibus tests of equality of means and variances and in subject-by-subject analyses (where applicable), and we show that these tests have adequate Type I error rates and power. The approach is illustrated with a re-analysis of published data from a signal detection theory experiment with which several hypotheses of equivalence had been tested using only regression analysis. Some further errors and inadequacies of the original analyses are described, and further scrutiny of the data contradict the conclusions raised through inadequate application of regression analyses.


2019 ◽  
Vol 14 (2) ◽  
pp. 399-425 ◽  
Author(s):  
Haolun Shi ◽  
Guosheng Yin

2014 ◽  
Vol 38 (2) ◽  
pp. 109-112 ◽  
Author(s):  
Daniel Furtado Ferreira

Sisvar is a statistical analysis system with a large usage by the scientific community to produce statistical analyses and to produce scientific results and conclusions. The large use of the statistical procedures of Sisvar by the scientific community is due to it being accurate, precise, simple and robust. With many options of analysis, Sisvar has a not so largely used analysis that is the multiple comparison procedures using bootstrap approaches. This paper aims to review this subject and to show some advantages of using Sisvar to perform such analysis to compare treatments means. Tests like Dunnett, Tukey, Student-Newman-Keuls and Scott-Knott are performed alternatively by bootstrap methods and show greater power and better controls of experimentwise type I error rates under non-normal, asymmetric, platykurtic or leptokurtic distributions.


Sign in / Sign up

Export Citation Format

Share Document