Three New Methods for Analysis of Answer Changes

Sandip Sinharay; Matthew S. Johnson

doi:10.1177/0013164416632287

Three New Methods for Analysis of Answer Changes

Educational and Psychological Measurement ◽

10.1177/0013164416632287 ◽

2016 ◽

Vol 77 (1) ◽

pp. 54-81 ◽

Cited By ~ 6

Author(s):

Sandip Sinharay ◽

Matthew S. Johnson

Keyword(s):

Normal Distribution ◽

Null Hypothesis ◽

Type I Error ◽

Error Rates ◽

Standard Normal Distribution ◽

Type I ◽

Data Set ◽

Continuity Correction ◽

Type I Error Rates ◽

Standard Normal

In a pioneering research article, Wollack and colleagues suggested the “erasure detection index” (EDI) to detect test tampering. The EDI can be used with or without a continuity correction and is assumed to follow the standard normal distribution under the null hypothesis of no test tampering. When used without a continuity correction, the EDI often has inflated Type I error rates. When used with a continuity correction, the EDI has satisfactory Type I error rates, but smaller power compared with the EDI without a continuity correction. This article suggests three methods for detecting test tampering that do not rely on the assumption of a standard normal distribution under the null hypothesis. It is demonstrated in a detailed simulation study that the performance of each suggested method is slightly better than that of the EDI. The EDI and the suggested methods were applied to a real data set. The suggested methods, although more computation intensive than the EDI, seem to be promising in detecting test tampering.

Download Full-text

Considering Both Statistical and Clinical Significance

International Journal of Statistics and Probability ◽

10.5539/ijsp.v5n5p16 ◽

2016 ◽

Vol 5 (5) ◽

pp. 16 ◽

Cited By ~ 1

Author(s):

Guolong Zhao

Keyword(s):

Clinical Significance ◽

Null Hypothesis ◽

Type I Error ◽

Statistical Significance ◽

Error Rates ◽

Type I ◽

Data Set ◽

Type I Error Rates ◽

Empirical Coverage ◽

Superiority Trials

To evaluate a drug, statistical significance alone is insufficient and clinical significance is also necessary. This paper explains how to analyze clinical data with considering both statistical and clinical significance. The analysis is practiced by combining a confidence interval under null hypothesis with that under non-null hypothesis. The combination conveys one of the four possible results: (i) both significant, (ii) only significant in the former, (iii) only significant in the latter or (iv) neither significant. The four results constitute a quadripartite procedure. Corresponding tests are mentioned for describing Type I error rates and power. The empirical coverage is exhibited by Monte Carlo simulations. In superiority trials, the four results are interpreted as clinical superiority, statistical superiority, non-superiority and indeterminate respectively. The interpretation is opposite in inferiority trials. The combination poses a deflated Type I error rate, a decreased power and an increased sample size. The four results may helpful for a meticulous evaluation of drugs. Of these, non-superiority is another profile of equivalence and so it can also be used to interpret equivalence. This approach may prepare a convenience for interpreting discordant cases. Nevertheless, a larger data set is usually needed. An example is taken from a real trial in naturally acquired influenza.

Download Full-text

Detecting Fraudulent Erasures at an Aggregate Level

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998617739626 ◽

2017 ◽

Vol 43 (3) ◽

pp. 286-315 ◽

Cited By ~ 1

Author(s):

Sandip Sinharay

Keyword(s):

Type I Error ◽

Null Distribution ◽

Real Data ◽

Error Rates ◽

Standard Normal Distribution ◽

Type I ◽

Group Level ◽

Nominal Level ◽

Type I Error Rates ◽

Standard Normal

Wollack, Cohen, and Eckerly suggested the “erasure detection index” (EDI) to detect fraudulent erasures for individual examinees. Wollack and Eckerly extended the EDI to detect fraudulent erasures at the group level. The EDI at the group level was found to be slightly conservative. This article suggests two modifications of the EDI for the group level. The asymptotic null distribution of the two modified indices is proved to be the standard normal distribution. In a simulation study, the modified indices are shown to have Type I error rates close to the nominal level and larger power than the index of Wollack and Eckerly. A real data example is also included.

Download Full-text

Analysis of type I and II error rates of Bayesian and frequentist parametric and nonparametric two-sample hypothesis tests under preliminary assessment of normality

Computational Statistics ◽

10.1007/s00180-020-01034-7 ◽

2020 ◽

Author(s):

Riko Kelter

Keyword(s):

Null Hypothesis ◽

Error Control ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Preliminary Assessment ◽

Type I Error Rates ◽

Practical Research

Abstract Testing for differences between two groups is among the most frequently carried out statistical methods in empirical research. The traditional frequentist approach is to make use of null hypothesis significance tests which use p values to reject a null hypothesis. Recently, a lot of research has emerged which proposes Bayesian versions of the most common parametric and nonparametric frequentist two-sample tests. These proposals include Student’s two-sample t-test and its nonparametric counterpart, the Mann–Whitney U test. In this paper, the underlying assumptions, models and their implications for practical research of recently proposed Bayesian two-sample tests are explored and contrasted with the frequentist solutions. An extensive simulation study is provided, the results of which demonstrate that the proposed Bayesian tests achieve better type I error control at slightly increased type II error rates. These results are important, because balancing the type I and II errors is a crucial goal in a variety of research, and shifting towards the Bayesian two-sample tests while simultaneously increasing the sample size yields smaller type I error rates. What is more, the results highlight that the differences in type II error rates between frequentist and Bayesian two-sample tests depend on the magnitude of the underlying effect.

Download Full-text

Testing Equivalence with Repeated Measures: Tests of the Difference Model of Two-Alternative Forced-Choice Performance

The Spanish Journal of Psychology ◽

10.5209/rev_sjop.2011.v14.n2.48 ◽

2011 ◽

Vol 14 (2) ◽

pp. 1023-1049 ◽

Cited By ~ 2

Author(s):

Miguel A. García-Pérez ◽

Rocío Alcalá-Quintana

Keyword(s):

Regression Analysis ◽

Repeated Measures ◽

Null Hypothesis ◽

Type I Error ◽

Error Rates ◽

Published Data ◽

Type I ◽

Type I Error Rates ◽

Equality Of Means ◽

Unit Slope

Solving theoretical or empirical issues sometimes involves establishing the equality of two variables with repeated measures. This defies the logic of null hypothesis significance testing, which aims at assessing evidence against the null hypothesis of equality, not for it. In some contexts, equivalence is assessed through regression analysis by testing for zero intercept and unit slope (or simply for unit slope in case that regression is forced through the origin). This paper shows that this approach renders highly inflated Type I error rates under the most common sampling models implied in studies of equivalence. We propose an alternative approach based on omnibus tests of equality of means and variances and in subject-by-subject analyses (where applicable), and we show that these tests have adequate Type I error rates and power. The approach is illustrated with a re-analysis of published data from a signal detection theory experiment with which several hypotheses of equivalence had been tested using only regression analysis. Some further errors and inadequacies of the original analyses are described, and further scrutiny of the data contradict the conclusions raised through inadequate application of regression analyses.

Download Full-text

Type I error rates and power of several versions of scaled chi-square difference tests in investigations of measurement invariance.

Psychological Methods ◽

10.1037/met0000097 ◽

2017 ◽

Vol 22 (3) ◽

pp. 467-485 ◽

Cited By ~ 4

Author(s):

Jordan Campbell Brace ◽

Victoria Savalei

Keyword(s):

Measurement Invariance ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Chi Square ◽

Type I Error Rates

Download Full-text

Type I Error Rates, Coverage of Confidence Intervals, and Variance Estimation in Propensity-Score Matched Analyses

The International Journal of Biostatistics ◽

10.2202/1557-4679.1146 ◽

2009 ◽

Vol 5 (1) ◽

Cited By ~ 65

Author(s):

Peter C Austin

Keyword(s):

Propensity Score ◽

Confidence Intervals ◽

Variance Estimation ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Type I Error Rates

Download Full-text

Type I Error Rates for Parscale’s Fit Index

Educational and Psychological Measurement ◽

10.1177/0013164404264849 ◽

2005 ◽

Vol 65 (1) ◽

pp. 42-50 ◽

Cited By ~ 17

Author(s):

Christine E. Demars

Keyword(s):

Type I Error ◽

Error Rates ◽

Type I ◽

Type I Error Rates ◽

Fit Index

Download Full-text

Control of Type I Error Rates in Bayesian Sequential Designs

Bayesian Analysis ◽

10.1214/18-ba1109 ◽

2019 ◽

Vol 14 (2) ◽

pp. 399-425 ◽

Cited By ~ 4

Author(s):

Haolun Shi ◽

Guosheng Yin

Keyword(s):

Type I Error ◽

Error Rates ◽

Type I ◽

Sequential Designs ◽

Type I Error Rates

Download Full-text

Sisvar: a Guide for its Bootstrap procedures in multiple comparisons

Ciência e Agrotecnologia ◽

10.1590/s1413-70542014000200001 ◽

2014 ◽

Vol 38 (2) ◽

pp. 109-112 ◽

Cited By ~ 299

Author(s):

Daniel Furtado Ferreira

Keyword(s):

Scientific Community ◽

Type I Error ◽

Multiple Comparisons ◽

Error Rates ◽

Type I ◽

Type I Error Rates ◽

Statistical Analysis System ◽

Scientific Results ◽

Analysis System ◽

Multiple Comparison Procedures

Sisvar is a statistical analysis system with a large usage by the scientific community to produce statistical analyses and to produce scientific results and conclusions. The large use of the statistical procedures of Sisvar by the scientific community is due to it being accurate, precise, simple and robust. With many options of analysis, Sisvar has a not so largely used analysis that is the multiple comparison procedures using bootstrap approaches. This paper aims to review this subject and to show some advantages of using Sisvar to perform such analysis to compare treatments means. Tests like Dunnett, Tukey, Student-Newman-Keuls and Scott-Knott are performed alternatively by bootstrap methods and show greater power and better controls of experimentwise type I error rates under non-normal, asymmetric, platykurtic or leptokurtic distributions.

Download Full-text

Bootstrap Type I error rates for the correlation coefficient: An examination of alternate procedures.

Psychological Bulletin ◽

10.1037/0033-2909.104.2.290 ◽

1988 ◽

Vol 104 (2) ◽

pp. 290-292 ◽

Cited By ~ 21

Author(s):

Michael J Strube

Keyword(s):

Correlation Coefficient ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Type I Error Rates

Download Full-text