Item Parameter Estimation Via Marginal Maximum Likelihood and an EM Algorithm: A Didactic

1988 ◽  
Vol 13 (3) ◽  
pp. 243-271 ◽  
Author(s):  
Michael R. Harwell ◽  
Frank B. Baker ◽  
Michael Zwarts

The Bock and Aitkin (1981) Marginal Maximum Likelihood/EM approach to item parameter estimation is an alternative to the classical joint maximum likelihood procedure of item response theory. Unfortunately, the complexity of the underlying mathematics and the terse nature of the existing literature has made understanding of the approach difficult. To make the approach accessible to a wider audience, the present didactic paper provides the essential mathematical details of a marginal maximum likelihood/EM solution and shows how it can be used to obtain consistent item parameter estimates. For pedagogical purposes, a short BASIC computer program is used to illustrate the underlying simplicity of the method.

2021 ◽  
Author(s):  
Jan Steinfeld ◽  
Alexander Robitzsch

This article describes the conditional maximum likelihood-based item parameter estimation in probabilistic multistage designs. In probabilistic multistage designs, the routing is not solely based on a raw score j and a cut score c as well as a rule for routing into a module such as j < c or j ≤ c but is based on a probability p(j) for each raw score j. It can be shown that the use of a conventional conditional maximum likelihood parameter estimate in multistage designs leads to severely biased item parameter estimates. Zwitser and Maris (2013) were able to show that with deterministic routing, the integration of the design into the item parameter estimation leads to unbiased estimates. This article extends this approach to probabilistic routing and, at the same time, represents a generalization. In a simulation study, it is shown that the item parameter estimation in probabilistic designs leads to unbiased item parameter estimates.


Psych ◽  
2021 ◽  
Vol 3 (3) ◽  
pp. 279-307
Author(s):  
Jan Steinfeld ◽  
Alexander Robitzsch

There is some debate in the psychometric literature about item parameter estimation in multistage designs. It is occasionally argued that the conditional maximum likelihood (CML) method is superior to the marginal maximum likelihood method (MML) because no assumptions have to be made about the trait distribution. However, CML estimation in its original formulation leads to biased item parameter estimates. Zwitser and Maris (2015, Psychometrika) proposed a modified conditional maximum likelihood estimation method for multistage designs that provides practically unbiased item parameter estimates. In this article, the differences between different estimation approaches for multistage designs were investigated in a simulation study. Four different estimation conditions (CML, CML estimation with the consideration of the respective MST design, MML with the assumption of a normal distribution, and MML with log-linear smoothing) were examined using a simulation study, considering different multistage designs, number of items, sample size, and trait distributions. The results showed that in the case of the substantial violation of the normal distribution, the CML method seemed to be preferable to MML estimation employing a misspecified normal trait distribution, especially if the number of items and sample size increased. However, MML estimation using log-linear smoothing lea to results that were very similar to the CML method with the consideration of the respective MST design.


2021 ◽  
Author(s):  
Jan Steinfeld ◽  
Alexander Robitzsch

This article describes the conditional maximum likelihood-based item parameter estimation in probabilistic multistage designs. In probabilistic multistage designs, the routing is not solely based on a raw score j and a cut score c as well as a rule for routing into a module such as j < c or j ≤ c but is based on a probability p(j) for each raw score j. It can be shown that the use of a conventional conditional maximum likelihood parameter estimate in multistage designs leads to severely biased item parameter estimates. Zwitser and Maris (2013) were able to show that with deterministic routing, the integration of the design into the item parameter estimation leads to unbiased estimates. This article extends this approach to probabilistic routing and, at the same time, represents a generalization. In a simulation study, it is shown that the item parameter estimation in probabilistic designs leads to unbiased item parameter estimates.


2018 ◽  
Vol 43 (1) ◽  
pp. 18-33 ◽  
Author(s):  
Seang-Hwane Joo ◽  
Seokjoon Chun ◽  
Stephen Stark ◽  
Olexander S. Chernyshenko

Over the last decade, researchers have come to recognize the benefits of ideal point item response theory (IRT) models for noncognitive measurement. Although most applied studies have utilized the Generalized Graded Unfolding Model (GGUM), many others have been developed. Most notably, David Andrich and colleagues published a series of papers comparing dominance and ideal point measurement perspectives, and they proposed ideal point models for dichotomous and polytomous single-stimulus responses, known as the Hyperbolic Cosine Model (HCM) and the General Hyperbolic Cosine Model (GHCM), respectively. These models have item response functions resembling the GGUM and its more constrained forms, but they are mathematically simpler. Despite the apparent impact of Andrich’s work on ensuing investigations, the HCM and GHCM have been largely overlooked by applied researchers. This may stem from questions about the compatibility of the parameter metric with other ideal point estimation and model-data fit software or seemingly unrealistic parameter estimates sometimes produced by the original joint maximum likelihood (JML) estimation software. Given the growing list of ideal point applications and variations in sample and scale characteristics, the authors believe these HCMs warrant renewed consideration. To address this need and overcome potential JML estimation difficulties, this study developed a marginal maximum likelihood (MML) estimation algorithm for the GHCM and explored parameter estimation requirements in a Monte Carlo study manipulating sample size, scale length, and data types. The authors found a sample size of 400 was adequate for parameter estimation and, in accordance with GGUM studies, estimation was superior in polytomous conditions.


2019 ◽  
Vol 44 (3) ◽  
pp. 309-341 ◽  
Author(s):  
Jeffrey M. Patton ◽  
Ying Cheng ◽  
Maxwell Hong ◽  
Qi Diao

In psychological and survey research, the prevalence and serious consequences of careless responses from unmotivated participants are well known. In this study, we propose to iteratively detect careless responders and cleanse the data by removing their responses. The careless responders are detected using person-fit statistics. In two simulation studies, the iterative procedure leads to nearly perfect power in detecting extremely careless responders and much higher power than the noniterative procedure in detecting moderately careless responders. Meanwhile, the false-positive error rate is close to the nominal level. In addition, item parameter estimation is much improved by iteratively cleansing the calibration sample. The bias in item discrimination and location parameter estimates is substantially reduced. The standard error estimates, which are spuriously small in the presence of careless responses, are corrected by the iterative cleansing procedure. An empirical example is also presented to illustrate the proposed procedure. These results suggest that the proposed procedure is a promising way to improve item parameter estimation for tests of 20 items or longer when data are contaminated by careless responses.


2021 ◽  
pp. 001316442110036
Author(s):  
Joseph A. Rios

The presence of rapid guessing (RG) presents a challenge to practitioners in obtaining accurate estimates of measurement properties and examinee ability. In response to this concern, researchers have utilized response times as a proxy of RG and have attempted to improve parameter estimation accuracy by filtering RG responses using popular scoring approaches, such as the effort-moderated item response theory (EM-IRT) model. However, such an approach assumes that RG can be correctly identified based on an indirect proxy of examinee behavior. A failure to meet this assumption leads to the inclusion of distortive and psychometrically uninformative information in parameter estimates. To address this issue, a simulation study was conducted to examine how violations to the assumption of correct RG classification influences EM-IRT item and ability parameter estimation accuracy and compares these results with parameter estimates from the three-parameter logistic (3PL) model, which includes RG responses in scoring. Two RG misclassification factors were manipulated: type (underclassification vs. overclassification) and rate (10%, 30%, and 50%). Results indicated that the EM-IRT model provided improved item parameter estimation over the 3PL model regardless of misclassification type and rate. Furthermore, under most conditions, increased rates of RG underclassification were associated with the greatest bias in ability parameter estimates from the EM-IRT model. In spite of this, the EM-IRT model with RG misclassifications demonstrated more accurate ability parameter estimation than the 3PL model when the mean ability of RG subgroups did not differ. This suggests that in certain situations it may be better for practitioners to (a) imperfectly identify RG than to ignore the presence of such invalid responses and (b) select liberal over conservative response time thresholds to mitigate bias from underclassified RG.


Sign in / Sign up

Export Citation Format

Share Document