simulate_CAT: A Computer Program for Post-Hoc Simulation for Computerized Adaptive Testing

İlker KALENDER

doi:10.21031/epod.15905

A Note on the Relationship of the Shannon Entropy Procedure and the Jensen–Shannon Divergence in Cognitive Diagnostic Computerized Adaptive Testing

SAGE Open ◽

10.1177/2158244019899046 ◽

2020 ◽

Vol 10 (1) ◽

pp. 215824401989904

Author(s):

Wenyi Wang ◽

Lihong Song ◽

Teng Wang ◽

Peng Gao ◽

Jian Xiong

Keyword(s):

Shannon Entropy ◽

Selection Criteria ◽

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Item Selection ◽

Item Parameters ◽

Post Hoc ◽

Relationship Of ◽

Jensen Shannon Divergence ◽

The Relationship

The purpose of this study is to investigate the relationship between the Shannon entropy procedure and the Jensen–Shannon divergence (JSD) that are used as item selection criteria in cognitive diagnostic computerized adaptive testing (CD-CAT). Because the JSD itself is defined by the Shannon entropy, we apply the well-known relationship between the JSD and Shannon entropy to establish a relationship between the item selection criteria that are based on these two measures. To understand the relationship between these two item selection criteria better, an alternative way is also provided. Theoretical derivations and empirical examples have shown that the Shannon entropy procedure and the JSD in CD-CAT have a linear relation under cognitive diagnostic models. Consistent with our theoretical conclusions, simulation results have shown that two item selection criteria behaved quite similarly in terms of attribute-level and pattern recovery rates under all conditions and they selected the same set of items for each examinee from an item bank with item parameters drawn from a uniform distribution U(0.1, 0.3) under post hoc simulations. We provide some suggestions for future studies and a discussion of relationship between the modified posterior-weighted Kullback–Leibler index and the G-DINA (generalized deterministic inputs, noisy “and” gate) discrimination index.

Download Full-text

Feasibility of computerized adaptive testing evaluated by Monte-Carlo and post-hoc simulations

Proceedings of the 2020 Federated Conference on Computer Science and Information Systems ◽

10.15439/2020f197 ◽

2020 ◽

Author(s):

Lubomír Štěpánek ◽

Patricia Martinková

Keyword(s):

Monte Carlo ◽

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Post Hoc

Download Full-text

Funding information of the article entitled “Post-hoc simulation study of computerized adaptive testing for the Korean Medical Licensing Examination”

Journal of Educational Evaluation for Health Professions ◽

10.3352/jeehp.2018.15.27 ◽

2018 ◽

Vol 15 ◽

pp. 27

Author(s):

Dong Gi Seo ◽

Jeongwook Choi

Keyword(s):

Simulation Study ◽

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Medical Licensing ◽

Licensing Examination ◽

Funding Information ◽

Post Hoc

Download Full-text

Post-hoc simulation study of computerized adaptive testing for the Korean Medical Licensing Examination

Journal of Educational Evaluation for Health Professions ◽

10.3352/jeehp.2018.15.14 ◽

2018 ◽

Vol 15 ◽

pp. 14 ◽

Cited By ~ 3

Author(s):

Dong Gi Seo ◽

Jeongwook Choi

Keyword(s):

Simulation Study ◽

Computerized Adaptive Testing ◽

Likelihood Estimation ◽

Adaptive Testing ◽

Item Selection ◽

Selection Methods ◽

Medical Licensing ◽

Licensing Examination ◽

Post Hoc ◽

Better Than

Purpose: Computerized adaptive testing (CAT) has been adopted in licensing examinations because it improves the efficiency and accuracy of the tests, as shown in many studies. This simulation study investigated CAT scoring and item selection methods for the Korean Medical Licensing Examination (KMLE). Methods: This study used a post-hoc (real data) simulation design. The item bank used in this study included all items from the January 2017 KMLE. All CAT algorithms for this study were implemented using the ‘catR’ package in the R program. Results: In terms of accuracy, the Rasch and 2-parametric logistic (PL) models performed better than the 3PL model. The ‘modal a posteriori’ and ‘expected a posterior’ methods provided more accurate estimates than maximum likelihood estimation or weighted likelihood estimation. Furthermore, maximum posterior weighted information and minimum expected posterior variance performed better than other item selection methods. In terms of efficiency, the Rasch model is recommended to reduce test length. Conclusion: Before implementing live CAT, a simulation study should be performed under varied test conditions. Based on a simulation study, and based on the results, specific scoring and item selection methods should be predetermined.

Download Full-text

Benefits from Computerized Adaptive Testing as Seen in Simulation Studies

European Journal of Psychological Assessment ◽

10.1027//1015-5759.15.2.91 ◽

1999 ◽

Vol 15 (2) ◽

pp. 91-98 ◽

Cited By ~ 10

Author(s):

Lutz F. Hornke

Keyword(s):

Measurement Error ◽

Computerized Adaptive Testing ◽

Test Procedure ◽

Adaptive Testing ◽

Parameter Estimates ◽

Simulation Studies ◽

Computerized Adaptive Test ◽

Item Banks ◽

Item Parameters ◽

General Reliability

Summary: Item parameters for several hundreds of items were estimated based on empirical data from several thousands of subjects. The logistic one-parameter (1PL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so that the remaining ones were assembled in well-fitting item banks. In several simulation studies 5000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end CAT testing. We also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all 1PL fitting item banks most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, only 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of computerized adaptive testing. Empirical evaluations from everyday uses will show whether these trends will hold up in practice. If so, CAT will become possible and reasonable with some 150 well-calibrated 2PL items.

Download Full-text

Methods for Restricting Maximum Exposure Rate in Computerized Adaptative Testing

Methodology ◽

10.1027/1614-2241.3.1.14 ◽

2007 ◽

Vol 3 (1) ◽

pp. 14-23 ◽

Cited By ~ 9

Author(s):

Juan Ramon Barrada ◽

Julio Olea ◽

Vicente Ponsoda

Keyword(s):

Measurement Accuracy ◽

Computerized Adaptive Testing ◽

Computation Time ◽

Adaptive Testing ◽

Exposure Rate ◽

Control Parameters ◽

The Impact ◽

Two Alternatives ◽

Selection Of ◽

Maximum Exposure

Abstract. The Sympson-Hetter (1985) method provides a means of controlling maximum exposure rate of items in Computerized Adaptive Testing. Through a series of simulations, control parameters are set that mark the probability of administration of an item on being selected. This method presents two main problems: it requires a long computation time for calculating the parameters and the maximum exposure rate is slightly above the fixed limit. Van der Linden (2003) presented two alternatives which appear to solve both of the problems. The impact of these methods in the measurement accuracy has not been tested yet. We show how these methods over-restrict the exposure of some highly discriminating items and, thus, the accuracy is decreased. It also shown that, when the desired maximum exposure rate is near the minimum possible value, these methods offer an empirical maximum exposure rate clearly above the goal. A new method, based on the initial estimation of the probability of administration and the probability of selection of the items with the restricted method ( Revuelta & Ponsoda, 1998 ), is presented in this paper. It can be used with the Sympson-Hetter method and with the two van der Linden's methods. This option, when used with Sympson-Hetter, speeds the convergence of the control parameters without decreasing the accuracy.

Download Full-text

Computerized adaptive testing: From inquiry to operation.

10.1037/10244-000 ◽

1997 ◽

Cited By ~ 48

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing

Download Full-text

A Comparison of Item Selection Methods for Controlling Exposure Rate in Cognitive Diagnostic Computerized Adaptive Testing

Acta Psychologica Sinica ◽

10.3724/sp.j.1041.2013.00694 ◽

2013 ◽

Vol 45 (6) ◽

pp. 694-703

Author(s):

Xiuzhen MAO ◽

Tao XIN

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Item Selection ◽

Exposure Rate ◽

Selection Methods

Download Full-text

Dynamic and Comprehensive Item Selection Strategies for Computerized Adaptive Testing Based on Graded Response Model

Acta Psychologica Sinica ◽

10.3724/sp.j.1041.2012.00400 ◽

2013 ◽

Vol 44 (3) ◽

pp. 400-412 ◽

Cited By ~ 1

Author(s):

Fen LUO ◽

Shu-Liang DING ◽

Xiao-Qing WANG

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Item Selection ◽

Response Model ◽

Graded Response Model ◽

Selection Strategies ◽

Graded Response

Download Full-text

Application of Online Calibration Technique in Computerized Adaptive Testing

Advances in Psychological Science ◽

10.3724/sp.j.1042.2013.01883 ◽

2013 ◽

Vol 21 (10) ◽

pp. 1883-1892

Author(s):

Ping CHEN ◽

Jiahui ZHANG ◽

Tao XIN

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Calibration Technique ◽

Online Calibration

Download Full-text