Evaluating Different Equating Setups in the Continuous Item Pool Calibration for Computerized Adaptive Testing

Evaluating a Computerized Adaptive Testing Version of a Cognitive Ability Test Using a Simulation Study

Journal of Psychoeducational Assessment ◽

10.1177/07342829211027753 ◽

2021 ◽

pp. 073428292110277

Author(s):

Ioannis Tsaousis ◽

Georgios D. Sideridis ◽

Hannan M. AlGhamdi

Keyword(s):

Cognitive Ability ◽

Simulation Study ◽

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Estimation Methods ◽

Item Pool ◽

Sequential Approach ◽

Ability Test ◽

Promising Alternative ◽

Item Exposure

This study evaluated the psychometric quality of a computerized adaptive testing (CAT) version of the general cognitive ability test (GCAT), using a simulation study protocol put forth by Han, K. T. (2018a). For the needs of the analysis, three different sets of items were generated, providing an item pool of 165 items. Before evaluating the efficiency of the GCAT, all items in the final item pool were linked (equated), following a sequential approach. Data were generated using a standard normal for 10,000 virtual individuals ( M = 0 and SD = 1). Using the measure’s 165-item bank, the ability value (θ) for each participant was estimated. maximum Fisher information (MFI) and maximum likelihood estimation with fences (MLEF) were used as item selection and score estimation methods, respectively. For item exposure control, the fade away method (FAM) was preferred. The termination criterion involved a minimum SE ≤ 0.33. The study revealed that the average number of items administered for 10,000 participants was 15. Moreover, the precision level in estimating the participant’s ability score was very high, as demonstrated by the CBIAS, CMAE, and CRMSE). It is concluded that the CAT version of the test is a promising alternative to administering the corresponding full-length measure since it reduces the number of administered items, prevents high rates of item exposure, and provides accurate scores with minimum measurement error.

Download Full-text

Computerized Adaptive Testing for Sleep Disorders: Development of An Item Bank and Validation in A Simulated Study

10.21203/rs.3.rs-18576/v1 ◽

2020 ◽

Author(s):

Menghua She ◽

Yaling Li ◽

Dongbo Tu ◽

Yan Cai

Keyword(s):

Sleep Disorders ◽

Computerized Adaptive Testing ◽

Assessment Tool ◽

Adaptive Testing ◽

Item Bank ◽

Accurate Assessment ◽

Item Pool ◽

Psychometric Characteristics ◽

Predictive Utility ◽

Item Fit

Abstract Background: As more and more people suffer from sleep disorders, developing an efficient, cheap and accurate assessment tool for screening sleep disorders is becoming more urgent. This study developed a computerized adaptive testing for sleep disorders (CAT-SD). Methods: A large sample of 1,304 participants was recruited to construct the item pool of CAT-SD and to investigate the psychometric characteristics of CAT-SD. More specifically, firstly the analyses of unidimensionality, model fit, item fit, item discrimination parameter and differential item functioning (DIF) were conducted to construct a final item pool which meets the requirements of item response theory (IRT) measurement. In addition, a simulated CAT study with real response data of participants was performed to investigate the psychometric characteristics of CAT-SD, including reliability, validity and predictive utility (sensitivity and specificity). Results: The final unidimensional item bank of the CAT-SD not only had good item fit, high discrimination and no DIF; Moreover, it had acceptable reliability, validity and predictive utility. Conclusions: The CAT-SD could be used as an effective and accurate assessment tool for measuring individuals' severity of the sleep disorders and offers a bran-new perspective for screening of sleep disorders with psychological scales.

Download Full-text

Computerized Adaptive Testing With the Partial Credit Model: Estimation Procedures, Population Distributions, and Item Pool Characteristics

Applied Psychological Measurement ◽

10.1177/0146621605280072 ◽

2005 ◽

Vol 29 (6) ◽

pp. 433-456 ◽

Cited By ~ 11

Author(s):

Joanna S. Gorin ◽

Barbara G. Dodd ◽

Steven J. Fitzpatrick ◽

Yann Yann Shieh

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Partial Credit Model ◽

Model Estimation ◽

Item Pool ◽

Partial Credit ◽

Population Distributions ◽

Estimation Procedures

Download Full-text

Hybrid Threshold-Based Sequential Procedures for Detecting Compromised Items in a Computerized Adaptive Testing Licensure Exam

Educational and Psychological Measurement ◽

10.1177/00131644211023868 ◽

2021 ◽

pp. 001316442110238

Author(s):

Chansoon Lee ◽

Hong Qian

Keyword(s):

Computerized Adaptive Testing ◽

Type I Error ◽

Early Stage ◽

Adaptive Testing ◽

Test Theory ◽

Type I ◽

Item Pool ◽

Sequential Procedures ◽

Local Threshold ◽

Threshold Approach

Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential procedure while controlling the Type I error rate. The hybrid threshold approach uses a local threshold for each item in an early stage of the CAT administration, and then it uses the global threshold in the decision-making stage. Applying various simulation factors, a series of simulation studies examined which factors contribute significantly to the power rate and lag time of the procedure. In addition to the simulation study, a case study investigated whether the procedures are applicable to the real item pool administered in CAT and can identify potentially compromised items in the pool. This research found that the increment of probability of a correct answer ( p-increment) was the simulation factor most important to the sequential procedures’ ability to detect compromised items. This study also found that the local threshold approach improved power rates and shortened lag times when the p-increment was small. The findings of this study could help practitioners implement the sequential procedures using the hybrid threshold approach in real-time CAT administration.

Download Full-text

Direct and Inverse Problems of Item Pool Design for Computerized Adaptive Testing

Educational and Psychological Measurement ◽

10.1177/0013164409332224 ◽

2009 ◽

Vol 69 (4) ◽

pp. 533-547 ◽

Cited By ~ 3

Author(s):

Dmitry I. Belov ◽

Ronald D. Armstrong

Keyword(s):

Inverse Problems ◽

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Item Pool ◽

Direct And Inverse Problems ◽

Item Pool Design

Download Full-text

Dimensionality of the Math Knowledge Item Pool for the Accelerated CAT- ASVAB (Computerized Adaptive Testing-Armed Services Vocational Aptitude Battery) Project

10.21236/ada194109 ◽

1987 ◽

Author(s):

D. R. Divgi

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Item Pool ◽

Knowledge Item ◽

Armed Services

Download Full-text

Computerized Adaptive Testing Using the Partial Credit Model: Effects Of Item Pool Characteristics and Different Stopping Rules

Educational and Psychological Measurement ◽

10.1177/0013164493053001005 ◽

1993 ◽

Vol 53 (1) ◽

pp. 61-77 ◽

Cited By ~ 21

Author(s):

Barbara G. Dodd ◽

William R. Koch ◽

Ralph J. De Ayala

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Stopping Rules ◽

Partial Credit Model ◽

Item Pool ◽

Partial Credit

Download Full-text

Computerized Adaptive Testing for Schizotypal Personality Disorder: Detecting Individuals at Risk

Frontiers in Psychology ◽

10.3389/fpsyg.2020.574760 ◽

2021 ◽

Vol 11 ◽

Author(s):

Yaling Li ◽

Menghua She ◽

Dongbo Tu ◽

Yan Cai

Keyword(s):

At Risk ◽

Personality Disorder ◽

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Schizotypal Personality Disorder ◽

Validation Sample ◽

Item Pool ◽

Chinese Sample ◽

Schizotypal Personality ◽

Calibration Sample

As schizotypal personality disorder (SPD) increasingly prevails in the general population, a rapid and comprehensive measurement instrument is imperative to screen individuals at risk for SPD. To address this issue, we aimed to develop a computerized adaptive testing for SPD (CAT-SPD) using a non-clinical Chinese sample (N = 999), consisting of a calibration sample (N1 = 497) and a validation sample (N2 = 502). The item pool of SPD was constructed from several widely used SPD scales and statistical analyses based on the item response theory (IRT) via a calibration sample using a graded response model (GRM). Finally, 90 items, which measured at least one symptom of diagnostic criteria of SPD in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) and had local independence, good item fit, high slope, and no differential item functioning (DIF), composed the final item pool for the CAT-SPD. In addition, a simulated CAT was conducted in an independent validation sample to assess the performance of the CAT-SPD. Results showed that the CAT-SPD not only had acceptable reliability, validity, and predictive utility but also had shorter but efficient assessment of SPD which can save significant time and reduce the test burden of individuals with less information loss.

Download Full-text

Item Selection Criteria With Practical Constraints in Cognitive Diagnostic Computerized Adaptive Testing

Educational and Psychological Measurement ◽

10.1177/0013164418790634 ◽

2018 ◽

Vol 79 (2) ◽

pp. 335-357 ◽

Cited By ~ 2

Author(s):

Chuan-Ju Lin ◽

Hua-Hua Chang

Keyword(s):

Selection Criteria ◽

Computerized Adaptive Testing ◽

Selection Index ◽

Selection Criterion ◽

Adaptive Testing ◽

Item Selection ◽

Item Pool ◽

Single Item ◽

Estimation Precision ◽

Practical Constraints

For item selection in cognitive diagnostic computerized adaptive testing (CD-CAT), ideally, a single item selection index should be created to simultaneously regulate precision, exposure status, and attribute balancing. For this purpose, in this study, we first proposed an attribute-balanced item selection criterion, namely, the standardized weighted deviation global discrimination index (SWDGDI), and subsequently formulated the constrained progressive index (CP_SWDGDI) by casting the SWDGDI in a progressive algorithm. A simulation study revealed that the SWDGDI method was effective in balancing attribute coverage and the CP_SWDGDI method was able to simultaneously balance attribute coverage and item pool usage while maintaining acceptable estimation precision. This research also demonstrates the advantage of a relatively low number of attributes in CD-CAT applications.

Download Full-text

Benefits from Computerized Adaptive Testing as Seen in Simulation Studies

European Journal of Psychological Assessment ◽

10.1027//1015-5759.15.2.91 ◽

1999 ◽

Vol 15 (2) ◽

pp. 91-98 ◽

Cited By ~ 10

Author(s):

Lutz F. Hornke

Keyword(s):

Measurement Error ◽

Computerized Adaptive Testing ◽

Test Procedure ◽

Adaptive Testing ◽

Parameter Estimates ◽

Simulation Studies ◽

Computerized Adaptive Test ◽

Item Banks ◽

Item Parameters ◽

General Reliability

Summary: Item parameters for several hundreds of items were estimated based on empirical data from several thousands of subjects. The logistic one-parameter (1PL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so that the remaining ones were assembled in well-fitting item banks. In several simulation studies 5000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end CAT testing. We also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all 1PL fitting item banks most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, only 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of computerized adaptive testing. Empirical evaluations from everyday uses will show whether these trends will hold up in practice. If so, CAT will become possible and reasonable with some 150 well-calibrated 2PL items.

Download Full-text