scholarly journals Investigation of Classification Accuracy, Test Length and Measurement Precision at Computerized Adaptive Classification Tests

Author(s):  
Seda DEMİR ◽  
Burcu ATAR
2011 ◽  
Vol 71 (6) ◽  
pp. 1006-1022 ◽  
Author(s):  
Timo Gnambs ◽  
Bernad Batinic

Computer-adaptive classification tests focus on classifying respondents in different proficiency groups (e.g., for pass/fail decisions). To date, adaptive classification testing has been dominated by research on dichotomous response formats and classifications in two groups. This article extends this line of research to polytomous classification tests for two- and three-group scenarios (e.g., inferior, mediocre, and superior proficiencies). Results of two simulation experiments with generated and real responses ( N = 2,000) to established personality scales of different length (12, 20, or 29 items) demonstrate that adaptive item presentations significantly reduce the number of items required to make such classification decisions while maintaining a consistent classification accuracy. Furthermore, the simulations highlight the importance of the selected test termination criterion, which has a significant impact on the average test length.


Author(s):  
Phillip Kim ◽  
Soyeon Ahn ◽  
Hyesung Jeon ◽  
Jae Kwan Lee ◽  
Sunghyun Park ◽  
...  

2021 ◽  
Author(s):  
Gáspár Lukács

The Response Time Concealed Information Test (RT-CIT) can reveal that a person recognizes a relevant item (probe, e.g. a murder weapon) among other, irrelevant items (controls), based on slower responses to the probe compared to the controls. The present paper assesses the influence of test length (due to practice, habituation, or fatigue) on two key variables in the RT-CIT: (a) probe-control differences and (b) classification accuracy, through a meta-analysis (using 12 previous experiments), as well as with two new experiments. It is consistently demonstrated that increased test length decreases probe-control differences but increases classification accuracies. The main implication for real-life application is that using altogether at least around 600 trials is optimal for the RT-CIT.


2020 ◽  
Vol 44 (7-8) ◽  
pp. 499-514
Author(s):  
Yi Zheng ◽  
Hyunjung Cheon ◽  
Charles M. Katz

This study explores advanced techniques in machine learning to develop a short tree-based adaptive classification test based on an existing lengthy instrument. A case study was carried out for an assessment of risk for juvenile delinquency. Two unique facts of this case are (a) the items in the original instrument measure a large number of distinctive constructs; (b) the target outcomes are of low prevalence, which renders imbalanced training data. Due to the high dimensionality of the items, traditional item response theory (IRT)-based adaptive testing approaches may not work well, whereas decision trees, which are developed in the machine learning discipline, present as a promising alternative solution for adaptive tests. A cross-validation study was carried out to compare eight tree-based adaptive test constructions with five benchmark methods using data from a sample of 3,975 subjects. The findings reveal that the best-performing tree-based adaptive tests yielded better classification accuracy than the benchmark method IRT scoring with optimal cutpoints, and yielded comparable or better classification accuracy than the best benchmark method, random forest with balanced sampling. The competitive classification accuracy of the tree-based adaptive tests also come with an over 30-fold reduction in the length of the instrument, only administering between 3 to 6 items to any individual. This study suggests that tree-based adaptive tests have an enormous potential when used to shorten instruments that measure a large variety of constructs.


2012 ◽  
Vol 433-440 ◽  
pp. 6572-6578 ◽  
Author(s):  
Dech Thammasiri ◽  
Phayung Meesad

In this research we propose an ensemble classification technique base on creating classification from a variety of techniques such as decision trees, support vector machines, neural networks and then choosing optimize the appropriate classifiers by genetic algorithm and also combined by a majority vote in order to increase classification accuracy. From classification accuracy test on Australian Credit, German Credit and Bankruptcy Data, we found that the proposed ensemble classification models selected by genetic algorithm yields highest performance and our algorithms are effective in building ensemble.


2021 ◽  
Vol 11 ◽  
Author(s):  
Sedat Sen ◽  
Allan S. Cohen

Results of a comprehensive simulation study are reported investigating the effects of sample size, test length, number of attributes and base rate of mastery on item parameter recovery and classification accuracy of four DCMs (i.e., C-RUM, DINA, DINO, and LCDMREDUCED). Effects were evaluated using bias and RMSE computed between true (i.e., generating) parameters and estimated parameters. Effects of simulated factors on attribute assignment were also evaluated using the percentage of classification accuracy. More precise estimates of item parameters were obtained with larger sample size and longer test length. Recovery of item parameters decreased as the number of attributes increased from three to five but base rate of mastery had a varying effect on the item recovery. Item parameter and classification accuracy were higher for DINA and DINO models.


2019 ◽  
Vol 14 ◽  
pp. 155892501984598
Author(s):  
Junfeng Jing ◽  
Ru Ren ◽  
Pengfei Li ◽  
Minqi Li

In this research, a statistical classification algorithm based on sparse coding is presented to classify the defects on E-glass fiber fabrics adaptively. First, all images are preprocessed by being convolved with the MR8 filter banks to obtain the filter responses. For the filter response space of each type of image, we will learn a Class-specific dictionary, and all the Class-specific dictionaries are concatenated to form a complete dictionary. Then, the reconstructed contribution rate of each atom of the complete dictionary to the image filter response is counted to obtain two types of histogram features of each image. Finally, the improved sparse representation classification is used to classify test defect images based on the histogram features. The proposed adaptive classification method has achieved an average classification accuracy of 96.67% on the dataset collected onsite. The results validate the superiority of the proposed method to E-glass fiber fabrics.


2011 ◽  
Vol 27 (3) ◽  
pp. 164-170 ◽  
Author(s):  
Anna Sundström

This study evaluated the psychometric properties of a self-report scale for assessing perceived driver competence, labeled the Self-Efficacy Scale for Driver Competence (SSDC), using item response theory analyses. Two samples of Swedish driving-license examinees (n = 795; n = 714) completed two versions of the SSDC that were parallel in content. Prior work, using classical test theory analyses, has provided support for the validity and reliability of scores from the SSDC. This study investigated the measurement precision, item hierarchy, and differential functioning for males and females of the items in the SSDC as well as how the rating scale functions. The results confirmed the previous findings; that the SSDC demonstrates sound psychometric properties. In addition, the findings showed that measurement precision could be increased by adding items that tap higher self-efficacy levels. Moreover, the rating scale can be improved by reducing the number of categories or by providing each category with a label.


2011 ◽  
Author(s):  
David S. Kreiner ◽  
Joseph J. Ryan ◽  
Samuel T. Gontkovsky

Sign in / Sign up

Export Citation Format

Share Document