Investigation of Classification Accuracy, Test Length and Measurement Precision at Computerized Adaptive Classification Tests

Polytomous Adaptive Classification Testing

Educational and Psychological Measurement ◽

10.1177/0013164410393956 ◽

2011 ◽

Vol 71 (6) ◽

pp. 1006-1022 ◽

Cited By ~ 5

Author(s):

Timo Gnambs ◽

Bernad Batinic

Keyword(s):

Classification Accuracy ◽

Test Length ◽

Simulation Experiments ◽

Adaptive Classification ◽

Response Formats ◽

Termination Criterion

Computer-adaptive classification tests focus on classifying respondents in different proficiency groups (e.g., for pass/fail decisions). To date, adaptive classification testing has been dominated by research on dichotomous response formats and classifications in two groups. This article extends this line of research to polytomous classification tests for two- and three-group scenarios (e.g., inferior, mediocre, and superior proficiencies). Results of two simulation experiments with generated and real responses ( N = 2,000) to established personality scales of different length (12, 20, or 29 items) demonstrate that adaptive item presentations significantly reduce the number of items required to make such classification decisions while maintaining a consistent classification accuracy. Furthermore, the simulations highlight the importance of the selected test termination criterion, which has a significant impact on the average test length.

Download Full-text

Classification Accuracy Test of Hearing Laboratory Test Models for Railway Noise at Station Platform

Transactions of the Korean Society for Noise and Vibration Engineering ◽

10.5050/ksnve.2015.25.4.299 ◽

2015 ◽

Vol 25 (4) ◽

pp. 299-305 ◽

Cited By ~ 1

Author(s):

Phillip Kim ◽

Soyeon Ahn ◽

Hyesung Jeon ◽

Jae Kwan Lee ◽

Sunghyun Park ◽

...

Keyword(s):

Laboratory Test ◽

Classification Accuracy ◽

Railway Noise ◽

Accuracy Test ◽

Test Models

Download Full-text

Prolonged Response Time Concealed Information Test Decreases Probe-Control Differences but Increases Classification Accuracy

10.31234/osf.io/g9y6w ◽

2021 ◽

Author(s):

Gáspár Lukács

Keyword(s):

Response Time ◽

Classification Accuracy ◽

Meta Analysis ◽

Real Life ◽

Test Length ◽

Concealed Information Test ◽

Key Variables ◽

Concealed Information ◽

Murder Weapon ◽

Information Test

The Response Time Concealed Information Test (RT-CIT) can reveal that a person recognizes a relevant item (probe, e.g. a murder weapon) among other, irrelevant items (controls), based on slower responses to the probe compared to the controls. The present paper assesses the influence of test length (due to practice, habituation, or fatigue) on two key variables in the RT-CIT: (a) probe-control differences and (b) classification accuracy, through a meta-analysis (using 12 previous experiments), as well as with two new experiments. It is consistently demonstrated that increased test length decreases probe-control differences but increases classification accuracies. The main implication for real-life application is that using altogether at least around 600 trials is optimal for the RT-CIT.

Download Full-text

Using Machine Learning Methods to Develop a Short Tree-Based Adaptive Classification Test: Case Study With a High-Dimensional Item Pool and Imbalanced Data

Applied Psychological Measurement ◽

10.1177/0146621620931198 ◽

2020 ◽

Vol 44 (7-8) ◽

pp. 499-514

Author(s):

Yi Zheng ◽

Hyunjung Cheon ◽

Charles M. Katz

Keyword(s):

Machine Learning ◽

Classification Accuracy ◽

Imbalanced Data ◽

Training Data ◽

Adaptive Tests ◽

Promising Alternative ◽

Adaptive Classification ◽

Short Tree ◽

Classification Test

This study explores advanced techniques in machine learning to develop a short tree-based adaptive classification test based on an existing lengthy instrument. A case study was carried out for an assessment of risk for juvenile delinquency. Two unique facts of this case are (a) the items in the original instrument measure a large number of distinctive constructs; (b) the target outcomes are of low prevalence, which renders imbalanced training data. Due to the high dimensionality of the items, traditional item response theory (IRT)-based adaptive testing approaches may not work well, whereas decision trees, which are developed in the machine learning discipline, present as a promising alternative solution for adaptive tests. A cross-validation study was carried out to compare eight tree-based adaptive test constructions with five benchmark methods using data from a sample of 3,975 subjects. The findings reveal that the best-performing tree-based adaptive tests yielded better classification accuracy than the benchmark method IRT scoring with optimal cutpoints, and yielded comparable or better classification accuracy than the best benchmark method, random forest with balanced sampling. The competitive classification accuracy of the tree-based adaptive tests also come with an over 30-fold reduction in the length of the instrument, only administering between 3 to 6 items to any individual. This study suggests that tree-based adaptive tests have an enormous potential when used to shorten instruments that measure a large variety of constructs.

Download Full-text

Ensemble Data Classification based on Diversity of Classifiers Optimized by Genetic Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.6572 ◽

2012 ◽

Vol 433-440 ◽

pp. 6572-6578 ◽

Cited By ~ 3

Author(s):

Dech Thammasiri ◽

Phayung Meesad

Keyword(s):

Genetic Algorithm ◽

Classification Accuracy ◽

Majority Vote ◽

Ensemble Classification ◽

Support Vector ◽

Classification Models ◽

Accuracy Test ◽

Ensemble Data ◽

Classification Technique ◽

Vector Machines

In this research we propose an ensemble classification technique base on creating classification from a variety of techniques such as decision trees, support vector machines, neural networks and then choosing optimize the appropriate classifiers by genetic algorithm and also combined by a majority vote in order to increase classification accuracy. From classification accuracy test on Australian Credit, German Credit and Bankruptcy Data, we found that the proposed ensemble classification models selected by genetic algorithm yields highest performance and our algorithms are effective in building ensemble.

Download Full-text

Sample Size Requirements for Applying Diagnostic Classification Models

Frontiers in Psychology ◽

10.3389/fpsyg.2020.621251 ◽

2021 ◽

Vol 11 ◽

Author(s):

Sedat Sen ◽

Allan S. Cohen

Keyword(s):

Sample Size ◽

Classification Accuracy ◽

Base Rate ◽

Item Parameter ◽

Test Length ◽

Diagnostic Classification Models ◽

Parameter Recovery ◽

Estimated Parameters ◽

Item Parameters ◽

Larger Sample

Results of a comprehensive simulation study are reported investigating the effects of sample size, test length, number of attributes and base rate of mastery on item parameter recovery and classification accuracy of four DCMs (i.e., C-RUM, DINA, DINO, and LCDMREDUCED). Effects were evaluated using bias and RMSE computed between true (i.e., generating) parameters and estimated parameters. Effects of simulated factors on attribute assignment were also evaluated using the percentage of classification accuracy. More precise estimates of item parameters were obtained with larger sample size and longer test length. Recovery of item parameters decreased as the number of attributes increased from three to five but base rate of mastery had a varying effect on the item recovery. Item parameter and classification accuracy were higher for DINA and DINO models.

Download Full-text

Statistical classification for E-glass fiber fabric defects based on sparse coding

Journal of Engineered Fibers and Fabrics ◽

10.1177/1558925019845985 ◽

2019 ◽

Vol 14 ◽

pp. 155892501984598

Author(s):

Junfeng Jing ◽

Ru Ren ◽

Pengfei Li ◽

Minqi Li

Keyword(s):

Glass Fiber ◽

Sparse Coding ◽

Classification Accuracy ◽

Statistical Classification ◽

Contribution Rate ◽

Average Classification Accuracy ◽

Adaptive Classification ◽

Sparse Representation Classification ◽

Fiber Fabric ◽

Filter Response

In this research, a statistical classification algorithm based on sparse coding is presented to classify the defects on E-glass fiber fabrics adaptively. First, all images are preprocessed by being convolved with the MR8 filter banks to obtain the filter responses. For the filter response space of each type of image, we will learn a Class-specific dictionary, and all the Class-specific dictionaries are concatenated to form a complete dictionary. Then, the reconstructed contribution rate of each atom of the complete dictionary to the image filter response is counted to obtain two types of histogram features of each image. Finally, the improved sparse representation classification is used to classify test defect images based on the histogram features. The proposed adaptive classification method has achieved an average classification accuracy of 96.67% on the dataset collected onsite. The results validate the superiority of the proposed method to E-glass fiber fabrics.

Download Full-text

Using the Rating Scale Model to Examine the Psychometric Properties of the Self-Efficacy Scale for Driver Competence

European Journal of Psychological Assessment ◽

10.1027/1015-5759/a000063 ◽

2011 ◽

Vol 27 (3) ◽

pp. 164-170 ◽

Cited By ~ 1

Author(s):

Anna Sundström

Keyword(s):

Psychometric Properties ◽

Self Efficacy ◽

Rating Scale ◽

Measurement Precision ◽

The Self ◽

Test Theory ◽

Self Report ◽

Scale Model ◽

Validity And Reliability ◽

Two Samples

This study evaluated the psychometric properties of a self-report scale for assessing perceived driver competence, labeled the Self-Efficacy Scale for Driver Competence (SSDC), using item response theory analyses. Two samples of Swedish driving-license examinees (n = 795; n = 714) completed two versions of the SSDC that were parallel in content. Prior work, using classical test theory analyses, has provided support for the validity and reliability of scores from the SSDC. This study investigated the measurement precision, item hierarchy, and differential functioning for males and females of the items in the SSDC as well as how the rating scale functions. The results confirmed the previous findings; that the SSDC demonstrates sound psychometric properties. In addition, the findings showed that measurement precision could be increased by adding items that tap higher self-efficacy levels. Moreover, the rating scale can be improved by reducing the number of categories or by providing each category with a label.

Download Full-text