Time Saving Students’ Formative Assessment: Algorithm to Balance Number of Tasks and Result Reliability

Jaroslav Melesko; Simona Ramanauskaite

doi:10.3390/app11136048

Time Saving Students’ Formative Assessment: Algorithm to Balance Number of Tasks and Result Reliability

Applied Sciences ◽

10.3390/app11136048 ◽

2021 ◽

Vol 11 (13) ◽

pp. 6048

Author(s):

Jaroslav Melesko ◽

Simona Ramanauskaite

Keyword(s):

Formative Assessment ◽

Item Response ◽

Item Difficulty ◽

Classical Test Theory ◽

Personalized Learning ◽

Test Theory ◽

Time Saving ◽

Learning Path ◽

Crucial Component

Feedback is a crucial component of effective, personalized learning, and is usually provided through formative assessment. Introducing formative assessment into a classroom can be challenging because of test creation complexity and the need to provide time for assessment. The newly proposed formative assessment algorithm uses multivariate Elo rating and multi-armed bandit approaches to solve these challenges. In the case study involving 106 students of the Cloud Computing course, the algorithm shows double learning path recommendation precision compared to classical test theory based assessment methods. The algorithm usage approaches item response theory benchmark precision with greatly reduced quiz length without the need for item difficulty calibration.

Download Full-text

ANALISIS METODE CHEATING PADA TES BERSKALA BESAR

Molluca Journal of Chemistry Education (MJoCE) ◽

10.30598/mjocevol9iss2pp133-146 ◽

2019 ◽

Vol 9 (2) ◽

pp. 133-146

Author(s):

Yance Manoppo ◽

Djemari Mardapi

Keyword(s):

Item Response Theory ◽

Item Response ◽

Item Difficulty ◽

Classical Test Theory ◽

Test Theory ◽

Theory Approach ◽

Response Theory ◽

Index Method ◽

National Examination ◽

Classical Test

This study aimed to reveal: (1) the characteristics of items of Chemistry Test in National Examination by using the classical test theory and item response theory; (2) the amount of cheating which occured by using Angoff's B-index Method, Pair 1 Method, Pair 2 Method, Modified Error Similarity Analysis (MESA) Method, and G2 Method; (3) the methods that detect more cheating in the implementation of the Chemistry Test in National Examination for high schools in the year 2011/2012 in Maluku Province. The results of the analysis with the classical test theory approach show that 77.5% items have item difficulty functioning well, 55% items have discrimination yet qualified and 70% items have distractor that works well with the index reliability test of 0,772. The analysis using the item response theory approach shows that 14 (35%) items fit with the model, the maximum function information is 11,4069 at θ = -1,6, and the magnitude of the error of measurement is 2,296. The number of pairs who are suspected of cheating is as follows: 13 pairs according to Angoff's B-index Method, 212 pairs according to Pair 1 Method, 444 pairs according to Pair 2 Method, 7 pairs according to MESA Method, and 102 pairs according to G2 Method. The most widely detecting cheating in a row is a Pair 2, Pair 1, G2, Angoff's B-index, and MESA.

Download Full-text

Evaluation Properties of the Chinese Version of the Caring Factor Survey-Caring of Manager According to Classical Test Theory and Item Response Theory

International Journal for Human Caring ◽

10.20467/1091-5710.23.4.275 ◽

2019 ◽

Vol 23 (4) ◽

pp. 275-283

Author(s):

Ling Wang ◽

John W. Nelson

Keyword(s):

Item Response Theory ◽

Psychometric Properties ◽

Item Response ◽

Rating Scale ◽

Item Difficulty ◽

Classical Test Theory ◽

Chinese Version ◽

Test Theory ◽

Response Theory ◽

Classical Test

The aim of the study is to evaluate psychometric properties of the Chinese version of Caring Factor Survey-Caring of Manager (CFS-CM), which evaluated by using with classical test theory (CTT) and item response theory (IRT). CTT analyses evaluate include internal consistence reliability, test–retest reliability and construct validity. IRT analyses were conducted to test the unidimensionality, item fit, item difficulty, the reliability, and rating scale analysis. CTT showed good psychometric properties of the CFS-CM. However, IRT revealed some problems of category level. Taking the above issue into consideration, it could be beneficial to perfect the CFS-CM in the future.

Download Full-text

Measuring Procedural Justice: A Case Study in Criminometrics

10.31235/osf.io/c48mh ◽

2021 ◽

Author(s):

Amanda Graham ◽

Francis T. Cullen

Keyword(s):

Systematic Review ◽

Item Response Theory ◽

Procedural Justice ◽

Item Response ◽

Classical Test Theory ◽

Test Theory ◽

Current Reform ◽

Two Measures ◽

Item Scale

Current reform efforts have called for police to engage in the use of procedural justice to build trust between the police and community. However, research in this area has employed highly heterogeneous measures of procedural justice, a practice potentially inhibiting the systematic accumulation of knowledge. To address this issue, this study develops two measures of procedural justice (a 16-item and 10-item scale) by following standard psychometric principles, including a systematic review of previously used items, four iterative surveys to narrow the pool of items, classical test theory, and item response theory. Beyond developing scales to be used in policing research, the analysis serves as a case study in “criminometrics,” arguing that the measurement of core constructs in criminology generally should be based on psychometric principles.

Download Full-text

Item development process and analysis of 50 case-based items for implementation on the Korean Nursing Licensing Examination

Journal of Educational Evaluation for Health Professions ◽

10.3352/jeehp.2017.14.20 ◽

2017 ◽

Vol 14 ◽

pp. 20

Author(s):

In Sook Park ◽

Yeon Ok Suh ◽

Hae Sook Park ◽

So Young Kang ◽

Kwang Sung Kim ◽

...

Keyword(s):

Item Response ◽

Content Validity ◽

Goodness Of Fit ◽

Item Difficulty ◽

Classical Test Theory ◽

Test Theory ◽

Goodness Of Fit Test ◽

Classical Test ◽

Licensing Examination ◽

Case Based

Purpose: The purpose of this study was to improve the quality of items on the Korean Nursing Licensing Examination by developing and evaluating case-based items that reflect integrated nursing knowledge.Methods: We conducted a cross-sectional observational study to develop new case-based items. The methods for developing test items included expert workshops, brainstorming, and verification of content validity. After a mock examination of undergraduate nursing students using the newly developed case-based items, we evaluated the appropriateness of the items through classical test theory and item response theory.Results: A total of 50 case-based items were developed for the mock examination, and content validity was evaluated. The question items integrated 34 discrete elements of integrated nursing knowledge. The mock examination was taken by 741 baccalaureate students in their fourth year of study at 13 universities. Their average score on the mock examination was 57.4, and the examination showed a reliability of 0.40. According to classical test theory, the average level of item difficulty of the items was 57.4% (80%–100% for 12 items; 60%–80% for 13 items; and less than 60% for 25 items). The mean discrimination index was 0.19, and was above 0.30 for 11 items and 0.20 to 0.29 for 15 items. According to item response theory, the item discrimination parameter (in the logistic model) was none for 10 items (0.00), very low for 20 items (0.01 to 0.34), low for 12 items (0.35 to 0.64), moderate for 6 items (0.65 to 1.34), high for 1 item (1.35 to 1.69), and very high for 1 item (above 1.70). The item difficulty was very easy for 24 items (below −2.0), easy for 8 items (−2.0 to −0.5), medium for 6 items (−0.5 to 0.5), hard for 3 items (0.5 to 2.0), and very hard for 9 items (2.0 or above). The goodness-of-fit test in terms of the 2-parameter item response model between the range of 2.0 to 0.5 revealed that 12 items had an ideal correct answer rate.Conclusion: We surmised that the low reliability of the mock examination was influenced by the timing of the test for the examinees and the inappropriate difficulty of the items. Our study suggested a methodology for the development of future case-based items for the Korean Nursing Licensing Examination.

Download Full-text

KARAKTERISTIK BUTIR SOAL: CLASSICAL TEST THEORY VS ITEM RESPONSE THEORY?

DIDAKTIKA : Jurnal Kependidikan ◽

10.30863/didaktika.v13i1.296 ◽

2019 ◽

Vol 13 (1) ◽

pp. 1-16

Author(s):

Muh Syahrul Sarea ◽

Rosnia Ruslan

Keyword(s):

Item Response Theory ◽

Item Response ◽

Item Difficulty ◽

Classical Test Theory ◽

Test Theory ◽

Response Theory ◽

Good Discrimination ◽

Final Exam ◽

Classical Test ◽

Good Item

This research aimes to describe the characteristic of UAS items theme 1 at the fourth grade of Primary School in Paramasan bawah village according to the item difficulties and discrimination. The sample of this research was 37 students who took the final examination year academic 2018/2019. The objects of this research were question items and the answer sheet of the final exam that obtained from 3 different schools in Paramasan Bawah village. The data analysis technique used in this research was empirical analysis helped by Bilog and Iteman program application. This analysis used to know the characteristic of items based on the Item Response Theory and Classical Test Theory. The result of this research showed the characteristic of UAS items, according to item response theory, 30 items had a good discrimination and 33 items had a good item difficulty, while according to Classical Test Theory: 15 items had a good discrimination and 27 items had a good item difficulty.Keywords: Characteristics of items, item difficulties, discrimination

Download Full-text

On the complementarity of classical test theory and item response models: item difficulty estimates and computerized adaptive testing

Ensaio Avaliação e Políticas Públicas em Educação ◽

10.1590/s0104-40362015000300003 ◽

2015 ◽

Vol 23 (88) ◽

pp. 593-610

Author(s):

Patrícia Costa ◽

Maria Eugénia Ferrão

Keyword(s):

Item Response ◽

Computerized Adaptive Testing ◽

Item Difficulty ◽

Classical Test Theory ◽

Adaptive Testing ◽

Test Theory ◽

Partial Credit Model ◽

Response Models ◽

Item Response Models ◽

Classical Test

This study aims to provide statistical evidence of the complementarity between classical test theory and item response models for certain educational assessment purposes. Such complementarity might support, at a reduced cost, future development of innovative procedures for item calibration in adaptive testing. Classical test theory and the generalized partial credit model are applied to tests comprising multiple choice, short answer, completion, and open response items scored partially. Datasets are derived from the tests administered to the Portuguese population of students enrolled in the 4th and 6th grades. The results show a very strong association between the estimates of difficulty obtained from classical test theory and item response models, corroborating the statistical theory of mental testing.

Download Full-text

An Investigation of Agreement Between the Item Difficulty Coefficient Calculated in Accordance With Classical Test Theory and Item Response Theory With Bland–Altman Method

Communication in Statistics- Theory and Methods ◽

10.1080/03610926.2013.793353 ◽

2015 ◽

Vol 44 (21) ◽

pp. 4614-4621

Author(s):

Tülin Acar

Keyword(s):

Item Response Theory ◽

Item Response ◽

Item Difficulty ◽

Classical Test Theory ◽

Test Theory ◽

Response Theory ◽

Classical Test ◽

Bland Altman Method

Download Full-text

Comparative Analysis of Classical Test Theory and Item Response Theory Based Item Parameter Estimates of Senior School Certificate Mathematics Examination

European Scientific Journal ESJ ◽

10.19044/esj.2016.v12n28p263 ◽

2016 ◽

Vol 12 (28) ◽

pp. 263 ◽

Cited By ~ 4

Author(s):

Awopeju, O. A. ◽

Afolabi, E. R. I.

Keyword(s):

Item Response Theory ◽

Item Response ◽

Item Difficulty ◽

Classical Test Theory ◽

Small Sample ◽

Test Theory ◽

Psychometric Tests ◽

Sampling Plans ◽

School Certificate ◽

Item Parameters

The study compared Classical Test Theory (CTT) and Item Response Theory (IRT)-estimated item difficulty and item discrimination indices in relation to the ability of examinees in Senior School Certificate Examination (SSCE) in Mathematics with a view to providing empirical basis for informed decisions on the appropriateness of statistical and psychometric tests. The study adopted ex-post-facto design. A sample of 6,000 students was selected from the population of 35,262 students who sat for the NECO SSCE Mathematics Paper 1 in 2008 in Osun State, Nigeria. An instrument consisting of 60-multiple-choice items, May/June 2008 NECO SSCE Mathematics Paper 1 was used. Three sampling plans: random, gender and ability sampling plans were employed to study the behaviours of the examinees scores under the CTT and IRT measurement frameworks. BILOG-MG 3 was used to estimate the indices of item parameters and SPSS 20 was used to compare CTT- and IRT-based item parameters. The results showed that CTT-based item difficulty estimates and oneparameter IRT item difficulty estimates were comparable (the correlations were generally in the -0.702 to -0.988 range in large sample and -0.622 to - 0.989 range in small sample). Results also indicated that CTT-based and two-parameter IRT-based item discrimination estimates were comparable (the correlations were in the 0.430 to 0.880 ranges in large sample and 0.531 to 0.950 range in small sample). The study concluded that CTT and IRT were comparable in estimating item characteristics of statistical and psychometric tests and thus could be used as complementary procedures in the development of national examinations

Download Full-text

Construction and psychometric properties of a computer memory battery using classical test theory and item response theory

PsycEXTRA Dataset ◽

10.1037/e578442014-080 ◽

2011 ◽

Author(s):

Aristides Ferreira

Keyword(s):

Item Response Theory ◽

Psychometric Properties ◽

Item Response ◽

Classical Test Theory ◽

Test Theory ◽

Computer Memory ◽

Response Theory ◽

Classical Test

Download Full-text

Correction to: Validation of Fear of COVID-19 Scale in India: Classical Test Theory and Item Response Theory Approach

International Journal of Mental Health and Addiction ◽

10.1007/s11469-021-00616-w ◽

2021 ◽

Author(s):

Neha Bellamkonda ◽

Murugan Pattusamy

Keyword(s):

Item Response Theory ◽

Item Response ◽

Classical Test Theory ◽

Test Theory ◽

Theory Approach ◽

Response Theory ◽

Classical Test

Download Full-text