Assessment Design for International Large-Scale Assessments

2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Yang Jiang ◽  
Tao Gong ◽  
Luis E. Saldivia ◽  
Gabrielle Cayton-Hodges ◽  
Christopher Agard

AbstractIn 2017, the mathematics assessments that are part of the National Assessment of Educational Progress (NAEP) program underwent a transformation shifting the administration from paper-and-pencil formats to digitally-based assessments (DBA). This shift introduced new interactive item types that bring rich process data and tremendous opportunities to study the cognitive and behavioral processes that underlie test-takers’ performances in ways that are not otherwise possible with the response data alone. In this exploratory study, we investigated the problem-solving processes and strategies applied by the nation’s fourth and eighth graders by analyzing the process data collected during their interactions with two technology-enhanced drag-and-drop items (one item for each grade) included in the first digital operational administration of the NAEP’s mathematics assessments. Results from this research revealed how test-takers who achieved different levels of accuracy on the items engaged in various cognitive and metacognitive processes (e.g., in terms of their time allocation, answer change behaviors, and problem-solving strategies), providing insights into the common mathematical misconceptions that fourth- and eighth-grade students held and the steps where they may have struggled during their solution process. Implications of the findings for educational assessment design and limitations of this research are also discussed.


Author(s):  
Clemens M. Lechner ◽  
Nivedita Bhaktha ◽  
Katharina Groskurth ◽  
Matthias Bluemke

AbstractMeasures of cognitive or socio-emotional skills from large-scale assessments surveys (LSAS) are often based on advanced statistical models and scoring techniques unfamiliar to applied researchers. Consequently, applied researchers working with data from LSAS may be uncertain about the assumptions and computational details of these statistical models and scoring techniques and about how to best incorporate the resulting skill measures in secondary analyses. The present paper is intended as a primer for applied researchers. After a brief introduction to the key properties of skill assessments, we give an overview over the three principal methods with which secondary analysts can incorporate skill measures from LSAS in their analyses: (1) as test scores (i.e., point estimates of individual ability), (2) through structural equation modeling (SEM), and (3) in the form of plausible values (PVs). We discuss the advantages and disadvantages of each method based on three criteria: fallibility (i.e., control for measurement error and unbiasedness), usability (i.e., ease of use in secondary analyses), and immutability (i.e., consistency of test scores, PVs, or measurement model parameters across different analyses and analysts). We show that although none of the methods are optimal under all criteria, methods that result in a single point estimate of each respondent’s ability (i.e., all types of “test scores”) are rarely optimal for research purposes. Instead, approaches that avoid or correct for measurement error—especially PV methodology—stand out as the method of choice. We conclude with practical recommendations for secondary analysts and data-producing organizations.


Sign in / Sign up

Export Citation Format

Share Document