Criterion-Referenced Job Proficiency Testing: A Large Scale Application

Author(s):  
Milton H. Maier ◽  
Stephen F. Hirshfeld
Author(s):  
Robert Williams ◽  
Dan Woods

This chapter begins with a consideration of the state of school-based assessments as an unavoidable consequence of the contemporary societal emphasis on accountability and curricular prescriptions at the state and national level in the United States of America. Additionally, the authors comment upon the potential inaccuracies inescapable in large scale, high-stakes, standardized assessment instruments, especially when such instruments are turned to the task of evaluation—whether norm- or criterion-referenced—in a teaching and learning engagement. Likewise, the chapter concludes with suggestions and templates (elaborately configured with specific activities and assessment rubrics included) to support teachers who want to develop their own, rigorous, valid, and reliable assessments instruments embedded seamlessly in student-centered learning activities, and that accommodate the reality of literacy as a culturally situated behavior that, for contemporary learners, includes all manner of meaning-making in all manner of modalities from the pencil and paper to the purely electronic (and potentially wordless, at times) video- or audio-based.


2019 ◽  
Vol 38 (4) ◽  
pp. 426-444 ◽  
Author(s):  
Andreas Schulz ◽  
Timo Leuders ◽  
Ulrike Rangel

We provide evidence of validity for a newly developed diagnostic competence model of operation sense, by both (a) describing the theoretically substantiated development of the competence model in close association with its use within a large-scale formative assessment and (b) providing empirical evidence for the theoretically described cognitive levels of competences. The competence model describes students’ operation sense on four distinct levels. On each level, the model elaborates on the characteristics of tasks that students on this level are able to answer correctly. Moreover, the model explains this by referring to two kinds of cognitive processes that are supposed to be necessary to respond to these kinds of tasks successfully. In a validation study, about 85% of the variance in the item difficulties was explained by the four, a priori allocated, levels of operation sense. We discuss the relevance of the validation of the diagnostic competence model for the provision of criterion-referenced feedback in a large-scale formative assessment, including suggestions for teachers’ subsequent support activities, and the contributions of the model to the state of research about operation sense.


2007 ◽  
Vol 4 (3) ◽  
pp. 343-354 ◽  
Author(s):  
Roberta F. Hammett

This article argues that if multimodal and new literacies are to become common practices in schools, they have to be included in both school and provincial/state large-scale assessment programmes. Building on current criterion-referenced testing in Newfoundland and Labrador which assesses a range of literacies (viewing, reading, writing, representing, speaking and listening), the article suggests criteria which might be considered in developing holistic and analytic rubrics for assessing new literacies in ways that are productive for learners. The article describes an interactive website that may be used to familiarize teachers and education students with rubrics for assessing children's written and graphic responses to linguistic, graphic and spoken texts.


2017 ◽  
Vol 41 (6) ◽  
pp. 472-491 ◽  
Author(s):  
Brian F. Patterson ◽  
Stefanie A. Wind ◽  
George Engelhard

This study presents a new criterion-referenced approach for exploring rating quality within the framework of latent-class signal detection theory (LC-SDT) that goes beyond commonly used reliability indices, and provides substantively meaningful indicators of rater accuracy that can be used to inform rater training and monitoring at the individual rater level. Specifically, this study illustrates a flexible application of restricted LC-SDT modeling, in which restrictions can be specified for the true latent classification to reflect the unique characteristics of a particular assessment context. While the LC-SDT modeling framework provides immediately useful characterizations of raters’ behavior, the restricted LC-SDT offers complementary evidence to further support the monitoring of rater behavior by bringing criterion ratings to bear. This study uses ratings from a large-scale writing assessment, and findings suggest that the criterion (i.e., restricted) LC-SDT provides useful information about rating quality for operational raters relative to criterion ratings, which may ultimately inform rater training and monitoring procedures.


HOW ◽  
2020 ◽  
Vol 27 (2) ◽  
pp. 135-155
Author(s):  
Frank Giraldo

Large-scale language testing uses statistical information to account for the quality of an assessment system. In this reflection article, I explain how basic statistics can be used meaningfully in the context of classroom language assessment. The paper explores a series of statistical calculations that can be used to examine test scores and assessment decisions in the language classroom. Therefore, interpretations for criterion-referenced assessment underlie the paper. Finally, I discuss limitations and include recommendations for teachers to use statistics.


Author(s):  
Robert Williams ◽  
Dan Woods

This chapter begins with a consideration of the state of school-based assessments as an unavoidable consequence of the contemporary societal emphasis on accountability and curricular prescriptions at the state and national level in the United States of America. Additionally, the authors comment upon the potential inaccuracies inescapable in large scale, high-stakes, standardized assessment instruments, especially when such instruments are turned to the task of evaluation—whether norm- or criterion-referenced—in a teaching and learning engagement. Likewise, the chapter concludes with suggestions and templates (elaborately configured with specific activities and assessment rubrics included) to support teachers who want to develop their own, rigorous, valid, and reliable assessments instruments embedded seamlessly in student-centered learning activities, and that accommodate the reality of literacy as a culturally situated behavior that, for contemporary learners, includes all manner of meaning-making in all manner of modalities from the pencil and paper to the purely electronic (and potentially wordless, at times) video- or audio-based.


2019 ◽  
Vol 144 (7) ◽  
pp. 846-852
Author(s):  
Paul N. Staats ◽  
Rhona J. Souers ◽  
Amberly Lindau Nunez ◽  
Zaibo Li ◽  
Daniel F. I. Kurtycz ◽  
...  

Context.— Repair is a challenging diagnosis and a significant source of false-positive (FP) interpretations in cervical cytology. No large-scale study of performance of repair in the liquid-based era has been performed. Objective.— To evaluate the performance of repair in the College of American Pathologists Pap Education and Proficiency Testing (PT) programs. Design.— The FP rate for slides classified as repair was evaluated by preparation type, participant type (cytotechnologist, pathologist, or laboratory), and program. The specific misdiagnosis category and individual slide performance were also evaluated. The rate of misclassification of slides as repair by participants for other diagnostic categories in the Pap Education program was assessed. Results.— The overall FP rate was 1700 of 12 715 (13.4%). There was no significant difference by program or preparation type. Within the Education program there was no difference by participant type, but pathologists' FP rate in the PT program (47 of 514, 9.1%) was significantly better than cytotechnologists in the PT program (51 of 380, 13.4%) and pathologists in the Education program (690 of 4900, 14.1%). High-grade squamous intraepithelial lesions/cancers (HSIL+) accounted for 1380 of 1602 FP interpretations (86%) in Education, but 43 of 98 (43.9%) in PT. Most slides had a low rate of misclassification, but a small number were poor performers. False-negative diagnosis of HSIL+ as repair was less common, ranging from 0.7% to 1.8%. Conclusions.— Despite initial indications that liquid-based cytology might reduce the rate of misclassification of repair, FP interpretations remain common and are no different by preparation type. Misclassification is most commonly as HSIL or carcinoma, potentially resulting in significant patient harm.


1999 ◽  
Vol 173 ◽  
pp. 243-248
Author(s):  
D. Kubáček ◽  
A. Galád ◽  
A. Pravda

AbstractUnusual short-period comet 29P/Schwassmann-Wachmann 1 inspired many observers to explain its unpredictable outbursts. In this paper large scale structures and features from the inner part of the coma in time periods around outbursts are studied. CCD images were taken at Whipple Observatory, Mt. Hopkins, in 1989 and at Astronomical Observatory, Modra, from 1995 to 1998. Photographic plates of the comet were taken at Harvard College Observatory, Oak Ridge, from 1974 to 1982. The latter were digitized at first to apply the same techniques of image processing for optimizing the visibility of features in the coma during outbursts. Outbursts and coma structures show various shapes.


1994 ◽  
Vol 144 ◽  
pp. 29-33
Author(s):  
P. Ambrož

AbstractThe large-scale coronal structures observed during the sporadically visible solar eclipses were compared with the numerically extrapolated field-line structures of coronal magnetic field. A characteristic relationship between the observed structures of coronal plasma and the magnetic field line configurations was determined. The long-term evolution of large scale coronal structures inferred from photospheric magnetic observations in the course of 11- and 22-year solar cycles is described.Some known parameters, such as the source surface radius, or coronal rotation rate are discussed and actually interpreted. A relation between the large-scale photospheric magnetic field evolution and the coronal structure rearrangement is demonstrated.


Sign in / Sign up

Export Citation Format

Share Document