alternate form reliability
Recently Published Documents


TOTAL DOCUMENTS

38
(FIVE YEARS 0)

H-INDEX

9
(FIVE YEARS 0)

2020 ◽  
pp. 153450842097845
Author(s):  
Sarah J. Conoyer ◽  
William J. Therrien ◽  
Kristen K. White

Meta-analysis was used to examine curriculum-based measurement in the content areas of social studies and science. Nineteen studies between the years of 1998 and 2020 were reviewed to determine overall mean correlation for criterion validity and examine alternate-form reliability and slope coefficients. An overall mean correlation of .59 was found for criterion validity; however, there was significant heterogeneity across studies suggesting curriculum-based measure (CBM) format or content area may affect findings. Low to high alternative form reliability correlation coefficients were reported across CBM formats between .21 and .89. Studies investigating slopes included mostly vocabulary-matching formats and reported a range from .12 to .65 correct items per week with a mean of .34. Our findings suggest that additional research in the development of these measures in validity, reliability, and slope is warranted.


2020 ◽  
Vol 25 (3) ◽  
pp. 318-333 ◽  
Author(s):  
Elizabeth A Lam ◽  
Susan Rose ◽  
Kristen L McMaster

Abstract This study compared the reliability and validity of student scores from paper–pencil and e-based assessments using the “maze” and “silent reading fluency” (SRF) tasks. Forty students who were deaf and hard of hearing and reading between the second and fifth grade reading levels and their teachers (n = 21) participated. For maze, alternate form reliability coefficients obtained from correct scores and correct scores adjusted for guessing ranged from r = .61 to .84 (ps < .01); criterion-related validity coefficients ranged from r = .33 to .67 (most ps < .01). For SRF, reliability coefficients obtained from correct scores ranged from r = .50 to .75 (ps < .01); validity ranged from r = .25 to .72. Differences between student performance on paper–pencil and e-based conditions were generally non-significant for maze; significant differences between conditions for SRF favored the paper–pencil condition. Findings suggest that maze holds promise, with inconclusive results for SRF.


2019 ◽  
pp. 153450841988394
Author(s):  
Amanda M. VanDerHeyden ◽  
Carmen Broussard

This study details the construction of parameters for generating subskill mastery math measures to be used for screening, intervention planning, progress monitoring, and proximal program evaluation. Parameters for generating assessment measures were built and tested to verify initial equivalence of generated measures using potential digits correct as a proxy for task difficulty across generated measures. Generated measures met initial equivalence criteria and were subjected to further reliability analysis. Measures were generated and administered 1 week apart at fall and winter to students in Grades K, 1, 3, 5, and 7. Thirty-four screening measures were examined for delayed alternate form reliability, risk decision agreement, and interobserver agreement. Delayed alternate form reliability values generally exceeded r = .80, could be reliably scored, and yielded consistent risk decisions. Future research directions were discussed.


2018 ◽  
Vol 45 (4) ◽  
pp. 311-320
Author(s):  
Sarah J. Conoyer ◽  
Lisa Goran ◽  
Abigail A. Allen ◽  
Katie E. Hoffman

The purpose of this preliminary study was to explore the reliability of Curriculum-Based Measurement (CBM) vocabulary-matching forms with students in an introduction to special education course in a college setting. Data from 84 students enrolled in a teacher preparation program across three semesters were examined. Results suggest low to moderate alternate form reliability with adjacent forms ( r = .49) compared to the mean of two weekly forms (r = .65). Future directions on form development to strengthen reliability are discussed as well as implications for CBM use in college classrooms as a formative assessment tool.


2018 ◽  
Vol 37 (7) ◽  
pp. 887-898
Author(s):  
Sarah J. Conoyer ◽  
Jeremy W. Ford ◽  
R. Alex Smith ◽  
Erica N. Mason ◽  
Erica S. Lembke ◽  
...  

This replication study examined the alternate form reliability, criterion validity, and predictive utility of two curriculum-based measurement (CBM) tools in science, Vocabulary-Matching (VM) and Statement Verification for Science (SV-S), for the purpose of screening. In all, 205 seventh-grade students from four middle schools were given alternate forms of each science CBM tool. Scores from the Idaho Standards Achievement Test (ISAT) science assessment were obtained. Stronger evidence of reliability and validity with the ISAT was found for VM compared with SV-S. With regard to predictive utility, VM more accurately classified students’ at-risk status compared with SV-S for identifying proficiency on the ISAT. Practical implications and directions for future research are also discussed.


2018 ◽  
Author(s):  
◽  
Robert Alexander Smith

The purpose of this study was to explore the technical adequacy and appropriateness of using benchmarks established with the general population with two forms of Curriculum Based Measures-Writing (CBM-W), Word Dictation (WD) and Picture Word (PW), with English Language Learners (ELs) in the 1st through 3rd grades as well as explore the utility of combining a measure of motivated academic behavior (i.e., Social, Academic, and Emotional Behavior Risk Screener-Academic Behavior subscale (SAEBRS-AB)) with CBM-W for identifying risk in writing for young ELs. ELs in the 1st through 3rd grades (n = 71) were administered two forms of WD and PW in the fall, winter, and spring of the same academic year. Teachers (n = 9) also completed the SAEBRS-AB at each time-point for each participating student. Correlations between forms at each time-point were used to establish alternate form reliability and validity was established using two criterion measures via correlations and regression. The utility of combining CBM-W with SAEBRS-AB was examined via logistic regression and Receiver Operator Characteristic (ROC) curve analysis using researcher determined cutscores for risk on the two criterion measures. Results indicated that both forms of CBMW are reliable and valid measures of general writing performance for young ELs, that benchmarks drawn from the general population are generally applicable to young ELs, and that integrating the SAEBRS-AB with either form of CBM-W improves diagnostic accuracy.


2017 ◽  
Vol 43 (2) ◽  
pp. 121-127 ◽  
Author(s):  
Jeremy W. Ford ◽  
Sarah J. Conoyer ◽  
Erica S. Lembke ◽  
R. Alex Smith ◽  
John L. Hosp

In the present study, two types of curriculum-based measurement (CBM) tools in science, Vocabulary Matching (VM) and Statement Verification for Science (SV-S), a modified Sentence Verification Technique, were compared. Specifically, this study aimed to determine whether the format of information presented (i.e., SV-S vs. VM) produces differences in alternate form reliability and validity of scores or any differences in accuracy of prediction of scores on the state standardized science assessment. Overall, 25 eighth-grade science students were administered two SV-S and two VM forms with identical items along with spring eighth-grade maze passages from Aimsweb. Students had recently taken the eighth-grade state science test. Results regarding technical adequacy for each CBM tool were consistent with past findings. However, this study extends the literature base on CBM tools in science by providing evidence for using standards to develop VM forms. In addition, despite probable ceiling effects, additional evidence was found for the potential of SV-S as a CBM tool in science.


Author(s):  
Yan Jin ◽  
Eric Wu

This article aims to demonstrate how innovative testing practices can effectively prevent high-tech mass cheating and improve fairness in language assessment. The article first introduces Xi's (2010) view of validity and fairness and her proposal of an argument-based approach to empirically examining test fairness. The article then describes the threat to fair testing posed by high-tech cheating on the College English Test (CET). A study of multiple-form equating was conducted and reported in the article, which was aimed at achieving alternate form reliability when multiple versions and multiple forms were used in the CET. The article then concludes with a discussion on the usefulness of an argument-based approach to empirically examining test fairness.


2017 ◽  
Vol 36 (8) ◽  
pp. 798-807 ◽  
Author(s):  
Crystal N. Taylor ◽  
Lisa Aguilar ◽  
Matthew K. Burns ◽  
June L. Preast ◽  
Kristy Warmbold-Brann

Teaching children too many words during a lesson reduces retention. The amount of new information a student can successfully rehearse and recall later is called acquisition rate (AR), which has been reliably measured with students in first, third, and fifth grades. The purpose of this study was to examine the reliability of assessing AR for sight words with kindergarten students. A total of 32 kindergarten students from five classrooms across two elementary schools participated in the study. AR was measured twice over a 2-week period, and 1-day retention was measured for the first AR. The AR data resulted in a 2-week delayed alternate form reliability of r = .83, and there was also a strong correlation between AR and number of words retained 1 day later. The limitations, implications, and considerations for the name of the construct being assessed are discussed.


2017 ◽  
Vol 71 (3) ◽  
pp. 7103190030p1
Author(s):  
Juan Pablo Saa ◽  
Meghan Doherty ◽  
Alexis Young ◽  
Meredith Spiers ◽  
Emily Leary ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document