The Reliability of Framework for Teaching Scores in Kindergarten

2020 ◽  
Vol 38 (7) ◽  
pp. 831-845
Author(s):  
Helen Patrick ◽  
Brian F. French ◽  
Panayota Mantzicopoulos

We evaluated the score stability of the Framework for Teaching (FFT), a prominent observation instrument used for teacher evaluation. Three raters each scored 200 reading and mathematics lessons taught by 20 kindergarten teachers. Using Generalizability theory analyses, we decomposed the FFT’s Classroom Environment, Instruction, and Total scores into potential sources of variation (teachers, lessons, raters, and their interactions). The scores’ variances attributable to differences among teachers were 71% and 76% for Classroom Environment, 49% and 37% for Instruction, and 69% and 66% for the Total score, for reading and mathematics, respectively. Reliability estimates (G) ranged from 0.92 to 0.96 for Classroom Environment and Total scores; they were 0.87 and 0.79 for reading and mathematics Instruction. Decision studies indicated that two raters, each scoring three reading lessons or four mathematics lessons, are necessary to achieve sufficiently reliable Total scores. For Instruction scores, three raters each scoring seven readings lessons are needed; more than four raters each scoring eight lessons are needed for mathematics.

2019 ◽  
Vol 57 (5) ◽  
pp. 2021-2058 ◽  
Author(s):  
Helen Patrick ◽  
Panayota Mantzicopoulos ◽  
Brian F. French

We used multilevel analysis to examine the predictive validity of scores from the Framework for Teaching (FFT), the observation measure used most often to evaluate teachers’ instruction. We investigated how well 81 kindergarten teachers’ FFT scores for eight reading and eight mathematics lessons observed throughout the year predicted students’ year-end achievement and motivation in reading and mathematics, controlling for students’ sex, ethnicity, and achievement entering kindergarten. Standardized reading and mathematics achievement were each predicted by FFT scores; however, they accounted for very little of the overall variance in students’ achievement: 2.5% for reading and 1.3% for mathematics. Neither students’ end-of-year criterion-referenced achievement nor motivation were predicted by FFT scores.


2021 ◽  
Author(s):  
Kurt Schilling ◽  
Chantal M.W. Tax ◽  
Francois Rheault ◽  
Colin B Hansen ◽  
Qi Yang ◽  
...  

When investigating connectivity and microstructure of white matter pathways of the brain using diffusion tractography bundle segmentation, it is important to understand potential confounds and sources of variation in the process. While cross-scanner and cross-protocol effects on diffusion microstructure measures are well described (in particular fractional anisotropy and mean diffusivity), it is unknown how potential sources of variation effect bundle segmentation results, which features of the bundle are most affected, where variability occurs, nor how these sources of variation depend upon the method used to reconstruct and segment bundles. In this study, we investigate four potential sources of variation, or confounds, for bundle segmentation: variation (1) across scan repeats, (2) across scanners, (3) across acquisition protocol, and (4) across diffusion sensitization. We employ four different bundle segmentation workflows on two benchmark multi-subject cross-scanner and cross-protocol databases, and investigate reproducibility and biases in volume overlap, shape geometry features of fiber pathways, and microstructure features within the pathways. We find that the effects of acquisition protocol, in particular acquisition resolution, result in the lowest reproducibility of tractography and largest variation of features, followed by scanner-effects, and finally b-value effects which had similar reproducibility as scan-rescan variation. However, confounds varied both across pathways and across segmentation workflows, with some bundle segmentation workflows more (or less) robust to sources of variation. Despite variability, bundle dissection is consistently able to recover the same location of pathways in the deep white matter, with variation at the gray matter/ white matter interface. Next, we show that differences due to the choice of bundle segmentation workflows are larger than any other studied confound, with low-to-moderate overlap of the same intended pathway when segmented using different methods. Finally, quantifying microstructure features within a pathway, we show that tractography adds variability over-and-above that which exists due to noise, scanner effects, and acquisition effects. Overall, these confounds need to be considered when harmonizing diffusion datasets, interpreting or combining data across sites, and when attempting to understand the successes and limitations of different methodologies in the design and development of new tractography or bundle segmentation methods.


Sign in / Sign up

Export Citation Format

Share Document