The inter-rater reliability of the Performance Oriented Mobility Assessment tool after brain surgery

Adam Marco Galloway; Edward C Killan; Gretl A McHugh

doi:10.12968/ijtr.2018.0135

The inter-rater reliability of the Performance Oriented Mobility Assessment tool after brain surgery

International Journal of Therapy and Rehabilitation ◽

10.12968/ijtr.2018.0135 ◽

2019 ◽

Vol 26 (12) ◽

pp. 1-7

Author(s):

Adam Marco Galloway ◽

Edward C Killan ◽

Gretl A McHugh

Keyword(s):

Hospital Admissions ◽

Assessment Tool ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Future Research ◽

Brain Surgery ◽

Rater Reliability ◽

Efficient Treatment ◽

Mobility Assessment ◽

Risk Of Falls

Background/Aims Falls are a significant cause of hospital admissions in the UK and require clinically reasoned intervention from the multidisciplinary team to ensure the patient receives an effective and efficient treatment, including physiotherapy. This study aimed to assess the inter-rater reliability of the Performance Oriented Mobility Assessment in patients who had recently undergone brain surgery. Methods A prospective inter-rater reliability study involving 18 male and 12 female patients aged between 27 and 87 years who had recently undergone brain surgery was conducted. Three raters of varying clinical physiotherapy experience assessed participants using the Performance Oriented Mobility Assessment on an acute neurosurgical ward. Inter-rater reliability was measured using Bland–Altman plots and intraclass correlation coefficients. Results Bland–Altman plots and intraclass correlation coefficient values demonstrated excellent inter-rater reliability, regardless of the age and sex of the patients or the clinical experience of the rater. Conclusions Results suggest that the Performance Oriented Mobility Assessment is a potentially useful tool for assessing patients, particularly for the risk of falls, following brain surgery. Future research is needed to determine other clinimetric properties of this outcome measure before wider implementation.

Download Full-text

Competency-based simulation assessment of resuscitation skills in emergency medicine postgraduate trainees – a Canadian multi-centred study

Canadian Medical Education Journal ◽

10.36834/cmej.36682 ◽

2016 ◽

Vol 7 (1) ◽

pp. e57-e67 ◽

Cited By ~ 11

Author(s):

J. Damon Dagnone ◽

Andrew K. Hall ◽

Stefanie Sebok-Syer ◽

Don Klinger ◽

Karen Woolfrey ◽

...

Keyword(s):

Emergency Medicine ◽

Assessment Tool ◽

Postgraduate Medical Education ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

High Fidelity ◽

Rater Reliability ◽

Simulation Based ◽

Competency Based ◽

Generalizability Study

Background: The use of high-fidelity simulation is emerging as a desirable method for competency-based assessment in postgraduate medical education. We aimed to demonstrate the feasibility and validity of a multi-centre simulation-based Objective Structured Clinical Examination (OSCE) of resuscitation competence with Canadian Emergency Medicine (EM) trainees.Method: EM postgraduate trainees (n=98) from five Canadian academic centres participated in a high fidelity, 3-station simulation-based OSCE. Expert panels of three emergency physicians evaluated trainee performances at each centre using the Queen’s Simulation Assessment Tool (QSAT). Intraclass correlation coefficients were used to measure the inter-rater reliability, and analysis of variance was used to measure the discriminatory validity of each scenario. A fully crossed generalizability study was also conducted for each examination centre. Results: Inter-rater reliability in four of the five centres was strong with a median absolute intraclass correlation coefficient (ICC) across centres and scenarios of 0.89 [0.65-0.97]. Discriminatory validity was also strong (p < 0.001 for scenarios 1 and 3; p < 0.05 for scenario 2). Generalizability studies found significant variations at two of the study centres.Conclusions: This study demonstrates the successful pilot administration of a multi-centre, 3-station simulation-based OSCE for the assessment of resuscitation competence in post-graduate Emergency Medicine trainees.

Download Full-text

Comprehensive Knowledge Assessment for Athletic Trainers: Part I

Internet Journal of Allied Health Sciences and Practice ◽

10.46743/1540-580x/2019.1823 ◽

2019 ◽

Author(s):

Lindsey Eberman ◽

Jessica Edler ◽

Kenneth Games

Keyword(s):

Continuing Education ◽

Athletic Training ◽

Assessment Tool ◽

Item Difficulty ◽

Athletic Trainers ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Current Mode ◽

Future Research ◽

Actual Knowledge

Purpose: Continuing education (CE) is intended to help clinicians maintain competence, develop and advance knowledge and skills, and enhance knowledge, skills, and abilities beyond the levels required for entry-level practice. Based on previous literature, the current mode of CE in athletic training does not appear to be helping clinicians maintain competence. The purpose of this research was to validate a comprehensive assessment based on the Role Delineation Study/Practice Analysis (6th ed.) through item analysis and estimates of reliability to be used to assess athletic trainers’ actual knowledge. Method: We conducted an instrumentation validation study using Qualtrics® web-based platform. Athletic trainers (n=191; age=31.5±8.1yrs; years of experience=8.9±11.1yrs) in good standing with the NATA and BOC completed both administrations of the assessment. Six experts developed 220 multiple-choice items for inclusion with broad application across the five domains of clinical practice (Injury/Illness and Wellness Protection [49 items], Clinical Evaluation and Diagnosis [63 items], Immediate and Emergency Care [29 items], Treatment and Rehabilitation [29 items], and Organizational and Professional Health and Wellbeing [50 items]). A random sample of NATA members were recruited via email, received weekly reminders, and then after four weeks, they completed a second administration of the assessment. We evaluated the assessment tool for item difficulty, item discrimination, internal consistency, item total statistics, and test-retest reliability. Results: We eliminated 42 items from the tool created by the experts that were too difficult (0.90). We eliminated 50 additional items due to point-biserial correlations between item performance and total domain score performance below 0.20. We identified additional weaknesses in 57 items through intraclass correlation coefficients (ICCConclusions: We developed a valid and reliable assessment tool to measure athletic trainers’ actual knowledge. Future research should utilize a validated assessment of actual knowledge to guide continuing education activities.

Download Full-text

Psychometric Properties of the MyotonPRO in Dementia Patients with Paratonia

Gerontology ◽

10.1159/000485462 ◽

2017 ◽

Vol 64 (4) ◽

pp. 401-412 ◽

Cited By ~ 5

Author(s):

Hans Drenth ◽

Sytse U. Zuidema ◽

Wim P. Krijnen ◽

Ivan Bautmans ◽

Cees van der Schans ◽

...

Keyword(s):

Psychometric Properties ◽

Correlation Coefficient ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Functional Mobility ◽

Future Research ◽

Minimal Detectable Change ◽

Muscle Properties ◽

Dementia Patients ◽

Longitudinal Outcome

Background: Paratonia is a distinctive form of hypertonia, causing loss of functional mobility in early stages of dementia to severe high muscle tone and pain in the late stages. For assessing and evaluating therapeutic interventions, objective instruments are required. Objective: Determine the psychometric properties of the MyotonPRO, a portable device that objectively measures muscle properties, in dementia patients with paratonia. Methods: Muscle properties were assessed with the MyotonPRO by 2 assessors within one session and repeated by the main researcher after 30 min and again after 6 months. Receiver operating characteristic curves were constructed for all MyotonPRO outcomes to discriminate between participants with (n = 70) and without paratonia (n = 82). In the participants with paratonia, correlation coefficients were established between the MyotonPRO outcomes and the Modified Ashworth Scale for paratonia (MAS-P) and muscle palpation. In participants with paratonia, reliability (intraclass correlation coefficient) and agreement values (standard error of measurement and minimal detectable change) were established. Longitudinal outcome from participants with paratonia throughout the study (n = 48) was used to establish the sensitivity for change (correlation coefficient) and responsiveness (minimal clinical important difference). Results: Included were 152 participants with dementia (mean [standard deviation] age of 83.5 [98.2]). The area under the curve ranged from 0.60 to 0.67 indicating the MyotonPRO is able to differentiate between participants with and without paratonia. The MyotonPRO explained 10-18% of the MAS-P score and 8-14% of the palpation score. Interclass correlation coefficients for interrater reliability ranged from 0.57 to 0.75 and from 0.54 to 0.71 for intrarater. The best agreement values were found for tone, elasticity, and stiffness. The change between baseline and 6 months in the MyotonPRO outcomes explained 8-13% of the change in the MAS-P scores. The minimal clinically important difference values were all smaller than the measurement error. Conclusion: The MyotonPRO is potentially applicable for cross-sectional studies between groups of paratonia patients and appears less suitable to measure intraindividual changes in paratonia. Because of the inherent variability in movement resistance in paratonia, the outcomes from the MyotonPRO should be interpreted with care; therefore, future research should focus on additional guidelines to increase the clinical interpretation and improving reproducibility.

Download Full-text

Adaptation and Psychometric Evaluation of the Chinese Counseling Competencies Scale-Revised

Frontiers in Psychology ◽

10.3389/fpsyg.2021.688539 ◽

2021 ◽

Vol 12 ◽

Author(s):

Wei Xia ◽

William Ho Cheung Li ◽

Tingna Liang ◽

Yuanhui Luo ◽

Laurie Long Kwan Ho ◽

...

Keyword(s):

Concurrent Validity ◽

Convergent Validity ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Psychometric Evaluation ◽

Counseling Competencies ◽

Counselors In Training ◽

Rater Reliability ◽

Retest Reliability ◽

Test Retest Reliability

Objectives: This study conducted a linguistic and psychometric evaluation of the Chinese Counseling Competencies Scale-Revised (CCS-R).Methods: The Chinese CCS-R was created from the original English version using a standard forward-backward translation process. The psychometric properties of the Chinese CCS-R were examined in a cohort of 208 counselors-in-training by two independent raters. Fifty-three counselors-in-training were asked to undergo another counseling performance evaluation for the test-retest. The confirmatory factor analysis (CFA) was conducted for the Chinese CCS-R, followed by internal consistency, test-retest reliability, inter-rater reliability, convergent validity, and concurrent validity.Results: The results of the CFA supported the factorial validity of the Chinese CCS-R, with adequate construct replicability. The scale had a McDonald's omega of 0.876, and intraclass correlation coefficients of 0.63 and 0.90 for test-retest reliability and inter-rater reliability, respectively. Significantly positive correlations were observed between the Chinese CCS-R score and scores of performance checklist (Pearson's γ = 0.781), indicating a large convergent validity, and knowledge on drug abuse (Pearson's γ = 0.833), indicating a moderate concurrent validity.Conclusion: The results support that the Chinese CCS-R is a valid and reliable measure of the counseling competencies.Practice implication: The CCS-R provides trainers with a reliable tool to evaluate counseling students' competencies and to facilitate discussions with trainees about their areas for growth.

Download Full-text

Comparing Counts of Park Users With a Wearable Video Device and an Unmanned Aerial System

Journal for the Measurement of Physical Behaviour ◽

10.1123/jmpb.2020-0063 ◽

2021 ◽

pp. 1-8

Author(s):

Richard R. Suminski ◽

Gregory M. Dominick ◽

Matthew Saponaro

Keyword(s):

Repeated Measures ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Future Research ◽

Unmanned Aerial Systems ◽

Park Use ◽

Intraclass Correlation Coefficients ◽

Potential Benefits ◽

Absolute Agreement ◽

Aerial Systems

Evidence suggests that video captured with a wearable video device (WVD) may augment or supplant traditional methods for assessing park use. Unmanned aerial systems (UASs) are used to assess human activity, but research employing them for park assessments is sparse. Therefore, this study compared park user counts between a WVD and UAS. A diverse set of 33 amenities (e.g., playground) in three parks were videoed simultaneously by one researcher wearing a WVD and another operating the UAS. Assessments were done at 12 p.m. and 7 p.m. on weekends, with one park evaluated on two occasions 7 days apart. Two investigators independently reviewed videos and reached consensus on the counts of individuals at each amenity. Intraclass correlation coefficients (ICCs) were used to determine intra- and interrater reliabilities. A total of 404 (M = 4.7; SD = 9.6) and 389 (M = 4.5; SD = 9.0) individuals were counted in the UAS and WVD videos, respectively. Absolute agreement was 86% (74/86) and 100% when no individuals were using the amenity. Whether using all 86 videos or only videos having people (48 videos), ICCs indicated excellent reliability (ICC = .99; p < .001). The totals seen for the repeated measures were UAS = 146 and WVD = 136 for Day 1 and UAS = 169 and WVD = 161 for Day 2. Intrarater reliability was excellent for the UAS (ICC = .92; p < .001) and good for the WVD (ICC = .89; p < .001). Disagreement was mainly due to obstructions—people behind or under structures. This study provides support for the use of UASs for counting park users and future research examining the potential benefits of video analysis for assessing park use.

Download Full-text

Standardising the measurement of physical activity in people receiving haemodialysis: considerations for research and practice

BMC Nephrology ◽

10.1186/s12882-019-1634-1 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 2

Author(s):

Hannah M. L. Young ◽

Mark W. Orme ◽

Yan Song ◽

Maurice Dungey ◽

James O. Burton ◽

...

Keyword(s):

Physical Activity ◽

Sample Size ◽

Repeated Measures ◽

A Priori ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Future Research ◽

Wear Time ◽

Minimum Number ◽

The Uk

Abstract Background Physical activity (PA) is exceptionally low amongst the haemodialysis (HD) population, and physical inactivity is a powerful predictor of mortality, making it a prime focus for intervention. Objective measurement of PA using accelerometers is increasing, but standard reporting guidelines essential to effectively evaluate, compare and synthesise the effects of PA interventions are lacking. This study aims to (i) determine the measurement and processing guidance required to ensure representative PA data amongst a diverse HD population, and; (ii) to assess adherence to PA monitor wear amongst HD patients. Methods Clinically stable HD patients from the UK and China wore a SenseWear Armband accelerometer for 7 days. Step count between days (HD, Weekday, Weekend) were compared using repeated measures ANCOVA. Intraclass correlation coefficients (ICCs) determined reliability (≥0.80 acceptable). Spearman-Brown prophecy formula, in conjunction with a priori ≥ 80% sample size retention, identified the minimum number of days required for representative PA data. Results Seventy-seven patients (64% men, mean ± SD age 56 ± 14 years, median (interquartile range) time on HD 40 (19–72) months, 40% Chinese, 60% British) participated. Participants took fewer steps on HD days compared with non-HD weekdays and weekend days (3402 [95% CI 2665–4140], 4914 [95% CI 3940–5887], 4633 [95% CI 3558–5707] steps/day, respectively, p < 0.001). PA on HD days were less variable than non-HD days, (ICC 0.723–0.839 versus 0.559–0.611) with ≥ 1 HD day and ≥ 3 non-HD days required to provide representative data. Using these criteria, the most stringent wear-time retaining ≥ 80% of the sample was ≥7 h. Conclusions At group level, a wear-time of ≥7 h on ≥1HD day and ≥ 3 non-HD days is required to provide reliable PA data whilst retaining an acceptable sample size. PA is low across both HD and non- HD days and future research should focus on interventions designed to increase physical activity in both the intra and interdialytic period.

Download Full-text

Is the location of the signal intensity weighted centroid a reliable measurement of fluid displacement within the disc?

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2016-0178 ◽

2018 ◽

Vol 63 (4) ◽

pp. 453-460 ◽

Cited By ~ 7

Author(s):

Vahid Abdollah ◽

Eric C. Parent ◽

Michele C. Battié

Keyword(s):

Signal Intensity ◽

Water Distribution ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Region Of Interest ◽

Rater Reliability ◽

Fluid Displacement ◽

Intraclass Correlation Coefficients ◽

The Mean ◽

Standard Error Of Measurement

Abstract Degenerated discs have shorter T2-relaxation time and lower MR signal. The location of the signal-intensity-weighted-centroid reflects the water distribution within a region-of-interest (ROI). This study compared the reliability of the location of the signal-intensity-weighted-centroid to mean signal intensity and area measurements. L4-L5 and L5-S1 discs were measured on 43 mid-sagittal T2-weighted 3T MRI images in adults with back pain. One rater analysed images twice and another once, blinded to measurements. Discs were semi-automatically segmented into a whole disc, nucleus, anterior and posterior annulus. The coordinates of the signal-intensity-weighted-centroid for all regions demonstrated excellent intraclass-correlation-coefficients for intra- (0.99–1.00) and inter-rater reliability (0.97–1.00). The standard error of measurement for the Y-coordinates of the signal-intensity-weighted-centroid for all ROIs were 0 at both levels and 0 to 2.7 mm for X-coordinates. The mean signal intensity and area for the whole disc and nucleus presented excellent intra-rater reliability with intraclass-correlation-coefficients from 0.93 to 1.00, and 0.92 to 1.00 for inter-rater reliability. The mean signal intensity and area had lower reliability for annulus ROIs, with intra-rater intraclass-correlation-coefficient from 0.5 to 0.76 and inter-rater from 0.33 to 0.58. The location of the signal-intensity-weighted-centroid is a reliable biomarker for investigating the effects of disc interventions.

Download Full-text

Development and assessment of the inter-rater and intra-rater reproducibility of a self-administration version of the ALSFRS-R

Journal of Neurology Neurosurgery & Psychiatry ◽

10.1136/jnnp-2019-321138 ◽

2019 ◽

Vol 91 (1) ◽

pp. 75-81 ◽

Cited By ~ 7

Author(s):

Leonhard A Bakker ◽

Carin D Schröder ◽

Harold H G Tan ◽

Simone M A G Vugts ◽

Ruben P A van Eijk ◽

...

Keyword(s):

Rating Scale ◽

Clinical Care ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

The Self ◽

Coefficient Alpha ◽

Rater Agreement ◽

Self Administration ◽

Limits Of Agreement ◽

Rater Reliability

ObjectiveThe Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised (ALSFRS-R) is widely applied to assess disease severity and progression in patients with motor neuron disease (MND). The objective of the study is to assess the inter-rater and intra-rater reproducibility, i.e., the inter-rater and intra-rater reliability and agreement, of a self-administration version of the ALSFRS-R for use in apps, online platforms, clinical care and trials.MethodsThe self-administration version of the ALSFRS-R was developed based on both patient and expert feedback. To assess the inter-rater reproducibility, 59 patients with MND filled out the ALSFRS-R online and were subsequently assessed on the ALSFRS-R by three raters. To assess the intra-rater reproducibility, patients were invited on two occasions to complete the ALSFRS-R online. Reliability was assessed with intraclass correlation coefficients, agreement was assessed with Bland-Altman plots and paired samples t-tests, and internal consistency was examined with Cronbach’s coefficient alpha.ResultsThe self-administration version of the ALSFRS-R demonstrated excellent inter-rater and intra-rater reliability. The assessment of inter-rater agreement demonstrated small systematic differences between patients and raters and acceptable limits of agreement. The assessment of intra-rater agreement demonstrated no systematic changes between time points; limits of agreement were 4.3 points for the total score and ranged from 1.6 to 2.4 points for the domain scores. Coefficient alpha values were acceptable.DiscussionThe self-administration version of the ALSFRS-R demonstrates high reproducibility and can be used in apps and online portals for both individual comparisons, facilitating the management of clinical care and group comparisons in clinical trials.

Download Full-text

An App to Assess Young Children’s Perceptions of Movement Competence

Journal of Motor Learning and Development ◽

10.1123/jmld.2017-0039 ◽

2018 ◽

Vol 6 (s2) ◽

pp. S252-S263 ◽

Cited By ~ 1

Author(s):

Lisa M. Barnett ◽

Owen Makin

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Future Research ◽

Children's Perceptions ◽

Intraclass Correlation Coefficients ◽

Children’S Perceptions ◽

Digital Assessment ◽

Android App ◽

Test Retest Reliability ◽

Good Agreement

Assessing young children’s perceptions is commonly done one on one with an interviewer. An app enables several children to complete the scale at once. The objective was to describe an app to assess children’s perceptions of movement competence and then present consistency of child responses. The Pictorial Scale of Perceived Movement Skill Competence (PMSC) has fundamental movement skill (FMS; e.g., catch) and play items (e.g., cycling). The PMSC android app has the same items and images but children complete it independently with audio. Intraclass correlation coefficients (ICC) assessed i) test-retest reliability using the PMSC app on 18 items in 42 children (M = 6.8 yrs) and ii) consistency between measures for 13 FMS items in 44 children (M = 8.5 yrs). Over time (M = 6.9 days, SD = 0.35) the full PMSC had good consistency (ICC = 0.79, 95% CI 0.64–0.88) and the FMS items had moderate consistency (ICC = 0.68, 95% CI 0.47–0.81). There was good agreement between the app and interview for FMS items (ICC = 0.86, 95% CI 0.76–0.92). Locomotor items were less consistent. The PMSC app can generally be recommended. Future research could investigate how different forms of digital assessment affect children’s perception.

Download Full-text

A Comparison of Reliability Coefficients for Ordinal Rating Scales

Journal of Classification ◽

10.1007/s00357-021-09386-5 ◽

2021 ◽

Author(s):

Alexandra de Raadt ◽

Matthijs J. Warrens ◽

Roel J. Bosker ◽

Henk A. L. Kiers

Keyword(s):

Empirical Data ◽

Rating Scales ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Rater Reliability ◽

Intraclass Correlations ◽

Applied Researcher ◽

Highly Correlated ◽

Reliability Coefficients

AbstractKappa coefficients are commonly used for quantifying reliability on a categorical scale, whereas correlation coefficients are commonly applied to assess reliability on an interval scale. Both types of coefficients can be used to assess the reliability of ordinal rating scales. In this study, we compare seven reliability coefficients for ordinal rating scales: the kappa coefficients included are Cohen’s kappa, linearly weighted kappa, and quadratically weighted kappa; the correlation coefficients included are intraclass correlation ICC(3,1), Pearson’s correlation, Spearman’s rho, and Kendall’s tau-b. The primary goal is to provide a thorough understanding of these coefficients such that the applied researcher can make a sensible choice for ordinal rating scales. A second aim is to find out whether the choice of the coefficient matters. We studied to what extent we reach the same conclusions about inter-rater reliability with different coefficients, and to what extent the coefficients measure agreement in a similar way, using analytic methods, and simulated and empirical data. Using analytical methods, it is shown that differences between quadratic kappa and the Pearson and intraclass correlations increase if agreement becomes larger. Differences between the three coefficients are generally small if differences between rater means and variances are small. Furthermore, using simulated and empirical data, it is shown that differences between all reliability coefficients tend to increase if agreement between the raters increases. Moreover, for the data in this study, the same conclusion about inter-rater reliability was reached in virtually all cases with the four correlation coefficients. In addition, using quadratically weighted kappa, we reached a similar conclusion as with any correlation coefficient a great number of times. Hence, for the data in this study, it does not really matter which of these five coefficients is used. Moreover, the four correlation coefficients and quadratically weighted kappa tend to measure agreement in a similar way: their values are very highly correlated for the data in this study.

Download Full-text