Development of a Musculoskeletal Imaging Competency Examination for Physical Therapists

2020 ◽  
Vol 100 (12) ◽  
pp. 2254-2265
Author(s):  
Troy Burley ◽  
Lori T Brody ◽  
William G Boissonnault ◽  
Michael D Ross

Abstract Objective The number of physical therapists with imaging ordering privileges is increasing; however, a known level of competency and knowledge is generally lacking within the profession, as is a method to determine practitioner competency. The purpose of this study was to develop a valid musculoskeletal (MSK) imaging competency examination for physical therapists. Methods This 3-round Delphi method study utilized experts to reach consensus on examination content and development. Round 1 was completed by 37 experts. The last 2 rounds were completed by 35 experts. Experts rated questions on a 5-point Likert rating scale of importance (1 = not at all important, 5 = very important). Consensus was achieved with an a priori decision of (1) >75% agreement of the expert panel rating and ≥4 on the Likert scale, and (2) ≥.90 on Cronbach alpha and intraclass correlation coefficients. Experts recommended a passing score of 75%. The examination was subsequently reviewed by a panel of 5 radiologists. Results The Delphi method and radiologist panel review resulted in the 151-question Burley Readiness Examination (BRE) for MSK Imaging Competency. Interrater agreement and internal consistency of the Delphi panel were excellent, with an average intraclass correlation coefficient and Cronbach alpha of .928 and .950, respectively. Conclusions The BRE is a tool that has the potential to demonstrate practitioners’ level of baseline competency with MSK imaging. Additional testing among physical therapists will provide further validation and reliability of the examination. Impact The use and application of diagnostic imaging is becoming more widespread in physical therapist practice throughout the United States. The BRE could potentially have broader implications for health care utilization and cost in the area of MSK imaging.

2014 ◽  
Vol 2014 ◽  
pp. 1-6 ◽  
Author(s):  
Sharon L. Gorman ◽  
Monica Rivera ◽  
Lise McCarthy

The function in sitting test (FIST) is a newly developed, performance-based measure examining deficits in seated postural control. The FIST has been shown to be internally consistent and valid in persons with neurological dysfunction but intra- and interrater reliability and test-retest reliability have not been previously described. Seven patients with chronic neurologic dysfunction were tested and videotaped performing the FIST on two consecutive days. Seventeen acute care and inpatient rehabilitation physical therapist raters scored six of the videotaped performance of the FIST on two occasions at least 2 weeks apart. Intraclass correlation coefficients were used to calculate the test-retest and intra- and interrater reliability of the FIST. ICC of 0.97 (95% CI 0.847–0.995) indicated excellent test-retest reliability of the FIST. Intra- and interrater reliability was also excellent with ICCs of 0.99 (95% CI 0.994–0.997) and 0.99 (95% CI 0.988–0.994), respectively. Physical therapists and other rehabilitation professionals can confidently use the FIST in a variety of clinical practice and research settings due to its favorable reliability characteristics. More studies are needed to describe the responsiveness and minimal clinically important level of change in FIST scores to further enhance clinical usefulness of this measure.


2005 ◽  
Vol 85 (7) ◽  
pp. 656-664 ◽  
Author(s):  
Joseph A Shrader ◽  
John M Popovich ◽  
G Chris Gracey ◽  
Jerome V Danoff

Abstract Background and Purpose. Navicular drop (ND) measurement may be a valuable examination technique for patients with rheumatoid arthritis (RA). However, no data exist on reliability for this technique in patients with RA. The purposes of this study were: (1) to determine interrater and intrarater reliability of ND measurements in people with RA, (2) to compare ND values of people with RA with published normative data, and (3) to investigate ND measurement error associated with the use of skin markings. Subjects. Ten women (20 feet) with RA consented to participate. Methods. Patients completed demographic and function questionnaires. Navicular height (NH) measurements were taken by 2 physical therapists and 1 physical therapist student, following four 1-hour training sessions, using standardized methods and a digital height gauge. Four different NH measurements were taken 3 times on each foot by each of the 3 examiners during a morning session and then repeated during an afternoon session on the same day. Navicular drop values were calculated, including ND1 (as reported in the literature), ND2 (compensating for skin error), and ND3 (single-limb stance). Intraclass correlation coefficients (ICCs) and standard errors of measurement (SEMs) were used to establish reliability. Results. Means (±SD) for each ND measure for sessions 1 and 2, respectively, were as follows: ND1=8.36±5.29 mm and 8.29±5.24 mm, ND2=9.95±5.44 mm and 9.57±5.37 mm. The ICCs (2,1 and 2,k, respectively) for all interrater measurements ranged from .67 to .92 (SEM=2.0–3.3 mm) and from .85 to .97 (SEM=1.1–2.0 mm). The ICCs (2,1 and 2,k, respectively) for intrarater measurements ranged from .73 to .95 (SEM=1.3–2.8 mm) and from .90 to .98 (SEM=0.7–1.6 mm). Paired t tests showed the means of ND1 and ND2 for each examiner and for both sessions were significantly different. Discussion and Conclusion. The results suggest that ND measurements for people with RA can be taken reliably by clinicians with varied experience. The ND values for our subjects were slightly greater than reported normal values of 6 to 8 mm. Error associated with skin markings was statistically significant for all sessions and examiners.


10.2196/20172 ◽  
2021 ◽  
Vol 4 (1) ◽  
pp. e20172
Author(s):  
Masanori Tanaka ◽  
Manabu Saito ◽  
Michio Takahashi ◽  
Masaki Adachi ◽  
Kazuhiko Nakamura

Background Early detection and intervention for neurodevelopmental disorders are effective. Several types of paper questionnaires have been developed to assess these conditions in early childhood; however, the psychometric equivalence between the web-based and the paper versions of these questionnaires is unknown. Objective This study examined the interformat reliability of the web-based parent-rated version of the Autism Spectrum Screening Questionnaire (ASSQ), Attention-Deficit/Hyperactivity Disorder Rating Scale (ADHD-RS), Developmental Coordination Disorder Questionnaire 2007 (DCDQ), and Strengths and Difficulties Questionnaire (SDQ) among Japanese preschoolers in a community developmental health check-up setting. Methods A set of paper-based questionnaires were distributed for voluntary completion to parents of children aged 5 years. The package of the paper format questionnaires included the ASSQ, ADHD-RS, DCDQ, parent-reported SDQ (P-SDQ), and several additional demographic questions. Responses were received from 508 parents of children who agreed to participate in the study. After 3 months, 300 parents, who were among the initial responders, were randomly selected and asked to complete the web-based versions of these questionnaires. A total of 140 parents replied to the web-based format and were included as a final sample in this study. Results We obtained the McDonald ω coefficients for both the web-based and paper formats of the ASSQ (web-based: ω=.90; paper: ω=.86), ADHD-RS total and subscales (web-based: ω=.88-.94; paper: ω=.87-.93), DCDQ total and subscales (web-based: ω=.82-.94; paper: ω=.74-.92), and P-SDQ total and subscales (web-based: ω=.55-.81; paper: ω=.52-.80). The intraclass correlation coefficients between the web-based and paper formats were all significant at the 99.9% confidence level: ASSQ (r=0.66, P<.001); ADHD-RS total and subscales (r=0.66-0.74, P<.001); DCDQ total and subscales (r=0.66-0.71, P<.001); P-SDQ Total Difficulties and subscales (r=0.55-0.73, P<.001). There were no significant differences between the web-based and paper formats for total mean score of the ASSQ (P=.76), total (P=.12) and subscale (P=.11-.47) mean scores of DCDQ, and the P-SDQ Total Difficulties mean score (P=.20) and mean subscale scores (P=.28-.79). Although significant differences were found between the web-based and paper formats for mean ADHD-RS scores (total: t132=2.83, P=.005; Inattention subscale: t133=2.15, P=.03; Hyperactivity/Impulsivity subscale: t133=3.21, P=.002), the effect sizes were small (Cohen d=0.18-0.22). Conclusions These results suggest that the web-based versions of the ASSQ, ADHD-RS, DCDQ, and P-SDQ were equivalent, with the same level of internal consistency and intrarater reliability as the paper versions, indicating the applicability of the web-based versions of these questionnaires for assessing neurodevelopmental disorders.


2019 ◽  
Author(s):  
Marco Bardus ◽  
Nathalie Awada ◽  
Lilian A Ghandour ◽  
Elie-Jacques Fares ◽  
Tarek Gherbal ◽  
...  

BACKGROUND With thousands of health apps in app stores globally, it is crucial to systemically and thoroughly evaluate the quality of these apps due to their potential influence on health decisions and outcomes. The Mobile App Rating Scale (MARS) is the only currently available tool that provides a comprehensive, multidimensional evaluation of app quality, which has been used to compare medical apps from American and European app stores in various areas, available in English, Italian, Spanish, and German. However, this tool is not available in Arabic. OBJECTIVE This study aimed to translate and adapt MARS to Arabic and validate the tool with a sample of health apps aimed at managing or preventing obesity and associated disorders. METHODS We followed a well-established and defined “universalist” process of cross-cultural adaptation using a mixed methods approach. Early translations of the tool, accompanied by confirmation of the contents by two rounds of separate discussions, were included and culminated in a final version, which was then back-translated into English. Two trained researchers piloted the MARS in Arabic (MARS-Ar) with a sample of 10 weight management apps obtained from Google Play and the App Store. Interrater reliability was established using intraclass correlation coefficients (ICCs). After reliability was ascertained, the two researchers independently evaluated a set of additional 56 apps. RESULTS MARS-Ar was highly aligned with the original English version. The ICCs for MARS-Ar (0.836, 95% CI 0.817-0.853) and MARS English (0.838, 95% CI 0.819-0.855) were good. The MARS-Ar subscales were highly correlated with the original counterparts (<i>P</i>&lt;.001). The lowest correlation was observed in the area of usability (<i>r</i>=0.685), followed by aesthetics (<i>r</i>=0.827), information quality (<i>r</i>=0.854), engagement (<i>r</i>=0.894), and total app quality (<i>r</i>=0.897). Subjective quality was also highly correlated (<i>r</i>=0.820). CONCLUSIONS MARS-Ar is a valid instrument to assess app quality among trained Arabic-speaking users of health and fitness apps. Researchers and public health professionals in the Arab world can use the overall MARS score and its subscales to reliably evaluate the quality of weight management apps. Further research is necessary to test the MARS-Ar on apps addressing various health issues, such as attention or anxiety prevention, or sexual and reproductive health.


2015 ◽  
Vol 2015 ◽  
pp. 1-7 ◽  
Author(s):  
Nathalie Rommel ◽  
Charlotte Borgers ◽  
Dirk Van Beckevoort ◽  
Ann Goeleven ◽  
Eddy Dejaeger ◽  
...  

Background. We aimed to validate an easy-to-use videofluoroscopic analysis tool, the bolus residue scale (BRS), for detection and classification of pharyngeal retention in the valleculae, piriform sinuses, and/or the posterior pharyngeal wall.Methods. 50 randomly selected videofluoroscopic images of 10 mL swallows (recorded in 18 dysphagia patients and 8 controls) were analyzed by 4 experts and 6 nonexpert observers. A score from 1 to 6 was assigned according to the number of structures affected by residue. Inter- and intrarater reliabilities were assessed by calculation of intraclass correlation coefficients (ICCs) for expert and nonexpert observers. Sensitivity, specificity, and interrater agreement were analyzed for different BRS levels.Results. Intrarater reproducibility was almost perfect for experts (mean ICC 0.972) and ranged from substantial to almost perfect for nonexperts (mean ICC 0.835). Interjudge agreement of the experts ranged from substantial to almost perfect (mean ICC 0.780), but interrater reliability of nonexperts ranged from substantial to good (mean 0.719). BRS shows for experts a high specificity and sensitivity and for nonexperts a low sensitivity and high specificity.Conclusions. The BRS is a simple, easy-to-carry-out, and accessible rating scale to locate pharyngeal retention on videofluoroscopic images with a good specificity and reproducibility for observers of different expertise levels.


2019 ◽  
Vol 65 (4) ◽  
pp. 237-244 ◽  
Author(s):  
Clément Dondé ◽  
Frédéric Haesebaert ◽  
Emmanuel Poulet ◽  
Marine Mondino ◽  
Jérôme Brunelin

Objective: The aim of this study was to validate the French version of the 7-item Auditory Hallucination Rating Scale (AHRS) so as to facilitate fine-grained assessment of auditory hallucinations (AH) in native French-speaking patients with schizophrenia (SZ) in clinical settings and studies. Method: Patients ( N = 66) were diagnosed with SZ according to the Diagnostic and Statistical Manual of Mental Disorders. The French version of the AHRS was developed using a forward–backward translation procedure. Psychometric properties of the French version of the AHRS were tested including (i) construct validity with a confirmatory one-factor analysis, (ii) internal validity with Pearson correlations and Cronbach α coefficients, and (iii) external validity by correlations with the Scale for Assessment of Positive Symptoms (SAPS-H1), the Positive and Negative Syndrome Scale (PANSS-P3; concurrent), the PANSS-Negative subscale and age of subjects (divergent), and inter-rater intraclass correlation coefficients (ICCs). Results: (i) The confirmatory one-factor analysis found a root mean square error of approximation (RMSEA) = 0.00, 90% confidence interval = [0.000 to 0.011], and a comparative fit index = 0.994. (ii) Correlations between AHRS total score and individual items were mostly ≥0.4. Cronbach α coefficient was 0.61. (iii) Correlations with PANSS-P3 and SAPS-H1 were 0.42 and 0.53, respectively. In a subset of participants ( N = 16), ICC values were extremely high and significant for AHRS total and individual item scores (ICCs range 0.899 to 0.996) Conclusion: The French version of the AHRS is a psychometrically acceptable instrument for the evaluation of AH severity in French-speaking patients with SZ.


2019 ◽  
Vol 91 (1) ◽  
pp. 75-81 ◽  
Author(s):  
Leonhard A Bakker ◽  
Carin D Schröder ◽  
Harold H G Tan ◽  
Simone M A G Vugts ◽  
Ruben P A van Eijk ◽  
...  

ObjectiveThe Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised (ALSFRS-R) is widely applied to assess disease severity and progression in patients with motor neuron disease (MND). The objective of the study is to assess the inter-rater and intra-rater reproducibility, i.e., the inter-rater and intra-rater reliability and agreement, of a self-administration version of the ALSFRS-R for use in apps, online platforms, clinical care and trials.MethodsThe self-administration version of the ALSFRS-R was developed based on both patient and expert feedback. To assess the inter-rater reproducibility, 59 patients with MND filled out the ALSFRS-R online and were subsequently assessed on the ALSFRS-R by three raters. To assess the intra-rater reproducibility, patients were invited on two occasions to complete the ALSFRS-R online. Reliability was assessed with intraclass correlation coefficients, agreement was assessed with Bland-Altman plots and paired samples t-tests, and internal consistency was examined with Cronbach’s coefficient alpha.ResultsThe self-administration version of the ALSFRS-R demonstrated excellent inter-rater and intra-rater reliability. The assessment of inter-rater agreement demonstrated small systematic differences between patients and raters and acceptable limits of agreement. The assessment of intra-rater agreement demonstrated no systematic changes between time points; limits of agreement were 4.3 points for the total score and ranged from 1.6 to 2.4 points for the domain scores. Coefficient alpha values were acceptable.DiscussionThe self-administration version of the ALSFRS-R demonstrates high reproducibility and can be used in apps and online portals for both individual comparisons, facilitating the management of clinical care and group comparisons in clinical trials.


2000 ◽  
Vol 80 (2) ◽  
pp. 168-178 ◽  
Author(s):  
Suh-Fang Jeng ◽  
Kuo-Inn Tsou Yau ◽  
Li-Chiou Chen ◽  
Shu-Fang Hsiao

Abstract Background and Purpose. The goal of this study was to examine the reliability and validity of measurements obtained with the Alberta Infant Motor Scale (AIMS) for evaluation of preterm infants in Taiwan. Subjects. Two independent groups of preterm infants were used to investigate the reliability (n=45) and validity (n=41) for the AIMS. Methods. In the reliability study, the AIMS was administered to the infants by a physical therapist, and infant performance was videotaped. The performance was then rescored by the same therapist and by 2 other therapists to examine the intrarater and interrater reliability. In the validity study, the AIMS and the Bayley Motor Scale were administered to the infants at 6 and 12 months of age to examine criterion-related validity. Results. Intraclass correlation coefficients (ICCs) for intrarater and interrater reliability of measurements obtained with the AIMS were high (ICC=.97–.99). The AIMS scores correlated with the Bayley Motor Scale scores at 6 and 12 months (r=.78 and .90), although the AIMS scores at 6 months were only moderately predictive of the motor function at 12 months (r=.56). Conclusion and Discussion. The results suggest that measurements obtained with the AIMS have acceptable reliability and concurrent validity but limited predictive value for evaluating preterm Taiwanese infants.


2020 ◽  
Vol 24 (1) ◽  
pp. 12-18
Author(s):  
Bárbara Pessali-Marques ◽  
Gustavo H.C. Peixoto ◽  
Christian E.T. Cabido ◽  
André Gustavo P. Andrade ◽  
Sara A. Rodrigues ◽  
...  

This study aimed to investigate the bio- mechanical response of the hamstring muscles to acute stretching in dancers (D) compared to non-dancers (ND). Maximal range of motion (ROMMax) and stiffness of the hamstrings were assessed in 46 young males, 23 undergraduate students (ND) and 23 professional dancers (D). Ages of the two groups were D 21.5 ± 0.60 years; ND 27.5 ± 0.98 years). Testing was performed in two sessions, familiarization with procedures in the first session and the tests themselves (pre- and post-test and intervention) in the second, with a 24- to 48-hour interval between. The pre-test consisted of three trials of passive knee extension to the point of increased tension in the hamstrings, defined as ROMMax. The resistance torque recorded at ROMMax was defined as torqueMax. Six 30-second constant torque stretches were performed at 100% of the torqueMaxreached in the pre-test in one lower limb only (intervention), with the contralateral limb used as control. The torque measured at an identical ROM before (pre-test) and after (post-test) the intervention was defined as torqueROM, and represented stiffness in this study. Reliability of the ROMMax, torqueMax, and torqueROMwas assessed via intraclass correlation coefficients (ICC3, k) and standard error of the measurements (SEM). Comparison between dancers and non-dancers, control, and intervention conditions for all dependent variables was performed using ANOVA repeated measures followed by Tukey post hoc comparisons to highlight any interaction. The submaximal stretch intensity applied caused torqueROM to decrease in both D and ND groups (p < 0.01), indicating a decrease in stiffness, but no difference between the groups was found. A significantly greater increase in ROMMax was found for the D group compared to the ND group (p < 0.01), suggesting that other aspects in addition to MTU biomechanical adaptations may have played a role in the ROMMax increase, especially for the D group. Further research is needed to explore what those other adaptations are. Meanwhile, coaches and physical therapists should be aware that dancers may require different stretch training protocols than non-dancers.


2013 ◽  
Vol 25 (9) ◽  
pp. 1503-1511 ◽  
Author(s):  
Florindo Stella ◽  
Orestes Vicente Forlenza ◽  
Jerson Laks ◽  
Larissa Pires de Andrade ◽  
Michelle A. Ljubetic Avendaño ◽  
...  

ABSTRACTBackground:Patients with dementia may be unable to describe their symptoms, and caregivers frequently suffer emotional burden that can interfere with judgment of the patient's behavior. The Neuropsychiatric Inventory-Clinician rating scale (NPI-C) was therefore developed as a comprehensive and versatile instrument to assess and accurately measure neuropsychiatric symptoms (NPS) in dementia, thereby using information from caregiver and patient interviews, and any other relevant available data. The present study is a follow-up to the original, cross-national NPI-C validation, evaluating the reliability and concurrent validity of the NPI-C in quantifying psychopathological symptoms in dementia in a large Brazilian cohort.Methods:Two blinded raters evaluated 312 participants (156 patient-knowledgeable informant dyads) using the NPI-C for a total of 624 observations in five Brazilian centers. Inter-rater reliability was determined through intraclass correlation coefficients for the NPI-C domains and the traditional NPI. Convergent validity included correlations of specific domains of the NPI-C with the Brief Psychiatric Rating Scale (BPRS), the Cohen-Mansfield Agitation Index (CMAI), the Cornell Scale for Depression in Dementia (CSDD), and the Apathy Inventory (AI).Results:Inter-rater reliability was strong for all NPI-C domains. There were high correlations between NPI-C/delusions and BPRS, NPI-C/apathy-indifference with the AI, NPI-C/depression-dysphoria with the CSDD, NPI-C/agitation with the CMAI, and NPI-C/aggression with the CMAI. There was moderate correlation between the NPI-C/aberrant vocalizations and CMAI and the NPI-C/hallucinations with the BPRS.Conclusion:The NPI-C is a comprehensive tool that provides accurate measurement of NPS in dementia with high concurrent validity and inter-rater reliability in the Brazilian setting. In addition to universal assessment, the NPI-C can be completed by individual domains.


Sign in / Sign up

Export Citation Format

Share Document