Navicular Drop Measurement in People With Rheumatoid Arthritis: Interrater and Intrarater Reliability

Joseph A Shrader; John M Popovich; G Chris Gracey; Jerome V Danoff

doi:10.1093/ptj/85.7.656

Navicular Drop Measurement in People With Rheumatoid Arthritis: Interrater and Intrarater Reliability

Physical Therapy ◽

10.1093/ptj/85.7.656 ◽

2005 ◽

Vol 85 (7) ◽

pp. 656-664 ◽

Cited By ~ 27

Author(s):

Joseph A Shrader ◽

John M Popovich ◽

G Chris Gracey ◽

Jerome V Danoff

Keyword(s):

Rheumatoid Arthritis ◽

Physical Therapists ◽

Physical Therapist ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Afternoon Session ◽

Intrarater Reliability ◽

Navicular Drop ◽

Physical Therapist Student ◽

And Function

Abstract Background and Purpose. Navicular drop (ND) measurement may be a valuable examination technique for patients with rheumatoid arthritis (RA). However, no data exist on reliability for this technique in patients with RA. The purposes of this study were: (1) to determine interrater and intrarater reliability of ND measurements in people with RA, (2) to compare ND values of people with RA with published normative data, and (3) to investigate ND measurement error associated with the use of skin markings. Subjects. Ten women (20 feet) with RA consented to participate. Methods. Patients completed demographic and function questionnaires. Navicular height (NH) measurements were taken by 2 physical therapists and 1 physical therapist student, following four 1-hour training sessions, using standardized methods and a digital height gauge. Four different NH measurements were taken 3 times on each foot by each of the 3 examiners during a morning session and then repeated during an afternoon session on the same day. Navicular drop values were calculated, including ND1 (as reported in the literature), ND2 (compensating for skin error), and ND3 (single-limb stance). Intraclass correlation coefficients (ICCs) and standard errors of measurement (SEMs) were used to establish reliability. Results. Means (±SD) for each ND measure for sessions 1 and 2, respectively, were as follows: ND1=8.36±5.29 mm and 8.29±5.24 mm, ND2=9.95±5.44 mm and 9.57±5.37 mm. The ICCs (2,1 and 2,k, respectively) for all interrater measurements ranged from .67 to .92 (SEM=2.0–3.3 mm) and from .85 to .97 (SEM=1.1–2.0 mm). The ICCs (2,1 and 2,k, respectively) for intrarater measurements ranged from .73 to .95 (SEM=1.3–2.8 mm) and from .90 to .98 (SEM=0.7–1.6 mm). Paired t tests showed the means of ND1 and ND2 for each examiner and for both sessions were significantly different. Discussion and Conclusion. The results suggest that ND measurements for people with RA can be taken reliably by clinicians with varied experience. The ND values for our subjects were slightly greater than reported normal values of 6 to 8 mm. Error associated with skin markings was statistically significant for all sessions and examiners.

Download Full-text

Reliability of the Function in Sitting Test (FIST)

Rehabilitation Research and Practice ◽

10.1155/2014/593280 ◽

2014 ◽

Vol 2014 ◽

pp. 1-6 ◽

Cited By ~ 4

Author(s):

Sharon L. Gorman ◽

Monica Rivera ◽

Lise McCarthy

Keyword(s):

Interrater Reliability ◽

Physical Therapists ◽

Physical Therapist ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Inpatient Rehabilitation ◽

Neurological Dysfunction ◽

Retest Reliability ◽

Reliability Characteristics ◽

Test Retest Reliability

The function in sitting test (FIST) is a newly developed, performance-based measure examining deficits in seated postural control. The FIST has been shown to be internally consistent and valid in persons with neurological dysfunction but intra- and interrater reliability and test-retest reliability have not been previously described. Seven patients with chronic neurologic dysfunction were tested and videotaped performing the FIST on two consecutive days. Seventeen acute care and inpatient rehabilitation physical therapist raters scored six of the videotaped performance of the FIST on two occasions at least 2 weeks apart. Intraclass correlation coefficients were used to calculate the test-retest and intra- and interrater reliability of the FIST. ICC of 0.97 (95% CI 0.847–0.995) indicated excellent test-retest reliability of the FIST. Intra- and interrater reliability was also excellent with ICCs of 0.99 (95% CI 0.994–0.997) and 0.99 (95% CI 0.988–0.994), respectively. Physical therapists and other rehabilitation professionals can confidently use the FIST in a variety of clinical practice and research settings due to its favorable reliability characteristics. More studies are needed to describe the responsiveness and minimal clinically important level of change in FIST scores to further enhance clinical usefulness of this measure.

Download Full-text

Reliability of Measurement of Maximal Isometric Lateral Trunk-Flexion Strength in Athletes Using Handheld Dynamometry

Journal of Sport Rehabilitation ◽

10.1123/jsr.2012.tr6 ◽

2012 ◽

Vol 21 (4) ◽

Cited By ~ 2

Author(s):

Bram L. Newman ◽

Courtney L. Pollock ◽

Michael A. Hunt

Keyword(s):

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Trunk Flexion ◽

Intrarater Reliability ◽

Force Output ◽

Test Occasion ◽

Maximum Effort ◽

And Function ◽

Flexion Strength

Context: Lateral trunk-flexion strength is an important determinant of overall trunk stability and function, but the reliability in measuring this outcome clinically in athletic individuals is not known. Objective: To determine the interrater and intrarater reliability of lateral trunk-flexion strength measurement in athletic individuals using handheld dynamometry. Design: Reliability study. Setting: Research laboratory. Participants: 12 healthy, athletic individuals. Intervention: Lateral trunk-flexion strength was measured using handheld dynamometry across 2 different trunk placements (lateral aspect of the axilla and laterally at the level of the midtrunk) and 2 testing occasions by 2 therapists. Three maximum-effort trials during a "make test" at each placement were completed for each therapist on both occasions. Main Outcome Measures: Maximum force output was identified and converted to a torque. Intraclass correlation coefficients (ICC2,1) were calculated for each dynamometer placement, therapist, and test occasion to determine intrarater and interrater reliability. Results: Intrarater reliability was moderate to good (ICC2,1 = .53-.77), while interrater reliability was good to very good (ICC2,1 = .79-81) at the axilla position. For the midtrunk position, intrarater reliability was good to very good (ICC2,1 = .80-.86), while interrater reliability was good on both days (ICC2,1 = .87-.88). Finally, the standard errors of measurement were low for the axilla position (0.20 Nm/kg; 95% CI .15, .28) and midtrunk position (0.09 Nm/kg; 95% CI .07, .12). Conclusions: Maximum lateral trunk-flexion strength can be reliably measured in athletic individuals with greater overall strength. Based on the 2 positions used in this study, measurement with a dynamometer placement at the midtrunk may be more reliable than that obtained at the axilla.

Download Full-text

Development of a Musculoskeletal Imaging Competency Examination for Physical Therapists

Physical Therapy ◽

10.1093/ptj/pzaa154 ◽

2020 ◽

Vol 100 (12) ◽

pp. 2254-2265

Author(s):

Troy Burley ◽

Lori T Brody ◽

William G Boissonnault ◽

Michael D Ross

Keyword(s):

Delphi Method ◽

Rating Scale ◽

Physical Therapists ◽

Physical Therapist ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

The United States ◽

Care Utilization ◽

Delphi Panel ◽

Cronbach Alpha

Abstract Objective The number of physical therapists with imaging ordering privileges is increasing; however, a known level of competency and knowledge is generally lacking within the profession, as is a method to determine practitioner competency. The purpose of this study was to develop a valid musculoskeletal (MSK) imaging competency examination for physical therapists. Methods This 3-round Delphi method study utilized experts to reach consensus on examination content and development. Round 1 was completed by 37 experts. The last 2 rounds were completed by 35 experts. Experts rated questions on a 5-point Likert rating scale of importance (1 = not at all important, 5 = very important). Consensus was achieved with an a priori decision of (1) >75% agreement of the expert panel rating and ≥4 on the Likert scale, and (2) ≥.90 on Cronbach alpha and intraclass correlation coefficients. Experts recommended a passing score of 75%. The examination was subsequently reviewed by a panel of 5 radiologists. Results The Delphi method and radiologist panel review resulted in the 151-question Burley Readiness Examination (BRE) for MSK Imaging Competency. Interrater agreement and internal consistency of the Delphi panel were excellent, with an average intraclass correlation coefficient and Cronbach alpha of .928 and .950, respectively. Conclusions The BRE is a tool that has the potential to demonstrate practitioners’ level of baseline competency with MSK imaging. Additional testing among physical therapists will provide further validation and reliability of the examination. Impact The use and application of diagnostic imaging is becoming more widespread in physical therapist practice throughout the United States. The BRE could potentially have broader implications for health care utilization and cost in the area of MSK imaging.

Download Full-text

Reliability and Change in Erosion Measurements by High-Resolution peripheral Quantitative Computed Tomography in a Longitudinal Dataset of Rheumatoid Arthritis Patients

The Journal of Rheumatology ◽

10.3899/jrheum.191391 ◽

2020 ◽

pp. jrheum.191391 ◽

Cited By ~ 1

Author(s):

Stephanie Finzel ◽

Sarah L. Manske ◽

Cheryl Barnabe ◽

Andrew J. Burghardt ◽

Hubert Marotte ◽

...

Keyword(s):

Rheumatoid Arthritis ◽

Computed Tomography ◽

High Resolution ◽

Intraclass Correlation ◽

Quantitative Computed Tomography ◽

Correlation Coefficients ◽

Peripheral Quantitative Computed Tomography ◽

Good Reliability ◽

Change Over Time ◽

Over Time

Objective The aim of this multi-reader exercise was to assess the reliability and change over time of erosion measurements in rheumatoid arthritis (RA) patients using high-resolution peripheral quantitative computed tomography (HR-pQCT). Methods HR-pQCT scans of 23 patients with RA were assessed at baseline and 12 months. Four experienced readers examined the dorsal, palmar, radial, and ulnar surfaces of the metacarpal head (MH) and phalangeal base (PB) of the 2nd and 3rd digits, blinded to time order. In total, 368 surfaces (23 patients x16 surfaces) were evaluated per time point to characterize cortical breaks as pathological (erosion) or physiological, and to quantify erosion width and depth. Reliability was evaluated by intraclass correlation coefficients (ICC), percentage agreement, and Light’s kappa; change over time was defined by means ± SD of erosion numbers and dimensions. Results ICCs for the mean measurements of width and depth of the pathological breaks ranged between 0.819 - 0.883, and 0.771 - 0.907 respectively. Most physiological cortical breaks were found at the palmar PB, whereas most pathological cortical breaks were located at the radial MH. There was a significant increase in both the numbers and the dimensions of erosions between baseline and follow-up (p=0.0001 for erosion numbers, width, and depth in axial plane, and p=0.001 for depth in perpendicular plane). Conclusion This exercise confirmed good reliability of HR-pQCT erosion measurements and their ability to detect change over time.

Download Full-text

Infant with Clefts Observation Outcomes Instrument (iCOO): A New Outcome for Infants and Young Children with Orofacial Clefts

The Cleft Palate-Craniofacial Journal ◽

10.1177/10556656211040307 ◽

2021 ◽

pp. 105566562110403

Author(s):

Todd C. Edwards ◽

Carrie L. Heike ◽

Kathleen A. Kapp-Simon ◽

Salene M. Jones ◽

Brian G. Leroux ◽

...

Keyword(s):

Cleft Lip ◽

Intraclass Correlation ◽

Primary Caregivers ◽

Correlation Coefficients ◽

Measurement Properties ◽

Cross Sectional ◽

Scale Scores ◽

Health Domains ◽

And Function ◽

The Impact

Objective We evaluated the measurement properties for item and domain scores of the Infant with Clefts Observation Outcomes Instrument (iCOO). Design Cross-sectional (before lip surgery) and longitudinal study (preoperative baseline and 2 days and 2 months after lip surgery). Setting Three academic craniofacial centers and national online advertisements. Participants Primary caregivers with an infant with cleft lip with or without cleft palate (CL ± P) scheduled to undergo primary lip repair. There were 133 primary caregivers at baseline, 115 at 2 days postsurgery, and 112 at 2 months postsurgery. Main Outcome Measure(s) Caregiver observation items ( n = 61) and global impression of health and function items ( n = 8) across eight health domains. Results Mean age at surgery was 6.0 months (range 2.7-11.8 months). Five of eight iCOO domains have scale scores, with Cronbach’s alphas ranging from 0.67 to 0.87. Except for the Facial Skin and Mouth domain, iCOO scales had acceptable intraclass correlation coefficients (ICCs) ranging from 0.76 to 0.84. The internal consistency of the Global Impression items across all domains was 0.90 and had acceptable ICCs (range 0.76-0.91). Sixteen out of 20 (nonscale) items had acceptable ICCs (range 0.66-0.96). As anticipated, iCOO scores 2 days postoperatively were generally lower than baseline and scores 2 months postsurgery were consistent with baseline or higher. The iCOO took approximately 10 min to complete. Conclusions The iCOO meets measurement standards and may be used for assessing the impact of cleft-related treatments in clinical research and care. More research is needed on its use in various treatment contexts.

Download Full-text

Dynamic Footprint Measurement Collection Technique and Intrarater Reliability

Journal of the American Podiatric Medical Association ◽

10.7547/1020130 ◽

2012 ◽

Vol 102 (2) ◽

pp. 130-138 ◽

Cited By ~ 17

Author(s):

Jeanna M. Fascione ◽

Ryan T. Crews ◽

James S. Wrobel

Keyword(s):

Repeated Measures ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Healthy Population ◽

Intrarater Reliability ◽

Intraclass Correlation Coefficients ◽

Foot Posture ◽

Arch Index ◽

Post Hoc ◽

Dynamic Footprint

Background: Identifying the variability of footprint measurement collection techniques and the reliability of footprint measurements would assist with appropriate clinical foot posture appraisal. We sought to identify relationships between these measures in a healthy population. Methods: On 30 healthy participants, midgait dynamic footprint measurements were collected using an ink mat, paper pedography, and electronic pedography. The footprints were then digitized, and the following footprint indices were calculated with photo digital planimetry software: footprint index, arch index, truncated arch index, Chippaux-Smirak Index, and Staheli Index. Differences between techniques were identified with repeated-measures analysis of variance with post hoc test of Scheffe. In addition, to assess practical similarities between the different methods, intraclass correlation coefficients (ICCs) were calculated. To assess intrarater reliability, footprint indices were calculated twice on 10 randomly selected ink mat footprint measurements, and the ICC was calculated. Results: Dynamic footprint measurements collected with an ink mat significantly differed from those collected with paper pedography (ICC, 0.85–0.96) and electronic pedography (ICC, 0.29–0.79), regardless of the practical similarities noted with ICC values (P = .00). Intrarater reliability for dynamic ink mat footprint measurements was high for the footprint index, arch index, truncated arch index, Chippaux-Smirak Index, and Staheli Index (ICC, 0.74–0.99). Conclusions: Footprint measurements collected with various techniques demonstrate differences. Interchangeable use of exact values without adjustment is not advised. Intrarater reliability of a single method (ink mat) was found to be high. (J Am Podiatr Med Assoc 102(2): 130–138, 2012)

Download Full-text

Assessment of the Intrarater and Interrater Reliability of an Established Clinical Task Analysis Methodology

Anesthesiology ◽

10.1097/00000542-200205000-00016 ◽

2002 ◽

Vol 96 (5) ◽

pp. 1129-1139 ◽

Cited By ~ 46

Author(s):

Jason Slagle ◽

Matthew B. Weinger ◽

My-Than T. Dinh ◽

Vanessa V. Brumer ◽

Kevin Williams

Keyword(s):

Real Time ◽

Task Analysis ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Intrarater Reliability ◽

Intraclass Correlation Coefficients ◽

Percent Time ◽

Analysis Methodology ◽

And Task

Background Task analysis may be useful for assessing how anesthesiologists alter their behavior in response to different clinical situations. In this study, the authors examined the intraobserver and interobserver reliability of an established task analysis methodology. Methods During 20 routine anesthetic procedures, a trained observer sat in the operating room and categorized in real-time the anesthetist's activities into 38 task categories. Two weeks later, the same observer performed task analysis from videotapes obtained intraoperatively. A different observer performed task analysis from the videotapes on two separate occasions. Data were analyzed for percent of time spent on each task category, average task duration, and number of task occurrences. Rater reliability and agreement were assessed using intraclass correlation coefficients. Results Intrarater reliability was generally good for categorization of percent time on task and task occurrence (mean intraclass correlation coefficients of 0.84-0.97). There was a comparably high concordance between real-time and video analyses. Interrater reliability was generally good for percent time and task occurrence measurements. However, the interrater reliability of the task duration metric was unsatisfactory, primarily because of the technique used to capture multitasking. Conclusions A task analysis technique used in anesthesia research for several decades showed good intrarater reliability. Off-line analysis of videotapes is a viable alternative to real-time data collection. Acceptable interrater reliability requires the use of strict task definitions, sophisticated software, and rigorous observer training. New techniques must be developed to more accurately capture multitasking. Substantial effort is required to conduct task analyses that will have sufficient reliability for purposes of research or clinical evaluation.

Download Full-text

Alberta Infant Motor Scale: Reliability and Validity When Used on Preterm Infants in Taiwan

Physical Therapy ◽

10.1093/ptj/80.2.168 ◽

2000 ◽

Vol 80 (2) ◽

pp. 168-178 ◽

Cited By ~ 76

Author(s):

Suh-Fang Jeng ◽

Kuo-Inn Tsou Yau ◽

Li-Chiou Chen ◽

Shu-Fang Hsiao

Keyword(s):

Preterm Infants ◽

Interrater Reliability ◽

Physical Therapist ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Intraclass Correlation Coefficients ◽

Scale Reliability ◽

Scale Scores ◽

Acceptable Reliability

Abstract Background and Purpose. The goal of this study was to examine the reliability and validity of measurements obtained with the Alberta Infant Motor Scale (AIMS) for evaluation of preterm infants in Taiwan. Subjects. Two independent groups of preterm infants were used to investigate the reliability (n=45) and validity (n=41) for the AIMS. Methods. In the reliability study, the AIMS was administered to the infants by a physical therapist, and infant performance was videotaped. The performance was then rescored by the same therapist and by 2 other therapists to examine the intrarater and interrater reliability. In the validity study, the AIMS and the Bayley Motor Scale were administered to the infants at 6 and 12 months of age to examine criterion-related validity. Results. Intraclass correlation coefficients (ICCs) for intrarater and interrater reliability of measurements obtained with the AIMS were high (ICC=.97–.99). The AIMS scores correlated with the Bayley Motor Scale scores at 6 and 12 months (r=.78 and .90), although the AIMS scores at 6 months were only moderately predictive of the motor function at 12 months (r=.56). Conclusion and Discussion. The results suggest that measurements obtained with the AIMS have acceptable reliability and concurrent validity but limited predictive value for evaluating preterm Taiwanese infants.

Download Full-text

Biomechanical Response to Acute Stretching in Dancers and Non-Dancers

Journal of Dance Medicine & Science ◽

10.12678/1089-313x.24.1.12 ◽

2020 ◽

Vol 24 (1) ◽

pp. 12-18

Author(s):

Bárbara Pessali-Marques ◽

Gustavo H.C. Peixoto ◽

Christian E.T. Cabido ◽

André Gustavo P. Andrade ◽

Sara A. Rodrigues ◽

...

Keyword(s):

Undergraduate Students ◽

Repeated Measures ◽

Mechanical Response ◽

Physical Therapists ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Knee Extension ◽

Young Males ◽

Resistance Torque ◽

Post Test

This study aimed to investigate the bio- mechanical response of the hamstring muscles to acute stretching in dancers (D) compared to non-dancers (ND). Maximal range of motion (ROMMax) and stiffness of the hamstrings were assessed in 46 young males, 23 undergraduate students (ND) and 23 professional dancers (D). Ages of the two groups were D 21.5 ± 0.60 years; ND 27.5 ± 0.98 years). Testing was performed in two sessions, familiarization with procedures in the first session and the tests themselves (pre- and post-test and intervention) in the second, with a 24- to 48-hour interval between. The pre-test consisted of three trials of passive knee extension to the point of increased tension in the hamstrings, defined as ROMMax. The resistance torque recorded at ROMMax was defined as torqueMax. Six 30-second constant torque stretches were performed at 100% of the torqueMaxreached in the pre-test in one lower limb only (intervention), with the contralateral limb used as control. The torque measured at an identical ROM before (pre-test) and after (post-test) the intervention was defined as torqueROM, and represented stiffness in this study. Reliability of the ROMMax, torqueMax, and torqueROMwas assessed via intraclass correlation coefficients (ICC3, k) and standard error of the measurements (SEM). Comparison between dancers and non-dancers, control, and intervention conditions for all dependent variables was performed using ANOVA repeated measures followed by Tukey post hoc comparisons to highlight any interaction. The submaximal stretch intensity applied caused torqueROM to decrease in both D and ND groups (p < 0.01), indicating a decrease in stiffness, but no difference between the groups was found. A significantly greater increase in ROMMax was found for the D group compared to the ND group (p < 0.01), suggesting that other aspects in addition to MTU biomechanical adaptations may have played a role in the ROMMax increase, especially for the D group. Further research is needed to explore what those other adaptations are. Meanwhile, coaches and physical therapists should be aware that dancers may require different stretch training protocols than non-dancers.

Download Full-text

Reliability, Internal Consistency, and Validity of Data Obtained With the Functional Gait Assessment

Physical Therapy ◽

10.1093/ptj/84.10.906 ◽

2004 ◽

Vol 84 (10) ◽

pp. 906-918 ◽

Cited By ~ 267

Author(s):

Diane M Wrisley ◽

Gregory F Marchetti ◽

Diane K Kuharsky ◽

Susan L Whitney

Keyword(s):

Internal Consistency ◽

Concurrent Validity ◽

Physical Therapist ◽

Rank Order ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Vestibular Disorders ◽

Intraclass Correlation Coefficients ◽

Gait Assessment ◽

Functional Gait

Background and Purpose. The Functional Gait Assessment (FGA) is a 10-item gait assessment based on the Dynamic Gait Index. The purpose of this study was to evaluate the reliability, internal consistency, and validity of data obtained with the FGA when used with people with vestibular disorders. Subjects. Seven physical therapists from various practice settings, 3 physical therapist students, and 6 patients with vestibular disorders volunteered to participate. Methods. All raters were given 10 minutes to review the instructions, the test items, and the grading criteria for the FGA. The 10 raters concurrently rated the performance of the 6 patients on the FGA. Patients completed the FGA twice, with an hour's rest between sessions. Reliability of total FGA scores was assessed using intraclass correlation coefficients (2,1). Internal consistency of the FGA was assessed using the Cronbach alpha and confirmatory factor analysis. Concurrent validity was assessed using the correlation of the FGA scores with balance and gait measurements. Results. Intraclass correlation coefficients of .86 and .74 were found for interrater and intrarater reliability of the total FGA scores. Internal consistency of the FGA scores was .79. Spearman rank order correlation coefficients of the FGA scores with balance measurements ranged from .11 to .67. Discussion and Conclusion. The FGA demonstrates what we believe is acceptable reliability, internal consistency, and concurrent validity with other balance measures used for patients with vestibular disorders.

Download Full-text