Allodynography: Reliability of a New Procedure for Objective Clinical Examination of Static Mechanical Allodynia

Tara L Packham; Claude J Spicher; Joy C MacDermid; Norman D Buckley

doi:10.1093/pm/pnz045

Allodynography: Reliability of a New Procedure for Objective Clinical Examination of Static Mechanical Allodynia

Pain Medicine ◽

10.1093/pm/pnz045 ◽

2019 ◽

Vol 21 (1) ◽

pp. 101-108

Author(s):

Tara L Packham ◽

Claude J Spicher ◽

Joy C MacDermid ◽

Norman D Buckley

Keyword(s):

Clinical Examination ◽

Clinical Assessment ◽

Mechanical Allodynia ◽

Intraclass Correlation ◽

Assessment Tools ◽

Measurement Properties ◽

Rater Reliability ◽

Retest Reliability ◽

Static Mechanical ◽

Test Retest Reliability

Abstract Objective There is a need for reliable and valid clinical assessment tools for quantifying allodynia in neuropathic pain. Allodynography has been proposed as a useful standardized procedure for clinical assessment of mechanical allodynia. This study (www.clinicaltrials.gov NCT02070367) undertook preliminary investigation of the measurement properties of allodynography, a new standardized clinical examination procedure for mapping the area of cutaneous allodynia. Methods Persons with pain in one upper extremity after complex regional pain syndrome, a peripheral nerve injury, or who had recently experienced a hand fracture were recruited for assessment of static mechanical allodynia (based on perception of a 15g force stimulus delivered by Semmes-Weinstein monofilament #5.18 as painful) by two raters at baseline; the assessment was repeated one week later. Results Single-measures estimates suggested inter-rater reliability for allodynography was excellent at an intraclass correlation coefficient (ICC) of 0.97 (N = 12); test–retest reliability was also excellent at ICC = 0.89 (N = 10) for allodynography (P < 0.001 for both). Confidence intervals’ lower bounds confirm inter-rater reliability as excellent (0.90) but were less definitive for test–retest (0.59). Conclusions This preliminary study supports the inter-rater and test–retest reliability of allodynography. Studies on larger samples in multiple contexts and reporting other measurement properties are warranted.

Download Full-text

Validating a Series of Photo-Numeric Rating Scales for Use in Facial Aesthetics Using Statistical Analysis of Intra- and Inter-Rater Reliability

Aesthetic Surgery Journal Open Forum ◽

10.1093/asjof/ojab039 ◽

2021 ◽

Author(s):

Z Paul Lorenc ◽

Derek Jones ◽

Jeongyun Kim ◽

Hee Min Gwak ◽

Samixa Batham ◽

...

Keyword(s):

Clinical Significance ◽

Rating Scales ◽

Objective Assessment ◽

Intraclass Correlation ◽

Unmet Need ◽

Assessment Tools ◽

Facial Aesthetics ◽

Rater Reliability ◽

Retest Reliability ◽

Test Retest Reliability

Abstract Background Growing demand for minimally invasive aesthetic procedures to correct age-related facial changes and optimize facial proportions has been met with innovation, but has created an unmet need for objective assessment tools to evaluate results empirically. Objectives The purpose of this study is to establish the intra- and inter-rater reliability of ordinal, photonumeric, 4- or 5-point rating scales for clinical use to assess facial aesthetics. Methods Board-certified plastic surgeons and dermatologists (3 raters) performed live validation of jawline contour, temple volume, chin retrusion, nasolabial folds, vertical perioral lip lines, midface volume loss, lip fullness, and crow’s feet dynamic- and at rest- rating scales over 2 rounds, 2 weeks apart. Subjects selected for live validation represented the range of scores and included 54-83 subjects for each scale. Test-retest reliability was quantitated through intra- and inter-rater reliability, determined from the mean weighted kappa and Round 2 Intraclass Correlation Coefficients (ICC), respectively. The clinical significance of a one grade difference was assessed through rater comparison of 31 pairs of side-by-side photographs of subjects with the same grade or a different grade on the developed scales. Results The study demonstrated substantial to near-perfect intra-rater and inter-rater reliability of all scales when utilized by trained raters to assess a diverse group of live subjects. Furthermore, the clinical significance of a 1-point difference on all the developed scales was established. Conclusions The high test-retest reliability and intuitive layout of these scales provide an objective approach with standardized ratings for clinical assessment of various facial features.

Download Full-text

Validation of a menstrual pictogram and a daily bleeding diary for assessment of uterine fibroid treatment efficacy in clinical studies

Journal of Patient-Reported Outcomes ◽

10.1186/s41687-020-00263-0 ◽

2020 ◽

Vol 4 (1) ◽

Author(s):

Claudia Haberland ◽

Anna Filonenko ◽

Christian Seitz ◽

Matthias Börner ◽

Christoph Gerlinger ◽

...

Keyword(s):

Uterine Fibroid ◽

Full Range ◽

Intraclass Correlation ◽

Phase Iii ◽

Measurement Properties ◽

Retest Reliability ◽

Response Options ◽

Patient Global Impression ◽

Patient Reported ◽

Test Retest Reliability

Abstract Background To evaluate the psychometric and measurement properties of two patient-reported outcome instruments, the menstrual pictogram superabsorbent polymer-containing version 3 (MP SAP-c v3) and Uterine Fibroid Daily Bleeding Diary (UF-DBD). Test-retest reliability, criterion, construct validity, responsiveness, missingness and comparability of the MP SAP-c v3 and UF-DBD versus the alkaline hematin (AH) method and a patient global impression of severity (PGI-S) were analyzed in post hoc trial analyses. Results Analyses were based on data from up to 756 patients. The full range of MP SAP-c v3 and UF-DBD response options were used, with score distributions reflecting the cyclic character of the disease. Test-retest reliability of MP SAP-c v3 and UF-DBD scores was supported by acceptable intraclass correlation coefficients when stability was defined by the AH method and Patient Global Impression of Severity (PGI-S) scores (0.80–0.96 and 0.42–0.94, respectively). MP SAP-c v3 and UF-DBD scores demonstrated strong and moderate-to-strong correlations with menstrual blood loss assessed by the AH method. Scores increased in monotonic fashion, with greater disease severities, defined by the AH method and PGI-S scores; differences between groups were mostly statistically significant (P < 0.05). MP SAP-c v3 and UF-DBD were sensitive to changes in disease severity, defined by the AH method and PGI-S. MP SAP-c v3 and UF-DBD showed a lower frequency of missing patient data versus the AH method, and good agreement with the AH method. Conclusions This evidence supports the use of the MP SAP-c v3 and UF-DBD to assess clinical efficacy endpoints in UF phase III studies replacing the AH method.

Download Full-text

The Brain Metastases Symptom Checklist as a novel tool for symptom measurement in patients with brain metastases undergoing whole-brain radiotherapy

Current Oncology ◽

10.3747/co.23.2936 ◽

2016 ◽

Vol 23 (3) ◽

pp. 239 ◽

Cited By ~ 5

Author(s):

D. Rodin ◽

B. Banihashemi ◽

L. Wang ◽

A. Lau ◽

S. Harris ◽

...

Keyword(s):

Brain Metastases ◽

Intraclass Correlation ◽

Whole Brain Radiotherapy ◽

Assessment Tools ◽

Symptom Checklist ◽

Retest Reliability ◽

Brain Radiotherapy ◽

Test Retest Reliability ◽

The Brain ◽

Responsiveness To Change

Purpose We evaluated the feasibility, reliability, and validity of the Brain Metastases Symptom Checklist (BMSC), a novel self-report measure of common symptoms experienced by patients with brain metastases.Methods Patients with first-presentation symptomatic brain metastases (n = 137) referred for whole-brain radiotherapy (WBRT) completed the BMSC at time points before and after treatment. Their caregivers (n = 48) provided proxy ratings twice on the day of consultation to assess reliability, and at week 4 after WBRT to assess responsiveness to change. Correlations with 4 other validated assessment tools were evaluated.Results The symptoms reported on the BMSC were largely mild to moderate, with tiredness (71%) and difficulties with balance (61%) reported most commonly at baseline. Test–retest reliability for individual symptoms had a median intraclass correlation of 0.59 (range: 0.23–0.85). Caregiver proxy and patient responses had a median intraclass correlation of 0.52. Correlation of absolute scores on the BMSC and other symptom assessment tools was low, but consistency in the direction of symptom change was observed. At week 4, change in symptoms was variable, with improvements in weight gain and sleep of 42% and 41% respectively, and worsening of tiredness and drowsiness of 62% and 59% respectively.Conclusions The BMSC captures a wide range of symptoms experienced by patients with brain metastases, and it is sensitive to change. It demonstrated adequate test–retest reliability and face validity in terms of its responsiveness to change. Future research is needed to determine whether modifications to the BMSC itself or correlation with more symptom-specific measures will enhance validity.

Download Full-text

Adaptation and Psychometric Evaluation of the Chinese Counseling Competencies Scale-Revised

Frontiers in Psychology ◽

10.3389/fpsyg.2021.688539 ◽

2021 ◽

Vol 12 ◽

Author(s):

Wei Xia ◽

William Ho Cheung Li ◽

Tingna Liang ◽

Yuanhui Luo ◽

Laurie Long Kwan Ho ◽

...

Keyword(s):

Concurrent Validity ◽

Convergent Validity ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Psychometric Evaluation ◽

Counseling Competencies ◽

Counselors In Training ◽

Rater Reliability ◽

Retest Reliability ◽

Test Retest Reliability

Objectives: This study conducted a linguistic and psychometric evaluation of the Chinese Counseling Competencies Scale-Revised (CCS-R).Methods: The Chinese CCS-R was created from the original English version using a standard forward-backward translation process. The psychometric properties of the Chinese CCS-R were examined in a cohort of 208 counselors-in-training by two independent raters. Fifty-three counselors-in-training were asked to undergo another counseling performance evaluation for the test-retest. The confirmatory factor analysis (CFA) was conducted for the Chinese CCS-R, followed by internal consistency, test-retest reliability, inter-rater reliability, convergent validity, and concurrent validity.Results: The results of the CFA supported the factorial validity of the Chinese CCS-R, with adequate construct replicability. The scale had a McDonald's omega of 0.876, and intraclass correlation coefficients of 0.63 and 0.90 for test-retest reliability and inter-rater reliability, respectively. Significantly positive correlations were observed between the Chinese CCS-R score and scores of performance checklist (Pearson's γ = 0.781), indicating a large convergent validity, and knowledge on drug abuse (Pearson's γ = 0.833), indicating a moderate concurrent validity.Conclusion: The results support that the Chinese CCS-R is a valid and reliable measure of the counseling competencies.Practice implication: The CCS-R provides trainers with a reliable tool to evaluate counseling students' competencies and to facilitate discussions with trainees about their areas for growth.

Download Full-text

Measuring test-retest reliability (TRR) of AMSTAR provides moderate to perfect agreement – a contribution to the discussion of the importance of TRR in relation to the psychometric properties of assessment tools

BMC Medical Research Methodology ◽

10.1186/s12874-021-01231-y ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Stefanie Bühn ◽

Peggy Ober ◽

Tim Mathes ◽

Uta Wegewitz ◽

Anja Jacobs ◽

...

Keyword(s):

Psychometric Properties ◽

Systematic Reviews ◽

Methodological Quality ◽

Assessment Tools ◽

Measurement Properties ◽

Perfect Agreement ◽

Retest Reliability ◽

Test Retest Reliability ◽

The Impact

Abstract Background Systematic Reviews (SRs) can build the groundwork for evidence-based health care decision-making. A sound methodological quality of SRs is crucial. AMSTAR (A Measurement Tool to Assess Systematic Reviews) is a widely used tool developed to assess the methodological quality of SRs of randomized controlled trials (RCTs). Research shows that AMSTAR seems to be valid and reliable in terms of interrater reliability (IRR), but the test retest reliability (TRR) of AMSTAR has never been investigated. In our study we investigated the TRR of AMSTAR to evaluate the importance of its measurement and contribute to the discussion of the measurement properties of AMSTAR and other quality assessment tools. Methods Seven raters at three institutions independently assessed the methodological quality of SRs in the field of occupational health with AMSTAR. Between the first and second ratings was a timespan of approximately two years. Answers were dichotomized, and we calculated the TRR of all raters and AMSTAR items using Gwet’s AC1 coefficient. To investigate the impact of variation in the ratings over time, we obtained summary scores for each review. Results AMSTAR item 4 (Was the status of publication used as an inclusion criterion?) provided the lowest median TRR of 0.53 (moderate agreement). Perfect agreement of all reviewers was detected for AMSTAR-item 1 with a Gwet’s AC1 of 1, which represented perfect agreement. The median TRR of the single raters varied between 0.69 (substantial agreement) and 0.89 (almost perfect agreement). Variation of two or more points in yes-scored AMSTAR items was observed in 65% (73/112) of all assessments. Conclusions The high variation between the first and second AMSTAR ratings suggests that consideration of the TRR is important when evaluating the psychometric properties of AMSTAR.. However, more evidence is needed to investigate this neglected issue of measurement properties. Our results may initiate discussion of the importance of considering the TRR of assessment tools. A further examination of the TRR of AMSTAR, as well as other recently established rating tools such as AMSTAR 2 and ROBIS (Risk Of Bias In Systematic reviews), would be useful.

Download Full-text

Development and reliability of the Korean version of the Feeding Abilities Assessment

Hong Kong Journal of Occupational Therapy ◽

10.1177/1569186119850694 ◽

2019 ◽

Vol 32 (1) ◽

pp. 69-74

Author(s):

Seul Gi Koo ◽

Hae Yean Park ◽

Jongbae Kim ◽

Areum Han

Keyword(s):

Correlation Coefficient ◽

Internal Consistency ◽

Content Validity ◽

Assessment Tool ◽

High Reliability ◽

Intraclass Correlation ◽

Rater Reliability ◽

Retest Reliability ◽

Korean Version ◽

Test Retest Reliability

Objective The purpose of this study is to introduce a standardised assessment tool by verifying the reliability of the translated Korean version of the Feeding Abilities Assessment (K-FAA), which was developed to suit Korean culture. Methods The research subjects were 65 patients with dementia living in nursing homes. The K-FAA was completed by verifying the suitability of translation and reverse translation. The validity of the K-FAA was established through content validity, while its reliability was analysed based on internal consistency reliability for the items, test–retest reliability and inter-rater reliability. Results The content validity index determined, based on the assessment of professors, occupational therapists, and nurses, was more than .70. Cronbach’s α was more than .929, showing good internal consistency. A test–retest reliability of .884 was derived using Pearson’s correlation coefficient (p < .01), and an inter-rater reliability of .800 was derived using the kappa coefficients; intraclass correlation coefficient was .897, which also indicated good reliability. Conclusion The K-FAA was modified to fit the Korean domestic situation, and this assessment had high reliability. Therefore, K-FAA can evaluate the feeding ability of patients with dementia. Future studies should focus on providing evidence-based data to maintain or supplement the feeding ability of patients with dementia in Korea.

Download Full-text

Reliability and Validity of the Dyskinesia Impairment Scale in Children and Young Adults with Inherited or Idiopathic Dystonia

Journal of Clinical Medicine ◽

10.3390/jcm9082597 ◽

2020 ◽

Vol 9 (8) ◽

pp. 2597

Author(s):

Annika Danielsson ◽

Inti Vanmechelen ◽

Cecilia Lidbeck ◽

Lena Krumlinde-Sundholm ◽

Els Ortibus ◽

...

Keyword(s):

Young Adults ◽

Rating Scale ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Children And Youth ◽

Rater Reliability ◽

Retest Reliability ◽

Children And Young Adults ◽

Test Retest Reliability ◽

Idiopathic Dystonia

Background: The Dyskinesia Impairment Scale (DIS) is a new assessment scale for dystonia and choreoathetosis in children and youth with dyskinetic cerebral palsy. Today, the Burke–Fahn–Marsden Dystonia Rating Scale (BFM) is mostly used to assess dystonia in children with inherited dystonia. The aim of this study was to assess reliability and validity of the DIS in children and youth with inherited or idiopathic dystonia. Methods: Reliability was measured by (1) the intraclass correlation coefficients (ICCs) for inter-rater and test-retest reliability, as well as (2) standard error of measurement (SEM) and minimal detectable difference (MDD). For concurrent validity of the DIS-dystonia subscale, the BFM was administered. Results: In total, 11 males and 9 females (median age 16 years and 7 months, range 6 to 24 years) were included. For inter-rater reliability, the ICCs for the DIS total score and the dystonia and choreoathetosis subscale scores were 0.83, 0.87, and 0.71, respectively. For test-retest reliability, the ICCs for the DIS total score and the dystonia and choreoathetosis subscale scores were 0.95, 0.88, and 0.93, respectively. The SEM and MDD for the total DIS were 3.98% and 11.04%, respectively. The Spearman correlation coefficient between the dystonia subscale and the BFM was 0.88 (p < 0.01). Conclusions: Good to excellent inter-rater, test-retest reliability, and validity were found for the total DIS and the dystonia subscale. The choreoathetosis subscale showed moderate inter-rater reliability and excellent test-retest reliability. The DIS may be a promising tool to assess dystonia and choreoathetosis in children and young adults with inherited or idiopathic dystonia.

Download Full-text

Comparative study of psychometric properties of three assessment tools for degenerative rotator cuff disease

Clinical Rehabilitation ◽

10.1177/0269215518796888 ◽

2018 ◽

Vol 33 (2) ◽

pp. 277-284 ◽

Cited By ~ 3

Author(s):

Etienne James-Belin ◽

Anne Laure Roy ◽

Sandra Lasbleiz ◽

Agnès Ostertag ◽

Alain Yelnik ◽

...

Keyword(s):

Rotator Cuff ◽

Psychometric Properties ◽

Intraclass Correlation ◽

Assessment Tools ◽

University Hospital ◽

Rotator Cuff Disease ◽

Retest Reliability ◽

Good For ◽

Test Retest Reliability ◽

Improvement Score

Objective: To compare psychometric properties of Disabilities of the Arm, Shoulder and Hand (DASH) questionnaire, Shoulder Pain and Disability Index (SPADI) and Constant–Murley scale, in patients with degenerative rotator cuff disease (DRCD). Design: Longitudinal cohort. Setting: One French university hospital. Methods: The scales were applied twice at one-week interval before physiotherapy and once after physiotherapy two months later. The perceived improvement after treatment was self-assessed on a numerical scale (0–4). The test–retest reliability of the DASH, SPADI and Constant–Murley scales was assessed before treatment by the intraclass correlation coefficient (ICC). The responsiveness was assessed by the paired t-test ( P < 0.05) and standardized mean difference (SMD). The correlation between the percentage of variation in scale scores and the self-assessed improvement score after treatment was measured by the Spearman coefficient. Results: Fifty-three patients were included. Twenty-six only were available for reliability. The test–retest reliability was very good for the DASH (ICC = 0.97), SPADI (0.95) and Constant–Murley (0.92). The scale score was improved after treatment for each scale ( P < 0.05). The SMD was moderate for the DASH (0.56) and SPADI (0.56) scales, and small for the Constant–Murley (0.44). The correlation between the percentage of variation in scores and self-assessed improvement score after treatment was high, moderate and not significant for the SPADI (0.59, P < 0.0001), DASH (0.42, P < 0.01) and Constant–Murley scales, respectively. Conclusion: The test–retest reliability of the DASH, SPADI and Constant–Murley scales is very good for patients with DRCD. The highest responsiveness was achieved with the SPADI.

Download Full-text

Patient Preference Assessment Reveals Disease Aspects Not Covered by Recommended Outcomes in Polymyositis and Dermatomyositis

ISRN Rheumatology ◽

10.5402/2011/463124 ◽

2011 ◽

Vol 2011 ◽

pp. 1-5 ◽

Cited By ~ 13

Author(s):

Li Alemo Munters ◽

Ronald F. van Vollenhoven ◽

Helene Alexanderson

Keyword(s):

Patient Preference ◽

Intraclass Correlation ◽

Health Assessment ◽

Preference Assessment ◽

Weighted Kappa ◽

Measurement Properties ◽

Retest Reliability ◽

Related Quality ◽

Health Related ◽

Test Retest Reliability

Objectives. Polymyositis (PM) and dermatomyositis (DM) are characterized by impaired muscle function with a majority of patients developing sustained disability. The aim of this study was to evaluate the patient’s individual priorities (patient preference) of disabilities most important to improve in PM/DM using the MacMaster Toronto Arthritis Patient Preference Disability Questionnaire (MACTAR), to correlate the MACTAR to myositis outcomes and to evaluate its test-retest reliability. Methods. Twenty-eight patients with PM/DM performed recommended outcomes as well as the MACTAR, which was performed twice with one week apart. Results. Sexual activity, walking, biking, social activities, and sleep constituted the predominating disabilities. Seventy-two and 33% of the identified disabilities were not covered by items of the Health Assessment Questionnaire and the Myositis Activities Profile. Correlations between the MACTAR and health-related quality of life measures were = −0.67–0.73, correlations with measures of activities of daily living and participation in society were = 0.51–0.60 with lower correlations for other outcomes. Intraclass correlation (ICC) and weighted Kappa () coefficients were 0.83 and 0.68, respectively, for test-retest reliability of the MACTAR. Conclusions. The MACTAR interview had promising measurement properties and identified patient preference disabilities in PM/DM that were not covered by recommended outcomes.

Download Full-text

Reliability of Autism-Tics, AD/HD, and other Comorbidities (A–TAC) Inventory in a Test-Retest Design

Psychological Reports ◽

10.2466/03.15.pr0.114k10w1 ◽

2014 ◽

Vol 114 (1) ◽

pp. 93-103 ◽

Cited By ~ 15

Author(s):

Tomas Larson ◽

Eva Norén Selinus ◽

Clara Hellner Gumpert ◽

Thomas Nilsson ◽

Nóra Kerekes ◽

...

Keyword(s):

Intraclass Correlation ◽

Correlation Coefficients ◽

Population Based ◽

Autism Spectrum ◽

Good Test ◽

Rater Reliability ◽

Retest Reliability ◽

Intraclass Correlation Coefficients ◽

Intraclass Correlations ◽

Test Retest Reliability

The Autism-Tics, AD/HD, and other Comorbidities (A–TAC) inventory is used in epidemiological research to assess neurodevelopmental problems and coexisting conditions. Although the A–TAC has been applied in various populations, data on retest reliability are limited. The objective of the present study was to present additional reliability data. The A–TAC was administered by lay assessors and was completed on two occasions by parents of 400 individual twins, with an average interval of 70 days between test sessions. Intra- and inter-rater reliability were analysed with intraclass correlations and Cohen's κ. A–TAC showed excellent test-retest intraclass correlations for both autism spectrum disorder and attention deficit hyperactivity disorder (each at .84). Most modules in the A–TAC had intra- and inter-rater reliability intraclass correlation coefficients of ≥ .60. Cohen's κ indicated acceptable reliability. The current study provides statistical evidence that the A–TAC yields good test-retest reliability in a population-based cohort of children.

Download Full-text