HoNOS–ABI: a reliable outcome measure of neuropsychiatric sequelae to brain injury?

S. Fleminger; E. Leigh; P. Eames; L. Langrell; R. Nagraj; S. Logsdail

doi:10.1192/pb.29.2.53

HoNOS–ABI: a reliable outcome measure of neuropsychiatric sequelae to brain injury?

Psychiatric Bulletin ◽

10.1192/pb.29.2.53 ◽

2005 ◽

Vol 29 (2) ◽

pp. 53-55 ◽

Cited By ~ 12

Author(s):

S. Fleminger ◽

E. Leigh ◽

P. Eames ◽

L. Langrell ◽

R. Nagraj ◽

...

Keyword(s):

Brain Injury ◽

Interrater Reliability ◽

Outcome Measure ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Acquired Brain Injury ◽

Weighted Kappa ◽

Routine Clinical Practice ◽

Clinical Implications ◽

Intraclass Correlation Coefficients

Aims and MethodThe Health of the Nation Outcome Scale for Acquired Brain Injury (HoNOS–ABI) is a relatively new outcome measure designed to assess the neuropsychiatric sequelae of brain damage. This study investigated the interrater reliability of this scale. Fifty patients with traumatic brain injury receiving rehabilitation were each rated twice on the HoNOS–ABI, by two different raters. There were 24 raters in total.ResultsWeighted kappa values ranged from 0.43 to 0.84 and intraclass correlation coefficients from 0.58 to 0.97 for the ten items assessed. This indicated that agreement was moderate to substantial for all items.Clinical ImplicationsThe scales consistently measured the items of interest across different raters. This indicates that HoNOS–ABI is a reliable outcome measure when applied by different raters in routine clinical practice.

Download Full-text

Reliability of the Assisting Hand Assessment (AHA) for Children and Youth With Acquired Brain Injury

Brain Impairment ◽

10.1375/brim.11.2.113 ◽

2010 ◽

Vol 11 (2) ◽

pp. 113-124 ◽

Cited By ~ 3

Author(s):

Elizabeth Davis ◽

Jane Galvin ◽

Cheryl Soo

Keyword(s):

Clinical Practice ◽

Brain Injury ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Motor Impairment ◽

Acquired Brain Injury ◽

Weighted Kappa ◽

Children And Youth ◽

Measurement Properties ◽

Intrarater Reliability

AbstractIntroduction:The ability to use both hands to interact with objects is required in daily activities and is therefore important to measure in clinical practice. The Assisting Hand Assessment (AHA) is unique in evaluating the function of a child or youth's assisting hand, through observing the spontaneous manipulation of objects during bimanual activity. The AHA was developed for children with unilateral motor impairment, and shows strong psychometric properties when used with children who have cerebral palsy (CP) or obstetric brachial plexus palsy (OBPP). The AHA is currently used in clinical practice with children who have an acquired brain injury (ABI), however there is limited research on the measurement properties of its use with this population.Objectives:The study aimed to determine the interrater and intrarater reliability of the AHA for children and youth with unilateral motor impairment following ABI. Methods: For interrater reliability, two occupational therapists (OT1 and OT2) independently rated the same 26 children and youth. For intrarater reliability, OT2 conducted a second assessment on the 26 participants 1 week later. Association between item scores on the AHA were analysed using weighted kappa (Kw), while intraclass correlation coefficients (ICCs) were used for domain and total scores.Results:The AHA items demonstrated good to excellent intrarater reliability (Kw= 0.67–1.00). Interrater reliability was good to excellent (Kw=0.60–0.84) for 20 of the 22 items of the AHA. Interrater and intrarater reliability coefficients for all domain and total scores were in the excellent range (ICC = 0.85–0.99).Conclusion:The current study indicates that the AHA shows good interrater and intrarater reliability when used with the paediatric ABI population. Findings provide preliminary support for the continued use of the AHA for children and youth with acquired hemiplegia.

Download Full-text

Interobserver Reliability Using the Phonetic Level Evaluation With Severely and Profoundly Hearing-Impaired Children

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3405.989 ◽

1991 ◽

Vol 34 (5) ◽

pp. 989-999 ◽

Cited By ~ 6

Author(s):

Stephanie Shaw ◽

Truman E. Coggins

Keyword(s):

Interrater Reliability ◽

Interobserver Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Hearing Impaired ◽

Intraclass Correlation Coefficients ◽

Assessment Measure ◽

Impaired Children ◽

Speech Assessment ◽

Hearing Impaired Children

This study examines whether observers reliably categorize selected speech production behaviors in hearing-impaired children. A group of experienced speech-language pathologists was trained to score the elicited imitations of 5 profoundly and 5 severely hearing-impaired subjects using the Phonetic Level Evaluation (Ling, 1976). Interrater reliability was calculated using intraclass correlation coefficients. Overall, the magnitude of the coefficients was found to be considerably below what would be accepted in published behavioral research. Failure to obtain acceptably high levels of reliability suggests that the Phonetic Level Evaluation may not yet be an accurate and objective speech assessment measure for hearing-impaired children.

Download Full-text

Development of a Model for the Acquisition and Assessment of Advanced Laparoscopic Suturing Skills Using an Automated Device

Surgical Innovation ◽

10.1177/1553350618764221 ◽

2018 ◽

Vol 25 (3) ◽

pp. 286-290 ◽

Cited By ~ 2

Author(s):

Elif Bilgic ◽

Madoka Takao ◽

Pepa Kaneva ◽

Satoshi Endo ◽

Toshitatsu Takao ◽

...

Keyword(s):

Laparoscopic Surgery ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Instructional Video ◽

Validity Evidence ◽

Laparoscopic Suturing ◽

Intraclass Correlation Coefficients ◽

Operative Assessment ◽

Suturing Skills

Background. Needs assessment identified a gap regarding laparoscopic suturing skills targeted in simulation. This study collected validity evidence for an advanced laparoscopic suturing task using an Endo StitchTM device. Methods. Experienced (ES) and novice surgeons (NS) performed continuous suturing after watching an instructional video. Scores were based on time and accuracy, and Global Operative Assessment of Laparoscopic Surgery. Data are shown as medians [25th-75th percentiles] (ES vs NS). Interrater reliability was calculated using intraclass correlation coefficients (confidence interval). Results. Seventeen participants were enrolled. Experienced surgeons had significantly greater task (980 [964-999] vs 666 [391-711], P = .0035) and Global Operative Assessment of Laparoscopic Surgery scores (25 [24-25] vs 14 [12-17], P = .0029). Interrater reliability for time and accuracy were 1.0 and 0.9 (0.74-0.96), respectively. All experienced surgeons agreed that the task was relevant to practice. Conclusion. This study provides validity evidence for the task as a measure of laparoscopic suturing skill using an automated suturing device. It could help trainees acquire the skills they need to better prepare for clinical learning.

Download Full-text

Interrater Reliability of the Handwriting Speed Test

The Occupational Therapy Journal of Research ◽

10.1177/153944929701700404 ◽

1997 ◽

Vol 17 (4) ◽

pp. 280-287 ◽

Cited By ~ 8

Author(s):

Margaret Wallen ◽

Mary-Ann Bonney ◽

Lyn Lennox

Keyword(s):

Interrater Reliability ◽

Test Development ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Objective Evaluation ◽

Research Tool ◽

School Students ◽

Speed Test ◽

Intraclass Correlation Coefficients ◽

Handwriting Speed

The Handwriting Speed Test (HST), a standardized, norm-referenced test, was developed to provide an objective evaluation of the handwriting speed of school students from approximately 8 to 18 years of age. Part of the test development involved an examination of interrater reliability. Two raters scored 165 (13%) of the total 1292 handwriting samples. Using intraclass correlation coefficients, the interrater reliability was found to be excellent (ICC=1.00, P<0.0001). The process of examining interrater reliability resulted in modification to the scoring criteria of the test. Excellent interrater reliability provides support for the HST as a valuable clinical and research tool.

Download Full-text

Validation of an Evidence-Based Medicine Critically Appraised Topic Presentation Evaluation Tool (EBM C-PET)

Journal of Graduate Medical Education ◽

10.4300/jgme-d-12-00049.1 ◽

2013 ◽

Vol 5 (2) ◽

pp. 252-256 ◽

Cited By ~ 2

Author(s):

Hans B. Kersten ◽

John G. Frohna ◽

Erin L. Giudice

Keyword(s):

Internal Consistency ◽

Evidence Based Medicine ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Evidence Based ◽

Evaluation Tool ◽

Intraclass Correlation Coefficients ◽

Excellent Internal Consistency ◽

Based Medicine

Abstract Background Competence in evidence-based medicine (EBM) is an important clinical skill. Pediatrics residents are expected to acquire competence in EBM during their education, yet few validated tools exist to assess residents' EBM skills. Objective We sought to develop a reliable tool to evaluate residents' EBM skills in the critical appraisal of a research article, the development of a written EBM critically appraised topic (CAT) synopsis, and a presentation of the findings to colleagues. Methods Instrument development used a modified Delphi technique. We defined the skills to be assessed while reviewing (1) a written CAT synopsis and (2) a resident's EBM presentation. We defined skill levels for each item using the Dreyfus and Dreyfus model of skill development and created behavioral anchors using a frame-of-reference training technique to describe performance for each skill level. We evaluated the assessment instrument's psychometric properties, including internal consistency and interrater reliability. Results The EBM Critically Appraised Topic Presentation Evaluation Tool (EBM C-PET) is composed of 14 items that assess residents' EBM and global presentation skills. Resident presentations (N = 27) and the corresponding written CAT synopses were evaluated using the EBM C-PET. The EBM C-PET had excellent internal consistency (Cronbach α = 0.94). Intraclass correlation coefficients were used to assess interrater reliability. Intraclass correlation coefficients for individual items ranged from 0.31 to 0.74; the average intraclass correlation coefficients for the 14 items was 0.67. Conclusions We identified essential components of an assessment tool for an EBM CAT synopsis and presentation with excellent internal consistency and a good level of interrater reliability across 3 different institutions. The EBM C-PET is a reliable tool to document resident competence in higher-level EBM skills.

Download Full-text

Developing and validating a methodology for crowdsourcing L2 speech ratings in Amazon Mechanical Turk

Journal of Second Language Pronunciation ◽

10.1075/jslp.18016.nag ◽

2019 ◽

Vol 5 (2) ◽

pp. 294-323 ◽

Cited By ~ 3

Author(s):

Charles Nagle

Keyword(s):

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Spanish Speakers ◽

Mechanical Turk ◽

Amazon Mechanical Turk ◽

Native Spanish Speakers ◽

Intraclass Correlation Coefficients ◽

Future Data ◽

Rater Severity

Abstract Researchers have increasingly turned to Amazon Mechanical Turk (AMT) to crowdsource speech data, predominantly in English. Although AMT and similar platforms are well positioned to enhance the state of the art in L2 research, it is unclear if crowdsourced L2 speech ratings are reliable, particularly in languages other than English. The present study describes the development and deployment of an AMT task to crowdsource comprehensibility, fluency, and accentedness ratings for L2 Spanish speech samples. Fifty-four AMT workers who were native Spanish speakers from 11 countries participated in the ratings. Intraclass correlation coefficients were used to estimate group-level interrater reliability, and Rasch analyses were undertaken to examine individual differences in rater severity and fit. Excellent reliability was observed for the comprehensibility and fluency ratings, but indices were slightly lower for accentedness, leading to recommendations to improve the task for future data collection.

Download Full-text

Assessment of the Intrarater and Interrater Reliability of an Established Clinical Task Analysis Methodology

Anesthesiology ◽

10.1097/00000542-200205000-00016 ◽

2002 ◽

Vol 96 (5) ◽

pp. 1129-1139 ◽

Cited By ~ 46

Author(s):

Jason Slagle ◽

Matthew B. Weinger ◽

My-Than T. Dinh ◽

Vanessa V. Brumer ◽

Kevin Williams

Keyword(s):

Real Time ◽

Task Analysis ◽

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Intrarater Reliability ◽

Intraclass Correlation Coefficients ◽

Percent Time ◽

Analysis Methodology ◽

And Task

Background Task analysis may be useful for assessing how anesthesiologists alter their behavior in response to different clinical situations. In this study, the authors examined the intraobserver and interobserver reliability of an established task analysis methodology. Methods During 20 routine anesthetic procedures, a trained observer sat in the operating room and categorized in real-time the anesthetist's activities into 38 task categories. Two weeks later, the same observer performed task analysis from videotapes obtained intraoperatively. A different observer performed task analysis from the videotapes on two separate occasions. Data were analyzed for percent of time spent on each task category, average task duration, and number of task occurrences. Rater reliability and agreement were assessed using intraclass correlation coefficients. Results Intrarater reliability was generally good for categorization of percent time on task and task occurrence (mean intraclass correlation coefficients of 0.84-0.97). There was a comparably high concordance between real-time and video analyses. Interrater reliability was generally good for percent time and task occurrence measurements. However, the interrater reliability of the task duration metric was unsatisfactory, primarily because of the technique used to capture multitasking. Conclusions A task analysis technique used in anesthesia research for several decades showed good intrarater reliability. Off-line analysis of videotapes is a viable alternative to real-time data collection. Acceptable interrater reliability requires the use of strict task definitions, sophisticated software, and rigorous observer training. New techniques must be developed to more accurately capture multitasking. Substantial effort is required to conduct task analyses that will have sufficient reliability for purposes of research or clinical evaluation.

Download Full-text

Alberta Infant Motor Scale: Reliability and Validity When Used on Preterm Infants in Taiwan

Physical Therapy ◽

10.1093/ptj/80.2.168 ◽

2000 ◽

Vol 80 (2) ◽

pp. 168-178 ◽

Cited By ~ 76

Author(s):

Suh-Fang Jeng ◽

Kuo-Inn Tsou Yau ◽

Li-Chiou Chen ◽

Shu-Fang Hsiao

Keyword(s):

Preterm Infants ◽

Interrater Reliability ◽

Physical Therapist ◽

Intraclass Correlation ◽

Reliability And Validity ◽

Correlation Coefficients ◽

Intraclass Correlation Coefficients ◽

Scale Reliability ◽

Scale Scores ◽

Acceptable Reliability

Abstract Background and Purpose. The goal of this study was to examine the reliability and validity of measurements obtained with the Alberta Infant Motor Scale (AIMS) for evaluation of preterm infants in Taiwan. Subjects. Two independent groups of preterm infants were used to investigate the reliability (n=45) and validity (n=41) for the AIMS. Methods. In the reliability study, the AIMS was administered to the infants by a physical therapist, and infant performance was videotaped. The performance was then rescored by the same therapist and by 2 other therapists to examine the intrarater and interrater reliability. In the validity study, the AIMS and the Bayley Motor Scale were administered to the infants at 6 and 12 months of age to examine criterion-related validity. Results. Intraclass correlation coefficients (ICCs) for intrarater and interrater reliability of measurements obtained with the AIMS were high (ICC=.97–.99). The AIMS scores correlated with the Bayley Motor Scale scores at 6 and 12 months (r=.78 and .90), although the AIMS scores at 6 months were only moderately predictive of the motor function at 12 months (r=.56). Conclusion and Discussion. The results suggest that measurements obtained with the AIMS have acceptable reliability and concurrent validity but limited predictive value for evaluating preterm Taiwanese infants.

Download Full-text

Is the BESTest at Its Best? A Suggested Brief Version Based on Interrater Reliability, Validity, Internal Consistency, and Theoretical Construct

Physical Therapy ◽

10.2522/ptj.20120056 ◽

2012 ◽

Vol 92 (9) ◽

pp. 1197-1207 ◽

Cited By ~ 86

Author(s):

Parminder K. Padgett ◽

Jesse V. Jacobs ◽

Susan L. Kasser

Keyword(s):

Interrater Reliability ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Cross Sectional Study ◽

Cross Sectional ◽

Test Items ◽

Intraclass Correlation Coefficients ◽

Eyes Closed ◽

Functional Reach ◽

Hip Abductor

Background The Balance Evaluation Systems Test (BESTest) and Mini-BESTest are clinical examinations of balance impairment, but the tests are lengthy and the Mini-BESTest is theoretically inconsistent with the BESTest. Objective The purpose of this study was to generate an alternative version of the BESTest that is valid, reliable, time efficient, and founded upon the same theoretical underpinnings as the original test. Design This was a cross-sectional study. Methods Three raters evaluated 20 people with and without a neurological diagnosis. Test items with the highest item-section correlations defined the new Brief-BESTest. The validity of the BESTest, the Mini-BESTest, and the new Brief-BESTest to identify people with or without a neurological diagnosis was compared. Interrater reliability of the test versions was evaluated by intraclass correlation coefficients. Validity was further investigated by determining the ability of each version of the examination to identify the fall status of a second cohort of 26 people with and without multiple sclerosis. Results Items of hip abductor strength, functional reach, one-leg stance, lateral push-and-release, standing on foam with eyes closed, and the Timed “Up & Go” Test defined the Brief-BESTest. Intraclass correlation coefficients for all examination versions were greater than .98. The accuracy of identifying people from the first cohort with or without a neurological diagnosis was 78% for the BESTest versus 72% for the Mini-BESTest or Brief-BESTest. The sensitivity to fallers from the second cohort was 100% for the Brief-BESTest, 71% for the Mini-BESTest, and 86% for the BESTest, and all versions exhibited specificity of 95% to 100% to identify nonfallers. Limitations Further testing is needed to improve the generalizability of findings. Conclusions Although preliminary, the Brief-BESTest demonstrated reliability comparable to that of the Mini-BESTest and potentially superior sensitivity while requiring half the items of the Mini-BESTest and representing all theoretically based sections of the original BESTest.

Download Full-text

Interrater Reliability of AM-PAC “6-Clicks” Basic Mobility and Daily Activity Short Forms

Physical Therapy ◽

10.2522/ptj.20140174 ◽

2015 ◽

Vol 95 (5) ◽

pp. 758-766 ◽

Cited By ~ 36

Author(s):

Diane U. Jette ◽

Mary Stilphen ◽

Vinoth K. Ranganathan ◽

Sandra Passek ◽

Frederick S. Frost ◽

...

Keyword(s):

Daily Activity ◽

Interrater Reliability ◽

Physical Therapists ◽

Short Form ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Occupational Therapists ◽

Weighted Kappa ◽

Item Scores ◽

Mean Differences

BackgroundThe interrater reliability of 2 new inpatient functional short-form measures, Activity Measure for Post-Acute Care (AM-PAC) “6-Clicks” basic mobility and daily activity scores, has yet to be established.ObjectiveThe purpose of this study was to examine the interrater reliability of AM-PAC “6-Clicks” measures.DesignA prospective observational study was conducted.MethodsFour pairs of physical therapists rated basic mobility and 4 pairs of occupational therapists rated daily activity of patients in 1 of 4 hospital services. One therapist in a pair was the primary therapist directing the assessment while the other therapist observed. Each therapist was unaware of the other's AM-PAC “6-Clicks” scores. Reliability was assessed with intraclass correlation coefficients (ICCs), Bland-Altman plots, and weighted kappa.ResultsThe ICCs for the overall reliability of basic mobility and daily activity were .849 (95% confidence interval [CI]=.784, .895) and .783 (95% CI=.696, .847), respectively. The ICCs for the reliability of each pair of raters ranged from .581 (95% CI=.260, .789) to .960 (95% CI=.897, .983) for basic mobility and .316 (95% CI=−.061, .611) to .907 (95% CI=.801, .958) for daily activity. The weighted kappa values for item agreement ranged from .492 (95% CI=.382, .601) to .712 (95% CI=.607, .816) for basic mobility and .251 (95% CI=.057, .445) to .751 (95% CI=.653, .848) for daily activity. Mean differences between raters' scores were near zero.LimitationsRaters were from one health system. Each pair of raters assessed different patients in different services.ConclusionsThe ICCs for AM-PAC “6-Clicks” total scores were very high. Levels of agreement varied across pairs of raters, from large to nearly perfect for physical therapists and from moderate to nearly perfect for occupational therapists. Levels of agreement for individual item scores ranged from small to very large.

Download Full-text