Competency-based simulation assessment of resuscitation skills in emergency medicine postgraduate trainees – a Canadian multi-centred study

J. Damon Dagnone; Andrew K. Hall; Stefanie Sebok-Syer; Don Klinger; Karen Woolfrey; Colleen Davison; John Ross; Gordon McNeil; Sean Moore

doi:10.36834/cmej.36682

Competency-based simulation assessment of resuscitation skills in emergency medicine postgraduate trainees – a Canadian multi-centred study

Canadian Medical Education Journal ◽

10.36834/cmej.36682 ◽

2016 ◽

Vol 7 (1) ◽

pp. e57-e67 ◽

Cited By ~ 11

Author(s):

J. Damon Dagnone ◽

Andrew K. Hall ◽

Stefanie Sebok-Syer ◽

Don Klinger ◽

Karen Woolfrey ◽

...

Keyword(s):

Emergency Medicine ◽

Assessment Tool ◽

Postgraduate Medical Education ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

High Fidelity ◽

Rater Reliability ◽

Simulation Based ◽

Competency Based ◽

Generalizability Study

Background: The use of high-fidelity simulation is emerging as a desirable method for competency-based assessment in postgraduate medical education. We aimed to demonstrate the feasibility and validity of a multi-centre simulation-based Objective Structured Clinical Examination (OSCE) of resuscitation competence with Canadian Emergency Medicine (EM) trainees.Method: EM postgraduate trainees (n=98) from five Canadian academic centres participated in a high fidelity, 3-station simulation-based OSCE. Expert panels of three emergency physicians evaluated trainee performances at each centre using the Queen’s Simulation Assessment Tool (QSAT). Intraclass correlation coefficients were used to measure the inter-rater reliability, and analysis of variance was used to measure the discriminatory validity of each scenario. A fully crossed generalizability study was also conducted for each examination centre. Results: Inter-rater reliability in four of the five centres was strong with a median absolute intraclass correlation coefficient (ICC) across centres and scenarios of 0.89 [0.65-0.97]. Discriminatory validity was also strong (p < 0.001 for scenarios 1 and 3; p < 0.05 for scenario 2). Generalizability studies found significant variations at two of the study centres.Conclusions: This study demonstrates the successful pilot administration of a multi-centre, 3-station simulation-based OSCE for the assessment of resuscitation competence in post-graduate Emergency Medicine trainees.

Download Full-text

P057: Performance of a national simulation-based resuscitation OSCE for emergency medicine trainees

CJEM ◽

10.1017/cem.2016.233 ◽

2016 ◽

Vol 18 (S1) ◽

pp. S97-S98 ◽

Cited By ~ 1

Author(s):

C. Hagel ◽

A.K. Hall ◽

D. Klinger ◽

G. McNeil ◽

D. Dagnone

Keyword(s):

Emergency Medicine ◽

Large Scale ◽

Assessment Tool ◽

Postgraduate Medical Education ◽

Perceived Realism ◽

Simulation Based ◽

Competency Based ◽

Generalizability Study ◽

Graduate Trainees ◽

Competency Based Assessment

Introduction: The use of high-fidelity simulation is emerging as an effective method for competency-based assessment in postgraduate medical education. We have previously reported the development of the Queen’s Simulation Assessment Tool (QSAT), for use in simulation-based Objective Structured Clinical Examinations (OSCEs) for Emergency Medicine (EM) trainees. We aimed to demonstrate the feasibility and present an argument for the validity of a simulation-based OSCE utilizing the QSAT with EM residents from multiple Canadian training sites. Methods: EM post-graduate trainees (PGY 2-5) from 9 Canadian EM training programs participated in an 8-station simulation-based resuscitation OSCE at Queen’s University in Kingston, ON. Each station was scored by a single trained rater from a group of 9 expert Canadian EM physicians. Raters utilized a station-specific QSAT and provided an Entrustment Score. A post-examination questionnaire was administered to the trainees to quantify perceived realism, comfort and educational impact. Statistical analyses included analysis of variance to measure the discriminatory capabilities and a generalizability study to examine the sources of variability in the scores. Results: EM postgraduate trainees (N=36) participated in the study. Discriminatory validity was strong, with senior trainees (PGY4-5) outperforming junior trainees (PGY2-3) in 6 of 8 scenarios and in aggregated QSAT and Entrustment Scores across all 8 stations (p<0.01). Generalizability studies found the largest sources of random variability was due to the trainee by station interaction and the error term, with a G coefficient of 0.84. Resident trainees reported reasonable comfort being assessed in the simulation environment (3.6/5), indicated significant perceived realism (4.1/5), and found the OSCE valuable to their learning (4.8/5). Conclusion: Overall, this study demonstrates that a large-scale simulation-based EM resuscitation OSCE is feasible, and an argument has been presented for the validity of such an examination. The incorporation of simulation or a simulation-based OSCE in the national certification process in EM may help to satisfy the increased demand for competency-based assessment required by the Royal College of Physicians & Surgeons of Canada’s Competency by Design transition.

Download Full-text

Teamwork evaluation during emergency medicine residents’ high-fidelity simulation

BMJ Simulation and Technology Enhanced Learning ◽

10.1136/bmjstel-2015-000068 ◽

2016 ◽

Vol 2 (1) ◽

pp. 12-18 ◽

Cited By ~ 1

Author(s):

Francesca Innocenti ◽

Elena Angeli ◽

Andrea Alesi ◽

Margherita Scorpiniti ◽

Riccardo Pini

Keyword(s):

Emergency Medicine ◽

Internal Consistency ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

High Fidelity ◽

High Fidelity Simulation ◽

Rater Reliability ◽

Emergency Team ◽

Good Internal Consistency ◽

Group 2

BackgroundTeamwork training has been included in several emergency medicine (EM) curricula; the aim of this study was to compare different scales’ performance in teamwork evaluation during simulation for EM residents.MethodsIn the period October 2013–June 2014, we performed bimonthly high-fidelity simulation sessions, with novice (I–III year, group 1 (G1)) and senior (IV–V year, group 2 (G2)) EM residents; scenarios were designed to simulate management of critical patients. Videos were assessed by three independent raters with the following scales: Emergency Team Dynamics (ETD), Clinical Teamwork Scale (CTS) and Team Emergency Assessment Measure (TEAM). In the period March–June, after each scenario, participants completed the CTS and ETD.ResultsThe analysis based on 18 sessions showed good internal consistency and good to fair inter-rater reliability for the three scales (TEAM, CTS, ETD: Cronbach's α 0.954, 0.954, 0.921; Intraclass Correlation Coefficients (ICC), 0.921, 0.917, 0.608). Single CTS items achieved highly significant ICC results, with 12 of the total 13 comparisons achieving ICC results ≥0.70; a similar result was confirmed for 4 of the total 11 TEAM items and 1 of the 8 total ETD items. Spearman's r was 0.585 between ETD and CTS, 0.694 between ETD and TEAM, and 0.634 between TEAM and CTS (scales converted to percentages, all p<0.0001). Participants gave themselves a better evaluation compared with external raters (CTS: 101±9 vs 90±9; ETD: 25±3 vs 20±5, all p<0.0001).ConclusionsAll examined scales demonstrated good internal consistency, with a slightly better inter-rater reliability for CTS compared with the other tools.

Download Full-text

The inter-rater reliability of the Performance Oriented Mobility Assessment tool after brain surgery

International Journal of Therapy and Rehabilitation ◽

10.12968/ijtr.2018.0135 ◽

2019 ◽

Vol 26 (12) ◽

pp. 1-7

Author(s):

Adam Marco Galloway ◽

Edward C Killan ◽

Gretl A McHugh

Keyword(s):

Hospital Admissions ◽

Assessment Tool ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Future Research ◽

Brain Surgery ◽

Rater Reliability ◽

Efficient Treatment ◽

Mobility Assessment ◽

Risk Of Falls

Background/Aims Falls are a significant cause of hospital admissions in the UK and require clinically reasoned intervention from the multidisciplinary team to ensure the patient receives an effective and efficient treatment, including physiotherapy. This study aimed to assess the inter-rater reliability of the Performance Oriented Mobility Assessment in patients who had recently undergone brain surgery. Methods A prospective inter-rater reliability study involving 18 male and 12 female patients aged between 27 and 87 years who had recently undergone brain surgery was conducted. Three raters of varying clinical physiotherapy experience assessed participants using the Performance Oriented Mobility Assessment on an acute neurosurgical ward. Inter-rater reliability was measured using Bland–Altman plots and intraclass correlation coefficients. Results Bland–Altman plots and intraclass correlation coefficient values demonstrated excellent inter-rater reliability, regardless of the age and sex of the patients or the clinical experience of the rater. Conclusions Results suggest that the Performance Oriented Mobility Assessment is a potentially useful tool for assessing patients, particularly for the risk of falls, following brain surgery. Future research is needed to determine other clinimetric properties of this outcome measure before wider implementation.

Download Full-text

Development and evaluation of a simulation-based resuscitation scenario assessment tool for emergency medicine residents

CJEM ◽

10.2310/8000.2012.110385 ◽

2012 ◽

Vol 14 (03) ◽

pp. 139-146 ◽

Cited By ~ 20

Author(s):

Andrew Koch Hall ◽

William Pickett ◽

Jeffrey Damon Dagnone

Keyword(s):

Emergency Medicine ◽

Interrater Reliability ◽

Assessment Tool ◽

Single Point ◽

Assessment Tools ◽

High Fidelity ◽

Emergency Medicine Resident ◽

Medicine Resident ◽

Simulation Based ◽

Junior Residents

ABSTRACT Objective: We sought to develop and validate a three-station simulation-based Objective Structured Clinical Examination (OSCE) tool to assess emergency medicine resident competency in resuscitation scenarios. Methods: An expert panel of emergency physicians developed three scenarios for use with high-fidelity mannequins. For each scenario, a corresponding assessment tool was developed with an essential actions (EA) checklist and a global assessment score (GAS). The scenarios were (1) unstable ventricular tachycardia, (2) respiratory failure, and (3) ST elevation myocardial infarction. Emergency medicine residents were videotaped completing the OSCE, and three clinician experts independently evaluated the videotapes using the assessment tool. Results: Twenty-one residents completed the OSCE (nine residents in the College of Family Physicians of Canada– Emergency Medicine [CCFP-EM] program, six junior residents in the Fellow of the Royal College of Physicians of Canada–Emergency Medicine [FRCP-EM] program, six senior residents in the FRCP-EM). Interrater reliability for the EA scores was good but varied between scenarios (Spearman rho 5 [1] 0.68, [2] 0.81, [3] 0.41). Interrater reliability for the GAS was also good, with less variability (rho 5 [1] 0.64, [2] 0.56, [3] 0.62). When comparing GAS scores, senior FRCP residents outperformed CCFP-EM residents in all scenarios and junior residents in two of three scenarios (p , 0.001 to 0.01). Based on EA scores, senior FRCP residents outperformed CCFP-EM residents, but junior residents outperformed senior FRCP residents in scenario 1 and CCFPEM residents in all scenarios (p 5 0.006 to 0.04). Conclusions: This study outlines the creation of a high-fidelity simulation assessment tool for trainees in emergency medicine. A single-point GAS demonstrated stronger relational validity and more consistent reliability in comparison with an EA checklist. This preliminary work will provide a foundation for ongoing future development of simulationbased assessment tools.

Download Full-text

Assessment tool for the instructional design of simulation-based team training courses: the ID-SIM

BMJ Simulation and Technology Enhanced Learning ◽

10.1136/bmjstel-2016-000192 ◽

2017 ◽

Vol 4 (2) ◽

pp. 59-64 ◽

Cited By ~ 4

Author(s):

Annemarie F Fransen ◽

M Beatrijs van der Hout-van der Jagt ◽

Roxane Gardner ◽

Manuela Capelle ◽

Sebastiaan P Oei ◽

...

Keyword(s):

Instructional Design ◽

Expert Opinion ◽

Assessment Tool ◽

Objective Assessment ◽

Intraclass Correlation ◽

Team Training ◽

Evidence Based ◽

Rater Reliability ◽

Simulation Based ◽

Training Courses

IntroductionTo achieve an expert performance of care teams, adequate simulation-based team training courses with an effective instructional design are essential. As the importance of the instructional design becomes ever more clear, an objective assessment tool would be valuable for educators and researchers. Therefore, we aimed to develop an evidence-based and objective assessment tool for the evaluation of the instructional design of simulation-based team training courses.MethodsA validation study in which we developed an assessment tool containing an evidence-based questionnaire with Visual Analogue Scale (VAS) and a visual chart directly translating the results of the questionnaire. Psychometric properties of the assessment tool were tested using five descriptions of simulation-based team training courses. An expert-opinion-based ranking from poor to excellent was obtained. Ten independent raters assessed the five training courses twice, by using the developed questionnaire with an interval of 2 weeks. Validity and reliability analyses were performed by using the scores from the raters and comparing them with the expert’s ranking. Usability was assessed by an 11-item survey.ResultsA 42-item questionnaire, using VAS, and a propeller chart were developed. The correlation between the expert-opinion-based ranking and the evaluators’ scores (Spearman correlation) was 0.95, and the variance due to subjectivity of raters was 3.5% (VTraining*Rater). The G-coefficient was 0.96. The inter-rater reliability (intraclass correlation coefficient (ICC)) was 0.91 (95% CI 0.77 to 0.99), and intra-rater reliability for the overall score (ICC) was ranging from 0.91 to 0.99.ConclusionsWe developed an evidence-based and reliable assessment tool for the evaluation of the instructional design of a simulation-based team training: the ID-SIM. The ID-SIM is available as a free mobile application.

Download Full-text

Adaptation and Psychometric Evaluation of the Chinese Counseling Competencies Scale-Revised

Frontiers in Psychology ◽

10.3389/fpsyg.2021.688539 ◽

2021 ◽

Vol 12 ◽

Author(s):

Wei Xia ◽

William Ho Cheung Li ◽

Tingna Liang ◽

Yuanhui Luo ◽

Laurie Long Kwan Ho ◽

...

Keyword(s):

Concurrent Validity ◽

Convergent Validity ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Psychometric Evaluation ◽

Counseling Competencies ◽

Counselors In Training ◽

Rater Reliability ◽

Retest Reliability ◽

Test Retest Reliability

Objectives: This study conducted a linguistic and psychometric evaluation of the Chinese Counseling Competencies Scale-Revised (CCS-R).Methods: The Chinese CCS-R was created from the original English version using a standard forward-backward translation process. The psychometric properties of the Chinese CCS-R were examined in a cohort of 208 counselors-in-training by two independent raters. Fifty-three counselors-in-training were asked to undergo another counseling performance evaluation for the test-retest. The confirmatory factor analysis (CFA) was conducted for the Chinese CCS-R, followed by internal consistency, test-retest reliability, inter-rater reliability, convergent validity, and concurrent validity.Results: The results of the CFA supported the factorial validity of the Chinese CCS-R, with adequate construct replicability. The scale had a McDonald's omega of 0.876, and intraclass correlation coefficients of 0.63 and 0.90 for test-retest reliability and inter-rater reliability, respectively. Significantly positive correlations were observed between the Chinese CCS-R score and scores of performance checklist (Pearson's γ = 0.781), indicating a large convergent validity, and knowledge on drug abuse (Pearson's γ = 0.833), indicating a moderate concurrent validity.Conclusion: The results support that the Chinese CCS-R is a valid and reliable measure of the counseling competencies.Practice implication: The CCS-R provides trainers with a reliable tool to evaluate counseling students' competencies and to facilitate discussions with trainees about their areas for growth.

Download Full-text

Is the location of the signal intensity weighted centroid a reliable measurement of fluid displacement within the disc?

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2016-0178 ◽

2018 ◽

Vol 63 (4) ◽

pp. 453-460 ◽

Cited By ~ 7

Author(s):

Vahid Abdollah ◽

Eric C. Parent ◽

Michele C. Battié

Keyword(s):

Signal Intensity ◽

Water Distribution ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Region Of Interest ◽

Rater Reliability ◽

Fluid Displacement ◽

Intraclass Correlation Coefficients ◽

The Mean ◽

Standard Error Of Measurement

Abstract Degenerated discs have shorter T2-relaxation time and lower MR signal. The location of the signal-intensity-weighted-centroid reflects the water distribution within a region-of-interest (ROI). This study compared the reliability of the location of the signal-intensity-weighted-centroid to mean signal intensity and area measurements. L4-L5 and L5-S1 discs were measured on 43 mid-sagittal T2-weighted 3T MRI images in adults with back pain. One rater analysed images twice and another once, blinded to measurements. Discs were semi-automatically segmented into a whole disc, nucleus, anterior and posterior annulus. The coordinates of the signal-intensity-weighted-centroid for all regions demonstrated excellent intraclass-correlation-coefficients for intra- (0.99–1.00) and inter-rater reliability (0.97–1.00). The standard error of measurement for the Y-coordinates of the signal-intensity-weighted-centroid for all ROIs were 0 at both levels and 0 to 2.7 mm for X-coordinates. The mean signal intensity and area for the whole disc and nucleus presented excellent intra-rater reliability with intraclass-correlation-coefficients from 0.93 to 1.00, and 0.92 to 1.00 for inter-rater reliability. The mean signal intensity and area had lower reliability for annulus ROIs, with intra-rater intraclass-correlation-coefficient from 0.5 to 0.76 and inter-rater from 0.33 to 0.58. The location of the signal-intensity-weighted-centroid is a reliable biomarker for investigating the effects of disc interventions.

Download Full-text

LO093: A national needs assessment survey for the development of a quality improvement and patient safety curriculum for Canadian emergency medicine residents

CJEM ◽

10.1017/cem.2016.130 ◽

2016 ◽

Vol 18 (S1) ◽

pp. S62-S62 ◽

Cited By ~ 1

Author(s):

L.B. Chartier ◽

S. Vaillancourt ◽

M. McGowan ◽

K. Dainty ◽

A.H. Cheng

Keyword(s):

Emergency Medicine ◽

Medical Education ◽

Patient Safety ◽

Quality Improvement ◽

Needs Assessment ◽

Residency Training ◽

Royal College ◽

Assessment Tool ◽

Postgraduate Medical Education ◽

Clinical Scenarios

Introduction: The Canadian Medical Education Directives for Specialists (CanMEDS) framework defines the competencies that postgraduate medical education programs must cover for resident physicians. The 2015 iteration of the CanMEDS framework emphasizes Quality Improvement and Patient Safety (QIPS), given their role in the provision of high value and cost-effective care. However, the opinion of Emergency Medicine (EM) program directors (PDs) regarding the need for QIPS curricula is unknown, as is the current level of knowledge of EM residents in QIPS principles. We therefore sought to determine the need for a QIPS curriculum for EM residents in a Canadian Royal College EM program. Methods: We developed a national multi-modal needs assessment. This included a survey of all Royal College EM residency PDs across Canada, as well as an evaluative assessment of baseline QIPS knowledge of 30 EM residents at the University of Toronto (UT). The resident evaluation was done using the validated Revised QI Knowledge Application Tool (QIKAT-R), which evaluates an individual’s ability to decipher a systematic quality problem from short clinical scenarios and to propose change initiatives for improvement. Results: Eight of the 13 (62%) PDs responded to the survey, unanimously agreeing that QIPS should be a formal part of residency training. However, challenges identified included the lack of qualified and available faculty to develop and teach QIPS material. 30 of 30 (100%) residents spanning three cohorts completed the QIKAT-R. Median overall score was 11 out of 27 points (IQR 9-14), demonstrating the lack of poor baseline QIPS knowledge amongst residents. Conclusion: QIPS is felt to be a necessary part of residency training, but the lack of available and qualified faculty makes developing and implementing such curriculum challenging. Residents at UT consistently performed poorly on a validated QIPS assessment tool, confirming the need for a formal QIPS curriculum. We are now developing a longitudinal, evidence-based QIPS curriculum that trains both residents and faculty to contribute to QI projects at the institution level.

Download Full-text

Development and assessment of the inter-rater and intra-rater reproducibility of a self-administration version of the ALSFRS-R

Journal of Neurology Neurosurgery & Psychiatry ◽

10.1136/jnnp-2019-321138 ◽

2019 ◽

Vol 91 (1) ◽

pp. 75-81 ◽

Cited By ~ 7

Author(s):

Leonhard A Bakker ◽

Carin D Schröder ◽

Harold H G Tan ◽

Simone M A G Vugts ◽

Ruben P A van Eijk ◽

...

Keyword(s):

Rating Scale ◽

Clinical Care ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

The Self ◽

Coefficient Alpha ◽

Rater Agreement ◽

Self Administration ◽

Limits Of Agreement ◽

Rater Reliability

ObjectiveThe Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised (ALSFRS-R) is widely applied to assess disease severity and progression in patients with motor neuron disease (MND). The objective of the study is to assess the inter-rater and intra-rater reproducibility, i.e., the inter-rater and intra-rater reliability and agreement, of a self-administration version of the ALSFRS-R for use in apps, online platforms, clinical care and trials.MethodsThe self-administration version of the ALSFRS-R was developed based on both patient and expert feedback. To assess the inter-rater reproducibility, 59 patients with MND filled out the ALSFRS-R online and were subsequently assessed on the ALSFRS-R by three raters. To assess the intra-rater reproducibility, patients were invited on two occasions to complete the ALSFRS-R online. Reliability was assessed with intraclass correlation coefficients, agreement was assessed with Bland-Altman plots and paired samples t-tests, and internal consistency was examined with Cronbach’s coefficient alpha.ResultsThe self-administration version of the ALSFRS-R demonstrated excellent inter-rater and intra-rater reliability. The assessment of inter-rater agreement demonstrated small systematic differences between patients and raters and acceptable limits of agreement. The assessment of intra-rater agreement demonstrated no systematic changes between time points; limits of agreement were 4.3 points for the total score and ranged from 1.6 to 2.4 points for the domain scores. Coefficient alpha values were acceptable.DiscussionThe self-administration version of the ALSFRS-R demonstrates high reproducibility and can be used in apps and online portals for both individual comparisons, facilitating the management of clinical care and group comparisons in clinical trials.

Download Full-text

A Comparison of Reliability Coefficients for Ordinal Rating Scales

Journal of Classification ◽

10.1007/s00357-021-09386-5 ◽

2021 ◽

Author(s):

Alexandra de Raadt ◽

Matthijs J. Warrens ◽

Roel J. Bosker ◽

Henk A. L. Kiers

Keyword(s):

Empirical Data ◽

Rating Scales ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Weighted Kappa ◽

Rater Reliability ◽

Intraclass Correlations ◽

Applied Researcher ◽

Highly Correlated ◽

Reliability Coefficients

AbstractKappa coefficients are commonly used for quantifying reliability on a categorical scale, whereas correlation coefficients are commonly applied to assess reliability on an interval scale. Both types of coefficients can be used to assess the reliability of ordinal rating scales. In this study, we compare seven reliability coefficients for ordinal rating scales: the kappa coefficients included are Cohen’s kappa, linearly weighted kappa, and quadratically weighted kappa; the correlation coefficients included are intraclass correlation ICC(3,1), Pearson’s correlation, Spearman’s rho, and Kendall’s tau-b. The primary goal is to provide a thorough understanding of these coefficients such that the applied researcher can make a sensible choice for ordinal rating scales. A second aim is to find out whether the choice of the coefficient matters. We studied to what extent we reach the same conclusions about inter-rater reliability with different coefficients, and to what extent the coefficients measure agreement in a similar way, using analytic methods, and simulated and empirical data. Using analytical methods, it is shown that differences between quadratic kappa and the Pearson and intraclass correlations increase if agreement becomes larger. Differences between the three coefficients are generally small if differences between rater means and variances are small. Furthermore, using simulated and empirical data, it is shown that differences between all reliability coefficients tend to increase if agreement between the raters increases. Moreover, for the data in this study, the same conclusion about inter-rater reliability was reached in virtually all cases with the four correlation coefficients. In addition, using quadratically weighted kappa, we reached a similar conclusion as with any correlation coefficient a great number of times. Hence, for the data in this study, it does not really matter which of these five coefficients is used. Moreover, the four correlation coefficients and quadratically weighted kappa tend to measure agreement in a similar way: their values are very highly correlated for the data in this study.

Download Full-text