A simple technique for improving the quality of parameter estimates in learning hierarchy validation studies

Alan R. Barton

doi:10.1007/bf02294080

A standardized framework for testing the performance of sleep-tracking technology: step-by-step guidelines and open-source code

SLEEP ◽

10.1093/sleep/zsaa170 ◽

2020 ◽

Author(s):

Luca Menghini ◽

Nicola Cellini ◽

Aimee Goldstone ◽

Fiona C Baker ◽

Massimiliano de Zambotti

Keyword(s):

Open Source ◽

Validation Studies ◽

Large Scale ◽

Analytical Framework ◽

Clinical Settings ◽

Large Scale Data ◽

Fast Pace ◽

Epoch Analysis ◽

Tracking Devices

Abstract Sleep-tracking devices, particularly within the consumer sleep technology (CST) space, are increasingly used in both research and clinical settings, providing new opportunities for large-scale data collection in highly ecological conditions. Due to the fast pace of the CST industry combined with the lack of a standardized framework to evaluate the performance of sleep trackers, their accuracy and reliability in measuring sleep remains largely unknown. Here, we provide a step-by-step analytical framework for evaluating the performance of sleep trackers (including standard actigraphy), as compared with gold-standard polysomnography (PSG) or other reference methods. The analytical guidelines are based on recent recommendations for evaluating and using CST from our group and others (de Zambotti and colleagues; Depner and colleagues), and include raw data organization as well as critical analytical procedures, including discrepancy analysis, Bland–Altman plots, and epoch-by-epoch analysis. Analytical steps are accompanied by open-source R functions (depicted at https://sri-human-sleep.github.io/sleep-trackers-performance/AnalyticalPipeline_v1.0.0.html). In addition, an empirical sample dataset is used to describe and discuss the main outcomes of the proposed pipeline. The guidelines and the accompanying functions are aimed at standardizing the testing of CSTs performance, to not only increase the replicability of validation studies, but also to provide ready-to-use tools to researchers and clinicians. All in all, this work can help to increase the efficiency, interpretation, and quality of validation studies, and to improve the informed adoption of CST in research and clinical settings.

Download Full-text

Statistical assessment of peer opinions in higher education rankings

Journal of Applied Research in Higher Education ◽

10.1108/jarhe-09-2018-0196 ◽

2019 ◽

Vol 11 (3) ◽

pp. 481-492 ◽

Cited By ~ 2

Author(s):

Amir Ghiasi ◽

Grigorios Fountas ◽

Panagiotis Anastasopoulos ◽

Fred Mannering

Keyword(s):

Higher Education ◽

Peer Assessment ◽

Parameter Estimates ◽

Individual Parameter ◽

Content Type ◽

Assessment Scores ◽

College And University ◽

Higher Education Rankings ◽

College Of Engineering

Purpose Unlike many other quantitative characteristics used to determine higher education rankings, opinion-based peer assessment scores and the factors that may influence them are not well understood. Using peer scores of US colleges of engineering as reported annually in US News and World Report (USNews) rankings, the purpose of this paper is to provide some insights into peer assessments by statistically identifying factors that influence them. Design/methodology/approach With highly detailed data, a random parameters linear regression is estimated to statistically identify the factors determining a college of engineering’s average USNews peer assessment score. Findings The findings show that a wide variety of college- and university-specific attributes influence average peer impressions of a university’s college of engineering including the size of the faculty, the quality of admitted students and the quality of the faculty measured by their citation data and other factors. Originality/value The paper demonstrates that average peer assessment scores can be readily and accurately predicted with observable data on the college of engineering and the university as a whole. In addition, the individual parameter estimates from the statistical modeling in this paper provide insights as to how specific college and university attributes can help guide policies to improve an individual college’s average peer assessment scores and its overall ranking.

Download Full-text

Discriminative power of the Sarcopenia Quality of Life (SarQoL®) questionnaire with the EWGSOP2 criteria

Journal of Frailty & Aging ◽

10.14283/jfa.2020.47 ◽

2020 ◽

pp. 1-2

Author(s):

A. Geerinck ◽

M. Locquet ◽

J.-Y. Reginster

Keyword(s):

Quality Of Life ◽

Grip Strength ◽

Validation Studies ◽

Discriminative Power ◽

Validity And Reliability ◽

Specific Instrument ◽

Multiple Languages

The Sarcopenia Quality of Life (SarQoL®) questionnaire was developed in 2015 to fill the need for a specific instrument to measure quality of life in sarcopenia. Since then, its validity and reliability have been evaluated in multiple languages, and it is now available in 30 language-specific versions. In multiple validation studies, the SarQoL® has demonstrated its ability to discriminate between sarcopenic and non-sarcopenic subjects when diagnosed according to the EWGSOP criteria (1). However, these criteria have now been updated, and the discriminative power of the SarQoL® questionnaire should be reaffirmed using the EWGSOP2 criteria (2). The analysis presented below aims to establish whether the SarQoL® questionnaire can discriminate between sarcopenic, probably sarcopenic (low grip strength in the EWGSOP2 algorithm) and non-sarcopenic participants.

Download Full-text

Standard error of measurement and smallest detectable change of the Sarcopenia Quality of Life (SarQoL) questionnaire: An analysis of subjects from 9 validation studies

PLoS ONE ◽

10.1371/journal.pone.0216065 ◽

2019 ◽

Vol 14 (4) ◽

pp. e0216065 ◽

Cited By ~ 6

Author(s):

Anton Geerinck ◽

Vidmantas Alekna ◽

Charlotte Beaudart ◽

Ivan Bautmans ◽

Cyrus Cooper ◽

...

Keyword(s):

Quality Of Life ◽

Standard Error ◽

Validation Studies ◽

Detectable Change ◽

Smallest Detectable Change ◽

Standard Error Of Measurement ◽

Error Of Measurement

Download Full-text

Characterization of urinary symptoms and children’s quality of life with ureteral stents based on the validation studies of the new questionnaire (USSQ)

European Urology Supplements ◽

10.1016/s1569-9056(17)30794-7 ◽

2017 ◽

Vol 16 (3) ◽

pp. e1282

Author(s):

R. Benrabah ◽

M. Azli ◽

S. Boumelit ◽

M.B. Souid ◽

M. Lounici

Keyword(s):

Quality Of Life ◽

Validation Studies ◽

Urinary Symptoms ◽

Ureteral Stents

Download Full-text

Quality of patient-reported outcomes used for quality of life, physical function, and functional capacity in trials of childhood fractures

The Bone & Joint Journal ◽

10.1302/0301-620x.102b12.bjj-2020-0732.r2 ◽

2020 ◽

Vol 102-B (12) ◽

pp. 1599-1607

Author(s):

Ben A. Marson ◽

Simon Craxford ◽

Sandeep R. Deshmukh ◽

Douglas J. C. Grindlay ◽

Joseph C. Manning ◽

...

Keyword(s):

Quality Of Life ◽

Physical Function ◽

Upper Extremity ◽

Validation Studies ◽

Patient Reported Outcomes ◽

Development Studies ◽

Ovid Medline ◽

Patient Reported ◽

Childhood Fractures

Aims This study evaluates the quality of patient-reported outcome measures (PROMs) reported in childhood fracture trials and recommends outcome measures to assess and report physical function, functional capacity, and quality of life using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) standards. Methods A Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)-compliant systematic review of OVID Medline, Embase, and Cochrane CENTRAL was performed to identify all PROMs reported in trials. A search of OVID Medline, Embase, and PsycINFO was performed to identify all PROMs with validation studies in childhood fractures. Development studies were identified through hand-searching. Data extraction was undertaken by two reviewers. Study quality and risk of bias was evaluated by COSMIN guidelines and recorded on standardized checklists. Results Searches yielded 13,672 studies, which were screened to identify 124 trials and two validation studies. Review of the 124 trials identified 16 reported PROMs, of which two had validation studies. The development papers were retrieved for all PROMs. The quality of the original development studies was adequate for Patient-Reported Outcomes Measurement Information System (PROMIS) Mobility and Upper Extremity and doubtful for the EuroQol Five Dimension Youth questionnaire (EQ-5D-Y). All other PROMs were found to have inadequate development studies. No content validity studies were identified. Reviewer-rated content validity was acceptable for six PROMs: Activity Scale for Kids (ASK), Childhood Health Assessment Questionnaire, PROMIS Upper Extremity, PROMIS Mobility, EQ-5D-Y, and Pediatric Quality of Life Inventory (PedsQL4.0). The Modified Disabilities of the Arm, Shoulder, and Hand (DASH) questionnaire was shown to have indeterminate reliability and convergence validity in one study and PROMIS Upper Extremity had insufficient convergence validity in one study. Conclusion There is insufficient evidence to recommend strongly the use of any single PROM to assess and report physical function or quality of life following childhood fractures. There is a need to conduct validation studies for PROMs. In the absence of these studies, we cautiously recommend the use of the PROMIS or ASK-P for physical function and the PedsQL4.0 or EQ-5D-Y for quality of life. Cite this article: Bone Joint J 2020;102-B(12):1599–1607.

Download Full-text

Quality of anticholinergic burden scales and their impact on clinical outcomes: a systematic review

European Journal of Clinical Pharmacology ◽

10.1007/s00228-020-02994-x ◽

2020 ◽

Author(s):

Angela Lisibach ◽

Valérie Benelli ◽

Marco Giacomo Ceppi ◽

Karin Waldner-Knogler ◽

Chantal Csajka ◽

...

Keyword(s):

Side Effects ◽

Clinical Outcomes ◽

Validation Studies ◽

Multiple Scales ◽

Meta Analysis ◽

Anticholinergic Drug ◽

Burden Scale ◽

Newcastle Ottawa Scale ◽

Anticholinergic Burden

Abstract Purpose Older people are at risk of anticholinergic side effects due to changes affecting drug elimination and higher sensitivity to drug’s side effects. Anticholinergic burden scales (ABS) were developed to quantify the anticholinergic drug burden (ADB). We aim to identify all published ABS, to compare them systematically and to evaluate their associations with clinical outcomes. Methods We conducted a literature search in MEDLINE and EMBASE to identify all published ABS and a Web of Science citation (WoS) analysis to track validation studies implying clinical outcomes. Quality of the ABS was assessed using an adapted AGREE II tool. For the validation studies, we used the Newcastle-Ottawa Scale and the Cochrane tool Rob2.0. The validation studies were categorized into six evidence levels based on the propositions of the Oxford Center for Evidence-Based Medicine with respect to their quality. At least two researchers independently performed screening and quality assessments. Results Out of 1297 records, we identified 19 ABS and 104 validations studies. Despite differences in quality, all ABS were recommended for use. The anticholinergic cognitive burden (ACB) scale and the German anticholinergic burden scale (GABS) achieved the highest percentage in quality. Most ABS are validated, yet validation studies for newer scales are lacking. Only two studies compared eight ABS simultaneously. The four most investigated clinical outcomes delirium, cognition, mortality and falls showed contradicting results. Conclusion There is need for good quality validation studies comparing multiple scales to define the best scale and to conduct a meta-analysis for the assessment of their clinical impact.

Download Full-text

Analysis of the Quality of Parameter Estimates from Repeated Pumping and Slug Tests in a Fractured Porous Aquifer System in Wonju, Korea

Ground Water ◽

10.1111/j.1745-6584.1999.tb01161.x ◽

1999 ◽

Vol 37 (5) ◽

pp. 692-700 ◽

Cited By ~ 25

Author(s):

Jin-Yong Lee ◽

Kang-Kun Lee

Keyword(s):

Parameter Estimates ◽

Slug Tests ◽

Aquifer System ◽

Porous Aquifer

Download Full-text

A Technique For Assessing And Improving The Quality Of Reservoir Parameter Estimates Used In Numerical Simulators

10.2118/4546-ms ◽

1973 ◽

Cited By ~ 2

Author(s):

V.S. Breit ◽

K.A. Bishop ◽

D.W. Green ◽

E.E. Trompeter

Keyword(s):

Parameter Estimates ◽

Reservoir Parameter

Download Full-text

Validation of rating processes within an argument-based framework

Language Testing ◽

10.1177/0265532217710049 ◽

2017 ◽

Vol 35 (4) ◽

pp. 477-499 ◽

Cited By ~ 9

Author(s):

Ute Knoch ◽

Carol A. Chapelle

Keyword(s):

Validation Studies ◽

Test Score ◽

Rating Scale ◽

Language Testing ◽

Test Interpretation ◽

Validity Argument ◽

Linguistic Performance ◽

Manual Search ◽

Rating Process

Argument-based validation requires test developers and researchers to specify what is entailed in test interpretation and use. Doing so has been shown to yield advantages (Chapelle, Enright, & Jamieson, 2010), but it also requires an analysis of how the concerns of language testers can be conceptualized in the terms used to construct a validity argument. This article presents one such analysis by examining how issues associated with the rating of test takers’ linguistic performance can be included in a validity argument. Through a manual search of published language testing research, we gathered examples of research studies investigating the quality of rating processes and products. We then analyzed them in terms of how the research could be framed within a validity argument. Drawing on Kane’s (2001, 2006, 2013) conceptualization of inferences, warrants, and assumptions, we show that the relevance of research about the rating of test performances extends beyond one or two inferences about rater reliability. Such research results, for example, provide backing for assumptions about the correspondence of the rating scale to the test construct (explanation inference) and the context of extrapolation as well as the decisions made based on the ratings and their consequences. Our analysis reveals a picture of the extensive reach of the rating process into many aspects of test score meaning as well as concrete suggestions for integrating rating issues into future argument-based validation studies.

Download Full-text