Influence of Healthcare Organization Factors on Cardiovascular Diseases Mortality

Mapping Intimacies ◽

10.3233/shti210835 ◽

2021 ◽

Author(s):

Oleg Metsker ◽

Georgy Kopanitsa

Keyword(s):

Public Health ◽

Machine Learning ◽

Health Care Organizations ◽

Characteristic Curve ◽

Vascular Diseases ◽

Testing Dataset ◽

Target Indicator ◽

Health Factors ◽

Operating Characteristic Curve ◽

Cardio Vascular

One serious pandemic can nullify years of efforts to extend life expectancy and reduce disability. The coronavirus pandemic has been a perturbing factor that has provided an opportunity to assess not only the effectiveness of health systems for cardio-vascular diseases (CVD), but also their sustainability. The goal of our research is to analyze the influence of public health factors on the mortality from circulatory diseases using machine learning methods. We analysed a very large dataset that consisted of the information collected from the national registers in Russia. We included data from 2015 to 2021. It included 340 factors that characterize organization of healthcare in Russia. The resulting area under receiver operating characteristic curve (AUC of ROC) of the Random Forest based regression model was 92% with a testing dataset. The models allow for automated retraining as time passes and epidemiological and other situations change. They also allow additional characteristics of regions and health care organizations to be added to existing training datasets depending on the target. The developed models allow the calculation of the probability of the target for 6–12 months with an error of 8%. Moreover, the models allow to calculate scenarios and the value of the target indicator when other indicators of the region change.

Download Full-text

A Review on Heart Disease Detection Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0200 ◽

2017 ◽

Vol 7 (7) ◽

pp. 395

Author(s):

. Anika ◽

Navpreet Kaur

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Review Paper ◽

Vascular Diseases ◽

Disease Detection ◽

Automated Classification ◽

Feature Extraction Method ◽

Detection Techniques ◽

Cardio Vascular ◽

Public Domain Software

The paper exhibits a formal audit on early detection of heart disease which are the major cause of death. Computational science has potential to detect disease in prior stages automatically. With this review paper we describe machine learning for disease detection. Machine learning is a method of data analysis that automates analytical model building.Various techniques develop to predict cardiac disease based on cases through MRI was developed. Automated classification using machine learning. Feature extraction method using Cell Profiler and GLCM. Cell Profiler a public domain software, freely available is flourished by the Broad Institute's Imaging Platform and Glcm is a statistical method of examining texture .Various techniques to detect cardio vascular diseases.

Download Full-text

Machine learning for identification of surgeries with high risks of cancellation

Health Informatics Journal ◽

10.1177/1460458218813602 ◽

2018 ◽

Vol 26 (1) ◽

pp. 141-155 ◽

Cited By ~ 2

Author(s):

Li Luo ◽

Fengyi Zhang ◽

Yao Yao ◽

RenRong Gong ◽

Martina Fu ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Predictive Value ◽

Operating Characteristic ◽

Sampling Methods ◽

Characteristic Curve ◽

Support Vector ◽

Chi Square ◽

Stable Performance ◽

Operating Characteristic Curve

Surgery cancellations waste scarce operative resources and hinder patients’ access to operative services. In this study, the Wilcoxon and chi-square tests were used for predictor selection, and three machine learning models – random forest, support vector machine, and XGBoost – were used for the identification of surgeries with high risks of cancellation. The optimal performances of the identification models were as follows: sensitivity − 0.615; specificity − 0.957; positive predictive value − 0.454; negative predictive value − 0.904; accuracy − 0.647; and area under the receiver operating characteristic curve − 0.682. Of the three models, the random forest model achieved the best performance. Thus, the effective identification of surgeries with high risks of cancellation is feasible with stable performance. Models and sampling methods significantly affect the performance of identification. This study is a new application of machine learning for the identification of surgeries with high risks of cancellation and facilitation of surgery resource management.

Download Full-text

A Time-Updated, Parsimonious Model to Predict AKI in Hospitalized Children

Journal of the American Society of Nephrology ◽

10.1681/asn.2019070745 ◽

2020 ◽

Vol 31 (6) ◽

pp. 1348-1357 ◽

Cited By ~ 1

Author(s):

Ibrahim Sandokji ◽

Yu Yamamoto ◽

Aditya Biswas ◽

Tanima Arora ◽

Ugochukwu Ugwuowo ◽

...

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Receiver Operating Characteristic Curve ◽

Operating Characteristic ◽

Characteristic Curve ◽

External Validation ◽

Health Record ◽

Hospitalized Children ◽

Operating Characteristic Curve ◽

Electronic Health

BackgroundTimely prediction of AKI in children can allow for targeted interventions, but the wealth of data in the electronic health record poses unique modeling challenges.MethodsWe retrospectively reviewed the electronic medical records of all children younger than 18 years old who had at least two creatinine values measured during a hospital admission from January 2014 through January 2018. We divided the study population into derivation, and internal and external validation cohorts, and used five feature selection techniques to select 10 of 720 potentially predictive variables from the electronic health records. Model performance was assessed by the area under the receiver operating characteristic curve in the validation cohorts. The primary outcome was development of AKI (per the Kidney Disease Improving Global Outcomes creatinine definition) within a moving 48-hour window. Secondary outcomes included severe AKI (stage 2 or 3), inpatient mortality, and length of stay.ResultsAmong 8473 encounters studied, AKI occurred in 516 (10.2%), 207 (9%), and 27 (2.5%) encounters in the derivation, and internal and external validation cohorts, respectively. The highest-performing model used a machine learning-based genetic algorithm, with an overall receiver operating characteristic curve in the internal validation cohort of 0.76 [95% confidence interval (CI), 0.72 to 0.79] for AKI, 0.79 (95% CI, 0.74 to 0.83) for severe AKI, and 0.81 (95% CI, 0.77 to 0.86) for neonatal AKI. To translate this prediction model into a clinical risk-stratification tool, we identified high- and low-risk threshold points.ConclusionsUsing various machine learning algorithms, we identified and validated a time-updated prediction model of ten readily available electronic health record variables to accurately predict imminent AKI in hospitalized children.

Download Full-text

Machine-Learning and Stochastic Tumor Growth Models for Predicting Outcomes in Patients With Advanced Non–Small-Cell Lung Cancer

JCO Clinical Cancer Informatics ◽

10.1200/cci.19.00046 ◽

2019 ◽

pp. 1-11 ◽

Cited By ~ 1

Author(s):

Kien Wei Siah ◽

Sean Khozin ◽

Chi Heem Wong ◽

Andrew W. Lo

Keyword(s):

Machine Learning ◽

Lung Cancer ◽

Clinical Trials ◽

Operating Characteristic ◽

Advanced Nsclc ◽

Characteristic Curve ◽

Small Cell ◽

Small Cell Lung ◽

Out Of Sample ◽

Operating Characteristic Curve

PURPOSE The prediction of clinical outcomes for patients with cancer is central to precision medicine and the design of clinical trials. We developed and validated machine-learning models for three important clinical end points in patients with advanced non–small-cell lung cancer (NSCLC)—objective response (OR), progression-free survival (PFS), and overall survival (OS)—using routinely collected patient and disease variables. METHODS We aggregated patient-level data from 17 randomized clinical trials recently submitted to the US Food and Drug Administration evaluating molecularly targeted therapy and immunotherapy in patients with advanced NSCLC. To our knowledge, this is one of the largest studies of NSCLC to consider biomarker and inhibitor therapy as candidate predictive variables. We developed a stochastic tumor growth model to predict tumor response and explored the performance of a range of machine-learning algorithms and survival models. Models were evaluated on out-of-sample data using the standard area under the receiver operating characteristic curve and concordance index (C-index) performance metrics. RESULTS Our models achieved promising out-of-sample predictive performances of 0.79 area under the receiver operating characteristic curve (95% CI, 0.77 to 0.81), 0.67 C-index (95% CI, 0.66 to 0.69), and 0.73 C-index (95% CI, 0.72 to 0.74) for OR, PFS, and OS, respectively. The calibration plots for PFS and OS suggested good agreement between actual and predicted survival probabilities. In addition, the Kaplan-Meier survival curves showed that the difference in survival between the low- and high-risk groups was significant (log-rank test P < .001) for both PFS and OS. CONCLUSION Biomarker status was the strongest predictor of OR, PFS, and OS in patients with advanced NSCLC treated with immune checkpoint inhibitors and targeted therapies. However, single biomarkers have limited predictive value, especially for programmed death-ligand 1 immunotherapy. To advance beyond the results achieved in this study, more comprehensive data on composite multiomic signatures is required.

Download Full-text

Validating the validation: reanalyzing a large-scale comparison of deep learning and machine learning models for bioactivity prediction

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-019-00274-0 ◽

2020 ◽

Vol 34 (7) ◽

pp. 717-730 ◽

Cited By ~ 9

Author(s):

Matthew C. Robinson ◽

Robert C. Glen ◽

Alpha A. Lee

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Numerical Experiments ◽

Large Scale ◽

Operating Characteristic ◽

Characteristic Curve ◽

Learning Models ◽

Bioactivity Prediction ◽

Operating Characteristic Curve ◽

Machine Learning Models

Abstract Machine learning methods may have the potential to significantly accelerate drug discovery. However, the increasing rate of new methodological approaches being published in the literature raises the fundamental question of how models should be benchmarked and validated. We reanalyze the data generated by a recently published large-scale comparison of machine learning models for bioactivity prediction and arrive at a somewhat different conclusion. We show that the performance of support vector machines is competitive with that of deep learning methods. Additionally, using a series of numerical experiments, we question the relevance of area under the receiver operating characteristic curve as a metric in virtual screening. We further suggest that area under the precision–recall curve should be used in conjunction with the receiver operating characteristic curve. Our numerical experiments also highlight challenges in estimating the uncertainty in model performance via scaffold-split nested cross validation.

Download Full-text

Supervised Machine-learning Predictive Analytics for Prediction of Postinduction Hypotension

Anesthesiology ◽

10.1097/aln.0000000000002374 ◽

2018 ◽

Vol 129 (4) ◽

pp. 675-688 ◽

Cited By ~ 45

Author(s):

Samir Kendale ◽

Prathamesh Kulkarni ◽

Andrew D. Rosenberg ◽

Jing Wang

Keyword(s):

Machine Learning ◽

Receiver Operating Characteristic Curve ◽

Operating Characteristic ◽

Predictive Analytics ◽

Characteristic Curve ◽

Supervised Machine Learning ◽

Gradient Boosting ◽

Machine Learning Methods ◽

Gradient Boosting Machine ◽

Operating Characteristic Curve

AbstractEditor’s PerspectiveWhat We Already Know about This TopicWhat This Article Tells Us That Is NewBackgroundHypotension is a risk factor for adverse perioperative outcomes. Machine-learning methods allow large amounts of data for development of robust predictive analytics. The authors hypothesized that machine-learning methods can provide prediction for the risk of postinduction hypotension.MethodsData was extracted from the electronic health record of a single quaternary care center from November 2015 to May 2016 for patients over age 12 that underwent general anesthesia, without procedure exclusions. Multiple supervised machine-learning classification techniques were attempted, with postinduction hypotension (mean arterial pressure less than 55 mmHg within 10 min of induction by any measurement) as primary outcome, and preoperative medications, medical comorbidities, induction medications, and intraoperative vital signs as features. Discrimination was assessed using cross-validated area under the receiver operating characteristic curve. The best performing model was tuned and final performance assessed using split-set validation.ResultsOut of 13,323 cases, 1,185 (8.9%) experienced postinduction hypotension. Area under the receiver operating characteristic curve using logistic regression was 0.71 (95% CI, 0.70 to 0.72), support vector machines was 0.63 (95% CI, 0.58 to 0.60), naive Bayes was 0.69 (95% CI, 0.67 to 0.69), k-nearest neighbor was 0.64 (95% CI, 0.63 to 0.65), linear discriminant analysis was 0.72 (95% CI, 0.71 to 0.73), random forest was 0.74 (95% CI, 0.73 to 0.75), neural nets 0.71 (95% CI, 0.69 to 0.71), and gradient boosting machine 0.76 (95% CI, 0.75 to 0.77). Test set area for the gradient boosting machine was 0.74 (95% CI, 0.72 to 0.77).ConclusionsThe success of this technique in predicting postinduction hypotension demonstrates feasibility of machine-learning models for predictive analytics in the field of anesthesiology, with performance dependent on model selection and appropriate tuning.

Download Full-text

Predicting Antituberculosis Drug–Induced Liver Injury Using an Interpretable Machine Learning Method: Model Development and Validation Study (Preprint)

10.2196/preprints.29226 ◽

2021 ◽

Author(s):

Tao Zhong ◽

Zian Zhuang ◽

Xiaoli Dong ◽

Ka Hing Wong ◽

Wing Tak Wong ◽

...

Keyword(s):

Machine Learning ◽

Liver Injury ◽

Receiver Operating Characteristic Curve ◽

Receiver Operating Characteristic ◽

Operating Characteristic ◽

Characteristic Curve ◽

Alanine Transaminase ◽

Drug Induced ◽

Drug Induced Liver Injury ◽

Operating Characteristic Curve

BACKGROUND Tuberculosis (TB) is a pandemic, being one of the top 10 causes of death and the main cause of death from a single source of infection. Drug-induced liver injury (DILI) is the most common and serious side effect during the treatment of TB. OBJECTIVE We aim to predict the status of liver injury in patients with TB at the clinical treatment stage. METHODS We designed an interpretable prediction model based on the XGBoost algorithm and identified the most robust and meaningful predictors of the risk of TB-DILI on the basis of clinical data extracted from the Hospital Information System of Shenzhen Nanshan Center for Chronic Disease Control from 2014 to 2019. RESULTS In total, 757 patients were included, and 287 (38%) had developed TB-DILI. Based on values of relative importance and area under the receiver operating characteristic curve, machine learning tools selected patients’ most recent alanine transaminase levels, average rate of change of patients’ last 2 measures of alanine transaminase levels, cumulative dose of pyrazinamide, and cumulative dose of ethambutol as the best predictors for assessing the risk of TB-DILI. In the validation data set, the model had a precision of 90%, recall of 74%, classification accuracy of 76%, and balanced error rate of 77% in predicting cases of TB-DILI. The area under the receiver operating characteristic curve score upon 10-fold cross-validation was 0.912 (95% CI 0.890-0.935). In addition, the model provided warnings of high risk for patients in advance of DILI onset for a median of 15 (IQR 7.3-27.5) days. CONCLUSIONS Our model shows high accuracy and interpretability in predicting cases of TB-DILI, which can provide useful information to clinicians to adjust the medication regimen and avoid more serious liver injury in patients.

Download Full-text

Different Relationships Between Steps and Movements and Healthy Biomarkers in People With and Without Disability

Journal of Physical Activity and Health ◽

10.1123/jpah.2019-0639 ◽

2021 ◽

pp. 1-12

Author(s):

Chungyi Chiu ◽

Alicia R. Covello-Jones ◽

Esteban Montenegro ◽

Jessica M. Brooks ◽

Sa Shen

Keyword(s):

Public Health ◽

Receiver Operating Characteristic Curve ◽

Receiver Operating Characteristic ◽

Operating Characteristic ◽

Characteristic Curve ◽

Data Driven ◽

Active Lifestyle ◽

Physically Active ◽

Cut Points ◽

Operating Characteristic Curve

Background: Physical activity benefits have been extensively studied. However, the public health guidelines seem unclear about the relationships between steps and movements with healthy biomarkers for people with (PWD) and without disabilities (PWOD), respectively. While public health guidelines illustrate types of exercise (eg, running, swimming), it is equally important to provide data-driven recommended amounts of daily steps or movements to achieve health biomarkers and further promote a physically active lifestyle. Methods: Data from the National Health and Nutrition Examination Survey 2003–2006 were used. The authors conducted sensitivity, specificity, and receiver-operating-characteristic curve analyses regarding cut points from ActiGraph 7164 of daily steps and movements for health biomarkers (eg, body mass index, cholesterol) in PWD (2178 participants) and PWOD (4414 participants). The authors also examined the dose relationships of steps, movements, and healthy biomarkers in each group. Results: The authors found significant differences in the cut points of daily steps and movement for health biomarkers in PWD and PWOD. For daily steps, cut points of PWD were ranged from 3222 to 8311 (area under the receiver-operating-characteristic curve [AUC] range = 0.52–0.93) significantly lower than PWOD’s daily steps (range = 5455–14,272; AUC = 0.58–0.87). For daily movement, cut points of PWD were ranged from 115,451 to 430,324 (AUC = 0.53–0.91) significantly lower than the PWOD’s daily movements (range = 215,288–282,307; AUC = 0.60–0.88). The authors found strong but different dose relationships of many biomarkers in each group. Conclusions: PWD need fewer daily steps or movement counts to achieve health biomarkers than PWOD. The authors provided data-driven, condition-specific recommendations on promoting a physically active lifestyle.

Download Full-text

Prediction of extubation failure in the paediatric cardiac ICU using machine learning and high-frequency physiologic data

Cardiology in the Young ◽

10.1017/s1047951121004959 ◽

2021 ◽

pp. 1-8

Author(s):

Sydney R. Rooney ◽

Evan L. Reynolds ◽

Mousumi Banerjee ◽

Sara K. Pasquali ◽

John R. Charpie ◽

...

Keyword(s):

Machine Learning ◽

Receiver Operating Characteristic Curve ◽

Receiver Operating Characteristic ◽

Operating Characteristic ◽

Characteristic Curve ◽

Extubation Failure ◽

Machine Learning Methods ◽

Operating Characteristic Curve ◽

Physiologic Data ◽

Receiver Operating

Abstract Background: Cardiac intensivists frequently assess patient readiness to wean off mechanical ventilation with an extubation readiness trial despite it being no more effective than clinician judgement alone. We evaluated the utility of high-frequency physiologic data and machine learning for improving the prediction of extubation failure in children with cardiovascular disease. Methods: This was a retrospective analysis of clinical registry data and streamed physiologic extubation readiness trial data from one paediatric cardiac ICU (12/2016-3/2018). We analysed patients’ final extubation readiness trial. Machine learning methods (classification and regression tree, Boosting, Random Forest) were performed using clinical/demographic data, physiologic data, and both datasets. Extubation failure was defined as reintubation within 48 hrs. Classifier performance was assessed on prediction accuracy and area under the receiver operating characteristic curve. Results: Of 178 episodes, 11.2% (N = 20) failed extubation. Using clinical/demographic data, our machine learning methods identified variables such as age, weight, height, and ventilation duration as being important in predicting extubation failure. Best classifier performance with this data was Boosting (prediction accuracy: 0.88; area under the receiver operating characteristic curve: 0.74). Using physiologic data, our machine learning methods found oxygen saturation extremes and descriptors of dynamic compliance, central venous pressure, and heart/respiratory rate to be of importance. The best classifier in this setting was Random Forest (prediction accuracy: 0.89; area under the receiver operating characteristic curve: 0.75). Combining both datasets produced classifiers highlighting the importance of physiologic variables in determining extubation failure, though predictive performance was not improved. Conclusion: Physiologic variables not routinely scrutinised during extubation readiness trials were identified as potential extubation failure predictors. Larger analyses are necessary to investigate whether these markers can improve clinical decision-making.

Download Full-text

Revisiting performance metrics for prediction with rare outcomes

Statistical Methods in Medical Research ◽

10.1177/09622802211038754 ◽

2021 ◽

Vol 30 (10) ◽

pp. 2352-2366

Author(s):

Samrachana Adhikari ◽

Sharon-Lise Normand ◽

Jordan Bloom ◽

David Shahian ◽

Sherri Rose

Keyword(s):

Machine Learning ◽

Receiver Operating Characteristic Curve ◽

Receiver Operating Characteristic ◽

Operating Characteristic ◽

Characteristic Curve ◽

True Positive ◽

Post Surgery ◽

Operating Characteristic Curve ◽

Prediction Problems ◽

Receiver Operating

Machine learning algorithms are increasingly used in the clinical literature, claiming advantages over logistic regression. However, they are generally designed to maximize the area under the receiver operating characteristic curve. While area under the receiver operating characteristic curve and other measures of accuracy are commonly reported for evaluating binary prediction problems, these metrics can be misleading. We aim to give clinical and machine learning researchers a realistic medical example of the dangers of relying on a single measure of discriminatory performance to evaluate binary prediction questions. Prediction of medical complications after surgery is a frequent but challenging task because many post-surgery outcomes are rare. We predicted post-surgery mortality among patients in a clinical registry who received at least one aortic valve replacement. Estimation incorporated multiple evaluation metrics and algorithms typically regarded as performing well with rare outcomes, as well as an ensemble and a new extension of the lasso for multiple unordered treatments. Results demonstrated high accuracy for all algorithms with moderate measures of cross-validated area under the receiver operating characteristic curve. False positive rates were [Formula: see text]1%, however, true positive rates were [Formula: see text]7%, even when paired with a 100% positive predictive value, and graphical representations of calibration were poor. Similar results were seen in simulations, with the addition of high area under the receiver operating characteristic curve ([Formula: see text]90%) accompanying low true positive rates. Clinical studies should not primarily report only area under the receiver operating characteristic curve or accuracy.

Download Full-text