Machine Learning-Based Model to Predict the Disease Severity and Outcome in COVID-19 Patients

Scientific Programming ◽

10.1155/2021/5587188 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Sumayh S. Aljameel ◽

Irfan Ullah Khan ◽

Nida Aslam ◽

Malak Aljabri ◽

Eman S. Alsulmi

Keyword(s):

At Risk ◽

Survival Rate ◽

Early Identification ◽

Global Economy ◽

Prediction Method ◽

Data Partitioning ◽

University Hospital ◽

Gradient Boosting ◽

Extreme Gradient Boosting ◽

Novel Coronavirus

The novel coronavirus (COVID-19) outbreak produced devastating effects on the global economy and the health of entire communities. Although the COVID-19 survival rate is high, the number of severe cases that result in death is increasing daily. A timely prediction of at-risk patients of COVID-19 with precautionary measures is expected to increase the survival rate of patients and reduce the fatality rate. This research provides a prediction method for the early identification of COVID-19 patient’s outcome based on patients’ characteristics monitored at home, while in quarantine. The study was performed using 287 COVID-19 samples of patients from the King Fahad University Hospital, Saudi Arabia. The data were analyzed using three classification algorithms, namely, logistic regression (LR), random forest (RF), and extreme gradient boosting (XGB). Initially, the data were preprocessed using several preprocessing techniques. Furthermore, 10-k cross-validation was applied for data partitioning and SMOTE for alleviating the data imbalance. Experiments were performed using twenty clinical features, identified as significant for predicting the survival versus the deceased COVID-19 patients. The results showed that RF outperformed the other classifiers with an accuracy of 0.95 and area under curve (AUC) of 0.99. The proposed model can assist the decision-making and health care professional by early identification of at-risk COVID-19 patients effectively.

Download Full-text

Application of Machine-Learning-Based Fusion Model in Visibility Forecast: A Case Study of Shanghai, China

Remote Sensing ◽

10.3390/rs13112096 ◽

2021 ◽

Vol 13 (11) ◽

pp. 2096

Author(s):

Zhongqi Yu ◽

Yuanhao Qu ◽

Yunxin Wang ◽

Jinghui Ma ◽

Yu Cao

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Eastern China ◽

Prediction Method ◽

Sampling Technique ◽

Environmental Modeling ◽

Gradient Boosting ◽

Fusion Model ◽

Light Gradient ◽

Extreme Gradient Boosting

A visibility forecast model called a boosting-based fusion model (BFM) was established in this study. The model uses a fusion machine learning model based on multisource data, including air pollutants, meteorological observations, moderate resolution imaging spectroradiometer (MODIS) aerosol optical depth (AOD) data, and an operational regional atmospheric environmental modeling System for eastern China (RAEMS) outputs. Extreme gradient boosting (XGBoost), a light gradient boosting machine (LightGBM), and a numerical prediction method, i.e., RAEMS were fused to establish this prediction model. Three sets of prediction models, that is, BFM, LightGBM based on multisource data (LGBM), and RAEMS, were used to conduct visibility prediction tasks. The training set was from 1 January 2015 to 31 December 2018 and used several data pre-processing methods, including a synthetic minority over-sampling technique (SMOTE) data resampling, a loss function adjustment, and a 10-fold cross verification. Moreover, apart from the basic features (variables), more spatial and temporal gradient features were considered. The testing set was from 1 January to 31 December 2019 and was adopted to validate the feasibility of the BFM, LGBM, and RAEMS. Statistical indicators confirmed that the machine learning methods improved the RAEMS forecast significantly and consistently. The root mean square error and correlation coefficient of BFM for the next 24/48 h were 5.01/5.47 km and 0.80/0.77, respectively, which were much higher than those of RAEMS. The statistics and binary score analysis for different areas in Shanghai also proved the reliability and accuracy of using BFM, particularly in low-visibility forecasting. Overall, BFM is a suitable tool for predicting the visibility. It provides a more accurate visibility forecast for the next 24 and 48 h in Shanghai than LGBM and RAEMS. The results of this study provide support for real-time operational visibility forecasts.

Download Full-text

iterb-PPse: Identification of transcriptional terminators in bacterial by incorporating nucleotide properties into PseKNC

10.1101/2020.01.17.910232 ◽

2020 ◽

Author(s):

Yongxian Fan ◽

Wanru Wang ◽

Qingqi Zhu

Keyword(s):

Prediction Method ◽

Disease Diagnosis ◽

Extraction Methods ◽

Gradient Boosting ◽

Step Method ◽

Base Content ◽

Transcriptional Termination ◽

Extreme Gradient Boosting ◽

New Feature ◽

Fold Cross Validation

AbstractTerminator is a DNA sequence that give the RNA polymerase the transcriptional termination signal. Identifying terminators correctly can optimize the genome annotation, more importantly, it has considerable application value in disease diagnosis and therapies. However, accurate prediction methods are deficient and in urgent need. Therefore, we proposed a prediction method “iterb-PPse” for terminators by incorporating 47 nucleotide properties into PseKNC- I and PseKNC- II and utilizing Extreme Gradient Boosting to predict terminators based on Escherichia coli and Bacillus subtilis. Combing with the preceding methods, we employed three new feature extraction methods K-pwm, Base-content, Nucleotidepro to formulate raw samples. The two-step method was applied to select features. When identifying terminators based on optimized features, we compared five single models as well as 16 ensemble models. As a result, the accuracy of our method on benchmark dataset achieved 99.88%, higher than the existing state-of-the-art predictor iTerm-PseKNC in 100 times five-fold cross-validation test. It’s prediction accuracy for two independent datasets reached 94.24% and 99.45% respectively. For the convenience of users, a software was developed with the same name on the basis of “iterb-PPse”. The open software and source code of “iterb-PPse” are available at https://github.com/Sarahyouzi/iterb-PPse.

Download Full-text

Predictive modeling of susceptibility to substance abuse, mortality and drug-drug interactions in opioid patients

10.1101/506451 ◽

2018 ◽

Author(s):

Ramya Vunikili ◽

Benjamin S Glicksberg ◽

Kipp W Johnson ◽

Joel Dudley ◽

Lakshminarayanan Subramanian ◽

...

Keyword(s):

At Risk ◽

Drug Interaction ◽

Drug Interactions ◽

Predictive Models ◽

Opioid Abuse ◽

Gradient Boosting ◽

Interaction Patterns ◽

Extreme Gradient Boosting ◽

Patients At Risk ◽

Drug Drug Interaction

Opioid addiction causes high degree of morbidity and mortality. Preemptive identification of patients at risk of opioid dependence and developing intelligent clinical decisions to deprescribe opioids to the vulnerable patient population may help in reducing the burden. Identifying patients susceptible to mortality due to opioid-induced side effects and understanding the landscape of drug-drug interaction pairs aggravating opioid usage are significant, yet, unexplored research questions. In this study, we present a collection of predictive models to identify patients at risk of opioid abuse, mortality and drug-drug interactions in the context of opioid usage. Using publicly available dataset from MIMIC-III, we developed predictive models (opioid abuse models a=Logistic Regression; b=Extreme Gradient Boosting and mortality model= Extreme Gradient Boosting) and identified potential drug-drug interaction patterns. To enable the translational value of our work, the predictive model and all associated software code is provided. This repository could be used to build clinical decision aids and thus improve the optimization of prescription rates for vulnerable population.

Download Full-text

A Prediction Method Based on Extreme Gradient Boosting Tree Model and its Application

Journal of Physics Conference Series ◽

10.1088/1742-6596/1995/1/012017 ◽

2021 ◽

Vol 1995 (1) ◽

pp. 012017

Author(s):

Yongchang Lao ◽

Fangzhong Qi ◽

Jiakai Zhou ◽

Xiaobao Fang

Keyword(s):

Prediction Method ◽

Gradient Boosting ◽

Tree Model ◽

Extreme Gradient Boosting

Download Full-text

A CEEMDAN and XGBOOST-Based Approach to Forecast Crude Oil Prices

Complexity ◽

10.1155/2019/4392785 ◽

2019 ◽

Vol 2019 ◽

pp. 1-15 ◽

Cited By ~ 22

Author(s):

Yingrui Zhou ◽

Taiyong Li ◽

Jiayi Shi ◽

Zijie Qian

Keyword(s):

Crude Oil ◽

Global Economy ◽

Oil Prices ◽

Ensemble Empirical Mode Decomposition ◽

Gradient Boosting ◽

Intrinsic Mode Functions ◽

Crude Oil Prices ◽

Novel Approach ◽

Mode Decomposition ◽

Extreme Gradient Boosting

Crude oil is one of the most important types of energy for the global economy, and hence it is very attractive to understand the movement of crude oil prices. However, the sequences of crude oil prices usually show some characteristics of nonstationarity and nonlinearity, making it very challenging for accurate forecasting crude oil prices. To cope with this issue, in this paper, we propose a novel approach that integrates complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and extreme gradient boosting (XGBOOST), so-called CEEMDAN-XGBOOST, for forecasting crude oil prices. Firstly, we use CEEMDAN to decompose the nonstationary and nonlinear sequences of crude oil prices into several intrinsic mode functions (IMFs) and one residue. Secondly, XGBOOST is used to predict each IMF and the residue individually. Finally, the corresponding prediction results of each IMF and the residue are aggregated as the final forecasting results. To demonstrate the performance of the proposed approach, we conduct extensive experiments on the West Texas Intermediate (WTI) crude oil prices. The experimental results show that the proposed CEEMDAN-XGBOOST outperforms some state-of-the-art models in terms of several evaluation metrics.

Download Full-text

Research on Accurate Prediction of the Container Ship Resistance by RBFNN and Other Machine Learning Algorithms

Journal of Marine Science and Engineering ◽

10.3390/jmse9040376 ◽

2021 ◽

Vol 9 (4) ◽

pp. 376 ◽

Cited By ~ 1

Author(s):

Yunfei Yang ◽

Haiwen Tu ◽

Lei Song ◽

Lin Chen ◽

De Xie ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Prediction Method ◽

Resistance Coefficient ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Container Ship ◽

Extreme Gradient Boosting ◽

Better Than

Resistance is one of the important performance indicators of ships. In this paper, a prediction method based on the Radial Basis Function neural network (RBFNN) is proposed to predict the resistance of a 13500 transmission extension unit (13500TEU) container ship at different drafts. The predicted draft state in the known range is called interpolation prediction; otherwise, it is extrapolation prediction. First, ship features are extracted to make the resistance Rt prediction. The resistance prediction results show that the performance of the RBFNN is significantly better than the other four machine learning models, backpropagation neural network (BPNN), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost). Then, the ship data is processed in a dimensionless manner, and the models mentioned above are used to predict the total resistance coefficient Ct of the container ship. The prediction results show that the RBFNN prediction model still performs well. Good results can be obtained by RBFNN in interpolation prediction, even when using part of dimensionless features. Finally, the accuracy of the prediction method based on RBFNN is greatly improved compared with the modified admiralty coefficient.

Download Full-text

Using explainable machine learning to identify patients at risk of reattendance at discharge from emergency departments

Scientific Reports ◽

10.1038/s41598-021-00937-9 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

F. P. Chmiel ◽

D. K. Burns ◽

M. Azor ◽

F. Borca ◽

M. J. Boniface ◽

...

Keyword(s):

Machine Learning ◽

At Risk ◽

Emergency Departments ◽

Gradient Boosting ◽

Tree Model ◽

Post Discharge ◽

Test Set ◽

Increased Risk ◽

Extreme Gradient Boosting ◽

Patients At Risk

AbstractShort-term reattendances to emergency departments are a key quality of care indicator. Identifying patients at increased risk of early reattendance could help reduce the number of missed critical illnesses and could reduce avoidable utilization of emergency departments by enabling targeted post-discharge intervention. In this manuscript, we present a retrospective, single-centre study where we created and evaluated an extreme gradient boosting decision tree model trained to identify patients at risk of reattendance within 72 h of discharge from an emergency department (University Hospitals Southampton Foundation Trust, UK). Our model was trained using 35,447 attendances by 28,945 patients and evaluated on a hold-out test set featuring 8847 attendances by 7237 patients. The set of attendances from a given patient appeared exclusively in either the training or the test set. Our model was trained using both visit level variables (e.g., vital signs, arrival mode, and chief complaint) and a set of variables available in a patients electronic patient record, such as age and any recorded medical conditions. On the hold-out test set, our highest performing model obtained an AUROC of 0.747 (95% CI 0.722–0.773) and an average precision of 0.233 (95% CI 0.194–0.277). These results demonstrate that machine-learning models can be used to classify patients, with moderate performance, into low and high-risk groups for reattendance. We explained our models predictions using SHAP values, a concept developed from coalitional game theory, capable of explaining predictions at an attendance level. We demonstrated how clustering techniques (the UMAP algorithm) can be used to investigate the different sub-groups of explanations present in our patient cohort.

Download Full-text

Predicting WNV Circulation in Italy Using Earth Observation Data and Extreme Gradient Boosting Model

Remote Sensing ◽

10.3390/rs12183064 ◽

2020 ◽

Vol 12 (18) ◽

pp. 3064

Author(s):

Luca Candeloro ◽

Carla Ippoliti ◽

Federica Iapaolo ◽

Federica Monaco ◽

Daniela Morelli ◽

...

Keyword(s):

At Risk ◽

Environmental Conditions ◽

Land Surface ◽

Vegetation Index ◽

Earth Observation ◽

Gradient Boosting ◽

Observation Data ◽

West Nile ◽

Extreme Gradient Boosting ◽

Earth Observation Data

West Nile Disease (WND) is one of the most spread zoonosis in Italy and Europe caused by a vector-borne virus. Its transmission cycle is well understood, with birds acting as the primary hosts and mosquito vectors transmitting the virus to other birds, while humans and horses are occasional dead-end hosts. Identifying suitable environmental conditions across large areas containing multiple species of potential hosts and vectors can be difficult. The recent and massive availability of Earth Observation data and the continuous development of innovative Machine Learning methods can contribute to automatically identify patterns in big datasets and to make highly accurate identification of areas at risk. In this paper, we investigated the West Nile Virus (WNV) circulation in relation to Land Surface Temperature, Normalized Difference Vegetation Index and Surface Soil Moisture collected during the 160 days before the infection took place, with the aim of evaluating the predictive capacity of lagged remotely sensed variables in the identification of areas at risk for WNV circulation. WNV detection in mosquitoes, birds and horses in 2017, 2018 and 2019, has been collected from the National Information System for Animal Disease Notification. An Extreme Gradient Boosting model was trained with data from 2017 and 2018 and tested for the 2019 epidemic, predicting the spatio-temporal WNV circulation two weeks in advance with an overall accuracy of 0.84. This work lays the basis for a future early warning system that could alert public authorities when climatic and environmental conditions become favourable to the onset and spread of WNV.

Download Full-text

Predictive Modelling of Susceptibility to Substance Abuse, Mortality and Drug-Drug Interactions in Opioid Patients

Frontiers in Artificial Intelligence ◽

10.3389/frai.2021.742723 ◽

2021 ◽

Vol 4 ◽

Author(s):

Ramya Vunikili ◽

Benjamin S. Glicksberg ◽

Kipp W. Johnson ◽

Joel T. Dudley ◽

Lakshminarayanan Subramanian ◽

...

Keyword(s):

At Risk ◽

Drug Interactions ◽

Predictive Models ◽

Causal Effect ◽

Opioid Abuse ◽

Gradient Boosting ◽

Diabetic Patients ◽

Opioid Overdose ◽

Extreme Gradient Boosting ◽

Patients At Risk

Objective: Opioids are a class of drugs that are known for their use as pain relievers. They bind to opioid receptors on nerve cells in the brain and the nervous system to mitigate pain. Addiction is one of the chronic and primary adverse events of prolonged usage of opioids. They may also cause psychological disorders, muscle pain, depression, anxiety attacks etc. In this study, we present a collection of predictive models to identify patients at risk of opioid abuse and mortality by using their prescription histories. Also, we discover particularly threatening drug-drug interactions in the context of opioid usage.Methods and Materials: Using a publicly available dataset from MIMIC-III, two models were trained, Logistic Regression with L2 regularization (baseline) and Extreme Gradient Boosting (enhanced model), to classify the patients of interest into two categories based on their susceptibility to opioid abuse. We’ve also used K-Means clustering, an unsupervised algorithm, to explore drug-drug interactions that might be of concern.Results: The baseline model for classifying patients susceptible to opioid abuse has an F1 score of 76.64% (accuracy 77.16%) while the enhanced model has an F1 score of 94.45% (accuracy 94.35%). These models can be used as a preliminary step towards inferring the causal effect of opioid usage and can help monitor the prescription practices to minimize the opioid abuse.Discussion and Conclusion: Results suggest that the enhanced model provides a promising approach in preemptive identification of patients at risk for opioid abuse. By discovering and correlating the patterns contributing to opioid overdose or abuse among a variety of patients, machine learning models can be used as an efficient tool to help uncover the existing gaps and/or fraudulent practices in prescription writing. To quote an example of one such incidental finding, our study discovered that insulin might possibly be interacting with opioids in an unfavourable way leading to complications in diabetic patients. This indicates that diabetic patients under long term opioid usage might need to take increased amounts of insulin to make it more effective. This observation backs up prior research studies done on a similar aspect. To increase the translational value of our work, the predictive models and the associated software code are made available under the MIT License.

Download Full-text

Early Identification of Youth at Risk for Suicidal Behavior

Crisis ◽

10.1027/0227-5910/a000569 ◽

2019 ◽

Vol 40 (5) ◽

pp. 326-332

Author(s):

Ivonne Andrea Florez ◽

Devon LoParo ◽

Nakia Valentine ◽

Dorian A. Lamis

Keyword(s):

At Risk ◽

Early Identification ◽

Suicide Risk ◽

Psychiatric Symptoms ◽

Demographic Factors ◽

Health Centers ◽

Lack Of Information ◽

Youth At Risk ◽

Logistic Regressions ◽

Lethal Means

Abstract. Background: Early identification and appropriate referral services are priorities to prevent suicide. Aims: The aim of this study was to describe patterns of identification and referrals among three behavioral health centers and determine whether youth demographic factors and type of training received by providers were associated with identification and referral patterns. Method: The Early Identification Referral Forms were used to gather the data of interest among 820 youth aged 10–24 years who were screened for suicide risk (females = 53.8%). Descriptive statistics and binary logistic regressions were conducted to examine significant associations. Results: Significant associations between gender, race, and age and screening positive for suicide were found. Age and race were significantly associated with different patterns of referrals and/or services received by youths. For providers, being trained in Counseling on Access to Lethal Means was positively associated with number of referrals to inpatient services. Limitations: The correlational nature of the study and lack of information about suicide risk and comorbidity of psychiatric symptoms limit the implications of the findings. Conclusion: The results highlight the importance of considering demographic factors when identifying and referring youth at risk to ensure standard yet culturally appropriate procedures to prevent suicide.

Download Full-text