A Machine Learning Approach as an Aid for Early COVID-19 Detection

Roberto Martinez-Velazquez; Diana P. Tobón V.; Alejandro Sanchez; Abdulmotaleb El Saddik; Emil Petriu

doi:10.3390/s21124202

A Machine Learning Approach as an Aid for Early COVID-19 Detection

Sensors ◽

10.3390/s21124202 ◽

2021 ◽

Vol 21 (12) ◽

pp. 4202

Author(s):

Roberto Martinez-Velazquez ◽

Diana P. Tobón V. ◽

Alejandro Sanchez ◽

Abdulmotaleb El Saddik ◽

Emil Petriu

Keyword(s):

Machine Learning ◽

Operating Characteristic ◽

Area Under The Curve ◽

The Novel ◽

Sensitivity Score ◽

Machine Learning Approach ◽

Physical Interactions ◽

Learning Test ◽

Novel Coronavirus ◽

Specificity Score

The novel coronavirus SARS-CoV-2 that causes the disease COVID-19 has forced us to go into our homes and limit our physical interactions with others. Economies around the world have come to a halt, with non-essential businesses being forced to close in order to prevent further propagation of the virus. Developing countries are having more difficulties due to their lack of access to diagnostic resources. In this study, we present an approach for detecting COVID-19 infections exclusively on the basis of self-reported symptoms. Such an approach is of great interest because it is relatively inexpensive and easy to deploy at either an individual or population scale. Our best model delivers a sensitivity score of 0.752, a specificity score of 0.609, and an area under the curve for the receiver operating characteristic of 0.728. These are promising results that justify continuing research efforts towards a machine learning test for detecting COVID-19.

Download Full-text

How Does the Novel Coronavirus Kill? A Machine Learning Approach

SSRN Electronic Journal ◽

10.2139/ssrn.3618304 ◽

2020 ◽

Author(s):

Logan Ryan ◽

Huaqin Pan ◽

Samson Mataraso ◽

Anna Lynn-Palevsky ◽

Emily Pellegrini ◽

...

Keyword(s):

Machine Learning ◽

Learning Approach ◽

The Novel ◽

Machine Learning Approach ◽

Novel Coronavirus

Download Full-text

Effect of Temperature on the Transmission of COVID-19: A Machine Learning Case Study in Spain

10.1101/2020.05.01.20087759 ◽

2020 ◽

Cited By ~ 1

Author(s):

Amir Abdollahi ◽

Maryam Rahbaralam

Keyword(s):

Machine Learning ◽

Wind Speed ◽

Inverse Correlation ◽

Effect Of Temperature ◽

The Novel ◽

Autonomous Communities ◽

Infected People ◽

Machine Learning Approach ◽

Potential Factors ◽

Novel Coronavirus

AbstractThe novel coronavirus (COVID-19) has already spread to almost every country in the world and has infected over 3 million people. To understand the transmission mechanism of this highly contagious virus, it is necessary to study the potential factors, including meteorological conditions. Here, we present a machine learning approach to study the effect of temperature, humidity and wind speed on the number of infected people in the three most populous autonomous communities in Spain. We find that there is a moderate inverse correlation between temperature and the daily number of infections. This correlation manifests for temperatures recorded up to 6 days before the onset, which corresponds well to the known mean incubation period of COVID-19. We also show that the correlation for humidity and wind speed is not significant.

Download Full-text

Analysis and Prediction of COVID-19 Using SIR, SEIQR, and Machine Learning Models: Australia, Italy, and UK Cases

Information ◽

10.3390/info12030109 ◽

2021 ◽

Vol 12 (3) ◽

pp. 109 ◽

Cited By ~ 1

Author(s):

Iman Rahimi ◽

Amir H. Gandomi ◽

Panagiotis G. Asteris ◽

Fang Chen

Keyword(s):

Machine Learning ◽

Logistic Function ◽

Prediction Performance ◽

Machine Learning Algorithms ◽

Model Parameters ◽

The Novel ◽

Chinese City ◽

Limited Memory ◽

Increasing Trend ◽

Novel Coronavirus

The novel coronavirus disease, also known as COVID-19, is a disease outbreak that was first identified in Wuhan, a Central Chinese city. In this report, a short analysis focusing on Australia, Italy, and UK is conducted. The analysis includes confirmed and recovered cases and deaths, the growth rate in Australia compared with that in Italy and UK, and the trend of the disease in different Australian regions. Mathematical approaches based on susceptible, infected, and recovered (SIR) cases and susceptible, exposed, infected, quarantined, and recovered (SEIQR) cases models are proposed to predict epidemiology in the above-mentioned countries. Since the performance of the classic forms of SIR and SEIQR depends on parameter settings, some optimization algorithms, namely Broyden–Fletcher–Goldfarb–Shanno (BFGS), conjugate gradients (CG), limited memory bound constrained BFGS (L-BFGS-B), and Nelder–Mead, are proposed to optimize the parameters and the predictive capabilities of the SIR and SEIQR models. The results of the optimized SIR and SEIQR models were compared with those of two well-known machine learning algorithms, i.e., the Prophet algorithm and logistic function. The results demonstrate the different behaviors of these algorithms in different countries as well as the better performance of the improved SIR and SEIQR models. Moreover, the Prophet algorithm was found to provide better prediction performance than the logistic function, as well as better prediction performance for Italy and UK cases than for Australian cases. Therefore, it seems that the Prophet algorithm is suitable for data with an increasing trend in the context of a pandemic. Optimization of SIR and SEIQR model parameters yielded a significant improvement in the prediction accuracy of the models. Despite the availability of several algorithms for trend predictions in this pandemic, there is no single algorithm that would be optimal for all cases.

Download Full-text

Monitoring the Impact of Air Quality on the COVID-19 Fatalities in Delhi, India: Using Machine Learning Techniques

Disaster Medicine and Public Health Preparedness ◽

10.1017/dmp.2020.372 ◽

2020 ◽

pp. 1-8

Author(s):

Jasleen Kaur Sethi ◽

Mamta Mittal

Keyword(s):

Machine Learning ◽

Air Quality ◽

Air Pollutants ◽

Machine Learning Techniques ◽

Environmental Restoration ◽

The Novel ◽

Ozone Pollution ◽

Learning Techniques ◽

Novel Coronavirus ◽

The Impact

ABSTRACT Objective: The focus of this study is to monitor the effect of lockdown on the various air pollutants due to the coronavirus disease (COVID-19) pandemic and identify the ones that affect COVID-19 fatalities so that measures to control the pollution could be enforced. Methods: Various machine learning techniques: Decision Trees, Linear Regression, and Random Forest have been applied to correlate air pollutants and COVID-19 fatalities in Delhi. Furthermore, a comparison between the concentration of various air pollutants and the air quality index during the lockdown period and last two years, 2018 and 2019, has been presented. Results: From the experimental work, it has been observed that the pollutants ozone and toluene have increased during the lockdown period. It has also been deduced that the pollutants that may impact the mortalities due to COVID-19 are ozone, NH3, NO2, and PM10. Conclusions: The novel coronavirus has led to environmental restoration due to lockdown. However, there is a need to impose measures to control ozone pollution, as there has been a significant increase in its concentration and it also impacts the COVID-19 mortality rate.

Download Full-text

Prediction of COVID-19 Severity Using Chest Computed Tomography and Laboratory Measurements: Evaluation Using a Machine Learning Approach (Preprint)

10.2196/preprints.21604 ◽

2020 ◽

Author(s):

Daowei Li ◽

Qiang Zhang ◽

Yue Tan ◽

Xinghuo Feng ◽

Yuanyi Yue ◽

...

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Laboratory Tests ◽

Operating Characteristic ◽

Clinical Laboratory ◽

Severe Disease ◽

Ct Images ◽

Model Combining ◽

Machine Learning Model ◽

Machine Learning Approach

BACKGROUND Most of the mortality resulting from COVID-19 has been associated with severe disease. Effective treatment of severe cases remains a challenge due to the lack of early detection of the infection. OBJECTIVE This study aimed to develop an effective prediction model for COVID-19 severity by combining radiological outcome with clinical biochemical indexes. METHODS A total of 46 patients with COVID-19 (10 severe, 36 nonsevere) were examined. To build the prediction model, a set of 27 severe and 151 nonsevere clinical laboratory records and computerized tomography (CT) records were collected from these patients. We managed to extract specific features from the patients’ CT images by using a recently published convolutional neural network. We also trained a machine learning model combining these features with clinical laboratory results. RESULTS We present a prediction model combining patients’ radiological outcomes with their clinical biochemical indexes to identify severe COVID-19 cases. The prediction model yielded a cross-validated area under the receiver operating characteristic (AUROC) score of 0.93 and an F<sub>1</sub> score of 0.89, which showed a 6% and 15% improvement, respectively, compared to the models based on laboratory test features only. In addition, we developed a statistical model for forecasting COVID-19 severity based on the results of patients’ laboratory tests performed before they were classified as severe cases; this model yielded an AUROC score of 0.81. CONCLUSIONS To our knowledge, this is the first report predicting the clinical progression of COVID-19, as well as forecasting severity, based on a combined analysis using laboratory tests and CT images.

Download Full-text

Shape-based Machine Learning Models for the Potential Novel COVID-19 Protease Inhibitors Assisted by Molecular Dynamics Simulation

Current Topics in Medicinal Chemistry ◽

10.2174/1568026620666200704135327 ◽

2020 ◽

Vol 20 (24) ◽

pp. 2146-2167 ◽

Cited By ~ 1

Author(s):

Anuraj Nayarisseri ◽

Ravina Khandelwal ◽

Maddala Madhavi ◽

Chandrabose Selvaraj ◽

Umesh Panwar ◽

...

Keyword(s):

Machine Learning ◽

Molecular Dynamics ◽

Protease Inhibitors ◽

Dynamics Simulation ◽

The Novel ◽

Public Health Services ◽

Dynamic Simulations ◽

Disease Patterns ◽

Novel Coronavirus ◽

Dynamics Simulations

Background: The vast geographical expansion of novel coronavirus and an increasing number of COVID-19 affected cases have overwhelmed health and public health services. Artificial Intelligence (AI) and Machine Learning (ML) algorithms have extended their major role in tracking disease patterns, and in identifying possible treatments. Objective: This study aims to identify potential COVID-19 protease inhibitors through shape-based Machine Learning assisted by Molecular Docking and Molecular Dynamics simulations. Methods: 31 Repurposed compounds have been selected targeting the main coronavirus protease (6LU7) and a machine learning approach was employed to generate shape-based molecules starting from the 3D shape to the pharmacophoric features of their seed compound. Ligand-Receptor Docking was performed with Optimized Potential for Liquid Simulations (OPLS) algorithms to identify highaffinity compounds from the list of selected candidates for 6LU7, which were subjected to Molecular Dynamic Simulations followed by ADMET studies and other analyses. Results: Shape-based Machine learning reported remdesivir, valrubicin, aprepitant, and fulvestrant as the best therapeutic agents with the highest affinity for the target protein. Among the best shape-based compounds, a novel compound identified was not indexed in any chemical databases (PubChem, Zinc, or ChEMBL). Hence, the novel compound was named 'nCorv-EMBS'. Further, toxicity analysis showed nCorv-EMBS to be suitable for further consideration as the main protease inhibitor in COVID-19. Conclusion: Effective ACE-II, GAK, AAK1, and protease 3C blockers can serve as a novel therapeutic approach to block the binding and attachment of the main COVID-19 protease (PDB ID: 6LU7) to the host cell and thus inhibit the infection at AT2 receptors in the lung. The novel compound nCorv- EMBS herein proposed stands as a promising inhibitor to be evaluated further for COVID-19 treatment.

Download Full-text

Prediction of COVID-19 Severity Using Chest Computed Tomography and Laboratory Measurements: Evaluation Using a Machine Learning Approach

JMIR Medical Informatics ◽

10.2196/21604 ◽

2020 ◽

Vol 8 (11) ◽

pp. e21604

Author(s):

Daowei Li ◽

Qiang Zhang ◽

Yue Tan ◽

Xinghuo Feng ◽

Yuanyi Yue ◽

...

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Laboratory Tests ◽

Operating Characteristic ◽

Clinical Laboratory ◽

Severe Disease ◽

Ct Images ◽

Model Combining ◽

Machine Learning Model ◽

Machine Learning Approach

Background Most of the mortality resulting from COVID-19 has been associated with severe disease. Effective treatment of severe cases remains a challenge due to the lack of early detection of the infection. Objective This study aimed to develop an effective prediction model for COVID-19 severity by combining radiological outcome with clinical biochemical indexes. Methods A total of 46 patients with COVID-19 (10 severe, 36 nonsevere) were examined. To build the prediction model, a set of 27 severe and 151 nonsevere clinical laboratory records and computerized tomography (CT) records were collected from these patients. We managed to extract specific features from the patients’ CT images by using a recently published convolutional neural network. We also trained a machine learning model combining these features with clinical laboratory results. Results We present a prediction model combining patients’ radiological outcomes with their clinical biochemical indexes to identify severe COVID-19 cases. The prediction model yielded a cross-validated area under the receiver operating characteristic (AUROC) score of 0.93 and an F1 score of 0.89, which showed a 6% and 15% improvement, respectively, compared to the models based on laboratory test features only. In addition, we developed a statistical model for forecasting COVID-19 severity based on the results of patients’ laboratory tests performed before they were classified as severe cases; this model yielded an AUROC score of 0.81. Conclusions To our knowledge, this is the first report predicting the clinical progression of COVID-19, as well as forecasting severity, based on a combined analysis using laboratory tests and CT images.

Download Full-text

Use of Natural Language Processing to Improve Identification of Patients With Peripheral Artery Disease

Circulation Cardiovascular Interventions ◽

10.1161/circinterventions.120.009447 ◽

2020 ◽

Vol 13 (10) ◽

Cited By ~ 1

Author(s):

E. Hope Weissler ◽

Jikai Zhang ◽

Steven Lippmann ◽

Shelley Rusincovitch ◽

Ricardo Henao ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Operating Characteristic ◽

Structured Data ◽

Learning Approach ◽

Peripheral Artery ◽

Machine Learning Approach ◽

Artery Disease

Background: Peripheral artery disease (PAD) is underrecognized, undertreated, and understudied: each of these endeavors requires efficient and accurate identification of patients with PAD. Currently, PAD patient identification relies on diagnosis/procedure codes or lists of patients diagnosed or treated by specific providers in specific locations and ways. The goal of this research was to leverage natural language processing to more accurately identify patients with PAD in an electronic health record system compared with a structured data–based approach. Methods: The clinical notes from a cohort of 6861 patients in our health system whose PAD status had previously been adjudicated were used to train, test, and validate a natural language processing model using 10-fold cross-validation. The performance of this model was described using the area under the receiver operating characteristic and average precision curves; its performance was quantitatively compared with an administrative data–based least absolute shrinkage and selection operator (LASSO) approach using the DeLong test. Results: The median (SD) of the area under the receiver operating characteristic curve for the natural language processing model was 0.888 (0.009) versus 0.801 (0.017) for the LASSO-based approach alone (DeLong P <0.0001). The median (SD) of the area under the precision curve was 0.909 (0.008) versus 0.816 (0.012) for the structured data–based approach. When sensitivity was set at 90%, the precision for LASSO was 65% and the machine learning approach was 74%, while the specificity for LASSO was 41% and for the machine learning approach was 62%. Conclusions: Using a natural language processing approach in addition to partial cohort preprocessing with a LASSO-based model, we were able to meaningfully improve our ability to identify patients with PAD compared with an approach using structured data alone. This model has potential applications to both interventions targeted at improving patient care as well as efficient, large-scale PAD research. Graphic Abstract: A graphic abstract is available for this article.

Download Full-text

Prediction of Colon Cancer Stages and Survival Period with Machine Learning Approach

Cancers ◽

10.3390/cancers11122007 ◽

2019 ◽

Vol 11 (12) ◽

pp. 2007 ◽

Cited By ~ 5

Author(s):

Pushpanjali Gupta ◽

Sum-Fu Chiang ◽

Prasan Kumar Sahoo ◽

Suvendu Kumar Mohapatra ◽

Jeng-Fu You ◽

...

Keyword(s):

Machine Learning ◽

Colon Cancer ◽

Random Forest ◽

Area Under The Curve ◽

Disease Free Survival ◽

Tumor Stage ◽

Tnm Staging ◽

Chang Gung Memorial Hospital ◽

Aggression Score ◽

Machine Learning Approach

The prediction of tumor in the TNM staging (tumor, node, and metastasis) stage of colon cancer using the most influential histopathology parameters and to predict the five years disease-free survival (DFS) period using machine learning (ML) in clinical research have been studied here. From the colorectal cancer (CRC) registry of Chang Gung Memorial Hospital, Linkou, Taiwan, 4021 patients were selected for the analysis. Various ML algorithms were applied for the tumor stage prediction of the colon cancer by considering the Tumor Aggression Score (TAS) as a prognostic factor. Performances of different ML algorithms were evaluated using five-fold cross-validation, which is an effective way of the model validation. The accuracy achieved by the algorithms taking both cases of standard TNM staging and TNM staging with the Tumor Aggression Score was determined. It was observed that the Random Forest model achieved an F-measure of 0.89, when the Tumor Aggression Score was considered as an attribute along with the standard attributes normally used for the TNM stage prediction. We also found that the Random Forest algorithm outperformed all other algorithms, with an accuracy of approximately 84% and an area under the curve (AUC) of 0.82 ± 0.10 for predicting the five years DFS.

Download Full-text

A Novel Ensemble Machine Learning Approach for Bioarchaeological Sex Prediction

Technologies ◽

10.3390/technologies9020023 ◽

2021 ◽

Vol 9 (2) ◽

pp. 23

Author(s):

Evan Muzzall

Keyword(s):

Machine Learning ◽

Missing Data ◽

Central Italy ◽

Area Under The Curve ◽

Machine Learning Algorithms ◽

Low Rank ◽

Future Research ◽

Learning Approach ◽

Machine Learning Approach ◽

Metric Distances

I present a novel machine learning approach to predict sex in the bioarchaeological record. Eighteen cranial interlandmark distances and five maxillary dental metric distances were recorded from n = 420 human skeletons from the necropolises at Alfedena (600–400 BCE) and Campovalano (750–200 BCE and 9–11th Centuries CE) in central Italy. A generalized low rank model (GLRM) was used to impute missing data and Area under the Curve—Receiver Operating Characteristic (AUC-ROC) with 20-fold stratified cross-validation was used to evaluate predictive performance of eight machine learning algorithms on different subsets of the data. Additional perspectives such as this one show strong potential for sex prediction in bioarchaeological and forensic anthropological contexts. Furthermore, GLRMs have the potential to handle missing data in ways previously unexplored in the discipline. Although results of this study look promising (highest AUC-ROC = 0.9722 for predicting binary male/female sex), the main limitation is that the sexes of the individuals included were not known but were estimated using standard macroscopic bioarchaeological methods. However, future research should apply this machine learning approach to known-sex reference samples in order to better understand its value, along with the more general contributions that machine learning can make to the reconstruction of past human lifeways.

Download Full-text