Mathematical Modelling and Prediction Tools for the COVID-19 Pandemic: A Review (Preprint)

Mapping Intimacies ◽

10.2196/preprints.30546 ◽

2021 ◽

Author(s):

Chin Kuan Ho ◽

Seng Huat Ong ◽

Kamarul Imran Musa ◽

Choo Yee Ting ◽

Chiung Ching Ho ◽

...

Keyword(s):

Phenomenological Models ◽

Drug Repurposing ◽

Quality Data ◽

Mechanistic Models ◽

Learning Approaches ◽

Agent Based ◽

Prediction Tools ◽

Ensure Patient Safety ◽

Empirical Perspective ◽

Interpretable Models

UNSTRUCTURED The latest threat to global health is the ongoing outbreak of the Coronavirus Disease 2019 (COVID-19). There are three main areas of modeling research, namely epidemiology, drug repurposing and vaccine design. The most important purpose of the models is to inform institutional and nationwide efforts to ensure patient safety. This study aimed to review COVID-19 modelling and prediction tools. Understanding these methods streamlines the strengths and limitations of each method. We researched the traditional model and the more current models that flourish during the pandemic. This understanding is the key to the proper use of specific models to achieve certain goals. Modeling approaches for COVID-19 can be very broadly categorized into phenomenological models and mechanistic models. Phenomenological approaches treat the modeling problem purely from an empirical perspective. From our survey, there are three major types of approaches under the phenomenological models: time-series analysis and forecasting, fractal-based models, and machine learning approaches. Mechanistic models consider the underlying mechanics of the epidemic. In this survey, compartmental models and agent-based models are categorized as mechanistic models. We studied 46 scientific articles (published between 22 February 2020 and 29 January 2021) that we think are representative of the scientific community’s approaches in modeling and prediction. We highlight the challenges and limitations of modelling approaches such as the need for high quality data, and interpretable models. Finally, we list the desired features for developing robust and reliable modelling and prediction tools.

Download Full-text

Potential 2019-nCoV 3C-like Protease Inhibitors Designed Using Generative Deep Learning Approaches

10.26434/chemrxiv.11829102.v1 ◽

2020 ◽

Cited By ~ 7

Author(s):

Alex Zhavoronkov ◽

Vladimir Aladinskiy ◽

Alexander Zhebrak ◽

Bogdan Zagribelnyy ◽

Victor Terentiev ◽

...

Keyword(s):

Homology Modelling ◽

Drug Repurposing ◽

Hiv Protease ◽

Learning Approaches ◽

Protein Targets ◽

Chemical Libraries ◽

Internal Resources ◽

Novel Coronavirus ◽

Approved Drugs ◽

Novel Drug

<div> <div> <div> <p>The emergence of the 2019 novel coronavirus (2019-nCoV), for which there is no vaccine or any known effective treatment created a sense of urgency for novel drug discovery approaches. One of the most important 2019-nCoV protein targets is the 3C-like protease for which the crystal structure is known. Most of the immediate efforts are focused on drug repurposing of known clinically-approved drugs and virtual screening for the molecules available from chemical libraries that may not work well. For example, the IC50 of lopinavir, an HIV protease inhibitor, against the 3C-like protease is approximately 50 micromolar. In an attempt to address this challenge, on January 28th, 2020 Insilico Medicine decided to utilize a part of its generative chemistry pipeline to design novel drug-like inhibitors of 2019-nCoV and started generation on January 30th. It utilized three of its previously validated generative chemistry approaches: crystal-derived pocked- based generator, homology modelling-based generation, and ligand-based generation. Novel druglike compounds generated using these approaches are being published at www.insilico.com/ncov-sprint/ and will be continuously updated. Several molecules will be synthesized and tested using the internal resources; however, the team is seeking collaborations to synthesize, test, and, if needed, optimize the published molecules. </p> </div> </div> </div>

Download Full-text

ReVac: a reverse vaccinology computational pipeline for prioritization of prokaryotic protein vaccine candidates

BMC Genomics ◽

10.1186/s12864-019-6195-y ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 1

Author(s):

Adonis D’Mello ◽

Christian P. Ahearn ◽

Timothy F. Murphy ◽

Hervé Tettelin

Keyword(s):

Machine Learning ◽

Experimental Testing ◽

Negative Control ◽

Reverse Vaccinology ◽

Learning Approaches ◽

Protein Vaccine ◽

Vaccine Candidates ◽

Variable Expression ◽

Prediction Tools ◽

Potential Vaccine

Abstract Background Reverse vaccinology accelerates the discovery of potential vaccine candidates (PVCs) prior to experimental validation. Current programs typically use one bacterial proteome to identify PVCs through a filtering architecture using feature prediction programs or a machine learning approach. Filtering approaches may eliminate potential antigens based on limitations in the accuracy of prediction tools used. Machine learning approaches are heavily dependent on the selection of training datasets with experimentally validated antigens (positive control) and non-protective-antigens (negative control). The use of one or few bacterial proteomes does not assess PVC conservation among strains, an important feature of vaccine antigens. Results We present ReVac, which implements both a panoply of feature prediction programs without filtering out proteins, and scoring of candidates based on predictions made on curated positive and negative control PVCs datasets. ReVac surveys several genomes assessing protein conservation, as well as DNA and protein repeats, which may result in variable expression of PVCs. ReVac’s orthologous clustering of conserved genes, identifies core and dispensable genome components. This is useful for determining the degree of conservation of PVCs among the population of isolates for a given pathogen. Potential vaccine candidates are then prioritized based on conservation and overall feature-based scoring. We present the application of ReVac, applied to 69 Moraxella catarrhalis and 270 non-typeable Haemophilus influenzae genomes, prioritizing 64 and 29 proteins as PVCs, respectively. Conclusion ReVac’s use of a scoring scheme ranks PVCs for subsequent experimental testing. It employs a redundancy-based approach in its predictions of features using several prediction tools. The protein’s features are collated, and each protein is ranked based on the scoring scheme. Multi-genome analyses performed in ReVac allow for a comprehensive overview of PVCs from a pan-genome perspective, as an essential pre-requisite for any bacterial subunit vaccine design. ReVac prioritized PVCs of two human respiratory pathogens, identifying both novel and previously validated PVCs.

Download Full-text

Surrogate Models for Predicting the Performance of a Liquid-Liquid Cylindrical Cyclone Separator

Process Industries ◽

10.1115/imece2006-14759 ◽

2006 ◽

Author(s):

Jorge E. Pacheco ◽

Miguel A. Reyes

Keyword(s):

Flow Pattern ◽

Mean Squared Error ◽

Surrogate Models ◽

Mechanistic Models ◽

Water Mixture ◽

Inlet Flow ◽

Prediction Tools ◽

Squared Error ◽

Cylindrical Cyclone ◽

Oil Water

Liquid-Liquid Cylindrical Cyclone (LLCC) separators are devices used in the petroleum industry to extract a portion of the water from the oil-water mixture obtained at the well. The oil-water mixture entering the separator is divided due to centrifugal and buoyancy forces in an upper (oil rich) exit and a bottom (water rich) exit. The advantages in size and cost compared with traditional vessel type static separators are significant. The use of LLCC separators has not been widespread due to the lack of proven performance prediction tools. Mechanistic models have been developed over the years as tools for predicting the behavior of these separators. These mechanistic models are highly dependent on the inlet flow pattern prediction. Thus, for each specific inlet flow pattern a sub-model has to be developed. The use of surrogate models will result in prediction tools that are accurate over a wider range of operational conditions. We propose in this study to use surrogate models based on a minimum-mean-squared-error method of spatial prediction known as Kriging. Kriging models have been used in different applications ranging from structural optimization, conceptual design, multidisciplinary design optimization to mechanical and biomedical engineering. These models have been developed for deterministic data. They are targeted for applications where the available information is limited due to the cost of the experiments or the time consumed in numerical simulations. We propose to use these models with a different framework so that they can manage information from replications. For the LLCC separator a two-stage surrogate model is built based on the Bayesian surrogate multistage approach, which allows for data to be incorporated as the model is improved. Cross validation mean squared error measurements are analyzed and the model obtained shows good predicting capabilities. These surrogate models are efficient and versatile predicting tools that do not require information about the physical phenomena that drives the separation process.

Download Full-text

Face Image Quality Assessment: A Literature Survey

ACM Computing Surveys ◽

10.1145/3507901 ◽

2022 ◽

Author(s):

Torsten Schlett ◽

Christian Rathgeb ◽

Olaf Henniger ◽

Javier Galbally ◽

Julian Fierrez ◽

...

Keyword(s):

Deep Learning ◽

Image Quality ◽

Quality Assessment ◽

Image Quality Assessment ◽

Face Image ◽

Quality Data ◽

Learning Approaches ◽

The Face ◽

Future Work

The performance of face analysis and recognition systems depends on the quality of the acquired face data, which is influenced by numerous factors. Automatically assessing the quality of face data in terms of biometric utility can thus be useful to detect low-quality data and make decisions accordingly. This survey provides an overview of the face image quality assessment literature, which predominantly focuses on visible wavelength face image input. A trend towards deep learning based methods is observed, including notable conceptual differences among the recent approaches, such as the integration of quality assessment into face recognition models. Besides image selection, face image quality assessment can also be used in a variety of other application scenarios, which are discussed herein. Open issues and challenges are pointed out, i.a. highlighting the importance of comparability for algorithm evaluations, and the challenge for future work to create deep learning approaches that are interpretable in addition to providing accurate utility predictions.

Download Full-text

Identifying Protein Features Responsible for Improved Drug Repurposing Accuracies Using The CANDO Platform: Implications for Drug Design

10.20944/preprints201811.0429.v1 ◽

2018 ◽

Cited By ~ 2

Author(s):

William Mangione ◽

Ram Samudrala

Keyword(s):

Drug Design ◽

Drug Repurposing ◽

Biomedical Literature ◽

Learning Approaches ◽

Drug Candidates ◽

Protein Library ◽

Fold Reduction ◽

Ligand Interactions ◽

Novel Drug

Drug repurposing is a valuable tool for combating the slowing rates of novel therapeutic discovery. The Computational Analysis of Novel Drug Opportunities (CANDO) platform performs shotgun repurposing of 2030 indications/diseases using 3733 drugs/compounds to predict interactions with 46,784 proteins and relating them via proteomic interaction signatures. An accuracy is calculated by comparing interaction similarities of drugs approved for the same indications. We performed a unique subset analysis by breaking down the full protein library into smaller subsets and then recombining the best performing subsets into larger supersets. Up to 14% improvement in accuracy is seen upon benchmarking the supersets, representing a 100–1000 fold reduction in the number of proteins considered relative to the full library. Further analysis revealed that libraries comprised of proteins with more equitably diverse ligand interactions are important for describing compound behavior. Using one of these libraries to generate putative drug candidates against malaria results in more drugs that could be validated in the biomedical literature than the list suggested by the full protein library. Our work elucidates the role of particular protein subsets and corresponding ligand interactions that play a role in drug repurposing, with implications for drug design and machine learning approaches to improve the CANDO platform.

Download Full-text

Potential COVID-2019 3C-like Protease Inhibitors Designed Using Generative Deep Learning Approaches

10.26434/chemrxiv.11829102.v2 ◽

2020 ◽

Cited By ~ 8

Author(s):

Alex Zhavoronkov ◽

Vladimir Aladinskiy ◽

Alexander Zhebrak ◽

Bogdan Zagribelnyy ◽

Victor Terentiev ◽

...

Keyword(s):

Homology Modelling ◽

Drug Repurposing ◽

Hiv Protease ◽

Learning Approaches ◽

Protein Targets ◽

Chemical Libraries ◽

Internal Resources ◽

Novel Coronavirus ◽

Approved Drugs ◽

Novel Drug

<div> <div> <p>The emergence of the 2019 novel coronavirus (COVID-19), for which there is no vaccine or any known effective treatment created a sense of urgency for novel drug discovery approaches. One of the most important COVID-19 protein targets is the 3C-like protease for which the crystal structure is known. Most of the immediate efforts are focused on drug repurposing of known clinically-approved drugs and virtual screening for the molecules available from chemical libraries that may not work well. For example, the IC50 of lopinavir, an HIV protease inhibitor, against the 3C-like protease is approximately 50 micromolar, which is far from ideal. In an attempt to address this challenge, on January 28th, 2020 Insilico Medicine decided to utilize a part of its generative chemistry pipeline to design novel drug-like inhibitors of COVID-19 and started generation on January 30th. It utilized three of its previously validated generative chemistry approaches: crystal-derived pocked-based generator, homology modelling-based generation, and ligand-based generation. Novel druglike compounds generated using these approaches were published at <a href="http://www.insilico.com/ncov-sprint/">www.insilico.com/ncov-sprint/</a>. Several molecules will be synthesized and tested using the internal resources; however, the team is seeking collaborations to synthesize, test, and, if needed, optimize the published molecules. <br></p> </div> </div>

Download Full-text

Machine Learning Approaches for the Prioritization of Genomic Variants Impacting Pre-mRNA Splicing

10.20944/preprints201911.0085.v1 ◽

2019 ◽

Author(s):

Charles F Rowlands ◽

Diana Baralle ◽

Jamie M Ellingford

Keyword(s):

Machine Learning ◽

In Silico Analysis ◽

Mrna Splicing ◽

Comparative Approach ◽

Mendelian Disease ◽

Learning Approaches ◽

Basic Principles ◽

Prediction Tools ◽

Splicing Defects ◽

Made In

Defects in pre-mRNA splicing are frequently a cause of Mendelian disease. Despite the advent of next-generation sequencing, allowing a deeper insight into a patient’s variant landscape, the ability to characterize variants causing splicing defects has not progressed with the same speed. To address this, recent years have seen a sharp spike in the number of splice prediction tools leveraging machine learning approaches, leaving clinical geneticists with a plethora of choices for in silico analysis. In this Review, some basic principles of machine learning are introduced in the context of genomics and splicing analysis. A critical comparative approach is then used to describe seven recent machine learning-based splice prediction tools, revealing highly diverse approaches and common caveats. We find that, although great progress has been made in producing specific and sensitive tools, there is still much scope for personalized approaches to prediction of variant impact on splicing. Such approaches may increase diagnostic yields and underpin improvements to patient care.

Download Full-text

Machine Learning Approaches to Predict Peak Demand Days of Cardiovascular Admissions Considering Environmental Exposure

10.21203/rs.2.19636/v3 ◽

2020 ◽

Author(s):

Hang Qiu ◽

Lin Luo ◽

Ziqi Su ◽

Li Zhou ◽

Liya Wang ◽

...

Keyword(s):

Machine Learning ◽

Environmental Exposure ◽

Loss Function ◽

Ambient Air ◽

Quality Data ◽

Gradient Boosting ◽

Learning Approaches ◽

Learning Models ◽

Peak Demand ◽

Logarithmic Loss

Abstract Background: Accumulating evidence has linked environmental exposure, such as ambient air pollution and meteorological factors, to the development and severity of cardiovascular diseases (CVDs), resulting in increased healthcare demand. Effective prediction of demand for healthcare services, particularly those associated with peak events of CVDs, can be useful in optimizing the allocation of medical resources. However, few studies have attempted to adopt machine learning approaches with excellent predictive abilities to forecast the healthcare demand for CVDs. This study aims to develop and compare several machine learning models in predicting the peak demand days of CVDs admissions using the hospital admissions data, air quality data and meteorological data in Chengdu, China from 2015 to 2017.Methods: Six machine learning algorithms, including logistic regression (LR), support vector machine (SVM), artificial neural network (ANN), random forest (RF), extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM) were applied to build the predictive models with a unique feature set. The area under a receiver operating characteristic curve (AUC), logarithmic loss function, accuracy, sensitivity, specificity, precision, and F1 score were used to evaluate the predictive performances of the six models.Results: The LightGBM model exhibited the highest AUC (0.940, 95% CI: 0.900-0.980), which was significantly higher than that of LR (0.842, 95% CI: 0.783-0.901), SVM (0.834, 95% CI: 0.774-0.894) and ANN (0.890, 95% CI: 0.836-0.944), but did not differ significantly from that of RF (0.926, 95% CI: 0.879-0.974) and XGBoost (0.930, 95% CI: 0.878-0.982). In addition, the LightGBM has the optimal logarithmic loss function (0.218), accuracy (91.3%), specificity (94.1%), precision (0.695), and F1 score (0.725). Feature importance identification indicated that the contribution rate of meteorological conditions and air pollutants for the prediction was 32% and 43%, respectively.Conclusion: This study suggests that ensemble learning models, especially the LightGBM model, can be used to effectively predict the peak events of CVDs admissions, and therefore could be a very useful decision-making tool for medical resource management.

Download Full-text

Air quality data series estimation based on machine learning approaches for urban environments

Air Quality Atmosphere & Health ◽

10.1007/s11869-020-00925-4 ◽

2020 ◽

Author(s):

Alireza Rahimpour ◽

Jamil Amanollahi ◽

Chris G. Tzanis

Keyword(s):

Machine Learning ◽

Air Quality ◽

Urban Environments ◽

Quality Data ◽

Data Series ◽

Learning Approaches ◽

Series Estimation ◽

Air Quality Data

Download Full-text

Drug repurposing for COVID-19 using machine learning and mechanistic models of signal transduction circuits related to SARS-CoV-2 infection

Signal Transduction and Targeted Therapy ◽

10.1038/s41392-020-00417-y ◽

2020 ◽

Vol 5 (1) ◽

Author(s):

Carlos Loucera ◽

Marina Esteban-Medina ◽

Kinza Rian ◽

Matías M. Falco ◽

Joaquín Dopazo ◽

...

Keyword(s):

Machine Learning ◽

Signal Transduction ◽

Drug Repurposing ◽

Mechanistic Models

Download Full-text