A Critical Review of Spatial Predictive Modeling Process in Environmental Sciences with Reproducible Examples in R

Jin Li

doi:10.3390/app9102048

A Critical Review of Spatial Predictive Modeling Process in Environmental Sciences with Reproducible Examples in R

Applied Sciences ◽

10.3390/app9102048 ◽

2019 ◽

Vol 9 (10) ◽

pp. 2048 ◽

Cited By ~ 4

Author(s):

Jin Li

Keyword(s):

Predictive Model ◽

Predictive Modeling ◽

Predictive Models ◽

Predictive Accuracy ◽

Hybrid Methods ◽

R Package ◽

Environmental Sciences ◽

Modeling Process ◽

Predictive Methods ◽

Management And Conservation

Spatial predictive methods are increasingly being used to generate predictions across various disciplines in environmental sciences. Accuracy of the predictions is critical as they form the basis for environmental management and conservation. Therefore, improving the accuracy by selecting an appropriate method and then developing the most accurate predictive model(s) is essential. However, it is challenging to select an appropriate method and find the most accurate predictive model for a given dataset due to many aspects and multiple factors involved in the modeling process. Many previous studies considered only a portion of these aspects and factors, often leading to sub-optimal or even misleading predictive models. This study evaluates a spatial predictive modeling process, and identifies nine major components for spatial predictive modeling. Each of these nine components is then reviewed, and guidelines for selecting and applying relevant components and developing accurate predictive models are provided. Finally, reproducible examples using spm, an R package, are provided to demonstrate how to select and develop predictive models using machine learning, geostatistics, and their hybrid methods according to predictive accuracy for spatial predictive modeling; reproducible examples are also provided to generate and visualize spatial predictions in environmental sciences.

Download Full-text

Predicting skilled delivery service use in Ethiopia: dual application of logistic regression and machine learning algorithms

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-019-0942-5 ◽

2019 ◽

Vol 19 (1) ◽

Cited By ~ 1

Author(s):

Brook Tesfaye ◽

Suleman Atique ◽

Tariq Azim ◽

Mihiretu M. Kebede

Keyword(s):

Logistic Regression ◽

Predictive Model ◽

Predictive Models ◽

Service Use ◽

Predictive Accuracy ◽

Maternal Deaths ◽

Skilled Delivery ◽

Skilled Assistance ◽

Delivery Assistance ◽

Delivery Services

Abstract Background Skilled assistance during childbirth is essential to reduce maternal deaths. However, in Ethiopia, which is among the six countries contributing to more than half of the global maternal deaths, the coverage of births attended by skilled health personnel remains very low. The aim of this study was to identify determinants and develop a predictive model for skilled delivery service use in Ethiopia by applying logistic regression and machine-learning techniques. Methods Data from the 2016 Ethiopian Demographic and Health Survey (EDHS) was used for this study. Statistical Package for Social Sciences (SPSS) and Waikato Environment for Knowledge Analysis (WEKA) tools were used for logistic regression and model building respectively. Classification algorithms namely J48, Naïve Bayes, Support Vector Machine (SVM), and Artificial Neural Network (ANN) were used for model development. The validation of the predictive models was assessed using accuracy, sensitivity, specificity, and area under Receiver Operating Characteristics (ROC) curve. Results Only 27.7% women received skilled delivery assistance in Ethiopia. First antenatal care (ANC) [AOR = 1.83, 95% CI (1.24–2.69)], birth order [AOR = 0.22, 95% CI (0.11–0.46)], television ownership [AOR = 6.83, 95% CI (2.52–18.52)], contraceptive use [AOR = 1.92, 95% CI (1.26–2.97)], cost needed for healthcare [AOR = 2.17, 95% CI (1.47–3.21)], age at first birth [AOR = 1.96, 95% CI (1.31–2.94)], and age at first sex [AOR = 2.72, 95% CI (1.55–4.76)] were determinants for utilizing skilled delivery services during the childbirth. Predictive models were developed and the J48 model had superior predictive accuracy (98%), sensitivity (96%), specificity (99%) and, the area under ROC (98%). Conclusions First ANC and contraceptive uses were among the determinants of utilization of skilled delivery services. A predictive model was developed to forecast the likelihood of a pregnant woman seeking skilled delivery assistance; therefore, the predictive model can help to decide targeted interventions for a pregnant woman to ensure skilled assistance at childbirth. The model developed through the J48 algorithm has better predictive accuracy. Web-based application can be build based on results of this study.

Download Full-text

Study of the route correction system for the on-board navigation system of an unmanned aerial vehicle on the grounds of radar images of the terrain

Automation. Modern Techologies ◽

10.36652/0869-4931-2020-74-3-129-134 ◽

2020 ◽

pp. 129-134

Keyword(s):

Predictive Model ◽

Unmanned Aerial Vehicle ◽

Predictive Models ◽

Navigation System ◽

Error Compensation ◽

Compensation Scheme ◽

Radar Images ◽

Aerial Vehicle ◽

System Errors ◽

Correction System

The system of route correction of an unmanned aerial vehicle (UAV) is considered. For the route correction the on-board radar complex is used. In conditions of active interference, it is impossible to use radar images for the route correction so it is proposed to use the on-board navigation system with algorithmic correction. An error compensation scheme of the navigation system in the output signal using the algorithm for constructing a predictive model of the system errors is applied. The predictive model is building using the genetic algorithm and the method of group accounting of arguments. The quality comparison of the algorithms for constructing predictive models is carried out using mathematical modeling.

Download Full-text

Assessing elderly’s functional balance and mobility via analyzing data from waist-mounted tri-axial wearable accelerometers in timed up and go tests

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01463-4 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Lisha Yu ◽

Yang Zhao ◽

Hailiang Wang ◽

Tien-Lung Sun ◽

Terrence E. Murphy ◽

...

Keyword(s):

Predictive Models ◽

Cross Validation ◽

Motor Coordination ◽

Predictive Accuracy ◽

Short Form ◽

Community Dwelling ◽

Regularized Regression ◽

Balance Assessment ◽

Timed Up And Go ◽

Functional Balance

Abstract Background Poor balance has been cited as one of the key causal factors of falls. Timely detection of balance impairment can help identify the elderly prone to falls and also trigger early interventions to prevent them. The goal of this study was to develop a surrogate approach for assessing elderly’s functional balance based on Short Form Berg Balance Scale (SFBBS) score. Methods Data were collected from a waist-mounted tri-axial accelerometer while participants performed a timed up and go test. Clinically relevant variables were extracted from the segmented accelerometer signals for fitting SFBBS predictive models. Regularized regression together with random-shuffle-split cross-validation was used to facilitate the development of the predictive models for automatic balance estimation. Results Eighty-five community-dwelling older adults (72.12 ± 6.99 year) participated in our study. Our results demonstrated that combined clinical and sensor-based variables, together with regularized regression and cross-validation, achieved moderate-high predictive accuracy of SFBBS scores (mean MAE = 2.01 and mean RMSE = 2.55). Step length, gender, gait speed and linear acceleration variables describe the motor coordination were identified as significantly contributed variables of balance estimation. The predictive model also showed moderate-high discriminations in classifying the risk levels in the performance of three balance assessment motions in terms of AUC values of 0.72, 0.79 and 0.76 respectively. Conclusions The study presented a feasible option for quantitatively accurate, objectively measured, and unobtrusively collected functional balance assessment at the point-of-care or home environment. It also provided clinicians and elderly with stable and sensitive biomarkers for long-term monitoring of functional balance.

Download Full-text

Using Predictive Modeling and Supervised Machine Learning to Identify Patients at Risk for Venous Thromboembolism Following Posterior Lumbar Fusion

Global Spine Journal ◽

10.1177/21925682211019361 ◽

2021 ◽

pp. 219256822110193

Author(s):

Kevin Y. Wang ◽

Ijezie Ikwuezunma ◽

Varun Puvanesarajah ◽

Jacob Babu ◽

Adam Margalit ◽

...

Keyword(s):

Machine Learning ◽

At Risk ◽

Venous Thromboembolism ◽

Risk Stratification ◽

Predictive Model ◽

Predictive Modeling ◽

Lumbar Fusion ◽

Posterior Lumbar Fusion ◽

Single Level ◽

Patients At Risk

Study Design: Retrospective review. Objective: To use predictive modeling and machine learning to identify patients at risk for venous thromboembolism (VTE) following posterior lumbar fusion (PLF) for degenerative spinal pathology. Methods: Patients undergoing single-level PLF in the inpatient setting were identified in the National Surgical Quality Improvement Program database. Our outcome measure of VTE included all patients who experienced a pulmonary embolism and/or deep venous thrombosis within 30-days of surgery. Two different methodologies were used to identify VTE risk: 1) a novel predictive model derived from multivariable logistic regression of significant risk factors, and 2) a tree-based extreme gradient boosting (XGBoost) algorithm using preoperative variables. The methods were compared against legacy risk-stratification measures: ASA and Charlson Comorbidity Index (CCI) using area-under-the-curve (AUC) statistic. Results: 13, 500 patients who underwent single-level PLF met the study criteria. Of these, 0.95% had a VTE within 30-days of surgery. The 5 clinical variables found to be significant in the multivariable predictive model were: age > 65, obesity grade II or above, coronary artery disease, functional status, and prolonged operative time. The predictive model exhibited an AUC of 0.716, which was significantly higher than the AUCs of ASA and CCI (all, P < 0.001), and comparable to that of the XGBoost algorithm ( P > 0.05). Conclusion: Predictive analytics and machine learning can be leveraged to aid in identification of patients at risk of VTE following PLF. Surgeons and perioperative teams may find these tools useful to augment clinical decision making risk stratification tool.

Download Full-text

O-203 Application of machine learning to predict aneuploidy and mosaicism in embryos from in vitro fertilization (IVF) cycles

Human Reproduction ◽

10.1093/humrep/deab128.014 ◽

2021 ◽

Vol 36 (Supplement_1) ◽

Author(s):

J A Ortiz ◽

R Morales ◽

B Lledo ◽

E Garcia-Hernandez ◽

A Cascales ◽

...

Keyword(s):

Machine Learning ◽

Predictive Model ◽

Predictive Models ◽

Maternal Age ◽

The Other ◽

Predictor Variables ◽

Learning Models ◽

Male Factor ◽

Factors Associated ◽

Machine Learning Models

Abstract Study question Is it possible to predict the likelihood of an IVF embryo being aneuploid and/or mosaic using a machine learning algorithm? Summary answer There are paternal, maternal, embryonic and IVF-cycle factors that are associated with embryonic chromosomal status that can be used as predictors in machine learning models. What is known already The factors associated with embryonic aneuploidy have been extensively studied. Mostly maternal age and to a lesser extent male factor and ovarian stimulation have been related to the occurrence of chromosomal alterations in the embryo. On the other hand, the main factors that may increase the incidence of embryo mosaicism have not yet been established. The models obtained using classical statistical methods to predict embryonic aneuploidy and mosaicism are not of high reliability. As an alternative to traditional methods, different machine and deep learning algorithms are being used to generate predictive models in different areas of medicine, including human reproduction. Study design, size, duration The study design is observational and retrospective. A total of 4654 embryos from 1558 PGT-A cycles were included (January-2017 to December-2020). The trophoectoderm biopsies on D5, D6 or D7 blastocysts were analysed by NGS. Embryos with ≤25% aneuploid cells were considered euploid, between 25-50% were classified as mosaic and aneuploid with >50%. The variables of the PGT-A were recorded in a database from which predictive models of embryonic aneuploidy and mosaicism were developed. Participants/materials, setting, methods The main indications for PGT-A were advanced maternal age, abnormal sperm FISH and recurrent miscarriage or implantation failure. Embryo analysis were performed using Veriseq-NGS (Illumina). The software used to carry out all the analysis was R (RStudio). The library used to implement the different algorithms was caret. In the machine learning models, 22 predictor variables were introduced, which can be classified into 4 categories: maternal, paternal, embryonic and those specific to the IVF cycle. Main results and the role of chance The different couple, embryo and stimulation cycle variables were recorded in a database (22 predictor variables). Two different predictive models were performed, one for aneuploidy and the other for mosaicism. The predictor variable was of multi-class type since it included the segmental and whole chromosome alteration categories. The dataframe were first preprocessed and the different classes to be predicted were balanced. A 80% of the data were used for training the model and 20% were reserved for further testing. The classification algorithms applied include multinomial regression, neural networks, support vector machines, neighborhood-based methods, classification trees, gradient boosting, ensemble methods, Bayesian and discriminant analysis-based methods. The algorithms were optimized by minimizing the Log_Loss that measures accuracy but penalizing misclassifications. The best predictive models were achieved with the XG-Boost and random forest algorithms. The AUC of the predictive model for aneuploidy was 80.8% (Log_Loss 1.028) and for mosaicism 84.1% (Log_Loss: 0.929). The best predictor variables of the models were maternal age, embryo quality, day of biopsy and whether or not the couple had a history of pregnancies with chromosomopathies. The male factor only played a relevant role in the mosaicism model but not in the aneuploidy model. Limitations, reasons for caution Although the predictive models obtained can be very useful to know the probabilities of achieving euploid embryos in an IVF cycle, increasing the sample size and including additional variables could improve the models and thus increase their predictive capacity. Wider implications of the findings Machine learning can be a very useful tool in reproductive medicine since it can allow the determination of factors associated with embryonic aneuploidies and mosaicism in order to establish a predictive model for both. To identify couples at risk of embryo aneuploidy/mosaicism could benefit them of the use of PGT-A. Trial registration number Not Applicable

Download Full-text

Predictive Modeling for Adverse Events and Risk Stratification Programs for People Receiving Cancer Treatment

JCO Oncology Practice ◽

10.1200/op.21.00198 ◽

2021 ◽

pp. OP.21.00198

Author(s):

Chelsea K. Osterman ◽

Hanna K. Sanoff ◽

William A. Wood ◽

Megan Fasold ◽

Jennifer Elston Lafata

Keyword(s):

Clinical Practice ◽

Risk Stratification ◽

Cancer Treatment ◽

Acute Care ◽

Predictive Modeling ◽

Predictive Models ◽

Emergency Department Visits ◽

Targeted Interventions ◽

Risk Algorithm ◽

Oncology Care

Emergency department visits and hospitalizations are common among people receiving cancer treatment, accounting for a large proportion of spending in oncology care and negatively affecting quality of life. As oncology care shifts toward value- and quality-based payment models, there is a need to develop interventions that can prevent these costly and low-value events among people receiving cancer treatment. Risk stratification programs have the potential to address this need and optimally would consist of three components: (1) a risk stratification algorithm that accurately identifies patients with modifiable risk(s), (2) intervention(s) that successfully reduce this risk, and (3) the ability to implement the risk algorithm and intervention(s) in an adaptable and sustainable way. Predictive modeling is a common method of risk stratification, and although a number of predictive models have been developed for use in oncology care, they have rarely been tested alongside corresponding interventions or developed with implementation in clinical practice as an explicit consideration. In this article, we review the available published predictive models for treatment-related toxicity or acute care events among people receiving cancer treatment and highlight challenges faced when attempting to use these models in practice. To move the field of risk-stratified oncology care forward, we argue that it is critical to evaluate predictive models alongside targeted interventions that address modifiable risks and to demonstrate that these two key components can be implemented within clinical practice to avoid unplanned acute care events among people receiving cancer treatment.

Download Full-text

Real-world observational study of assessment of CHA2DS2-VASc, C2HEST and HAVOC scores for atrial fibrillation among patients with rheumatological disorders: a nationwide analysis

Postgraduate Medical Journal ◽

10.1136/postgradmedj-2021-140754 ◽

2021 ◽

pp. postgradmedj-2021-140754

Author(s):

Wei Syun Hu ◽

Cheng Li Lin

Keyword(s):

Atrial Fibrillation ◽

Retrospective Study ◽

Predictive Models ◽

Operating Characteristic ◽

Predictive Accuracy ◽

Scoring Systems ◽

Gray Model ◽

Discriminative Ability ◽

Discriminatory Ability ◽

Rheumatological Disease

PurposeThis is a nationwide-based retrospective study aiming to compare the three different scoring systems (CHA2DS2-VASc, C2HEST and HAVOC scores) in the prediction of atrial fibrillation (AF) in patients with rheumatological disease.MethodsWe used the Fine and Gray model to estimate the risk of AF (subhazard ratio and 95% CI). The predictive accuracy and discriminatory ability of the predictive model were evaluated by receiver operating characteristic (ROC) curve.ResultsAmong the three predictive models, the model using CHA2DS2-VASc score had the better discriminative ability with an ROC of 0.79. The model with C2HEST score had an ROC of 0.78. The discriminative ability of the HAVOC score was 0.77, estimated by ROC.ConclusionWe concluded the CHA2DS2-VASc score has better performance in predicting AF compared with C2HEST score or HAVOC score.

Download Full-text

The effect of Kurtosis on the accuracy of artificial neural network predictive model

MATEC Web of Conferences ◽

10.1051/matecconf/201820402018 ◽

2018 ◽

Vol 204 ◽

pp. 02018

Author(s):

Aisyah Larasati ◽

Anik Dwiastutik ◽

Darin Ramadhanti ◽

Aal Mahardika

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Predictive Model ◽

Predictive Models ◽

Output Layer ◽

Input Layer ◽

Artificial Neural ◽

Misclassification Rates ◽

Hidden Layer ◽

Program Data

This study aims to explore the effect of kurtosis level of the data in the output layer on the accuracy of artificial neural network predictive models. The artificial neural network predictive models are comprised of one node in the output layer and six nodes in the input layer. The number of hidden layer is automatically built by the program. Data are generated using simulation approach. The results show that the kurtosis level of the node in the output layer is significantly affect the accuracy of the artificial neural network predictive model. Platycurtic and leptocurtic data has significantly higher misclassification rates than mesocurtic data. However, the misclassification rates between platycurtic and leptocurtic is not significantly different. Thus, data distribution with kurtosis nearly to zero results in a better ANN predictive model.

Download Full-text

La misura dell'attendibilitŕ dei modelli di previsione e l'unitŕ del sapere

FUTURIBILI ◽

10.3280/fu2008-001006 ◽

2009 ◽

pp. 98-107

Author(s):

Luciana Bozzo

Keyword(s):

Urban Development ◽

Predictive Model ◽

Predictive Models ◽

Territorial Planning ◽

Private Citizens ◽

Development Plans ◽

Unity Of Knowledge ◽

Science Cafe

- The reliability of predictive models is assured by the ability to establish a unity of knowledge, or rather of many branches of knowledge. This is the idea that leads the author to reflect on the prediction derived first of all from the "science café", defined as "a talking shop for scholars from a range of disciplines", who represent many branches of knowledge which are in fact a complete whole - "knowledge". The background for the predictive model discussed here is territorial planning, which encompasses an instrumental-explanatory component, a predictive component and an ideal. The construction of the predictive model and the degree of its reliability are produced by the process of unifying knowledge, and this confluence derives from knowledge of geographers, biologists, chemists, engineers, architects, agronomists, sociologists and private citizens. General Urban Development Plans stand as the instrumental and predictive model in which a certain unification of knowledge - at least operational - is achieved.

Download Full-text

Factors Associated With Unplanned Acute Care Services for Patients With Newly Diagnosed Hematologic Malignancies

JCO Clinical Cancer Informatics ◽

10.1200/cci.21.00110 ◽

2021 ◽

pp. 1197-1206

Author(s):

Kai Zu ◽

Kristina L. Greenwood ◽

Joyce C. LaMori ◽

Besa Smith ◽

Tyler Smith ◽

...

Keyword(s):

Acute Care ◽

Predictive Models ◽

Psychosocial Functioning ◽

Predictive Accuracy ◽

Care Utilization ◽

Medical Complication ◽

Quality Reporting ◽

Care Services ◽

Acute Care Services ◽

The Impact

PURPOSE This study evaluated risk factors predicting unplanned 30-day acute service utilization among adults subsequent to hospitalization for a new diagnosis of leukemia, lymphoma, or myeloma. This study explored the prevalence of medical complications (aligned with OP-35 measure specifications from the Centers for Medicare & Medicaid Services [CMS] Hospital Outpatient Quality Reporting Program) and the potential impact of psychosocial factors on unplanned acute care utilization. METHODS This study included 933 unique patients admitted to three acute care inpatient facilities within a nonprofit community-based health care system in southern California from 2012 to 2017. Integrated comprehensive data elements from electronic medical records and facility oncology registries were leveraged for univariate statistics, predictive models constructed using multivariable logistic regression, and further exploratory data mining, with predictive accuracy of the models measured with c-statistics. RESULTS The mean age of study participants was 65 years, and 55.1% were male. Specific diagnoses were lymphoma (48.7%), leukemia (35.2%), myeloma (14.0%), and mixed types (2.1%). Approximately one fifth of patients received unplanned acute care services within 30 days postdischarge, and over half of these patients presented with one or more symptoms associated with the CMS medical complication measure. The predictive models, with c-statistics ranging from 0.7 and above for each type of hematologic malignancy, indicated good predictive qualities with the impact of psychosocial functioning on the use of acute care services ( P values < .05), including lack of consult for social work during initial admission (lymphoma or myeloma), history of counseling or use of psychotropic medications (lymphoma), and past substance use (myeloma). CONCLUSION This study provides insights into patient-related factors that may inform a proactive approach to improve health outcomes, such as enhanced care transition, monitoring, and support interventions.

Download Full-text