Development of Models for Prompt Responses from Natural Disasters

Kichul Jung; Daeryong Park; Sangki Park

doi:10.3390/su12187803

Development of Models for Prompt Responses from Natural Disasters

Sustainability ◽

10.3390/su12187803 ◽

2020 ◽

Vol 12 (18) ◽

pp. 7803

Author(s):

Kichul Jung ◽

Daeryong Park ◽

Sangki Park

Keyword(s):

Natural Disasters ◽

Structural Damage ◽

Model Performance ◽

Gaussian Process Regression ◽

Principal Component ◽

Fragility Curve ◽

Support Vector ◽

Displacement Estimation ◽

Rapid Responses ◽

Story Drift

This study aims to provide an enhanced model for rapid responses from natural disasters by estimating the maximum structural displacement. The linear regression, support vector machine, and Gaussian process regression (GPR) models were applied to obtain displacement estimates. Further, normalization (NM) and standardization (SD) of variables, and principal component analysis (PCA) were applied to improve model performance. The k-fold cross-validation approach was used to assess the results from the models based on the root-mean-square error and the R-squared indices. According to the results, the GPR model with NM and SD tended to provide the best estimates among the three models. The model that was based on a PCA value of 97% yielded better displacement estimation than the models with PCA values of 95% and 100%. Based on the displacement estimation, the maximum inter-story drift ratio was used to produce the fragility curve that can be used for risk assessment. The fragility curve parameters obtained from the actual numerical and predicted models were investigated and yielded similar responses. The proposed model can thus provide accurate and quick responses in disaster case by rapidly predicting the structural damage information.

Download Full-text

Gaussian Process Regression-Based Structural Response Model and Its Application to Regional Damage Assessment

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10090574 ◽

2021 ◽

Vol 10 (9) ◽

pp. 574

Author(s):

Sangki Park ◽

Kichul Jung

Keyword(s):

Damage Assessment ◽

Structural Damage ◽

Seismic Waves ◽

Mean Squared Error ◽

Fragility Curves ◽

Model Performance ◽

Structural Response ◽

Gaussian Process Regression ◽

Ground Acceleration ◽

Maximum Displacement

Seismic activities are serious disasters that induce natural hazards resulting in an incalculable amount of damage to properties and millions of deaths. Typically, seismic risk assessment can be performed by means of structural damage information computed based on the maximum displacement of the structure. In this study, machine learning models based on GPR are developed in order to estimate the maximum displacement of the structures from seismic activities and then used to construct fragility curves as an application. During construction of the models, 13 features of seismic waves are considered, and six wave features are selected to establish the seismic models with the correlation analysis normalizing the variables with the peak ground acceleration. Two models for six-floor and 13-floor buildings are developed, and a sensitivity analysis is performed to identify the relationship between prediction accuracy and sampling size. A 10-fold cross-validation method is used to evaluate the model performance, using the R-squared, root mean squared error, Nash criterion, and mean bias. Results of the six-parameter-based model apparently indicate a similar performance to that of the 13-parameter-based model for the two types of buildings. The model for the six-floor building affords a steadily enhanced performance by increasing the sampling size, while the model for the 13-floor building shows a significantly improved performance with a sampling size of over 200. The results indicate that the heighted structure requires a larger sampling size because it has more degrees of freedom that can influence the model performance. Finally, the proposed models are successfully constructed to estimate the maximum displacement, and applied to obtain fragility curves with various performance levels. Then, the regional seismic damage is assessed in Gyeonjgu city of South Korea as an application of the developed models. The damage assessment with the fragility curve provides the structural response from the seismic activities, which can assist in minimizing damage.

Download Full-text

Adaptable and Explainable Predictive Maintenance: Semi-Supervised Deep Learning for Anomaly Detection and Diagnosis in Press Machine Data

Applied Sciences ◽

10.3390/app11167376 ◽

2021 ◽

Vol 11 (16) ◽

pp. 7376

Author(s):

Oscar Serradilla ◽

Ekhi Zugasti ◽

Julian Ramirez de Okariz ◽

Jon Rodriguez ◽

Urko Zurutuza

Keyword(s):

Anomaly Detection ◽

Null Space ◽

Model Performance ◽

Principal Component ◽

Predictive Maintenance ◽

Data Driven ◽

Support Vector ◽

Operational Conditions ◽

Detection Model ◽

Cluster Data

Predictive maintenance (PdM) has the potential to reduce industrial costs by anticipating failures and extending the work life of components. Nowadays, factories are monitoring their assets and most collected data belong to correct working conditions. Thereby, semi-supervised data-driven models are relevant to enable PdM application by learning from assets’ data. However, their main challenges for application in industry are achieving high accuracy on anomaly detection, diagnosis of novel failures, and adaptability to changing environmental and operational conditions (EOC). This article aims to tackle these challenges, experimenting with algorithms in press machine data of a production line. Initially, state-of-the-art and classic data-driven anomaly detection model performance is compared, including 2D autoencoder, null-space, principal component analysis (PCA), one-class support vector machines (OC-SVM), and extreme learning machine (ELM) algorithms. Then, diagnosis tools are developed supported on autoencoder’s latent space feature vector, including clustering and projection algorithms to cluster data of synthetic failure types semi-supervised. In addition, explainable artificial intelligence techniques have enabled to track the autoencoder’s loss with input data to detect anomalous signals. Finally, transfer learning is applied to adapt autoencoders to changing EOC data of the same process. The data-driven techniques used in this work can be adapted to address other industrial use cases, helping stakeholders gain trust and thus promote the adoption of data-driven PdM systems in smart factories.

Download Full-text

Do we need different machine learning algorithms for QSAR modeling? A comprehensive assessment of 16 machine learning algorithms on 14 QSAR data sets

Briefings in Bioinformatics ◽

10.1093/bib/bbaa321 ◽

2020 ◽

Author(s):

Zhenxing Wu ◽

Minfeng Zhu ◽

Yu Kang ◽

Elaine Lai-Han Leung ◽

Tailong Lei ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Gaussian Process Regression ◽

Principal Component ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Data Sets ◽

Linear Svm

Abstract Although a wide variety of machine learning (ML) algorithms have been utilized to learn quantitative structure–activity relationships (QSARs), there is no agreed single best algorithm for QSAR learning. Therefore, a comprehensive understanding of the performance characteristics of popular ML algorithms used in QSAR learning is highly desirable. In this study, five linear algorithms [linear function Gaussian process regression (linear-GPR), linear function support vector machine (linear-SVM), partial least squares regression (PLSR), multiple linear regression (MLR) and principal component regression (PCR)], three analogizers [radial basis function support vector machine (rbf-SVM), K-nearest neighbor (KNN) and radial basis function Gaussian process regression (rbf-GPR)], six symbolists [extreme gradient boosting (XGBoost), Cubist, random forest (RF), multiple adaptive regression splines (MARS), gradient boosting machine (GBM), and classification and regression tree (CART)] and two connectionists [principal component analysis artificial neural network (pca-ANN) and deep neural network (DNN)] were employed to learn the regression-based QSAR models for 14 public data sets comprising nine physicochemical properties and five toxicity endpoints. The results show that rbf-SVM, rbf-GPR, XGBoost and DNN generally illustrate better performances than the other algorithms. The overall performances of different algorithms can be ranked from the best to the worst as follows: rbf-SVM > XGBoost > rbf-GPR > Cubist > GBM > DNN > RF > pca-ANN > MARS > linear-GPR ≈ KNN > linear-SVM ≈ PLSR > CART ≈ PCR ≈ MLR. In terms of prediction accuracy and computational efficiency, SVM and XGBoost are recommended to the regression learning for small data sets, and XGBoost is an excellent choice for large data sets. We then investigated the performances of the ensemble models by integrating the predictions of multiple ML algorithms. The results illustrate that the ensembles of two or three algorithms in different categories can indeed improve the predictions of the best individual ML algorithms.

Download Full-text

Soft sensor based on Gaussian process regression and its application in erythromycin fermentation process

Chemical Industry and Chemical Engineering Quarterly ◽

10.2298/ciceq150125026m ◽

2016 ◽

Vol 22 (2) ◽

pp. 127-135 ◽

Cited By ~ 4

Author(s):

Congli Mei ◽

Ming Yang ◽

Dongxin Shu ◽

Hui Jiang ◽

Guohai Liu ◽

...

Keyword(s):

Gaussian Process ◽

High Performance ◽

Fermentation Process ◽

Low Cost ◽

Gaussian Process Regression ◽

Principal Component ◽

Soft Sensor ◽

Support Vector ◽

Soft Sensors ◽

Uncertainty Measurement

Erythromycin fermentation process is a typical microbial fermentation process. Soft sensors can be used to estimate biomass of Erythromycin fermentation process for their relative low cost, simple development, and ability to predict difficult-to-measure variables. However, traditional soft sensors, e.g. artificial neural network (ANN) soft sensors, support vector machine (SVM) soft sensors, etc., cannot represent the uncertainty (measurement precision) of outputs. That results in difficulties in practice. Gaussian process regression (GPR) provides a novel framework to solve regression problems. The output uncertainty of a GPR model follows Gaussian distribution, expressed in terms of mean and variance. The mean represents the predicted output. The variance can be viewed as the measure of confidence in the predicted output that distinguishes the GPR from NN and SVM soft sensor models. We proposed a systematic approach based on GPR and principal component analysis (PCA) to establish a soft sensor to estimate biomass of Erythromycin fermentation process. Simulations on industrial data from an Erythromycin fermentation process show the proposed GPR soft sensor has high performance of modeling the uncertainty of estimates.

Download Full-text

QSAR Study of PARP Inhibitors by GA-MLR, GA-SVM and GA-ANN Approaches

Current Analytical Chemistry ◽

10.2174/1573411016999200518083359 ◽

2020 ◽

Vol 16 (8) ◽

pp. 1088-1105

Author(s):

Nafiseh Vahedi ◽

Majid Mohammadhosseini ◽

Mehdi Nekoei

Keyword(s):

Present Report ◽

Principal Component ◽

Parp Inhibitors ◽

Support Vector ◽

Ann Model ◽

Statistical Parameters ◽

Qsar Study ◽

Data Set ◽

Test Set ◽

Non Linear

Background: The poly(ADP-ribose) polymerases (PARP) is a nuclear enzyme superfamily present in eukaryotes. Methods: In the present report, some efficient linear and non-linear methods including multiple linear regression (MLR), support vector machine (SVM) and artificial neural networks (ANN) were successfully used to develop and establish quantitative structure-activity relationship (QSAR) models capable of predicting pEC50 values of tetrahydropyridopyridazinone derivatives as effective PARP inhibitors. Principal component analysis (PCA) was used to a rational division of the whole data set and selection of the training and test sets. A genetic algorithm (GA) variable selection method was employed to select the optimal subset of descriptors that have the most significant contributions to the overall inhibitory activity from the large pool of calculated descriptors. Results: The accuracy and predictability of the proposed models were further confirmed using crossvalidation, validation through an external test set and Y-randomization (chance correlations) approaches. Moreover, an exhaustive statistical comparison was performed on the outputs of the proposed models. The results revealed that non-linear modeling approaches, including SVM and ANN could provide much more prediction capabilities. Conclusion: Among the constructed models and in terms of root mean square error of predictions (RMSEP), cross-validation coefficients (Q2 LOO and Q2 LGO), as well as R2 and F-statistical value for the training set, the predictive power of the GA-SVM approach was better. However, compared with MLR and SVM, the statistical parameters for the test set were more proper using the GA-ANN model.

Download Full-text

Application of Machine Learning in Animal Disease Analysis and Prediction

Current Bioinformatics ◽

10.2174/1574893615999200728195613 ◽

2020 ◽

Vol 15 ◽

Author(s):

Shuwen Zhang ◽

Qiang Su ◽

Qin Chen

Keyword(s):

Machine Learning ◽

Unsupervised Learning ◽

Supervised Learning ◽

Clustering Algorithm ◽

Principal Component ◽

Support Vector ◽

Animal Disease ◽

Human Beings ◽

Animal Diseases ◽

Disease Analysis

Abstract: Major animal diseases pose a great threat to animal husbandry and human beings. With the deepening of globalization and the abundance of data resources, the prediction and analysis of animal diseases by using big data are becoming more and more important. The focus of machine learning is to make computers learn how to learn from data and use the learned experience to analyze and predict. Firstly, this paper introduces the animal epidemic situation and machine learning. Then it briefly introduces the application of machine learning in animal disease analysis and prediction. Machine learning is mainly divided into supervised learning and unsupervised learning. Supervised learning includes support vector machines, naive bayes, decision trees, random forests, logistic regression, artificial neural networks, deep learning, and AdaBoost. Unsupervised learning has maximum expectation algorithm, principal component analysis hierarchical clustering algorithm and maxent. Through the discussion of this paper, people have a clearer concept of machine learning and understand its application prospect in animal diseases.

Download Full-text

Predicting Future Occurrence of Acute Hypotensive Episodes Using Noninvasive and Invasive Features

Military Medicine ◽

10.1093/milmed/usaa418 ◽

2021 ◽

Vol 186 (Supplement_1) ◽

pp. 445-451

Author(s):

Yifei Sun ◽

Navid Rashedi ◽

Vikrant Vaze ◽

Parikshit Shah ◽

Ryan Halter ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Real World ◽

Short Term Memory ◽

Model Performance ◽

Learning Technologies ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor ◽

Continuous Map

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.

Download Full-text

Modeling of Cutting Force in the Turning of AISI 4340 Using Gaussian Process Regression Algorithm

Applied Sciences ◽

10.3390/app11094055 ◽

2021 ◽

Vol 11 (9) ◽

pp. 4055

Author(s):

Mahdi S. Alajmi ◽

Abdullah M. Almeshal

Keyword(s):

Gaussian Process ◽

Cutting Force ◽

Predictive Accuracy ◽

Gaussian Process Regression ◽

Machining Process ◽

Support Vector ◽

Process Data ◽

Cutting Force Prediction ◽

Artificial Neural Network Ann ◽

Aisi 4340

Machining process data can be utilized to predict cutting force and optimize process parameters. Cutting force is an essential parameter that has a significant impact on the metal turning process. In this study, a cutting force prediction model for turning AISI 4340 alloy steel was developed using Gaussian process regression (GPR), support vector machines (SVM), and artificial neural network (ANN) methods. The GPR simulations demonstrated a reliable prediction of surface roughness for the dry turning method with R2 = 0.9843, MAPE = 5.12%, and RMSE = 1.86%. Performance comparisons between GPR, SVM, and ANN show that GPR is an effective method that can ensure high predictive accuracy of the cutting force in the turning of AISI 4340.

Download Full-text

Possibility of Human Gender Recognition Using Raman Spectra of Teeth

Molecules ◽

10.3390/molecules26133983 ◽

2021 ◽

Vol 26 (13) ◽

pp. 3983

Author(s):

Ozren Gamulin ◽

Marko Škrabić ◽

Kristina Serec ◽

Matej Par ◽

Marija Baković ◽

...

Keyword(s):

Raman Spectra ◽

Principal Component ◽

Support Vector ◽

Gender Recognition ◽

Proof Of Concept ◽

Male And Female ◽

Tooth Type ◽

Tooth Apex ◽

The Difference

Gender determination of the human remains can be very challenging, especially in the case of incomplete ones. Herein, we report a proof-of-concept experiment where the possibility of gender recognition using Raman spectroscopy of teeth is investigated. Raman spectra were recorded from male and female molars and premolars on two distinct sites, tooth apex and anatomical neck. Recorded spectra were sorted into suitable datasets and initially analyzed with principal component analysis, which showed a distinction between spectra of male and female teeth. Then, reduced datasets with scores of the first 20 principal components were formed and two classification algorithms, support vector machine and artificial neural networks, were applied to form classification models for gender recognition. The obtained results showed that gender recognition with Raman spectra of teeth is possible but strongly depends both on the tooth type and spectrum recording site. The difference in classification accuracy between different tooth types and recording sites are discussed in terms of the molecular structure difference caused by the influence of masticatory loading or gender-dependent life events.

Download Full-text

A Methodology Based on FT-IR Data Combined with Random Forest Model to Generate Spectralprints for the Characterization of High-Quality Vinegars

Foods ◽

10.3390/foods10061411 ◽

2021 ◽

Vol 10 (6) ◽

pp. 1411

Author(s):

José Luis P. Calle ◽

Marta Ferreiro-González ◽

Ana Ruiz-Rodríguez ◽

Gerardo F. Barbero ◽

José Á. Álvarez ◽

...

Keyword(s):

Random Forest ◽

Raw Materials ◽

Principal Component ◽

Hierarchical Cluster ◽

Raw Material ◽

Support Vector ◽

Protected Designation Of Origin ◽

Ft Ir

Sherry wine vinegar is a Spanish gourmet product under Protected Designation of Origin (PDO). Before a vinegar can be labeled as Sherry vinegar, the product must meet certain requirements as established by its PDO, which, in this case, means that it has been produced following the traditional solera and criadera ageing system. The quality of the vinegar is determined by many factors such as the raw material, the acetification process or the aging system. For this reason, mainly producers, but also consumers, would benefit from the employment of effective analytical tools that allow precisely determining the origin and quality of vinegar. In the present study, a total of 48 Sherry vinegar samples manufactured from three different starting wines (Palomino Fino, Moscatel, and Pedro Ximénez wine) were analyzed by Fourier-transform infrared (FT-IR) spectroscopy. The spectroscopic data were combined with unsupervised exploratory techniques such as hierarchical cluster analysis (HCA) and principal component analysis (PCA), as well as other nonparametric supervised techniques, namely, support vector machine (SVM) and random forest (RF), for the characterization of the samples. The HCA and PCA results present a clear grouping trend of the vinegar samples according to their raw materials. SVM in combination with leave-one-out cross-validation (LOOCV) successfully classified 100% of the samples, according to the type of wine used for their production. The RF method allowed selecting the most important variables to develop the characteristic fingerprint (“spectralprint”) of the vinegar samples according to their starting wine. Furthermore, the RF model reached 100% accuracy for both LOOCV and out-of-bag (OOB) sets.

Download Full-text