A Novel Methodology for Hydrocarbon Depth Prediction in Seabed Logging: Gaussian Process-Based Inverse Modeling of Electromagnetic Data

Hanita Daud; Muhammad Naeim Mohd Aris; Khairul Arifin Mohd Noh; Sarat Chandra Dass

doi:10.3390/app11041492

A Novel Methodology for Hydrocarbon Depth Prediction in Seabed Logging: Gaussian Process-Based Inverse Modeling of Electromagnetic Data

Applied Sciences ◽

10.3390/app11041492 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1492

Author(s):

Hanita Daud ◽

Muhammad Naeim Mohd Aris ◽

Khairul Arifin Mohd Noh ◽

Sarat Chandra Dass

Keyword(s):

Gaussian Process ◽

Inverse Modeling ◽

Mean Squared Error ◽

Computational Effort ◽

Percentage Error ◽

Depth Prediction ◽

Receiver System ◽

Realistic Representation ◽

And Performance ◽

Gp Model

Seabed logging (SBL) is an application of electromagnetic (EM) waves for detecting potential marine hydrocarbon-saturated reservoirs reliant on a source–receiver system. One of the concerns in modeling and inversion of the EM data is associated with the need for realistic representation of complex geo-electrical models. Concurrently, the corresponding algorithms of forward modeling should be robustly efficient with low computational effort for repeated use of the inversion. This work proposes a new inversion methodology which consists of two frameworks, namely Gaussian process (GP), which allows a greater flexibility in modeling a variety of EM responses, and gradient descent (GD) for finding the best minimizer (i.e., hydrocarbon depth). Computer simulation technology (CST), which uses finite element (FE), was exploited to generate prior EM responses for the GP to evaluate EM profiles at “untried” depths. Then, GD was used to minimize the mean squared error (MSE) where GP acts as its forward model. Acquiring EM responses using mesh-based algorithms is a time-consuming task. Thus, this work compared the time taken by the CST and GP in evaluating the EM profiles. For the accuracy and performance, the GP model was compared with EM responses modeled by the FE, and percentage error between the estimate and “untried” computer input was calculated. The results indicate that GP-based inverse modeling can efficiently predict the hydrocarbon depth in the SBL.

Download Full-text

Advanced Prediction of Roadway Broken Rock Zone Based on a Novel Hybrid Soft Computing Model Using Gaussian Process and Particle Swarm Optimization

Applied Sciences ◽

10.3390/app10176031 ◽

2020 ◽

Vol 10 (17) ◽

pp. 6031

Author(s):

Zhi Yu ◽

Xiuzhi Shi ◽

Jian Zhou ◽

Rendong Huang ◽

Yonggang Gou

Keyword(s):

Particle Swarm Optimization ◽

Gaussian Process ◽

Evaluation Method ◽

Mean Squared Error ◽

Particle Swarm ◽

Predictive Performance ◽

Support Design ◽

Swarm Optimization ◽

Roadway Stability ◽

Gp Model

A simple and accurate evaluation method of broken rock zone thickness (BRZT), which is usually used to describe the broken rock zone (BRZ), is meaningful, due to its ability to provide a reference for the roadway stability evaluation and support design. To create a relationship between various geological variables and the broken rock zone thickness (BRZT), the multiple linear regression (MLR), artificial neural network (ANN), Gaussian process (GP) and particle swarm optimization algorithm (PSO)-GP method were utilized, and the corresponding intelligence models were developed based on the database collected from various mines in China. Four variables including embedding depth (ED), drift span (DS), surrounding rock mass strength (RMS) and joint index (JI) were selected to train the intelligence model, while broken rock zone thickness (BRZT) is chosen as the output variable, and the k-fold cross-validation method was applied in the training process. After training, three validation metrics including variance account for (VAF), determination coefficient (R2) and root mean squared error (RMSE) were applied to describe the predictive performance of these developed models. After comparing performance based on a ranking method, the obtained results show that the PSO-GP model provides the best predictive performance in estimating broken rock zone thickness (BRZT). In addition, the sensitive effect of collected variables on broken rock zone thickness (BRZT) can be listed as JI, ED, DS and RMS, and JI was found to be the most sensitive factor.

Download Full-text

How to improve infectious disease prediction by integrating environmental data: an application of a novel ensemble analysis strategy to predict HFMD

Epidemiology and Infection ◽

10.1017/s0950268821000091 ◽

2021 ◽

Vol 149 ◽

Author(s):

Junwen Tao ◽

Yue Ma ◽

Xuefei Zhuang ◽

Qiang Lv ◽

Yaqiong Liu ◽

...

Keyword(s):

Variable Selection ◽

Mean Squared Error ◽

Moving Average ◽

Dynamic Bayesian Networks ◽

Environmental Data ◽

Control Measures ◽

Coefficient Of Determination ◽

Percentage Error ◽

Analysis Strategy ◽

Ensemble Analysis

Abstract This study proposed a novel ensemble analysis strategy to improve hand, foot and mouth disease (HFMD) prediction by integrating environmental data. The approach began by establishing a vector autoregressive model (VAR). Then, a dynamic Bayesian networks (DBN) model was used for variable selection of environmental factors. Finally, a VAR model with constraints (CVAR) was established for predicting the incidence of HFMD in Chengdu city from 2011 to 2017. DBN showed that temperature was related to HFMD at lags 1 and 2. Humidity, wind speed, sunshine, PM10, SO2 and NO2 were related to HFMD at lag 2. Compared with the autoregressive integrated moving average model with external variables (ARIMAX), the CVAR model had a higher coefficient of determination (R2, average difference: + 2.11%; t = 6.2051, P = 0.0003 < 0.05), a lower root mean-squared error (−24.88%; t = −5.2898, P = 0.0007 < 0.05) and a lower mean absolute percentage error (−16.69%; t = −4.3647, P = 0.0024 < 0.05). The accuracy of predicting the time-series shape was 88.16% for the CVAR model and 86.41% for ARIMAX. The CVAR model performed better in terms of variable selection, model interpretation and prediction. Therefore, it could be used by health authorities to identify potential HFMD outbreaks and develop disease control measures.

Download Full-text

Influence of Dextransucrase of Weissella Cibaria Nitcsk4 on Low Molecular Weight Dextran Yield: a Statistical Approach using Mixed Level Taguchi Design and Artificial Neural Network

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b1161.1292s219 ◽

2019 ◽

Vol 9 (2S2) ◽

pp. 657-664

Keyword(s):

Neural Network ◽

Molecular Weight ◽

Mean Squared Error ◽

Sucrose Concentration ◽

Average Molecular Weight ◽

Taguchi Design ◽

Percentage Error ◽

Low Molecular Weight ◽

Weight Average Molecular Weight ◽

Weissella Cibaria

In the present study, the influence of dextransucrase of Weissella cibaria NITCSK4 (DSWc4), sucrose concentration, and reaction temperature on the yield of low molecular weight dextran (LMWD-DexWc4) was investigated using mixed level Taguchi design and back propagation neural network (BPNN). BPNN model with three neurons in a hidden layer generated a low mean squared error (MSE). The determination coefficients (R2 -value) for ANN and Taguchi models were 0.991 and 0.998, respectively. Considering absolute average deviation (AAD) and MSE, Taguchi model is more adequate. Among three factors, the percentage yield of low molecular weight of dextran is invariably dependent on the sucrose concentration. The study suggested that a low sucrose concentration (3% w/v), DSWc4 (0.25 IU/ml) and slightly high temperature (35°C) ultimately favored the production of LMWD-DexWc4 (91.639%). LMW-DexWc4 produced by DSWc4 at optimized conditions was analyzed. The weight average molecular weight of LMW-DexWc4 was calculated using M-H expression, found to be 85775 (≈90 kDa). The relative percentage error between the number and weight average molecular weight was found to be less (4.42%). The polydispersity (PD) index of the LMW-DexWc4 was found to be 0.9576 and the value is close to 1. The PD value depicted that the molecular weight distribution of dextran was narrowly dispersed.

Download Full-text

Peramalan Jumlah Populasi Sapi Potong di Kalimantan Selatan Menggunakan Metode Moving Average, Exponential Smoothing dan Trend Analysis

Jurnal Teknologi Agro-Industri ◽

10.34128/jtai.v6i1.88 ◽

2019 ◽

Vol 6 (1) ◽

pp. 41

Author(s):

Jaka Darma Jaya

Keyword(s):

Trend Analysis ◽

Mean Squared Error ◽

Moving Average ◽

Exponential Smoothing ◽

Mean Absolute Percentage Error ◽

Percentage Error ◽

Absolute Percentage Error ◽

Polynomial Trend ◽

Squared Error

Perkembangan produksi daging sapi di Indonesia selama 30 tahun terakhir secara umum cenderung meningkat. Kebutuhan daging sapi di Indonesia masih belum bisa dicukupi oleh supply domestik, sehingga diperlukan impor daging sapi dari luar negeri. Diperlukan kajian tentang proyeksi ketersediaan populasi sapi potong di masa mendatang agar diambil kebijakan yang tepat dalam menjaga stabilitas dan keterpenuhan supply daging nasional. Penelitian ini bertujuan untuk melakukan peramalan jumlah populasi sapi potong menggunakan 3 (tiga) metode peramalan yaitu metode moving average, exponential smoothing dan trend analysis. Hasil peramalan ini selanjutnya diukur akurasinya menggunakan MAD (Mean Absolud Deviation), MSE (Mean Squared Error) dan MAPE (Mean Absolute Percentage Error). Proyeksi populasi sapi potong pada tahun 2019 (periode berikutnya) menggunakan 3 metode peramalan adalah: 195.100 (moving average); 218.225 (exponential smooting) dan 262.899 (trend analysis). Pengukuran akurasi menggunakan MAD, MSE dan MAPE menunjukkan bahwa metode peramalan jumlah populasi sapi potong yang paling akurat adalah peramalan menggunakan metode polynomial trend analysis (MAD 14.716,12; MSE 327.282.084,17; dan MAPE 0,09) karena memiliki tingkat kesalahan yang lebih kecil dibandingkan hasil peramalan menggunakan metode moving average dan exponential smoothing.

Download Full-text

RobOMP: Robust variants of Orthogonal Matching Pursuit for sparse representations

10.7287/peerj.preprints.27482v1 ◽

2019 ◽

Author(s):

Carlos A Loza

Keyword(s):

Mean Squared Error ◽

Matching Pursuit ◽

Synthetic Data ◽

Optimal Solution ◽

Parameter Tuning ◽

Weight Vector ◽

Orthogonal Matching Pursuit ◽

Main Mode ◽

Observation Matrix ◽

And Performance

Sparse coding aims to find a parsimonious representation of an example given an observation matrix or dictionary. In this regard, Orthogonal Matching Pursuit (OMP) provides an intuitive, simple and fast approximation of the optimal solution. However, its main building block is anchored on the minimization of the Mean Squared Error cost function (MSE). This approach is only optimal if the errors are distributed according to a Gaussian distribution without samples that strongly deviate from the main mode, i.e. outliers. If such assumption is violated, the sparse code will likely be biased and performance will degrade accordingly. In this paper, we introduce five robust variants of OMP (RobOMP) fully based on the theory of M-Estimators under a linear model. The proposed framework exploits efficient Iteratively Reweighted Least Squares (IRLS) techniques to mitigate the effect of outliers and emphasize the samples corresponding to the main mode of the data. This is done adaptively via a learned weight vector that models the distribution of the data in a robust manner. Experiments on synthetic data under several noise distributions and image recognition under different combinations of occlusion and missing pixels thoroughly detail the superiority of RobOMP over MSE-based approaches and similar robust alternatives. We also introduce a denoising framework based on robust, sparse and redundant representations that open the door to potential further applications of the proposed techniques. The five different variants of RobOMP do not require parameter tuning from the user and, hence, constitute principled alternatives to OMP.

Download Full-text

Gaussian Process-Based Response Surface Method for Slope Reliability Analysis

Advances in Civil Engineering ◽

10.1155/2019/9185756 ◽

2019 ◽

Vol 2019 ◽

pp. 1-11 ◽

Cited By ~ 1

Author(s):

Bin Hu ◽

Guo-shao Su ◽

Jianqing Jiang ◽

Yilong Xiao

Keyword(s):

Gaussian Process ◽

Reliability Analysis ◽

Response Surface ◽

Response Surface Method ◽

Failure Probability ◽

Limit State ◽

State Function ◽

Limit State Function ◽

Surface Method ◽

Gp Model

A new response surface method (RSM) for slope reliability analysis was proposed based on Gaussian process (GP) machine learning technology. The method involves the approximation of limit state function by the trained GP model and estimation of failure probability using the first-order reliability method (FORM). A small amount of training samples were firstly built by the limited equilibrium method for training the GP model. Then, the implicit limit state function of slope was approximated by the trained GP model. Thus, the implicit limit state function and its derivatives for slope stability analysis were approximated by the GP model with the explicit formulation. Furthermore, an iterative algorithm was presented to improve the precision of approximation of the limit state function at the region near the design point which contributes significantly to the failure probability. Results of four case studies including one nonslope and three slope problems indicate that the proposed method is more efficient to achieve reasonable accuracy for slope reliability analysis than the traditional RSM.

Download Full-text

Prediction of tensile strength of polymer carbon nanotube composites using practical machine learning method

Journal of Composite Materials ◽

10.1177/0021998320953540 ◽

2020 ◽

pp. 002199832095354 ◽

Cited By ~ 5

Author(s):

Tien-Thinh Le

Keyword(s):

Machine Learning ◽

Mechanical Properties ◽

Tensile Strength ◽

Carbon Nanotube ◽

Polymer Matrix ◽

Mean Squared Error ◽

Gaussian Process Regression ◽

Weight Fraction ◽

Percentage Error ◽

Input Variables

This paper is devoted to the development and construction of a practical Machine Learning (ML)-based model for the prediction of tensile strength of polymer carbon nanotube (CNTs) composites. To this end, a database was compiled from the available literature, composed of 11 input variables. The input variables for predicting tensile strength of nanocomposites were selected for the following main reasons: (i) type of polymer matrix, (ii) mechanical properties of polymer matrix, (iii) physical characteristics of CNTs, (iv) mechanical properties of CNTs and (v) incorporation parameters such as CNT weight fraction, CNT surface modification method and processing method. As the problem of prediction is highly dimensional (with 11 dimensions), the Gaussian Process Regression (GPR) model was selected and optimized by means of a parametric study. The correlation coefficient (R), Willmott’s index of agreement (IA), slope of regression, Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) were employed as error measurement criteria when training the GPR model. The GPR model exhibited good performance for both training and testing parts (RMSE = 5.982 and 5.327 MPa, MAE = 3.447 and 3.539 MPa, respectively). In addition, uncertainty analysis was also applied to estimate the prediction confidence intervals. Finally, the prediction capability of the GPR model with different ranges of values of input variables was investigated and discussed. For practical application, a Graphical User Interface (GUI) was developed in Matlab for predicting the tensile strength of nanocomposites.

Download Full-text

NIMG-52. UNCERTAINTY QUANTIFICATION IN RADIOMICS

Neuro-Oncology ◽

10.1093/neuonc/noz175.721 ◽

2019 ◽

Vol 21 (Supplement_6) ◽

pp. vi172-vi173

Author(s):

Lujia Wang ◽

Hyunsoo Yoon ◽

Andrea Hawkins-Daarud ◽

Kyle Singleton ◽

Kamala Clark-Swanson ◽

...

Keyword(s):

Gaussian Process ◽

Copy Number ◽

Prediction Accuracy ◽

Clinical Decision Making ◽

Diffusion Tensor ◽

Feature Space ◽

Training Sample ◽

P Value ◽

The Impact ◽

Gp Model

Abstract INTRODUCTION The quantification of intratumoral heterogeneity – through radiomics-based approaches - can help resolve the regionally distinct genetic drug targets that may co-exist within a single Glioblastoma (GBM) tumor. While this offers potential diagnostic value under the paradigm of individualized oncology, clinical decision-making must also consider the degree of uncertainty associated with each model. In this study, we evaluate the performance of a novel machine-learning (ML) algorithm, called Gaussian Process (GP) modeling, that can quantify the impact of multiple sources of uncertainty in ML model development and prediction accuracy, including variabilities in the copy number measurement, radiomics features, training sample characteristics, and training sample size. METHOD We collected 95 image-localized biopsies from 25 primary GBM patients. We coregistered stereotactic locations with preoperative multi-parametric MRI features (conventional MRI, DSC perfusion, Diffusion Tensor Imaging) to generate spatially matched pairs of MRI and copy number variants (CNV) for for each biopsy. We developed a Gaussian Process (GP) model to predict CNV for Epidermal Growth Factor Receptor (EGFR) based on MRI radiomic features in each patient. We used leave-one-patient-out cross validation to quantify prediction accuracy and model uncertainty. Spatial prediction and uncertainty (p-value) maps were overlaid to help visualize regional genetic variation of EGFR and uncertainty of the radiomic predictions. RESULT: The initial GP radiomics model for EGFR amplification (CNV > 3.5) produced a sensitivity of 0.8 and specificity of 0.8. Samples/regions associated with high uncertainty (p-value >0.05) correlated with either 1) extrapolation of radiomic features from the training set-defined feature space or 2) insufficient training samples in the feature space. CONCLUSION We present a ML-based model that quantifies spatial genetic heterogeneity in GBM, while also estimating model uncertainties that result from multi-source data variabilities. This approach lays the groundwork for prospective clinical integration of modeling-based diagnostic approaches in the paradigm of individualized medicine.

Download Full-text

A Model Tree-Based Vehicle Emission Model at Freeway Toll Plazas

Sustainability ◽

10.3390/su12218959 ◽

2020 ◽

Vol 12 (21) ◽

pp. 8959

Author(s):

Yueru Xu ◽

Chao Wang ◽

Yuan Zheng ◽

Zhuoqun Sun ◽

Zhirui Ye

Keyword(s):

Vehicle Emissions ◽

Polynomial Regression ◽

Mean Squared Error ◽

Percentage Error ◽

Vehicle Emission ◽

Emission Model ◽

Toll Plazas ◽

Model Tree ◽

Proposed Model ◽

Toll Collection

With the increased concern over sustainable development, many efforts have been made to alleviate air quality deterioration. Freeway toll plazas can cause serious pollution, due to the increased emissions caused by stop-and-go operations. Different toll collections and different fuel types obviously influence the vehicle emissions at freeway toll plazas. Therefore, this paper proposes a model tree-based vehicle emission model by considering these factors. On-road emissions data and vehicle operation data were obtained from two different freeway toll plazas. The statistical analysis indicates that different methods of toll collection and fuel types have significant impacts on vehicle emissions at freeway toll plazas. The performance of the proposed model was compared with a polynomial regression method. Based on the results, the mean absolute percentage error (MAPE), root mean squared error (RMSE), and mean absolute error (MAE) of the proposed model were all smaller, while the R-squared value increased from 0.714 to 0.833. Finally, the variations of vehicle emissions at different locations of freeway toll plazas were calculated and shown in heat maps. The results of this study can help better estimate the vehicle emissions and give advice to the development of electronic toll collection (ETC) lanes and relevant policies at freeway toll plazas.

Download Full-text

Hybrid Load Forecasting Using Gaussian Process Regression and Novel Residual Prediction

Applied Sciences ◽

10.3390/app10134588 ◽

2020 ◽

Vol 10 (13) ◽

pp. 4588 ◽

Cited By ~ 1

Author(s):

Cosmin Darab ◽

Turcu Antoniu ◽

Horia Gheorghe Beleiu ◽

Sorin Pavel ◽

Iulian Birou ◽

...

Keyword(s):

Power Systems ◽

Gaussian Process ◽

Electricity Markets ◽

Load Forecasting ◽

Gaussian Process Regression ◽

Influential Factors ◽

Percentage Error ◽

Weather Factors ◽

Empirical Wavelet Transform ◽

Utility Companies

Short-term electricity load forecasting has attracted considerable attention as a result of the crucial role that it plays in power systems and electricity markets. This paper presents a novel hybrid forecasting method that combines an autoregressive model with Gaussian process regression. Mixed-user, hourly, historical data are used to train, validate, and evaluate the model. The empirical wavelet transform was used to preprocess the data. Among the perturbing factors, the most influential predictors that were recorded were the weather factors and day type. The developed methodology is upgraded using a novel closed-loop algorithm that uses the forecasting values and influential factors to predict the residuals. Most performance indicators that are computed indicate that forecasting the residuals actually improves the method’s precision, decreasing the mean absolute percentage error from 5.04% to 4.28%. Measured data are used to validate the effectiveness of the presented approach, making it a suitable tool for use in load forecasting by utility companies.

Download Full-text