Field Development Optimization Using Machine Learning Methods to Identify the Optimal Water Flooding Regime

2021 ◽  
Author(s):  
Alexey Vasilievich Timonov ◽  
Arturas Rimo Shabonas ◽  
Sergey Alexandrovich Schmidt

Abstract The main technology used to optimize field development is hydrodynamic modeling, which is very costly in terms of computing resources and expert time to configure the model. And in the case of brownfields, the complexity increases exponentially. The paper describes the stages of developing a hybrid geological-physical-mathematical proxy model using machine learning methods, which allows performing multivariate calculations and predicting production including various injection well operating regimes. Based on the calculations, we search for the optimal ratio of injection volume distribution to injection wells under given infrastructural constraints. The approach implemented in this work takes into account many factors (some features of the geological structure, history of field development, mutual influence of wells, etc.) and can offer optimal options for distribution of injection volumes of injection wells without performing full-scale or sector hydrodynamic simulation. To predict production, we use machine learning methods (based on decision trees and neural networks) and methods for optimizing the target functions. As a result of this research, a unified algorithm for data verification and preprocessing has been developed for feature extraction tasks and the use of deep machine learning models as input data. Various machine learning algorithms were tested and it was determined that the highest prediction accuracy is achieved by building machine learning models based on Temporal Convolutional Networks (TCN) and gradient boosting. Developed and tested an algorithm for finding the optimal allocation of injection volumes, taking into account the existing infrastructure constraints. Different optimization algorithms are tested. It is determined that the choice and setting of boundary conditions is critical for optimization algorithms in this problem. An integrated approach was tested on terrigenous formations of the West Siberian field, where the developed algorithm showed effectiveness.

2021 ◽  
Author(s):  
Bruno Barbosa Miranda de Paiva ◽  
Polianna Delfino Pereira ◽  
Claudio Moises Valiense de Andrade ◽  
Virginia Mara Reis Gomes ◽  
Maria Clara Pontello Barbosa Lima ◽  
...  

Objective: To provide a thorough comparative study among state ofthe art machine learning methods and statistical methods for determining in-hospital mortality in COVID 19 patients using data upon hospital admission; to study the reliability of the predictions of the most effective methods by correlating the probability of the outcome and the accuracy of the methods; to investigate how explainable are the predictions produced by the most effective methods. Materials and Methods: De-identified data were obtained from COVID 19 positive patients in 36 participating hospitals, from March 1 to September 30, 2020. Demographic, comorbidity, clinical presentation and laboratory data were used as training data to develop COVID 19 mortality prediction models. Multiple machine learning and traditional statistics models were trained on this prediction task using a folded cross validation procedure, from which we assessed performance and interpretability metrics. Results: The Stacking of machine learning models improved over the previous state of the art results by more than 26% in predicting the class of interest (death), achieving 87.1% of AUROC and macroF1 of 73.9%. We also show that some machine learning models can be very interpretable and reliable, yielding more accurate predictions while providing a good explanation for the why. Conclusion: The best results were obtained using the meta learning ensemble model Stacking. State of the art explainability techniques such as SHAP values can be used to draw useful insights into the patterns learned by machine-learning algorithms. Machine learning models can be more explainable than traditional statistics models while also yielding highly reliable predictions. Key words: COVID-19; prognosis; prediction model; machine learning


2021 ◽  
Vol 19 (3) ◽  
pp. 55-64
Author(s):  
K. N. Maiorov ◽  

The paper examines the life cycle of field development, analyzes the processes of the field development design stage for the application of machine learning methods. For each process, relevant problems are highlighted, existing solutions based on machine learning methods, ideas and problems are proposed that could be effectively solved by machine learning methods. For the main part of the processes, examples of solutions are briefly described; the advantages and disadvantages of the approaches are identified. The most common solution method is feed-forward neural networks. Subject to preliminary normalization of the input data, this is the most versatile algorithm for regression and classification problems. However, in the problem of selecting wells for hydraulic fracturing, a whole ensemble of machine learning models was used, where, in addition to a neural network, there was a random forest, gradient boosting and linear regression. For the problem of optimizing the placement of a grid of oil wells, the disadvantages of existing solutions based on a neural network and a simple reinforcement learning approach based on Markov decision-making process are identified. A deep reinforcement learning algorithm called Alpha Zero is proposed, which has previously shown significant results in the role of artificial intelligence for games. This algorithm is a decision tree search that directs the neural network: only those branches that have received the best estimates from the neural network are considered more thoroughly. The paper highlights the similarities between the tasks for which Alpha Zero was previously used, and the task of optimizing the placement of a grid of oil producing wells. Conclusions are made about the possibility of using and modifying the algorithm of the optimization problem being solved. Аn approach is proposed to take into account symmetric states in a Monte Carlo tree to reduce the number of required simulations.


2019 ◽  
pp. 1-11 ◽  
Author(s):  
David Chen ◽  
Gaurav Goyal ◽  
Ronald S. Go ◽  
Sameer A. Parikh ◽  
Che G. Ngufor

PURPOSE Time to event is an important aspect of clinical decision making. This is particularly true when diseases have highly heterogeneous presentations and prognoses, as in chronic lymphocytic lymphoma (CLL). Although machine learning methods can readily learn complex nonlinear relationships, many methods are criticized as inadequate because of limited interpretability. We propose using unsupervised clustering of the continuous output of machine learning models to provide discrete risk stratification for predicting time to first treatment in a cohort of patients with CLL. PATIENTS AND METHODS A total of 737 treatment-naïve patients with CLL diagnosed at Mayo Clinic were included in this study. We compared predictive abilities for two survival models (Cox proportional hazards and random survival forest) and four classification methods (logistic regression, support vector machines, random forest, and gradient boosting machine). Probability of treatment was then stratified. RESULTS Machine learning methods did not yield significantly more accurate predictions of time to first treatment. However, automated risk stratification provided by clustering was able to better differentiate patients who were at risk for treatment within 1 year than models developed using standard survival analysis techniques. CONCLUSION Clustering the posterior probabilities of machine learning models provides a way to better interpret machine learning models.


2019 ◽  
Vol 40 (Supplement_1) ◽  
Author(s):  
G Sng ◽  
D Y Z Lim ◽  
C H Sia ◽  
J S W Lee ◽  
X Y Shen ◽  
...  

Abstract Background/Introduction Classic electrocardiographic (ECG) criteria for left ventricular hypertrophy (LVH) have been well studied in Western populations, particularly in hypertensive patients. However, their utility in Asian populations is not well studied, and their applicability to young pre-participation cohorts is unclear. We sought to evaluate the performance of classical criteria against that of machine learning models. Aims We sought to evaluate the performance of classical criteria against the performance of novel machine learning models in the identification of LVH. Methodology Between November 2009 and December 2014, pre-participation screening ECG and subsequent echocardiographic data was collected from 13,954 males aged 16 to 22, who reported for medical screening prior to military conscription. Final diagnosis of LVH was made on echocardiography, with LVH defined as a left ventricular mass index >115g/m2. The continuous and binary forms of classical criteria were compared against machine learning models using receiver-operating characteristics (ROC) curve analysis. An 80:20 split was used to divide the data into training and test sets for the machine learning models, and three fold cross validation was used in training the models. We also compared the important variables identified by machine learning models with the input variables of classical criteria. Results Prevalence of echocardiographic LVH in this population was 0.91% (127 cases). Classical ECG criteria had poor performance in predicting LVH, with the best predictions achieved by the continuous Sokolow-Lyon (AUC = 0.63, 95% CI = 0.58–0.68) and the continuous Modified Cornell (AUC = 0.63, 95% CI = 0.58–0.68). Machine learning methods achieved superior performance – Random Forest (AUC = 0.74, 95% CI = 0.66–0.82), Gradient Boosting Machines (AUC = 0.70, 95% CI = 0.61–0.79), GLMNet (AUC = 0.78, 95% CI = 0.70–0.86). Novel and less recognized ECG parameters identified by the machine learning models as being predictive of LVH included mean QT interval, mean QRS interval, R in V4, and R in I. ROC curves of models studies Conclusion The prevalence of LVH in our population is lower than that previously reported in other similar populations. Classical ECG criteria perform poorly in this context. Machine learning methods show superior predictive performance and demonstrate non-traditional predictors of LVH from ECG data. Further research is required to improve the predictive ability of machine learning models, and to understand the underlying pathology of the novel ECG predictors identified.


Author(s):  
Wolfgang Drobetz ◽  
Tizian Otto

AbstractThis paper evaluates the predictive performance of machine learning methods in forecasting European stock returns. Compared to a linear benchmark model, interactions and nonlinear effects help improve the predictive performance. But machine learning models must be adequately trained and tuned to overcome the high dimensionality problem and to avoid overfitting. Across all machine learning methods, the most important predictors are based on price trends and fundamental signals from valuation ratios. However, the models exhibit substantial variation in statistical predictive performance that translate into pronounced differences in economic profitability. The return and risk measures of long-only trading strategies indicate that machine learning models produce sizeable gains relative to our benchmark. Neural networks perform best, also after accounting for transaction costs. A classification-based portfolio formation, utilizing a support vector machine that avoids estimating stock-level expected returns, performs even better than the neural network architecture.


2019 ◽  
Vol 6 (2) ◽  
pp. 343-349 ◽  
Author(s):  
Daniele Padula ◽  
Jack D. Simpson ◽  
Alessandro Troisi

Combining electronic and structural similarity between organic donors in kernel based machine learning methods allows to predict photovoltaic efficiencies reliably.


2021 ◽  
Vol 23 (35) ◽  
pp. 19781-19789
Author(s):  
Tom Vermeyen ◽  
Jure Brence ◽  
Robin Van Echelpoel ◽  
Roy Aerts ◽  
Guillaume Acke ◽  
...  

The capabilities of machine learning models to extract the absolute configuration of a series of compounds from their vibrational circular dichroism spectra have been demonstrated. The important spectral areas are identified.


2021 ◽  
Vol 2089 (1) ◽  
pp. 012047
Author(s):  
Vuppu Padmakar ◽  
B V Ramana Murthy

Abstract This venture plans to give improved security by enabling a client to realize who is actually getting to the framework utilizing facial acknowledgment. The framework enables just approved clients to get entrance. Python is a programming language utilized alongside Machine learning methods and an open source library which is utilized to configuration, construct and train Machine learning models. Interface component is additionally accommodated unapproved clients to enroll to obtain entrance with the earlier authorization from the Admin.


2020 ◽  
Author(s):  
Mo Zhang ◽  
Wenjiao Shi

Abstract. Soil texture and soil particle size fractions (PSFs) play an increasing role in physical, chemical and hydrological processes. Many previous studies have used machine-learning and log ratio transformation methods for soil texture classification and soil PSFs interpolation to improve the prediction accuracy. However, few reports systematically compared the performance of them in both classification and interpolation. Here, a total of 45 evaluation models generated from five machine-learning models – K-nearest neighbor (KNN), multilayer perceptron neural network (MLP), random forest (RF), support vector machines (SVM), extreme gradient boosting (XGB), combined with original and three log ratio methods – additive log ratio (ALR), centered log ratio (CLR) and isometric log ratio (ILR), were applied to evaluate and compare both of them using 640 soil samples in the Heihe River Basin in China. The results demonstrated that log ratio transformation methods decreased skewness of distributions of soil PSFs data. For soil texture classification, RF and XGB showed better performance with the overall accuracy and kappa coefficients, they were also recommended to evaluate classification capacity of imbalanced data according to the area under the precision-recall curve (AUPRC) analysis. For soil PSFs interpolation, RF delivered the best performance among five machine-learning models with the lowest root mean squared error (RMSE, sand: 15.09 %, silt: 13.86 %, clay: 6.31 %), mean absolute error (MAE, sand: 10.65 %, silt: 9.99 %, clay: 5.00 %), Aitchison distance (AD, 0.84) and standardized residual sum of squares (STRESS, 0.61), and the highest coefficient of determination (R2, sand: 53.28 %, silt: 45.77 %, clay: 53.75 %). STRESS was improved using log ratio methods, especially CLR and ILR. For the comparison of direct and indirect classification, prediction maps were similar on the middle and upper reaches and different on the lower reaches of the HRB. Moreover, indirect classification maps based on log ratio transformed data had more detailed information. There is a pronounced improvement with 21.3 % of kappa coefficient using indirect methods for soil texture classification compared to the direct ones. RF was recommended as the best strategy among these five machine-learning models according to the accuracy evaluation of soil PSFs interpolation and soil texture classification, and ILR was recommended for component-wise machine-learning methods without multivariate treatment considering the constrained nature of compositional data. In addition, XGB was preferred than other models when trade-off of accuracy and time was considered. Our findings can provide a reference for other research of spatial prediction of soil PSFs and texture using machine-learning methods with skewed distribution soil PSFs data in a large area.


Sign in / Sign up

Export Citation Format

Share Document