Subsurface Characterization Using Ensemble Machine Learning

2021 ◽  
Author(s):  
Gorka G Leiceaga ◽  
Robert Balch ◽  
George El-kaseeh

Abstract Reservoir characterization is an ambitious challenge that aims to predict variations within the subsurface using fit-for-purpose information that follows physical and geological sense. To properly achieve subsurface characterization, artificial intelligence (AI) algorithms may be used. Machine learning, a subset of AI, is a data-driven approach that has exploded in popularity during the past decades in industries such as healthcare, banking and finance, cryptocurrency, data security, and e-commerce. An advantage of machine learning methods is that they can be implemented to produce results without the need to have first established a complete theoretical scientific model for a problem – with a set of complex model equations to be solved analytically or numerically. The principal challenge of machine learning lies in attaining enough training information, which is essential in obtaining an adequate model that allows for a prediction with a high level of accuracy. Ensemble machine learning in reservoir characterization studies is a candidate to reduce subsurface uncertainty by integrating seismic and well data. In this article, a bootstrap aggregating algorithm is evaluated to determine its potential as a subsurface discriminator. The algorithm fits decision trees on various sub-samples of a dataset and uses averaging to improve the accuracy of the prediction without over-fitting. The gamma ray results from our test dataset show a high correlation with the measured logs, giving confidence in our workflow applied to subsurface characterization.

Energies ◽  
2021 ◽  
Vol 14 (4) ◽  
pp. 1052
Author(s):  
Baozhong Wang ◽  
Jyotsna Sharma ◽  
Jianhua Chen ◽  
Patricia Persaud

Estimation of fluid saturation is an important step in dynamic reservoir characterization. Machine learning techniques have been increasingly used in recent years for reservoir saturation prediction workflows. However, most of these studies require input parameters derived from cores, petrophysical logs, or seismic data, which may not always be readily available. Additionally, very few studies incorporate the production data, which is an important reflection of the dynamic reservoir properties and also typically the most frequently and reliably measured quantity throughout the life of a field. In this research, the random forest ensemble machine learning algorithm is implemented that uses the field-wide production and injection data (both measured at the surface) as the only input parameters to predict the time-lapse oil saturation profiles at well locations. The algorithm is optimized using feature selection based on feature importance score and Pearson correlation coefficient, in combination with geophysical domain-knowledge. The workflow is demonstrated using the actual field data from a structurally complex, heterogeneous, and heavily faulted offshore reservoir. The random forest model captures the trends from three and a half years of historical field production, injection, and simulated saturation data to predict future time-lapse oil saturation profiles at four deviated well locations with over 90% R-square, less than 6% Root Mean Square Error, and less than 7% Mean Absolute Percentage Error, in each case.


2020 ◽  
Vol 214 ◽  
pp. 01023
Author(s):  
Linan (Frank) Zhao

Long-term unemployment has significant societal impact and is of particular concerns for policymakers with regard to economic growth and public finances. This paper constructs advanced ensemble machine learning models to predict citizens’ risks of becoming long-term unemployed using data collected from European public authorities for employment service. The proposed model achieves 81.2% accuracy on identifying citizens with high risks of long-term unemployment. This paper also examines how to dissect black-box machine learning models by offering explanations at both a local and global level using SHAP, a state-of-the-art model-agnostic approach to explain factors that contribute to long-term unemployment. Lastly, this paper addresses an under-explored question when applying machine learning in the public domain, that is, the inherent bias in model predictions. The results show that popular models such as gradient boosted trees may produce unfair predictions against senior age groups and immigrants. Overall, this paper sheds light on the recent increasing shift for governments to adopt machine learning models to profile and prioritize employment resources to reduce the detrimental effects of long-term unemployment and improve public welfare.


2020 ◽  
Vol 8 (3) ◽  
pp. SL25-SL34
Author(s):  
Shirui Wang ◽  
Qiuyang Shen ◽  
Xuqing Wu ◽  
Jiefu Chen

Depth matching of multiple logging curves is essential to any well evaluation or reservoir characterization. Depth matching can be applied to various measurements of a single well or multiple log curves from multiple wells within the same field. Because many drilling advisory projects have been launched to digitalize the well-log analysis, accurate depth matching becomes an important factor in improving well evaluation, production, and recovery. It is a challenge, though, to align the log curves from multiple wells due to the unpredictable structure of the geologic formations. We have conducted a study on the alignment of multiple gamma-ray well logs by using the state-of-the-art machine-learning techniques. Our objective is to automate the depth-matching task with minimum human intervention. We have developed a novel multitask learning approach by using a deep neural network to optimize the depth-matching strategy that correlates multiple gamma-ray logs in the same field. Our approach can be extended to other applications as well, such as automatic formation top labeling for an ongoing well given a reference well.


2021 ◽  
Author(s):  
Ardiansyah Negara ◽  
Arturo Magana-Mora ◽  
Khaqan Khan ◽  
Johannes Vossen ◽  
Guodong David Zhan ◽  
...  

Abstract This study presents a data-driven approach using machine learning algorithms to provide predicted analogues in the absence of acoustic logs, especially while drilling. Acoustic logs are commonly used to derive rock mechanical properties; however, these data are not always available. Well logging data (wireline/logging while drilling - LWD), such as gamma ray, density, neutron porosity, and resistivity, are used as input parameters to develop the data-driven rock mechanical models. In addition to the logging data, real-time drilling data (i.e., weight-on-bit, rotation speed, torque, rate of penetration, flowrate, and standpipe pressure) are used to derive the model. In the data preprocessing stage, we labeled drilling and well logging data based on formation tops in the drilling plan and performed data cleansing to remove outliers. A set of field data from different wells across the same formation is used to build and train the predictive models. We computed feature importance to rank the data based on the relevance to predict acoustic logs and applied feature selection techniques to remove redundant features that may unnecessarily require a more complex model. An additional feature, mechanical specific energy, is also generated from drilling real-time data to improve the prediction accuracy. A number of scenarios showing a comparison of different predictive models were studied, and the results demonstrated that adding drilling data and/or feature engineering into the model could improve the accuracy of the models.


2019 ◽  
Author(s):  
Siddhartha Laghuvarapu ◽  
Yashaswi Pathak ◽  
U. Deva Priyakumar

Recent advances in artificial intelligence along with development of large datasets of energies calculated using quantum mechanical (QM)/density functional theory (DFT) methods have enabled prediction of accurate molecular energies at reasonably low computational cost. However, machine learning models that have been reported so far requires the atomic positions obtained from geometry optimizations using high level QM/DFT methods as input in order to predict the energies, and do not allow for geometry optimization. In this paper, a transferable and molecule-size independent machine learning model (BAND NN) based on a chemically intuitive representation inspired by molecular mechanics force fields is presented. The model predicts the atomization energies of equilibrium and non-equilibrium structures as sum of energy contributions from bonds (B), angles (A), nonbonds (N) and dihedrals (D) at remarkable accuracy. The robustness of the proposed model is further validated by calculations that span over the conformational, configurational and reaction space. The transferability of this model on systems larger than the ones in the dataset is demonstrated by performing calculations on select large molecules. Importantly, employing the BAND NN model, it is possible to perform geometry optimizations starting from non-equilibrium structures along with predicting their energies.


2020 ◽  
Author(s):  
Sina Faizollahzadeh Ardabili ◽  
Amir Mosavi ◽  
Pedram Ghamisi ◽  
Filip Ferdinand ◽  
Annamaria R. Varkonyi-Koczy ◽  
...  

Several outbreak prediction models for COVID-19 are being used by officials around the world to make informed-decisions and enforce relevant control measures. Among the standard models for COVID-19 global pandemic prediction, simple epidemiological and statistical models have received more attention by authorities, and they are popular in the media. Due to a high level of uncertainty and lack of essential data, standard models have shown low accuracy for long-term prediction. Although the literature includes several attempts to address this issue, the essential generalization and robustness abilities of existing models needs to be improved. This paper presents a comparative analysis of machine learning and soft computing models to predict the COVID-19 outbreak as an alternative to SIR and SEIR models. Among a wide range of machine learning models investigated, two models showed promising results (i.e., multi-layered perceptron, MLP, and adaptive network-based fuzzy inference system, ANFIS). Based on the results reported here, and due to the highly complex nature of the COVID-19 outbreak and variation in its behavior from nation-to-nation, this study suggests machine learning as an effective tool to model the outbreak. This paper provides an initial benchmarking to demonstrate the potential of machine learning for future research. Paper further suggests that real novelty in outbreak prediction can be realized through integrating machine learning and SEIR models.


Sign in / Sign up

Export Citation Format

Share Document