Machine learning and geophysical inversion — A numerical study

2019 ◽  
Vol 38 (7) ◽  
pp. 512-519 ◽  
Author(s):  
Brian Russell

As geophysicists, we are trained to conceptualize geophysical problems in detail. However, machine learning algorithms are more difficult to understand and are often thought of as simply “black boxes.” A numerical example is used here to illustrate the difference between geophysical inversion and inversion by machine learning. In doing so, an attempt is made to demystify machine learning algorithms and show that, like inverse problems, they have a definite mathematical structure that can be written down and understood. The example used is the extraction of the underlying reflection coefficients from a synthetic seismic response that was created by convolving a reflection coefficient dipole with a symmetric wavelet. Because the dipole is below the seismic tuning frequency, the overlapping wavelets create both an amplitude increase and extra nonphysical reflection coefficients in the synthetic seismic data. This is a common problem in real seismic data. In discussing the solution to this problem, the topics of deconvolution, recursive inversion, linear regression, and nonlinear regression using a feedforward neural network are covered. It is shown that if the inputs to the deconvolution problem are fully understood, this is the optimal way to extract the true reflection coefficients. However, if the geophysics is not fully understood and large amounts of data are available, machine learning can provide a viable alternative to geophysical inversion.

2021 ◽  
Vol 11 (9) ◽  
pp. 4251
Author(s):  
Jinsong Zhang ◽  
Shuai Zhang ◽  
Jianhua Zhang ◽  
Zhiliang Wang

In the digital microfluidic experiments, the droplet characteristics and flow patterns are generally identified and predicted by the empirical methods, which are difficult to process a large amount of data mining. In addition, due to the existence of inevitable human invention, the inconsistent judgment standards make the comparison between different experiments cumbersome and almost impossible. In this paper, we tried to use machine learning to build algorithms that could automatically identify, judge, and predict flow patterns and droplet characteristics, so that the empirical judgment was transferred to be an intelligent process. The difference on the usual machine learning algorithms, a generalized variable system was introduced to describe the different geometry configurations of the digital microfluidics. Specifically, Buckingham’s theorem had been adopted to obtain multiple groups of dimensionless numbers as the input variables of machine learning algorithms. Through the verification of the algorithms, the SVM and BPNN algorithms had classified and predicted the different flow patterns and droplet characteristics (the length and frequency) successfully. By comparing with the primitive parameters system, the dimensionless numbers system was superior in the predictive capability. The traditional dimensionless numbers selected for the machine learning algorithms should have physical meanings strongly rather than mathematical meanings. The machine learning algorithms applying the dimensionless numbers had declined the dimensionality of the system and the amount of computation and not lose the information of primitive parameters.


2020 ◽  
Vol 11 (3) ◽  
pp. 80-105 ◽  
Author(s):  
Vijay M. Khadse ◽  
Parikshit Narendra Mahalle ◽  
Gitanjali R. Shinde

The emerging area of the internet of things (IoT) generates a large amount of data from IoT applications such as health care, smart cities, etc. This data needs to be analyzed in order to derive useful inferences. Machine learning (ML) plays a significant role in analyzing such data. It becomes difficult to select optimal algorithm from the available set of algorithms/classifiers to obtain best results. The performance of algorithms differs when applied to datasets from different application domains. In learning, it is difficult to understand if the difference in performance is real or due to random variation in test data, training data, or internal randomness of the learning algorithms. This study takes into account these issues during a comparison of ML algorithms for binary and multivariate classification. It helps in providing guidelines for statistical validation of results. The results obtained show that the performance measure of accuracy for one algorithm differs by critical difference (CD) than others over binary and multivariate datasets obtained from different application domains.


2021 ◽  
Author(s):  
Yew Kee Wong

In the information era, enormous amounts of data have become available on hand to decision makers. Big data refers to datasets that are not only big, but also high in variety and velocity, which makes them difficult to handle using traditional tools and techniques. Due to the rapid growth of such data, solutions need to be studiedand provided in order to handle and extract value and knowledge from these datasets. Machine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention. Such minimal human intervention can be provided using big data analytics, which is the application of advanced analytics techniques on big data. This paper aims to analyse some of the different machine learning algorithms and methods which can be applied to big data analysis, as well as the opportunities provided by the application of big data analytics in various decision making domains.


2021 ◽  
Vol 5 (2(15)) ◽  
pp. 61-76
Author(s):  
Vasilii Konstantinovich Alekhin ◽  

Social network TikTok has strong competitive differentiator in comparing with other platforms. ByteDance exploits machine learning algorithms to generate a recommendation feed (for you page). The algorithm bases on two main mechanisms. The first mechanism provides content database clustering depending on the type, audio track, video captions, and hashtags. The second mechanism analyzes the user’s behavioral patterns based on their actions in the application. The next step is the formation of user interaction scenarios. The difference between the predicted behavior and the real one is the object of analysis. If it equals zero, then the recommendations feed is formed correctly. The user is watching more and more interesting videos, just scrolling through video after video.


Author(s):  
Hozan Khalid Hamarashid

The mean result of machine learning models is determined by utilizing k-fold cross-validation. The algorithm with the best average performance should surpass those with the poorest. But what if the difference in average outcomes is the consequence of a statistical anomaly? To conduct whether or not the mean result differences between two algorithms is genuine then statistical hypothesis test is utilized. Using statistical hypothesis testing, this study will demonstrate how to compare machine learning algorithms. The output of several machine learning algorithms or simulation pipelines is compared during model selection. The model that performs the best based on your performance measure becomes the last model, which can be utilized to make predictions on new data. With classification and regression prediction models it can be conducted by utilizing traditional machine learning and deep learning methods. The difficulty is to identify whether or not the difference between two models is accurate.


2021 ◽  
Author(s):  
Karyna Rodriguez ◽  
Neil Hodgson

<p>Seismic data has been and continues to be the main tool for hydrocarbon exploration. Storing very large quantities of seismic data, as well as making it easily accessible and with machine learning functionality, is the way forward to gain regional and local understanding of petroleum systems. Seismic data has been made available as a streamed service through a web-based platform allowing seismic data access on the spot, from large datasets stored in the cloud. A data lake can be defined as transformed data used for tasks such as reporting, visualization, advanced analytics and machine learning. The global library of data has been deconstructed from the rigid flat file format traditionally associated with seismic and transformed into a distributed, scalable, big data store. This allows for rapid access, complex queries, and efficient use of computer power – fundamental criteria for enabling Big Data technologies such as deep learning.  </p><p>This data lake concept is already changing the way we access seismic data, enhancing the efficiency of gaining insights into any hydrocarbon basin. Examples include the identification of potentially prolific mixed turbidite/contourite systems in the Trujillo Basin offshore Peru, together with important implications of BSR-derived geothermal gradients, which are much higher than expected in a fore arc setting, opening new exploration opportunities. Another example is de-risking and ranking of offshore Malvinas Basin blocks by gaining new insights into areas until very recently considered to be non-prospective. Further de-risking was achieved by carrying out an in-depth source rock analysis in the Malvinas and conjugate southern South Africa Basins. Additionally, the data lake enabled the development of machine learning algorithms for channel recognition which were successfully applied to data offshore Australia and Norway.</p><p>“On demand” regional seismic dataset access is proving invaluable in our efforts to make hydrocarbon exploration more efficient and successful. Machine learning algorithms are helping to automate the more mechanical tasks, leaving time for the more valuable task of analysing the results. The geological insights gained by combining these 2 aspects confirm the value of seismic data lakes.</p>


Sign in / Sign up

Export Citation Format

Share Document