Model-Free Assessment of Inter-Well Connectivity in CO2 WAG Projects Using Statistical Recurrent Unit Models

Mapping Intimacies ◽

10.2118/205944-ms ◽

2021 ◽

Author(s):

Hongquan Chen ◽

Deepthi Sen ◽

Akhil Datta-Gupta ◽

Masahiro Nagao

Keyword(s):

Data Analytics ◽

Temporal Trends ◽

Data Driven ◽

Large Field ◽

Good Prediction ◽

Subsurface Structure ◽

Field Scale ◽

Significant Information ◽

Study Results ◽

Good Prediction Accuracy

Abstract Routine well-wise injection and production measurements contain significant information on subsurface structure and properties. Data-driven technology that interprets surface data into subsurface structure or properties can assist operators in making informed decisions by providing a better understanding of field assets. Our machine-learning framework is built on the statistical recurrent unit (SRU) model and interprets well-based injection/production data into inter-well connectivity without relying on a geologic model. We test it on synthetic and field-scale CO2 EOR projects utilizing the water-alternating-gas (WAG) process. SRU is a special type of recurrent neural network (RNN) that allows for better characterization of temporal trends, by learning various statistics of the input at different time scales. In our application, the complete states (injection rate, pressure and cumulative injection) at injectors and pressure states at producers are fed to SRU as the input and the phase rates at producers are treated as the output. Once the SRU is trained and validated, it is then used to assess the connectivity of each injector to any producer using permutation variable importance method, wherein inputs corresponding to an injector are shuffled and the increase in prediction error at a given producer is recorded as the importance (connectivity metric) of the injector to the producer. This method is tested in both synthetic and field-scale cases. The validation of the proposed data-driven inter-well connectivity assessment is performed using synthetic data from simulation models where inter-well connectivity can be easily measured using the streamline-based flux allocation. The SRU model is shown to offer excellent prediction performance on the synthetic case. Despite significant measurement noise and frequent well shut-ins imposed in the field-scale case, the SRU model offers good prediction accuracy, the overall relative error of the phase production rates at most producers ranges from 10% to 30%. It is shown that the dominant connections identified by the data-driven method and streamline method are in close agreement. This significantly improves confidence in our data-driven procedure. The novelty of this work is that it is purely data-driven method and can directly interpret routine surface measurements to intuitive subsurface knowledge. Furthermore, the streamline-based validation procedure provides physics-based backing to the results obtained from data analytics. The study results in a reliable and efficient data analytics framework that is well-suited for large field applications.

Download Full-text

Soft-Sensor for Estimation of Lead Slices Thickness in Continuous Casting Process

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.323.40 ◽

2011 ◽

Vol 323 ◽

pp. 40-45 ◽

Cited By ~ 1

Author(s):

Shi Lun Zuo ◽

Jia Xu Wang ◽

Tai Fu Li

Keyword(s):

Continuous Casting ◽

Continuous Casting Machine ◽

Casting Machine ◽

Casting Process ◽

Support Vector ◽

Good Prediction ◽

Soft Sensors ◽

Continuous Casting Process ◽

Study Results ◽

Good Prediction Accuracy

Continuous casting process is a traditional and widely-used technique in producing the cathode of electric lead. In this paer, soft-sensors based on a support vector regression （SVR, in short）model and an artificial neural networks （ANNs, in short）model respectively, were presented for the estimation of the lead slices thickness in the process.Experiments had been performed on the continuous casting machine to obtain the data used for training and testing of the soft-sensors. For the continuous casting process, the soft-sensors proposed here represents a viable and inexpensive on-line sensors.The study results indicate that a good prediction accuracy of the slice thickess can be provided by the soft-sensors, and even a better performance can be achieved by using pre-processing procedures to the input data, it also shows that the SVR model is an attractive alternative to ANNs model for the soft-sensors, when the number of samples is relatively small.

Download Full-text

Data Driven Smart Proxy for CFD Application of Big Data Analytics & Machine Learning in Computational Fluid Dynamics, Report Two: Model Building at the Cell Level

10.2172/1431303 ◽

2018 ◽

Cited By ~ 1

Author(s):

A. Ansari ◽

S. Mohaghegh ◽

M. Shahnam ◽

J. F. Dietiker ◽

T. Li

Keyword(s):

Machine Learning ◽

Fluid Dynamics ◽

Computational Fluid Dynamics ◽

Big Data ◽

Data Analytics ◽

Model Building ◽

Big Data Analytics ◽

Data Driven ◽

Cell Level

Download Full-text

A Data-Driven Method for the Temporal Estimation of Soil Water Potential and Its Application for Shallow Landslides Prediction

Water ◽

10.3390/w13091208 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1208

Author(s):

Massimiliano Bordoni ◽

Fabrizio Inzaghi ◽

Valerio Vivaldi ◽

Roberto Valentino ◽

Marco Bittelli ◽

...

Keyword(s):

Machine Learning ◽

Water Potential ◽

Soil Water ◽

Temporal Trends ◽

Soil Water Potential ◽

Shallow Landslides ◽

Water Dynamics ◽

Data Driven ◽

Machine Learning Techniques ◽

Physically Based

Soil water potential is a key factor to study water dynamics in soil and for estimating the occurrence of natural hazards, as landslides. This parameter can be measured in field or estimated through physically-based models, limited by the availability of effective input soil properties and preliminary calibrations. Data-driven models, based on machine learning techniques, could overcome these gaps. The aim of this paper is then to develop an innovative machine learning methodology to assess soil water potential trends and to implement them in models to predict shallow landslides. Monitoring data since 2012 from test-sites slopes in Oltrepò Pavese (northern Italy) were used to build the models. Within the tested techniques, Random Forest models allowed an outstanding reconstruction of measured soil water potential temporal trends. Each model is sensitive to meteorological and hydrological characteristics according to soil depths and features. Reliability of the proposed models was confirmed by correct estimation of days when shallow landslides were triggered in the study areas in December 2020, after implementing the modeled trends on a slope stability model, and by the correct choice of physically-based rainfall thresholds. These results confirm the potential application of the developed methodology to estimate hydrological scenarios that could be used for decision-making purposes.

Download Full-text

Data-Driven Wildfire Risk Prediction in Northern California

Atmosphere ◽

10.3390/atmos12010109 ◽

2021 ◽

Vol 12 (1) ◽

pp. 109

Author(s):

Ashima Malik ◽

Megha Rajam Rao ◽

Nandini Puppala ◽

Prathusha Koouri ◽

Venkata Anil Kumar Thota ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Curves ◽

Data Driven ◽

Northern California ◽

Combined Model ◽

Wildfire Risk ◽

Study Results ◽

Forest Models ◽

Random Forest Models

Over the years, rampant wildfires have plagued the state of California, creating economic and environmental loss. In 2018, wildfires cost nearly 800 million dollars in economic loss and claimed more than 100 lives in California. Over 1.6 million acres of land has burned and caused large sums of environmental damage. Although, recently, researchers have introduced machine learning models and algorithms in predicting the wildfire risks, these results focused on special perspectives and were restricted to a limited number of data parameters. In this paper, we have proposed two data-driven machine learning approaches based on random forest models to predict the wildfire risk at areas near Monticello and Winters, California. This study demonstrated how the models were developed and applied with comprehensive data parameters such as powerlines, terrain, and vegetation in different perspectives that improved the spatial and temporal accuracy in predicting the risk of wildfire including fire ignition. The combined model uses the spatial and the temporal parameters as a single combined dataset to train and predict the fire risk, whereas the ensemble model was fed separate parameters that were later stacked to work as a single model. Our experiment shows that the combined model produced better results compared to the ensemble of random forest models on separate spatial data in terms of accuracy. The models were validated with Receiver Operating Characteristic (ROC) curves, learning curves, and evaluation metrics such as: accuracy, confusion matrices, and classification report. The study results showed and achieved cutting-edge accuracy of 92% in predicting the wildfire risks, including ignition by utilizing the regional spatial and temporal data along with standard data parameters in Northern California.

Download Full-text

Simulation of Machining of Incoloy 907 Based on Thermodynamical Constitutive Equation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.631-632.681 ◽

2013 ◽

Vol 631-632 ◽

pp. 681-685

Author(s):

Fang Shao ◽

Fa Qing Li ◽

Hai Ying Zhang ◽

Xuan Gao

Keyword(s):

Cutting Force ◽

Cutting Temperature ◽

Tool Material ◽

High Reactivity ◽

Good Prediction ◽

Workpiece Material ◽

Element Analysis ◽

Good Prediction Accuracy ◽

Experimental Cutting ◽

Cutting Tool Materials

Aero-engine alloys (also as known as superalloys)are known as difficult-to-machine materials, especially at higher cutting speeds, due to their several inherent properties such as low thermal conductivity and their high reactivity with cutting tool materials. In this paper a finite element analysis (FEA) of machining for Incoloy907 is presented. In particular, the thermodynamical constitutitve equation(T-C-E) in FEA is applied for both workpiece material and tool material. Cutting temperature and cutting force are predicted. The comparison between the predicted and experimental cutting temperature and cutting force are presented and discussed. The results indicated that a good prediction accuracy of both principal cutting temperature and cutting force can be achieved by the method of FEA with thermodynamical constitutitve equation.

Download Full-text

Data-driven campaigns in public sensemaking: Discursive positions, contextualization, and maneuvers in American, British, and German debates around computational politics

Communications ◽

10.1515/commun-2019-0125 ◽

2020 ◽

Vol 45 (s1) ◽

pp. 535-559

Author(s):

Christian Pentzold ◽

Lena Fölsche

Keyword(s):

United States ◽

United Kingdom ◽

Presidential Election ◽

Data Analytics ◽

Large Data ◽

Large Data Sets ◽

Data Driven ◽

Data Sets ◽

Federal Election ◽

Online Comments

AbstractOur article examines how journalistic reports and online comments have made sense of computational politics. It treats the discourse around data-driven campaigns as its object of analysis and codifies four main perspectives that have structured the debates about the use of large data sets and data analytics in elections. We study American, British, and German sources on the 2016 United States presidential election, the 2017 United Kingdom general election, and the 2017 German federal election. There, groups of speakers maneuvered between enthusiastic, skeptical, agnostic, or admonitory stances and so cannot be clearly mapped onto these four discursive positions. Coming along with the inconsistent accounts, public sensemaking was marked by an atmosphere of speculation about the substance and effects of computational politics. We conclude that this equivocality helped journalists and commentators to sideline prior reporting on the issue in order to repeatedly rediscover the practices they had already covered.

Download Full-text

The truth must prevail: citizens’ rights to know the truth during the era of COVID-19

Journal of Public Health ◽

10.1093/pubmed/fdaa240 ◽

2020 ◽

Author(s):

Philip Joseph D Sarmiento ◽

John Federick C Yap ◽

Kevin Aldrin G Espinosa ◽

Ria P Ignacio ◽

Carisma A Caro

Keyword(s):

Public Health ◽

Data Sharing ◽

Data Analytics ◽

Data Driven ◽

Short Report ◽

Ordinary People ◽

Health Crisis ◽

Public Health Crisis

ABSTRACT In a recent short report, the necessity of sophisticated practices in gathering records that would facilitate data sharing yields data-driven analysis in time of COVID-19. Consequently, there is a need to present the truth in data analytics in the era of COVID-19. This paper discusses the urgent call for people handling the COVID-19 data to be ethically responsible in their handling, processing, and reporting that impacts the lives of ordinary people especially in this time of pandemic as public health crisis.

Download Full-text

Data-driven tools for assessing and combating COVID-19 outbreaks in Brazil based on analytics and statistical methods

10.22514/sv.2021.253 ◽

2021 ◽

Keyword(s):

Data Analytics ◽

Exploratory Data Analysis ◽

Reproduction Number ◽

Healthcare Systems ◽

Regional Level ◽

Data Driven ◽

Effective Response ◽

Reproduction Numbers ◽

Exploratory Data ◽

Using Data

The COVID-19 pandemic is one of the worst public health crises in Brazil and the world that has ever been faced. One of the main challenges that the healthcare systems have when decision-making is that the protocols tested in other epidemics do not guarantee success in controlling the spread of COVID-19, given its complexity. In this context, an effective response to guide the competent authorities in adopting public policies to fight COVID-19 depends on thoughtful analysis and effective data visualization, ideally based on different data sources. In this paper, we discuss and provide tools that can be helpful using data analytics to respond to the COVID-19 outbreak in Recife, Brazil. We use exploratory data analysis and inferential study to determine the trend changes in COVID-19 cases and their effective or instantaneous reproduction numbers. According to the data obtained of confirmed COVID-19 cases disaggregated at a regional level in this zone, we note a heterogeneous spread in most megaregions in Recife, Brazil. When incorporating quarantines decreed, effectiveness is detected in the regions. Our results indicate that the measures have effectively curbed the spread of the disease in Recife, Brazil. However, other factors can cause the effective reproduction number to not be within the expected ranges, which must be further studied.

Download Full-text

Short Term Injection Re-Distribution STIR: Real-Time Waterflood Optimization Technique Using Advanced Data Analytics

10.2118/205593-ms ◽

2021 ◽

Author(s):

Gaurav Modi ◽

Manu Ujjwal ◽

Srungeer Simha

Keyword(s):

Real Time ◽

Data Analytics ◽

Optimization Technique ◽

Material Balance ◽

Oil Production ◽

Data Driven ◽

Petroleum Engineering ◽

Short Term ◽

Injection Water ◽

Facility Level

Abstract Short Term Injection Re-distribution (STIR) is a python based real-time WaterFlood optimization technique for brownfield assets that uses advanced data analytics. The objective of this technique is to generate recommendations for injection water re-distribution to maximize oil production at the facility level. Even though this is a data driven technique, it is tightly bounded by Petroleum Engineering principles such as material balance etc. The workflow integrates and analyse short term data (last 3-6 months) at reservoir, wells and facility level. STIR workflow is divided into three modules: Injector-producer connectivity Injector efficiency Injection water optimization First module uses four major data types to estimate the connectivity between each injector-producer pair in the reservoir: Producers data (pressure, WC, GOR, salinity) Faults presence Subsurface distance Perforation similarity – layers and kh Second module uses connectivity and watercut data to establish the injector efficiency. Higher efficiency injectors contribute most to production while poor efficiency injectors contribute to water recycling. Third module has a mathematical optimizer to maximize the oil production by re-distributing the injection water amongst injectors while honoring the constraints at each node (well, facility etc.) of the production system. The STIR workflow has been applied to 6 reservoirs across different assets and an annual increase of 3-7% in oil production is predicted. Each recommendation is verified using an independent source of data and hence, the generated recommendations align very well with the reservoir understanding. The benefits of this technique can be seen in 3-6 months of implementation in terms of increased oil production and better support (pressure increase) to low watercut producers. The inherent flexibility in the workflow allows for easy replication in any Waterflooded Reservoir and works best when the injector well count in the reservoir is relatively high. Geological features are well represented in the workflow which is one of the unique functionalities of this technique. This method also generates producers bean-up and injector stimulation candidates opportunities. This low cost (no CAPEX) technique offers the advantages of conventional petroleum engineering techniques and Data driven approach. This technique provides a great alternative for WaterFlood management in brownfield where performing a reliable conventional analysis is challenging or at times impossible. STIR can be implemented in a reservoir from scratch in 3-6 weeks timeframe.

Download Full-text

Evaluating Stochastic Seeding Strategies in Networks

Management Science ◽

10.1287/mnsc.2021.3963 ◽

2021 ◽

Author(s):

Alex Chin ◽

Dean Eckles ◽

Johan Ugander

Keyword(s):

Field Experiments ◽

Random Network ◽

Data Driven ◽

Large Field ◽

Special Issue ◽

Valid Inference ◽

Nonparametric Estimators ◽

Prescriptive Analytics ◽

Marketing Intervention ◽

Random Individual

When trying to maximize the adoption of a behavior in a population connected by a social network, it is common to strategize about where in the network to seed the behavior, often with an element of randomness. Selecting seeds uniformly at random is a basic but compelling strategy in that it distributes seeds broadly throughout the network. A more sophisticated stochastic strategy, one-hop targeting, is to select random network neighbors of random individuals; this exploits a version of the friendship paradox, whereby the friend of a random individual is expected to have more friends than a random individual, with the hope that seeding a behavior at more connected individuals leads to more adoption. Many seeding strategies have been proposed, but empirical evaluations have demanded large field experiments designed specifically for this purpose and have yielded relatively imprecise comparisons of strategies. Here we show how stochastic seeding strategies can be evaluated more efficiently in such experiments, how they can be evaluated “off-policy” using existing data arising from experiments designed for other purposes, and how to design more efficient experiments. In particular, we consider contrasts between stochastic seeding strategies and analyze nonparametric estimators adapted from policy evaluation and importance sampling. We use simulations on real networks to show that the proposed estimators and designs can substantially increase precision while yielding valid inference. We then apply our proposed estimators to two field experiments, one that assigned households to an intensive marketing intervention and one that assigned students to an antibullying intervention. This paper was accepted by Gui Liberali, Special Issue on Data-Driven Prescriptive Analytics.

Download Full-text