Fracture time predictor in mask data preparation using machine learning

RSSI Data Preparation for Machine Learning

2020 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS) ◽

10.1109/icimcis51567.2020.9354273 ◽

2020 ◽

Author(s):

Dodo Zaenal Abidin ◽

Siti Nurmaini ◽

Reza Firsandava Malik ◽

Erwin ◽

Errissya Rasywir ◽

...

Keyword(s):

Machine Learning ◽

Data Preparation

Download Full-text

Application of Machine Learniing For Reservoir Facies Classification in Port Field, Offshore Niger Delta

10.2118/207163-ms ◽

2021 ◽

Author(s):

Jerome Asedegbega ◽

Oladayo Ayinde ◽

Alexander Nwakanma

Keyword(s):

Neural Network ◽

Machine Learning ◽

Niger Delta ◽

Nearest Neighbor ◽

Support Vector ◽

Data Preparation ◽

K Nearest Neighbor ◽

Blind Test ◽

Facies Classification ◽

Well Data

Abstract Several computer-aided techniques have been developed in recent past to improve interpretational accuracy of subsurface geology. This paradigm shift has provided tremendous success in variety of Machine Learning Application domains and help for better feasibility study in reservoir evaluation using multiple classification techniques. Facies classification is an essential subsurface exploration task as sedimentary facies reflect associated physical, chemical, and biological conditions that formation unit experienced during sedimentation activity. This study however, employed formation samples for facies classification using Machine Learning (ML) techniques and classified different facies from well logs in seven (7) wells of the PORT Field, Offshore Niger Delta. Six wells were concatenated during data preparation and trained using supervised ML algorithms before validating the models by blind testing on one well log to predict discrete facies groups. The analysis started with data preparation and examination where various features of the available well data were conditioned. For the model building and performance, support vector machine, random forest, decision tree, extra tree, neural network (multilayer preceptor), k-nearest neighbor and logistic regression model were built after dividing the data sets into training, test, and blind test well data. Results of metric score for the blind test well estimated for the various models using Jaccard index and F1-score indicated 0.73 and 0.82 for support vector machine, 0.38 and 0.54 for random forest, 0.78 and 0.83 for extra tree, 0.91 and 0.95 for k-nearest neighbor, 0.41 and 0.56 for decision tree, 0.63 and 0.74 for logistic regression, 0.55 and 0.68 for neural network, respectively. The efficiency of ML techniques for enhancing the prediction accuracy and decreasing the procedure time and their approach toward the data, makes it importantly desirable to recommend them in subsurface facies classification analysis.

Download Full-text

Data preparation step for automated diagnosis based on HRV analysis and machine learning

2016 6th International Conference on System Engineering and Technology (ICSET) ◽

10.1109/icsengt.2016.7849639 ◽

2016 ◽

Cited By ~ 2

Author(s):

Vincentius Timothy ◽

Ary Setijadi Prihatmanto ◽

Kyung-Hyune Rhee

Keyword(s):

Machine Learning ◽

Data Preparation ◽

Automated Diagnosis ◽

Preparation Step ◽

Hrv Analysis

Download Full-text

Layer 2 Path Evaluation System using Machine Learning

ECTI Transactions on Electrical Engineering, Electronics, and Communications ◽

10.37936/ecti-eec.2021193.244943 ◽

2021 ◽

Vol 19 (3) ◽

Author(s):

Mahamah Sebakor

Keyword(s):

Machine Learning ◽

Spanning Tree ◽

Evaluation System ◽

Software Defined Network ◽

Data Preparation ◽

Spanning Tree Protocol ◽

Novel Approach ◽

Layer 2 ◽

And Performance ◽

Network Administrators

Is it strange that the spanning tree protocol (STP) has been the only thing used to defend the Layer-2 backbone against looping? Do we trust it? For several decades, the campus backbone has often been an unsuspected problem, one of which is STP failure. Meanwhile, the MAC address flapping is probably a feasible issue for modern network fabrics. According to the serious Layer-2 issues, particularly the legacy switches extended STP design, this work uses the notion of a software-defined network fashion to evaluate the traditional and modern networks. Through the MAC address lookup of all bridge devices, this work proposes the Layer-2 evaluation system (LES), which uses a novel approach known as support supervised learning to create the data preparation for machine learning. Additionally, the LES enabled network administrators to determine their backbones. This study is intended to evaluate the potential slowdown network caused by MAC address problems. Furthermore, this work investigates the proposed method in a real network, and it also covers the evaluation and performance of our proposed method.

Download Full-text

Data Preparation of the nuMoM2b Dataset

10.1101/2021.08.24.21262142 ◽

2021 ◽

Author(s):

Anton Goretsky ◽

Anastasia Dmitrienko ◽

Irene Tang ◽

Nicolae Lari ◽

Owen Kunhardt ◽

...

Keyword(s):

Machine Learning ◽

Pregnancy Outcomes ◽

Statistical Models ◽

Diverse Population ◽

Data Preparation ◽

Large Dataset ◽

Nulliparous Women ◽

Data Extract ◽

Outcomes Study ◽

Study Monitoring

In 2010, the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) started the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-be (nuMoM2b), a prospective cohort study of a racially/ethnically/geographically diverse population of nulliparous women with singleton gestation. The nuMoM2b is a very large dataset, consisting of data for 10,038 patients with over 4,600 features per patient, spread out over 80 files. In this report, we share our experience preparing and working with this dataset. We present our data preprocessing of the nuMoM2b dataset to get a deeper understanding of the data, extract the most relevant features, make the fewest assumptions when filling in unknown values, and reducing the dimensionality of the data. We hope this report is useful to researchers interested in building machine learning and statistical models from the nuMoM2b dataset.

Download Full-text

Machine Learning and data mining tools applied for databases of low number of records

Advanced Engineering Research ◽

10.23947/2687-1653-2021-21-4-346-363 ◽

2022 ◽

Vol 21 (4) ◽

pp. 346-363

Author(s):

Hubert Anysz

Keyword(s):

Machine Learning ◽

Data Mining ◽

Computational Methods ◽

Large Datasets ◽

Learning Tools ◽

Data Preparation ◽

Preparation Methods ◽

Use Of Data ◽

Small Set ◽

Mining Tools

The use of data mining and machine learning tools is becoming increasingly common. Their usefulness is mainly noticeable in the case of large datasets, when information to be found or new relationships are extracted from information noise. The development of these tools means that datasets with much fewer records are being explored, usually associated with specific phenomena. This specificity most often causes the impossibility of increasing the number of cases, and that can facilitate the search for dependences in the phenomena under study. The paper discusses the features of applying the selected tools to a small set of data. Attempts have been made to present methods of data preparation, methods for calculating the performance of tools, taking into account the specifics of databases with a small number of records. The techniques selected by the author are proposed, which helped to break the deadlock in calculations, i.e., to get results much worse than expected. The need to apply methods to improve the accuracy of forecasts and the accuracy of classification was caused by a small amount of analysed data. This paper is not a review of popular methods of machine learning and data mining; nevertheless, the collected and presented material will help the reader to shorten the path to obtaining satisfactory results when using the described computational methods

Download Full-text

RPCA FOR INCORRECT UPPER AIR RADIO SOUNDING DATA DETECTION

Meteorologiya i Gidrologiya ◽

10.52002/0130-2906-2021-9-105-116 ◽

2021 ◽

pp. 105-116

Author(s):

A. M. KOZIN ◽

◽

A. D. LYKOV ◽

I. A. VYAZANKIN ◽

A. S. VYAZANKIN ◽

...

Keyword(s):

Machine Learning ◽

Middle Atmosphere ◽

Data Detection ◽

Analytic Center ◽

Data Preparation ◽

Learning Methods ◽

International Network ◽

Machine Learning Methods ◽

Central Aerological Observatory

The “Middle Atmosphere” Regional Information and Analytic Center (Central Aerological Observatory) works out algorithms for analyzing the quality of aerological data based on machine learning methods. Different approaches to the data preparation are described, the examples of data that were rejected using standard approaches are given, the ways to develop and improve the quality of aerological information transmitted to the WMO international network are outlined.

Download Full-text

Data preprocessing in predictive data mining

The Knowledge Engineering Review ◽

10.1017/s026988891800036x ◽

2019 ◽

Vol 34 ◽

Cited By ~ 9

Author(s):

Stamatios-Aggelos N. Alexandropoulos ◽

Sotiris B. Kotsiantis ◽

Michael N. Vrahatis

Keyword(s):

Machine Learning ◽

Data Mining ◽

Processing Time ◽

Data Preprocessing ◽

Difficult Problem ◽

Data Preparation ◽

Learning Tasks ◽

Effective Performance ◽

Predictive Data Mining

AbstractA large variety of issues influence the success of data mining on a given problem. Two primary and important issues are the representation and the quality of the dataset. Specifically, if much redundant and unrelated or noisy and unreliable information is presented, then knowledge discovery becomes a very difficult problem. It is well-known that data preparation steps require significant processing time in machine learning tasks. It would be very helpful and quite useful if there were various preprocessing algorithms with the same reliable and effective performance across all datasets, but this is impossible. To this end, we present the most well-known and widely used up-to-date algorithms for each step of data preprocessing in the framework of predictive data mining.

Download Full-text

Preparation of Quality Data for Air Pollution Forecasting

International Journal of Scientific Research in Science and Technology ◽

10.32628/ijsrst196511 ◽

2019 ◽

pp. 51-56

Author(s):

Y. Lathasree ◽

G. Mamatha

Keyword(s):

Machine Learning ◽

Air Pollution ◽

Physical Models ◽

Machine Learning Techniques ◽

Quality Data ◽

Data Preparation ◽

Learning Techniques ◽

Machine Learning Model ◽

Applications Of Machine Learning ◽

Air Pollution Forecasting

This paper proposes a preparation of quality data for training accurate machine learning model. Data preparation is very important in machine learning. Here we are preparing the data for air pollution forecast. As Air pollution forecasting has traditionally been done by physical models of the atmosphere, which are unstable and in accurate for large periods of time. Since machine learning techniques are more robust to perturbations, in this paper we explore the data preparation and applications of machine learning to air pollution forecasting to potentially generate more accurate predictions. A Linear Regression model is used to train the data a more accurately and predict the air pollution.

Download Full-text

Towards low-cost machine learning solutions for manufacturing SMEs

AI & Society ◽

10.1007/s00146-021-01332-8 ◽

2021 ◽

Author(s):

Jan Kaiser ◽

German Terrazas ◽

Duncan McFarlane ◽

Lavindra de Silva

Keyword(s):

Machine Learning ◽

Production Systems ◽

Low Cost ◽

Parameter Tuning ◽

Simple Solution ◽

Data Preparation ◽

Comprehensive Understanding ◽

Manufacturing Environment ◽

Extensive Data ◽

Learning Capabilities

AbstractMachine learning (ML) is increasingly used to enhance production systems and meet the requirements of a rapidly evolving manufacturing environment. Compared to larger companies, however, small- and medium-sized enterprises (SMEs) lack in terms of resources, available data and skills, which impedes the potential adoption of analytics solutions. This paper proposes a preliminary yet general approach to identify low-cost analytics solutions for manufacturing SMEs, with particular emphasis on ML. The initial studies seem to suggest that, contrarily to what is usually thought at first glance, SMEs seldom need digital solutions that use advanced ML algorithms which require extensive data preparation, laborious parameter tuning and a comprehensive understanding of the underlying problem. If an analytics solution does require learning capabilities, a ‘simple solution’, which we will characterise in this paper, should be sufficient.

Download Full-text