scholarly journals Comparison of Machine Learning Classifiers for Accurate Prediction of Real-Time Stuck Pipe Incidents

Energies ◽  
2020 ◽  
Vol 13 (14) ◽  
pp. 3683 ◽  
Author(s):  
Javed Akbar Khan ◽  
Muhammad Irfan ◽  
Sonny Irawan ◽  
Fong Kam Yao ◽  
Md Shokor Abdul Rahaman ◽  
...  

Stuck pipe incidents are one of the contributors to non-productive time (NPT), where they can result in a higher well cost. This research investigates the feasibility of applying machine learning to predict events of stuck pipes during drilling operations in petroleum fields. The predictive model aims to predict the occurrence of stuck pipes so that relevant drilling operation personnel are warned to enact a mitigation plan to prevent stuck pipes. Two machine learning methodologies were studied in this research, namely, the artificial neural network (ANN) and support vector machine (SVM). A total of 268 data sets were successfully collected through data extraction for the well drilling operation. The data also consist of the parameters with which the stuck pipes occurred during the drilling operations. These drilling parameters include information such as the properties of the drilling fluid, bottom-hole assembly (BHA) specification, state of the bore-hole and operating conditions. The R programming software was used to construct both the ANN and SVM machine learning models. The prediction performance of the machine learning models was evaluated in terms of accuracy, sensitivity and specificity. Sensitivity analysis was conducted on these two machine learning models. For the ANN, two activation functions—namely, the logistic activation function and hyperbolic tangent activation function—were tested. Additionally, all the possible combinations of network structures, from [19, 1, 1, 1, 1] to [19, 10, 10, 10, 1], were tested for each activation function. For the SVM, three kernel functions—namely, linear, Radial Basis Function (RBF) and polynomial—were tested. Apart from that, SVM hyper-parameters such as the regularization factor (C), sigma (σ) and degree (D) were used in sensitivity analysis as well. The results from the sensitivity analysis demonstrate that the best ANN model managed to achieve an 88.89% accuracy, 91.89% sensitivity and 86.36% specificity, whereas the best SVM model managed to achieve an 83.95% accuracy, 86.49% sensitivity and 81.82% specificity. Upon comparison, the ANN model is the better machine learning model in this study because its accuracy, sensitivity and specificity are consistently higher than those of the best SVM model. In conclusion, judging from the promising prediction accurateness as demonstrated in the results of this study, it is suggested that stuck pipe prediction using machine learning is indeed practical.

2019 ◽  
Vol 3 (s1) ◽  
pp. 60-61
Author(s):  
Kadie Clancy ◽  
Esmaeel Dadashzadeh ◽  
Christof Kaltenmeier ◽  
JB Moses ◽  
Shandong Wu

OBJECTIVES/SPECIFIC AIMS: This retrospective study aims to create and train machine learning models using a radiomic-based feature extraction method for two classification tasks: benign vs. pathologic PI and operation of benefit vs. operation not needed. The long-term goal of our study is to build a computerized model that incorporates both radiomic features and critical non-imaging clinical factors to improve current surgical decision-making when managing PI patients. METHODS/STUDY POPULATION: Searched radiology reports from 2010-2012 via the UPMC MARS Database for reports containing the term “pneumatosis” (subsequently accounting for negations and age restrictions). Our inclusion criteria included: patient age 18 or older, clinical data available at time of CT diagnosis, and PI visualized on manual review of imaging. Cases with intra-abdominal free air were excluded. Collected CT imaging data and an additional 149 clinical data elements per patient for a total of 75 PI cases. Data collection of an additional 225 patients is ongoing. We trained models for two clinically-relevant prediction tasks. The first (referred to as prediction task 1) classifies between benign and pathologic PI. Benign PI is defined as either lack of intraoperative visualization of transmural intestinal necrosis or successful non-operative management until discharge. Pathologic PI is defined as either intraoperative visualization of transmural PI or withdrawal of care and subsequent death during hospitalization. The distribution of data samples for prediction task 1 is 47 benign cases and 38 pathologic cases. The second (referred to as prediction task 2) classifies between whether the patient benefitted from an operation or not. “Operation of benefit” is defined as patients with PI, be it transmural or simply mucosal, who benefited from an operation. “Operation not needed” is defined as patients who were safely discharged without an operation or patients who had an operation, but nothing was found. The distribution of data samples for prediction task 2 is 37 operation not needed cases and 38 operation of benefit cases. An experienced surgical resident from UPMC manually segmented 3D PI ROIs from the CT scans (5 mm Axial cut) for each case. The most concerning ~10-15 cm segment of bowel for necrosis with a 1 cm margin was selected. A total of 7 slices per patient were segmented for consistency. For both prediction task 1 and prediction task 2, we independently completed the following procedure for testing and training: 1.) Extracted radiomic features from the 3D PI ROIs that resulted in 99 total features. 2.) Used LASSO feature selection to determine the subset of the original 99 features that are most significant for performance of the prediction task. 3.) Used leave-one-out cross-validation for testing and training to account for the small dataset size in our preliminary analysis. Implemented and trained several machine learning models (AdaBoost, SVM, and Naive Bayes). 4.) Evaluated the trained models in terms of AUC and Accuracy and determined the ideal model structure based on these performance metrics. RESULTS/ANTICIPATED RESULTS: Prediction Task 1: The top-performing model for this task was an SVM model trained using 19 features. This model had an AUC of 0.79 and an accuracy of 75%. Prediction Task 2: The top-performing model for this task was an SVM model trained using 28 features. This model had an AUC of 0.74 and an accuracy of 64%. DISCUSSION/SIGNIFICANCE OF IMPACT: To the best of our knowledge, this is the first study to use radiomic-based machine learning models for the prediction of tissue ischemia, specifically intestinal ischemia in the setting of PI. In this preliminary study, which serves as a proof of concept, the performance of our models has demonstrated the potential of machine learning based only on radiomic imaging features to have discriminative power for surgical decision-making problems. While many non-imaging-related clinical factors play a role in the gestalt of clinical decision making when PI presents, we have presented radiomic-based models that may augment this decision-making process, especially for more difficult cases when clinical features indicating acute abdomen are absent. It should be noted that prediction task 2, whether or not a patient presenting with PI would benefit from an operation, has lower performance than prediction task 1 and is also a more challenging task for physicians in real clinical environments. While our results are promising and demonstrate potential, we are currently working to increase our dataset to 300 patients to further train and assess our models. References DuBose, Joseph J., et al. “Pneumatosis Intestinalis Predictive Evaluation Study (PIPES): a multicenter epidemiologic study of the Eastern Association for the Surgery of Trauma.” Journal of Trauma and Acute Care Surgery 75.1 (2013): 15-23. Knechtle, Stuart J., Andrew M. Davidoff, and Reed P. Rice. “Pneumatosis intestinalis. Surgical management and clinical outcome.” Annals of Surgery 212.2 (1990): 160.


2020 ◽  
Vol 79 (Suppl 1) ◽  
pp. 1620.1-1621
Author(s):  
J. Lee ◽  
H. Kim ◽  
S. Y. Kang ◽  
S. Lee ◽  
Y. H. Eun ◽  
...  

Background:Tumor necrosis factor (TNF) inhibitors are important drugs in treating patients with ankylosing spondylitis (AS). However, they are not used as a first-line treatment for AS. There is an insufficient treatment response to the first-line treatment, non-steroidal anti-inflammatory drugs (NSAIDs), in over 40% of patients. If we can predict who will need TNF inhibitors at an earlier phase, adequate treatment can be provided at an appropriate time and potential damages can be avoided. There is no precise predictive model at present. Recently, various machine learning methods show great performances in predictions using clinical data.Objectives:We aim to generate an artificial neural network (ANN) model to predict early TNF inhibitor users in patients with ankylosing spondylitis.Methods:The baseline demographic and laboratory data of patients who visited Samsung Medical Center rheumatology clinic from Dec. 2003 to Sep. 2018 were analyzed. Patients were divided into two groups: early TNF inhibitor users treated by TNF inhibitors within six months of their follow-up (early-TNF users), and the others (non-early-TNF users). Machine learning models were formulated to predict the early-TNF users using the baseline data. Additionally, feature importance analysis was performed to delineate significant baseline characteristics.Results:The numbers of early-TNF and non-early-TNF users were 90 and 509, respectively. The best performing ANN model utilized 3 hidden layers with 50 hidden nodes each; its performance (area under curve (AUC) = 0.75) was superior to logistic regression model, support vector machine, and random forest model (AUC = 0.72, 0.65, and 0.71, respectively) in predicting early-TNF users. Feature importance analysis revealed erythrocyte sedimentation rate (ESR), C-reactive protein (CRP), and height as the top significant baseline characteristics for predicting early-TNF users. Among these characteristics, height was revealed by machine learning models but not by conventional statistical techniques.Conclusion:Our model displayed superior performance in predicting early TNF users compared with logistic regression and other machine learning models. Machine learning can be a vital tool in predicting treatment response in various rheumatologic diseases.Disclosure of Interests:None declared


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Martine De Cock ◽  
Rafael Dowsley ◽  
Anderson C. A. Nascimento ◽  
Davis Railsback ◽  
Jianwei Shen ◽  
...  

Abstract Background In biomedical applications, valuable data is often split between owners who cannot openly share the data because of privacy regulations and concerns. Training machine learning models on the joint data without violating privacy is a major technology challenge that can be addressed by combining techniques from machine learning and cryptography. When collaboratively training machine learning models with the cryptographic technique named secure multi-party computation, the price paid for keeping the data of the owners private is an increase in computational cost and runtime. A careful choice of machine learning techniques, algorithmic and implementation optimizations are a necessity to enable practical secure machine learning over distributed data sets. Such optimizations can be tailored to the kind of data and Machine Learning problem at hand. Methods Our setup involves secure two-party computation protocols, along with a trusted initializer that distributes correlated randomness to the two computing parties. We use a gradient descent based algorithm for training a logistic regression like model with a clipped ReLu activation function, and we break down the algorithm into corresponding cryptographic protocols. Our main contributions are a new protocol for computing the activation function that requires neither secure comparison protocols nor Yao’s garbled circuits, and a series of cryptographic engineering optimizations to improve the performance. Results For our largest gene expression data set, we train a model that requires over 7 billion secure multiplications; the training completes in about 26.90 s in a local area network. The implementation in this work is a further optimized version of the implementation with which we won first place in Track 4 of the iDASH 2019 secure genome analysis competition. Conclusions In this paper, we present a secure logistic regression training protocol and its implementation, with a new subprotocol to securely compute the activation function. To the best of our knowledge, we present the fastest existing secure multi-party computation implementation for training logistic regression models on high dimensional genome data distributed across a local area network.


2020 ◽  
Vol 13 (2) ◽  
pp. 148-156
Author(s):  
Keon Vin Park ◽  
Kyoung Ho Oh ◽  
Yong Jun Jeong ◽  
Jihye Rhee ◽  
Mun Soo Han ◽  
...  

Objectives. Prognosticating idiopathic sudden sensorineural hearing loss (ISSNHL) is an important challenge. In our study, a dataset was split into training and test sets and cross-validation was implemented on the training set, thereby determining the hyperparameters for machine learning models with high test accuracy and low bias. The effectiveness of the following five machine learning models for predicting the hearing prognosis in patients with ISSNHL after 1 month of treatment was assessed: adaptive boosting, K-nearest neighbor, multilayer perceptron, random forest (RF), and support vector machine (SVM).Methods. The medical records of 523 patients with ISSNHL admitted to Korea University Ansan Hospital between January 2010 and October 2017 were retrospectively reviewed. In this study, we analyzed data from 227 patients (recovery, 106; no recovery, 121) after excluding those with missing data. To determine risk factors, statistical hypothesis tests (e.g., the two-sample <i>t</i>-test for continuous variables and the chi-square test for categorical variables) were conducted to compare patients who did or did not recover. Variables were selected using an RF model depending on two criteria (mean decreases in the Gini index and accuracy).Results. The SVM model using selected predictors achieved both the highest accuracy (75.36%) and the highest F-score (0.74) on the test set. The RF model with selected variables demonstrated the second-highest accuracy (73.91%) and F-score (0.74). The RF model with the original variables showed the same accuracy (73.91%) as that of the RF model with selected variables, but a lower F-score (0.73). All the tested models, except RF, demonstrated better performance after variable selection based on RF.Conclusion. The SVM model with selected predictors was the best-performing of the tested prediction models. The RF model with selected predictors was the second-best model. Therefore, machine learning models can be used to predict hearing recovery in patients with ISSNHL.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Seulkee Lee ◽  
Yeonghee Eun ◽  
Hyungjin Kim ◽  
Hoon-Suk Cha ◽  
Eun-Mi Koh ◽  
...  

AbstractWe aim to generate an artificial neural network (ANN) model to predict early TNF inhibitor users in patients with ankylosing spondylitis. The baseline demographic and laboratory data of patients who visited Samsung Medical Center rheumatology clinic from Dec. 2003 to Sep. 2018 were analyzed. Patients were divided into two groups: early-TNF and non-early-TNF users. Machine learning models were formulated to predict the early-TNF users using the baseline data. Feature importance analysis was performed to delineate significant baseline characteristics. The numbers of early-TNF and non-early-TNF users were 90 and 505, respectively. The performance of the ANN model, based on the area under curve (AUC) for a receiver operating characteristic curve (ROC) of 0.783, was superior to logistic regression, support vector machine, random forest, and XGBoost models (for an ROC curve of 0.719, 0.699, 0.761, and 0.713, respectively) in predicting early-TNF users. Feature importance analysis revealed CRP and ESR as the top significant baseline characteristics for predicting early-TNF users. Our model displayed superior performance in predicting early-TNF users compared with logistic regression and other machine learning models. Machine learning can be a vital tool in predicting treatment response in various rheumatologic diseases.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Wenjin Zhu ◽  
Zhiming Chao ◽  
Guotao Ma

In this paper, a database developed from the existing literature about permeability of rock was established. Based on the constructed database, a Support Vector Machine (SVM) model with hyperparameters optimised by Mind Evolutionary Algorithm (MEA) was proposed to predict the permeability of rock. Meanwhile, the Genetic Algorithm- (GA-) and Particle Swarm Algorithm- (PSO-) SVM models were constructed to compare the improving effects of MEA on the foretelling accuracy of machine learning models with those of GA and PSO, respectively. The following conclusions were drawn. MEA can increase the predictive accuracy of the constructed machine learning models remarkably in a few iteration times, which has better optimisation performance than that of GA and PSO. MEA-SVM has the best forecasting performance, followed by PSO-SVM, while the estimating precision of GA-SVM is lower than them. The proposed MEA-SVM model can accurately predict the permeability of rock indicating the model having a satisfactory generalization and extrapolation capacity.


Energies ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 289
Author(s):  
Maria Krechowicz ◽  
Adam Krechowicz

Nowadays we can observe a growing demand for installations of new gas pipelines in Europe. A large number of them are installed using trenchless Horizontal Directional Drilling (HDD) technology. The aim of this work was to develop and compare new machine learning models dedicated for risk assessment in HDD projects. The data from 133 HDD projects from eight countries of the world were gathered, profiled, and preprocessed. Three machine learning models, logistic regression, random forests, and Artificial Neural Network (ANN), were developed to predict the overall HDD project outcome (failure free installation or installation likely to fail), and the occurrence of identified unwanted events. The best performance in terms of recall and accuracy was achieved for the developed ANN model, which proved to be efficient, fast and robust in predicting risks in HDD projects. Machine learning applications in the proposed models enabled eliminating the involvement of a group of experts in the risk assessment process and therefore significantly lower the costs associated with the risk assessment process. Future research may be oriented towards developing a comprehensive risk management system, which will enable dynamic risk assessment taking into account various combinations of risk mitigation actions.


2021 ◽  
Author(s):  
M.D.S. Sudaraka ◽  
I. Abeyagunawardena ◽  
E. S. De Silva ◽  
S Abeyagunawardena

Abstract BackgroundElectrocardiogram (ECG) is a key diagnostic test in cardiac investigation. Interpretation of ECG is based on the understanding of normal electrical patterns produced by the heart and alterations of those patterns in specific disease conditions. With machine learning techniques, it is possible to interpret ECGs with increased accuracy. However, there is a lacuna in machine learning models to detect myocardial infarction (MI) coupled with the affected territories of the heart. MethodsThe dataset was obtained from the University of California, Irvine, Machine Learning Repository. It was filtered to obtain observations categorized as Normal, Ischemic changes, Old Anterior MI and Old Inferior MI. The dataset was randomly split into a training set (70%) and a test set (30%). 73 out of the 270 ECG features were selected based on the changes observed following MI, after excluding predictors that had near zero variance across the observations. Three machine learning classification models (Bootstrap Aggregation Decision Trees, Random Forest, Multi-layer Perceptron) were trained using the training dataset, optimizing for the Kappa statistic and the parameter tuning was achieved with repeated 10-fold cross validation. Accuracy and Kappa of the samples were used to evaluate performance between the models. ResultsThe Random Forest model identified old anterior and old inferior MIs with 100% sensitivity and specificity and all 4 categorized observations with an overall accuracy of 0.9167 (95% CI 0.8424 - 0.9633). Both the Bootstrap Aggregation Decision Trees and the Multi-layer Perceptron models identified old anterior MIs with 100% sensitivity and specificity and their overall accuracies for all 4 observations were 0.8958 (95% CI 0.8168 - 0.9489) and 0.8542 (95% CI 0.7674 - 0.9179) respectively.Conclusion With a medically informed feature selection we were able to identify old anterior MI with 100% sensitivity and specificity by all three models in this study, and old inferior MI with 100% sensitivity and specificity by Random Forest Model. If the data set can be improved it is possible to utilize these machine learning models in hospital setting to identify cardiac emergencies by incorporating them into cardiac monitors, until trained personnel become available.


Sign in / Sign up

Export Citation Format

Share Document