Natural Language Processing and Text Mining Approaches in Production Shortfalls Analytics: Methodology, Case-Study and Value in the North Sea

2021 ◽  
Author(s):  
Edgar Bernier ◽  
Sebastien Perrier

Abstract Maximizing operational efficiency is a critical challenge in oil and gas production, particularly important for mature assets in the North Sea. The causes of production shortfalls are numerous, distributed across a wide range of disciplines, technical and non-technical causes. The primary reason to apply Natural Language Processing (NLP) and text mining on several years of shortfall history was the need to support efficiently the evaluation of digital transformation use-case screenings and value mapping exercises, through a proper mapping of the issues faced. Obviously, this mapping contributed as well to reflect on operational surveillance and maintenance strategies to reduce the production shortfalls. This paper presents a methodology where the historical records of descriptions, comments and results of investigation regarding production shortfalls are revisited, adding to existing shortfall classifications and statistics, in particular in two domains: richer first root-cause mapping, and a series of advanced visualizations and analytics. The methodology put in place uses natural-language pre-processing techniques, combined with keyword-based text-mining and classification techniques. The limitations associated to the size and quality of these language datasets will be described, and the results discussed, highlighting the value of reaching high level of data granularity while defeating the ‘more information, less attention’ bias. At the same time, visual designs are introduced to display efficiently the different dimensions of this data (impact, frequency evolution through time, location in term of field and affected systems, root causes and other cause-related categories). The ambition in the domain of visualization is to create User Experience-friendly shortfall analytics, that can be displayed in smart rooms and collaborative rooms, where display's efficiency is higher when user-interactions are kept minimal, number of charts is limited and multiple dimensions do not collide. The paper is based on several applications across the North Sea. This case study and the associated lessons learned regarding natural language processing and text mining applied to similar technical concise data are answering several frequently asked questions on the value of the textual data records gathered over years.

2019 ◽  
Vol 58 (2) ◽  
pp. 315-337 ◽  
Author(s):  
Thomas Cogswell

AbstractHistorians have not paid close attention to the activities of freebooters operating out of Dunkirk in the late 1620s. This essay corrects that omission by first studying the threat from Dunkirk to England's east coast and then addressing how the central government, counties, and coastal towns responded. A surprisingly rich vein of manuscript material from Great Yarmouth and particularly from the Suffolk fishing community of Aldeburgh informs this case study of the impact of this conflict around the North Sea.


2021 ◽  
pp. 1-13
Author(s):  
Lamiae Benhayoun ◽  
Daniel Lang

BACKGROUND: The renewed advent of Artificial Intelligence (AI) is inducing profound changes in the classic categories of technology professions and is creating the need for new specific skills. OBJECTIVE: Identify the gaps in terms of skills between academic training on AI in French engineering and Business Schools, and the requirements of the labour market. METHOD: Extraction of AI training contents from the schools’ websites and scraping of a job advertisements’ website. Then, analysis based on a text mining approach with a Python code for Natural Language Processing. RESULTS: Categorization of occupations related to AI. Characterization of three classes of skills for the AI market: Technical, Soft and Interdisciplinary. Skills’ gaps concern some professional certifications and the mastery of specific tools, research abilities, and awareness of ethical and regulatory dimensions of AI. CONCLUSIONS: A deep analysis using algorithms for Natural Language Processing. Results that provide a better understanding of the AI capability components at the individual and the organizational levels. A study that can help shape educational programs to respond to the AI market requirements.


Author(s):  
Jacqueline Peng ◽  
Mengge Zhao ◽  
James Havrilla ◽  
Cong Liu ◽  
Chunhua Weng ◽  
...  

Abstract Background Natural language processing (NLP) tools can facilitate the extraction of biomedical concepts from unstructured free texts, such as research articles or clinical notes. The NLP software tools CLAMP, cTAKES, and MetaMap are among the most widely used tools to extract biomedical concept entities. However, their performance in extracting disease-specific terminology from literature has not been compared extensively, especially for complex neuropsychiatric disorders with a diverse set of phenotypic and clinical manifestations. Methods We comparatively evaluated these NLP tools using autism spectrum disorder (ASD) as a case study. We collected 827 ASD-related terms based on previous literature as the benchmark list for performance evaluation. Then, we applied CLAMP, cTAKES, and MetaMap on 544 full-text articles and 20,408 abstracts from PubMed to extract ASD-related terms. We evaluated the predictive performance using precision, recall, and F1 score. Results We found that CLAMP has the best performance in terms of F1 score followed by cTAKES and then MetaMap. Our results show that CLAMP has much higher precision than cTAKES and MetaMap, while cTAKES and MetaMap have higher recall than CLAMP. Conclusion The analysis protocols used in this study can be applied to other neuropsychiatric or neurodevelopmental disorders that lack well-defined terminology sets to describe their phenotypic presentations.


2021 ◽  
Author(s):  
Matthew Kelsey ◽  
Magnus Raaholt ◽  
Olav Einervoll ◽  
Rustem Nafikov ◽  
Stian Amble

Abstract Multilateral technology has for nearly three decades extended the production life of fields in the North Sea by delivering a higher recovery factor supported by the cumulative production of the multiple laterals. Additionally, operators continue to look at methods to reduce the environmental impact of drilling and intervention. Taking advantage of the latest multilateral technology can turn otherwise unviable reservoirs into economically sound targets by achieving a longer field life while minimizing construction costs, risk, and environmental impact. This paper will focus on mature fields in the region that have used multilateral applications for wells that were reaching the end of their life and have been extended to further economic production. This paper discusses challenges faced to provide a multilateral solution for drilling new lateral legs in existing wells where there is a lack of available slots to drill new wells. Additionally, discussion will cover completion designs that tie new laterals into existing production casing. The case study will include discussion of workover operations, isolation methods, and lateral creation systems. The paper focuses on the challenges, solutions, and successful case study of a retrofit multilateral well constructed in the North Sea which extended production life in a mature field by using innovative multilateral re-entry methods. The paper also provides insight as to methodology for continually improving reliability of multilateral installations to maximize efficiencies.


Author(s):  
Sourajit Roy ◽  
Pankaj Pathak ◽  
S. Nithya

During the advent of the 21st century, technical breakthroughs and developments took place. Natural Language Processing or NLP is one of their promising disciplines that has been increasingly dynamic via groundbreaking findings on most computer networks. Because of the digital revolution the amounts of data generated by M2M communication across devices and platforms such as Amazon Alexa, Apple Siri, Microsoft Cortana, etc. were significantly increased. This causes a great deal of unstructured data to be processed that does not fit in with standard computational models. In addition, the increasing problems of language complexity, data variability and voice ambiguity make implementing models increasingly harder. The current study provides an overview of the potential and breadth of the NLP market and its acceptance in industry-wide, in particular after Covid-19. It also gives a macroscopic picture of progress in natural language processing research, development and implementation.


Author(s):  
Clifford Nangle ◽  
Stuart McTaggart ◽  
Margaret MacLeod ◽  
Jackie Caldwell ◽  
Marion Bennie

ABSTRACT ObjectivesThe Prescribing Information System (PIS) datamart, hosted by NHS National Services Scotland receives around 90 million electronic prescription messages per year from GP practices across Scotland. Prescription messages contain information including drug name, quantity and strength stored as coded, machine readable, data while prescription dose instructions are unstructured free text and difficult to interpret and analyse in volume. The aim, using Natural Language Processing (NLP), was to extract drug dose amount, unit and frequency metadata from freely typed text in dose instructions to support calculating the intended number of days’ treatment. This then allows comparison with actual prescription frequency, treatment adherence and the impact upon prescribing safety and effectiveness. ApproachAn NLP algorithm was developed using the Ciao implementation of Prolog to extract dose amount, unit and frequency metadata from dose instructions held in the PIS datamart for drugs used in the treatment of gastrointestinal, cardiovascular and respiratory disease. Accuracy estimates were obtained by randomly sampling 0.1% of the distinct dose instructions from source records, comparing these with metadata extracted by the algorithm and an iterative approach was used to modify the algorithm to increase accuracy and coverage. ResultsThe NLP algorithm was applied to 39,943,465 prescription instructions issued in 2014, consisting of 575,340 distinct dose instructions. For drugs used in the gastrointestinal, cardiovascular and respiratory systems (i.e. chapters 1, 2 and 3 of the British National Formulary (BNF)) the NLP algorithm successfully extracted drug dose amount, unit and frequency metadata from 95.1%, 98.5% and 97.4% of prescriptions respectively. However, instructions containing terms such as ‘as directed’ or ‘as required’ reduce the usability of the metadata by making it difficult to calculate the total dose intended for a specific time period as 7.9%, 0.9% and 27.9% of dose instructions contained terms meaning ‘as required’ while 3.2%, 3.7% and 4.0% contained terms meaning ‘as directed’, for drugs used in BNF chapters 1, 2 and 3 respectively. ConclusionThe NLP algorithm developed can extract dose, unit and frequency metadata from text found in prescriptions issued to treat a wide range of conditions and this information may be used to support calculating treatment durations, medicines adherence and cumulative drug exposure. The presence of terms such as ‘as required’ and ‘as directed’ has a negative impact on the usability of the metadata and further work is required to determine the level of impact this has on calculating treatment durations and cumulative drug exposure.


10.2196/20773 ◽  
2020 ◽  
Vol 22 (8) ◽  
pp. e20773 ◽  
Author(s):  
Antoine Neuraz ◽  
Ivan Lerner ◽  
William Digan ◽  
Nicolas Paris ◽  
Rosy Tsopra ◽  
...  

Background A novel disease poses special challenges for informatics solutions. Biomedical informatics relies for the most part on structured data, which require a preexisting data or knowledge model; however, novel diseases do not have preexisting knowledge models. In an emergent epidemic, language processing can enable rapid conversion of unstructured text to a novel knowledge model. However, although this idea has often been suggested, no opportunity has arisen to actually test it in real time. The current coronavirus disease (COVID-19) pandemic presents such an opportunity. Objective The aim of this study was to evaluate the added value of information from clinical text in response to emergent diseases using natural language processing (NLP). Methods We explored the effects of long-term treatment by calcium channel blockers on the outcomes of COVID-19 infection in patients with high blood pressure during in-patient hospital stays using two sources of information: data available strictly from structured electronic health records (EHRs) and data available through structured EHRs and text mining. Results In this multicenter study involving 39 hospitals, text mining increased the statistical power sufficiently to change a negative result for an adjusted hazard ratio to a positive one. Compared to the baseline structured data, the number of patients available for inclusion in the study increased by 2.95 times, the amount of available information on medications increased by 7.2 times, and the amount of additional phenotypic information increased by 11.9 times. Conclusions In our study, use of calcium channel blockers was associated with decreased in-hospital mortality in patients with COVID-19 infection. This finding was obtained by quickly adapting an NLP pipeline to the domain of the novel disease; the adapted pipeline still performed sufficiently to extract useful information. When that information was used to supplement existing structured data, the sample size could be increased sufficiently to see treatment effects that were not previously statistically detectable.


Sign in / Sign up

Export Citation Format

Share Document