POP-ON: Prediction of Process Using One-Way Language Model Based on NLP Approach

Junhyung Moon; Gyuyoung Park; Jongpil Jeong

doi:10.3390/app11020864

POP-ON: Prediction of Process Using One-Way Language Model Based on NLP Approach

Applied Sciences ◽

10.3390/app11020864 ◽

2021 ◽

Vol 11 (2) ◽

pp. 864

Author(s):

Junhyung Moon ◽

Gyuyoung Park ◽

Jongpil Jeong

Keyword(s):

Natural Language ◽

Language Processing ◽

Business Process ◽

Process Management ◽

Language Model ◽

Superior Performance ◽

Event Prediction ◽

Process Prediction ◽

Monitoring Service ◽

Deep Learning Model

In business process management, the monitoring service is an important element that can prevent various problems in advance from before they occur in companies and industries. Execution log is created in an information system that is aware of the enterprise process, which helps predict the process. The ultimate goal of the proposed method is to predict the process following the running process instance and predict events based on previously completed event log data. Companies can flexibly respond to unwanted deviations in their workflow. When solving the next event prediction problem, we use a fully attention-based transformer, which has performed well in recent natural language processing approaches. After recognizing the name attribute of the event in the natural language and predicting the next event, several necessary elements were applied. It is trained using the proposed deep learning model according to specific pre-processing steps. Experiments using various business process log datasets demonstrate the superior performance of the proposed method. The name of the process prediction model we propose is “POP-ON”.

Download Full-text

On the Generation of E-Learning Resources Using Business Process, Natural Language Processing, and Web Services

IT Professional ◽

10.1109/mitp.2021.3054640 ◽

2021 ◽

Vol 23 (2) ◽

pp. 40-44

Author(s):

Olivia Fragoso-Diaz ◽

Vitervo Lopez Caballero ◽

Juan Carlos Rojas-Perez ◽

Rene Santaolaya-Salgado ◽

Juan Gabriel Gonzalez-Serna

Keyword(s):

Natural Language Processing ◽

Web Services ◽

Natural Language ◽

Language Processing ◽

Business Process ◽

Learning Resources ◽

E Learning

Download Full-text

EMOSIS Sentiment Analysis on Tweets with Emotion and Intensity Level Recognition Considering Ending Punctuation Marks

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d4518.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 10289-10293

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Emotion Recognition ◽

Sentiment Analysis ◽

Language Processing ◽

Significant Role ◽

Language Model ◽

Intensity Level ◽

Processing Stage ◽

Overall Performance

Sentiment Analysis is a tool used for determining the Polarity or Emotion of a Sentence. It is a field of Natural Language Processing which focuses on the study of opinions. In this study, the researchers solved one key challenge in Sentiment Analysis, which is to consider the Ending Punctuation Marks present in a sentence. Ending punctuation marks plays a significant role in Emotion Recognition and Intensity Level Recognition. The research made used of tweets expressing opinions about Philippine President Rodrigo Duterte. These downloaded tweets served as the inputs. It was initially subjected to pre-processing stage to be able to prepare the sentences for processing. A Language Model was created to serve as the classifier for determining the scores of the tweets. The scores give the polarity of the sentence. Accuracy is very important in sentiment analysis. To increase the chance of correctly identifying the polarity of the tweets, the input undergone Intensity Level Recognition which determines the intensifiers and negations within the sentences. The system was evaluated with overall performance of 80.27%.

Download Full-text

Putting ontologies to use

The Knowledge Engineering Review ◽

10.1017/s0269888998001027 ◽

1998 ◽

Vol 13 (1) ◽

pp. 1-3 ◽

Cited By ~ 19

Author(s):

MIKE USCHOLD ◽

AUSTIN TATE

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Knowledge Acquisition ◽

Language Processing ◽

Common Core ◽

Process Management ◽

Planning Process ◽

Schema Integration ◽

Nature Development ◽

Theoretical Issues

Interest in the nature, development and use of ontologies is becoming increasingly widespread. Since the early nineties, numerous workshops have been held. Representatives from historically separate disciplines concerned with philosophical issues, knowledge acquisition and representation, planning, process management, database schema integration, natural language processing and enterprise modelling, came together to identify a common core of issues of interest. There was highly varied and inconsistent usage of a wide variety of terms, most notably, “ontology”, rendering cross-discipline communication difficult. However, progress was made toward understanding the commonality among the disciplines. Subsequent workshops addressed various aspects of the field, including theoretical issues, methodologies for building ontologies, as well as specific applications in government and industry.

Download Full-text

Self-normalizing learning on biomedical ontologies using a deep Siamese neural network

10.1101/2020.04.23.057117 ◽

2020 ◽

Cited By ~ 1

Author(s):

Fatima Zohra Smaili ◽

Xin Gao ◽

Robert Hoehndorf

Keyword(s):

Neural Network ◽

Natural Language ◽

Language Processing ◽

Research Group ◽

Prediction Method ◽

Entity Recognition ◽

Superior Performance ◽

Biomedical Ontologies ◽

Sources Of Information ◽

Structured Information

AbstractMotivationOntologies are widely used in biomedicine for the annotation and standardization of data. One of the main roles of ontologies is to provide structured background knowledge within a domain as well as a set of labels, synonyms, and definitions for the classes within a domain. The two types of information provided by ontologies have been extensively exploited in natural language processing and machine learning applications. However, they are commonly used separately, and thus it is unknown if joining the two sources of information can further benefit data analysis tasks.ResultsWe developed a novel method that applies named entity recognition and normalization methods on texts to connect the structured information in biomedical ontologies with the information contained in natural language. We apply this normalization both to literature and to the natural language information contained within ontologies themselves. The normalized ontologies and text are then used to generate embeddings, and relations between entities are predicted using a deep Siamese neural network model that takes these embeddings as input. We demonstrate that our novel embedding and prediction method using self-normalized biomedical ontologies significantly outperforms the state-of-the-art methods in embedding ontologies on two benchmark tasks: prediction of interactions between proteins and prediction of gene–disease associations. Our method also allows us to apply ontology-based annotations and axioms to the prediction of toxicological effects of chemicals where our method shows superior performance. Our method is generic and can be applied in scenarios where ontologies consisting of both structured information and natural language labels or synonyms are used.Availabilityhttps://github.com/bio-ontology-research-group/[email protected] and [email protected]

Download Full-text

DeNERT-KG: Named Entity and Relation Extraction Model Using DQN, Knowledge Graph, and BERT

Applied Sciences ◽

10.3390/app10186429 ◽

2020 ◽

Vol 10 (18) ◽

pp. 6429

Author(s):

SungMin Yang ◽

SoYeop Yoo ◽

OkRan Jeong

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Language Model ◽

Named Entity Recognition ◽

Relation Extraction ◽

Entity Recognition ◽

Knowledge Graph ◽

Named Entity ◽

Artificial Intelligence Technology

Along with studies on artificial intelligence technology, research is also being carried out actively in the field of natural language processing to understand and process people’s language, in other words, natural language. For computers to learn on their own, the skill of understanding natural language is very important. There are a wide variety of tasks involved in the field of natural language processing, but we would like to focus on the named entity registration and relation extraction task, which is considered to be the most important in understanding sentences. We propose DeNERT-KG, a model that can extract subject, object, and relationships, to grasp the meaning inherent in a sentence. Based on the BERT language model and Deep Q-Network, the named entity recognition (NER) model for extracting subject and object is established, and a knowledge graph is applied for relation extraction. Using the DeNERT-KG model, it is possible to extract the subject, type of subject, object, type of object, and relationship from a sentence, and verify this model through experiments.

Download Full-text

HIDING CRITICAL INFORMATION WHEN TRAINING LANGUAGE MODELS

EurasianUnionScientists ◽

10.31618/esu.2413-9335.2021.1.86.1349 ◽

2021 ◽

pp. 15-18

Author(s):

A. Evtushenko

Keyword(s):

Natural Language ◽

Language Processing ◽

Text Processing ◽

Language Model ◽

Personal Data ◽

Language Models ◽

Training Dataset ◽

Critical Information ◽

Research Company ◽

Learning Language

Machine learning language models are combinations of algorithms and neural networks designed for text processing composed in natural language (Natural Language Processing, NLP). In 2020, the largest language model from the artificial intelligence research company OpenAI, GPT-3, was released, the maximum number of parameters of which reaches 175 billion. The parameterization of the model increased by more than 100 times made it possible to improve the quality of generated texts to a level that is hard to distinguish from human-written texts. It is noteworthy that this model was trained on a training dataset mainly collected from open sources on the Internet, the volume of which is estimated at 570 GB. This article discusses the problem of memorizing critical information, in particular, personal data of individual, at the stage of training large language models (GPT-2/3 and derivatives), and also describes an algorithmic approach to solving this problem, which consists in additional preprocessing training dataset and refinement of the model inference in the context of generating pseudo-personal data and embedding into the results of work on the tasks of summarization, text generation, formation of answers to questions and others from the field of seq2seq.

Download Full-text

Highly accurate classification of chest radiographic reports using a deep learning natural language model pre-trained on 3.8 million text reports

Bioinformatics ◽

10.1093/bioinformatics/btaa668 ◽

2020 ◽

Author(s):

Keno K Bressem ◽

Lisa C Adams ◽

Robert A Gaudin ◽

Daniel Tröltzsch ◽

Bernd Hamm ◽

...

Keyword(s):

Natural Language ◽

Language Processing ◽

Language Model ◽

Fine Tuning ◽

Supplementary Information ◽

Free Text ◽

Clinical Workflow ◽

Text Data ◽

Unlabelled Data ◽

Medical Reports

Abstract Motivation The development of deep, bidirectional transformers such as Bidirectional Encoder Representations from Transformers (BERT) led to an outperformance of several Natural Language Processing (NLP) benchmarks. Especially in radiology, large amounts of free-text data are generated in daily clinical workflow. These report texts could be of particular use for the generation of labels in machine learning, especially for image classification. However, as report texts are mostly unstructured, advanced NLP methods are needed to enable accurate text classification. While neural networks can be used for this purpose, they must first be trained on large amounts of manually labelled data to achieve good results. In contrast, BERT models can be pre-trained on unlabelled data and then only require fine tuning on a small amount of manually labelled data to achieve even better results. Results Using BERT to identify the most important findings in intensive care chest radiograph reports, we achieve areas under the receiver operation characteristics curve of 0.98 for congestion, 0.97 for effusion, 0.97 for consolidation and 0.99 for pneumothorax, surpassing the accuracy of previous approaches with comparatively little annotation effort. Our approach could therefore help to improve information extraction from free-text medical reports. Availability and implementation We make the source code for fine-tuning the BERT-models freely available at https://github.com/fast-raidiology/bert-for-radiology. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

An artificial intelligence approach to COVID-19 infection risk assessment in virtual visits: A case report

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocaa105 ◽

2020 ◽

Vol 27 (8) ◽

pp. 1321-1325 ◽

Cited By ~ 7

Author(s):

Jihad S Obeid ◽

Matthew Davis ◽

Matthew Turner ◽

Stephane M Meystre ◽

Paul M Heider ◽

...

Keyword(s):

Artificial Intelligence ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Characteristic Curve ◽

Computer Algorithms ◽

Outbreak Response ◽

Virtual Visits ◽

Informatics Tools ◽

Deep Learning Model

Abstract Objective In an effort to improve the efficiency of computer algorithms applied to screening for coronavirus disease 2019 (COVID-19) testing, we used natural language processing and artificial intelligence–based methods with unstructured patient data collected through telehealth visits. Materials and Methods After segmenting and parsing documents, we conducted analysis of overrepresented words in patient symptoms. We then developed a word embedding–based convolutional neural network for predicting COVID-19 test results based on patients’ self-reported symptoms. Results Text analytics revealed that concepts such as smell and taste were more prevalent than expected in patients testing positive. As a result, screening algorithms were adapted to include these symptoms. The deep learning model yielded an area under the receiver-operating characteristic curve of 0.729 for predicting positive results and was subsequently applied to prioritize testing appointment scheduling. Conclusions Informatics tools such as natural language processing and artificial intelligence methods can have significant clinical impacts when applied to data streams early in the development of clinical systems for outbreak response.

Download Full-text

Extracting Business Process Models Using Natural Language Processing (NLP) Techniques

2017 IEEE 19th Conference on Business Informatics (CBI) ◽

10.1109/cbi.2017.41 ◽

2017 ◽

Cited By ~ 7

Author(s):

Konstantinos Sintoris ◽

Kostas Vergidis

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Business Process ◽

Process Models ◽

Business Process Models

Download Full-text

Inducing Relational Knowledge from BERT

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6242 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7456-7463 ◽

Cited By ~ 3

Author(s):

Zied Bouraoui ◽

Jose Camacho-Collados ◽

Steven Schockaert

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Language Model ◽

Language Models ◽

Word Embeddings ◽

Relational Knowledge ◽

Wide Range ◽

Fine Tune ◽

Standard Word

One of the most remarkable properties of word embeddings is the fact that they capture certain types of semantic and syntactic relationships. Recently, pre-trained language models such as BERT have achieved groundbreaking results across a wide range of Natural Language Processing tasks. However, it is unclear to what extent such models capture relational knowledge beyond what is already captured by standard word embeddings. To explore this question, we propose a methodology for distilling relational knowledge from a pre-trained language model. Starting from a few seed instances of a given relation, we first use a large text corpus to find sentences that are likely to express this relation. We then use a subset of these extracted sentences as templates. Finally, we fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input.

Download Full-text