A Semi-automatic Ontology Learning Based on WordNet and Event-based Natural Language Processing

EventEpi–A Natural Language Processing Framework for Event-Based Surveillance

10.1101/19006395 ◽

2019 ◽

Author(s):

Auss Abbood ◽

Alexander Ullrich ◽

Rüdiger Busche ◽

Stéphane Ghozzi

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Web Application ◽

Fine Tuning ◽

Entity Recognition ◽

World Health ◽

Support Vector ◽

Event Based ◽

Processing Framework

AbstractAccording to the World Health Organization (WHO), around 60% of all outbreaks are detected using informal sources. In many public health institutes, including the WHO and the Robert Koch Institute (RKI), dedicated groups of epidemiologists sift through numerous articles and newsletters to detect relevant events. This media screening is one important part of event-based surveillance (EBS). Reading the articles, discussing their relevance, and putting key information into a database is a time-consuming process. To support EBS, but also to gain insights into what makes an article and the event it describes relevant, we developed a natural-language-processing framework for automated information extraction and relevance scoring. First, we scraped relevant sources for EBS as done at RKI (WHO Disease Outbreak News and ProMED) and automatically extracted the articles’ key data: disease, country, date, and confirmed-case count. For this, we performed named entity recognition in two steps: EpiTator, an open-source epidemiological annotation tool, suggested many different possibilities for each. We trained a naive Bayes classifier to find the single most likely one using RKI’s EBS database as labels. Then, for relevance scoring, we defined two classes to which any article might belong: The article is relevant if it is in the EBS database and irrelevant otherwise. We compared the performance of different classifiers, using document and word embeddings. Two of the tested algorithms stood out: The multilayer perceptron performed best overall, with a precision of 0.19, recall of 0.50, specificity of 0.89, F1 of 0.28, and the highest tested index balanced accuracy of 0.46. The support-vector machine, on the other hand, had the highest recall (0.88) which can be of higher interest for epidemiologists. Finally, we integrated these functionalities into a web application called EventEpi where relevant sources are automatically analyzed and put into a database. The user can also provide any URL or text, that will be analyzed in the same way and added to the database. Each of these steps could be improved, in particular with larger labeled datasets and fine-tuning of the learning algorithms. The overall framework, however, works already well and can be used in production, promising improvements in EBS. The source code is publicly available at https://github.com/aauss/EventEpi.

Download Full-text

Application of natural language processing algorithms for extracting information from news articles in event-based surveillance

Canada Communicable Disease Report ◽

10.14745/ccdr.v46i06a06 ◽

2020 ◽

pp. 186-191

Author(s):

Victoria Ng ◽

Erin E Rees ◽

Jingcheng Niu ◽

Abdelhamid Zaghlool

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Processing Algorithms ◽

Event Based

Download Full-text

Identification Technology of Grid Monitoring Alarm Event Based on Natural Language Processing and Deep Learning in China

Energies ◽

10.3390/en12173258 ◽

2019 ◽

Vol 12 (17) ◽

pp. 3258 ◽

Cited By ~ 2

Author(s):

Bai ◽

Sun ◽

Zang ◽

Zhang ◽

Shen ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Short Term Memory ◽

Event Identification ◽

Short Period ◽

Grid Monitoring ◽

Low Efficiency ◽

Event Based ◽

Power Dispatching

Power dispatching systems currently receive massive, complicated, and irregular monitoring alarms during their operation, which prevents the controllers from making accurate judgments on the alarm events that occur within a short period of time. In view of the current situation with the low efficiency of monitoring alarm information, this paper proposes a method based on natural language processing (NLP) and a hybrid model that combines long short-term memory (LSTM) and convolutional neural network (CNN) for the identification of grid monitoring alarm events. Firstly, the characteristics of the alarm information text were analyzed and induced and then preprocessed. Then, the monitoring alarm information was vectorized based on the Word2vec model. Finally, a monitoring alarm event identification model based on a combination of LSTM and CNN was established for the characteristics of the alarm information. The feasibility and effectiveness of the method in this paper were verified by comparison with multiple identification models.

Download Full-text

Extracting Causal Claims from Information Systems Papers with Natural Language Processing for Theory Ontology Learning

Proceedings of the 51st Hawaii International Conference on System Sciences ◽

10.24251/hicss.2018.660 ◽

2018 ◽

Cited By ~ 5

Author(s):

Roland M. Mueller ◽

Sebastian Huettemann

Keyword(s):

Natural Language Processing ◽

Information Systems ◽

Natural Language ◽

Language Processing ◽

Ontology Learning

Download Full-text

Natural language processing based ontology learning

2010 International Conference on Computer Application and System Modeling (ICCASM 2010) ◽

10.1109/iccasm.2010.5620325 ◽

2010 ◽

Author(s):

Chengxiang Yuan ◽

Yi Zhuang ◽

Xiaojun Li

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Ontology Learning

Download Full-text

EventEpi—A natural language processing framework for event-based surveillance

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008277 ◽

2020 ◽

Vol 16 (11) ◽

pp. e1008277

Author(s):

Auss Abbood ◽

Alexander Ullrich ◽

Rüdiger Busche ◽

Stéphane Ghozzi

Keyword(s):

Public Health ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Fine Tuning ◽

Entity Recognition ◽

World Health ◽

Case Count ◽

Event Based ◽

Processing Framework

According to the World Health Organization (WHO), around 60% of all outbreaks are detected using informal sources. In many public health institutes, including the WHO and the Robert Koch Institute (RKI), dedicated groups of public health agents sift through numerous articles and newsletters to detect relevant events. This media screening is one important part of event-based surveillance (EBS). Reading the articles, discussing their relevance, and putting key information into a database is a time-consuming process. To support EBS, but also to gain insights into what makes an article and the event it describes relevant, we developed a natural language processing framework for automated information extraction and relevance scoring. First, we scraped relevant sources for EBS as done at the RKI (WHO Disease Outbreak News and ProMED) and automatically extracted the articles’ key data: disease, country, date, and confirmed-case count. For this, we performed named entity recognition in two steps: EpiTator, an open-source epidemiological annotation tool, suggested many different possibilities for each. We extracted the key country and disease using a heuristic with good results. We trained a naive Bayes classifier to find the key date and confirmed-case count, using the RKI’s EBS database as labels which performed modestly. Then, for relevance scoring, we defined two classes to which any article might belong: The article is relevant if it is in the EBS database and irrelevant otherwise. We compared the performance of different classifiers, using bag-of-words, document and word embeddings. The best classifier, a logistic regression, achieved a sensitivity of 0.82 and an index balanced accuracy of 0.61. Finally, we integrated these functionalities into a web application called EventEpi where relevant sources are automatically analyzed and put into a database. The user can also provide any URL or text, that will be analyzed in the same way and added to the database. Each of these steps could be improved, in particular with larger labeled datasets and fine-tuning of the learning algorithms. The overall framework, however, works already well and can be used in production, promising improvements in EBS. The source code and data are publicly available under open licenses.

Download Full-text

An Overview of Shallow and Deep Natural Language Processing for Ontology Learning

Ontology Learning and Knowledge Discovery Using the Web ◽

10.4018/978-1-60960-625-1.ch002 ◽

2011 ◽

pp. 16-37 ◽

Cited By ~ 13

Author(s):

Amal Zouaq

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Process ◽

Semantic Analysis ◽

State Of The Art ◽

Knowledge Extraction ◽

The State ◽

Ontology Learning ◽

Analysis Methods

This chapter gives an overview over the state-of-the-art in natural language processing for ontology learning. It presents two main NLP techniques for knowledge extraction from text, namely shallow techniques and deep techniques, and explains their usefulness for each step of the ontology learning process. The chapter also advocates the interest of deeper semantic analysis methods for ontology learning. In fact, there have been very few attempts to create ontologies using deep NLP. After a brief introduction to the main semantic analysis approaches, the chapter focuses on lexico-syntactic patterns based on dependency grammars and explains how these patterns can be considered as a step towards deeper semantic analysis. Finally, the chapter addresses the “ontologization” task that is the ability to filter important concepts and relationships among the mass of extracted knowledge.

Download Full-text

A modular framework for ontology learning from text in Portuguese

Multi-Science Journal ◽

10.33837/msj.v3i3.899 ◽

2020 ◽

Vol 3 (3) ◽

pp. 37-42

Author(s):

Norton Coelho Guimarães ◽

Cedric Luiz De Carvalho

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Ontology Learning ◽

Computational Framework ◽

Learning From Text ◽

Learning From Texts ◽

Processing Techniques ◽

Taxonomic Relations

Research on ontology learning has been carried out in many knowledge areas, especially in Artificial Intelligence. Semi-automatic or automatic ontology learning can contribute to the field of knowledge representation. Many semi-automatic approaches to ontology learning from texts have been proposed. Most of these proposals use natural language processing techniques. This paper describes a computational framework construction for semi-automated ontology learning from texts in Portuguese. Axioms are not treated in this paper. The work described here originated from the Philipp Cimiano’s proposal along with text standardization mechanisms, natural language processing, identification of taxonomic relations and techniques for structuring ontologies. In this work, a case study on public security domain was also done, showing the benefits of the developed computational framework. The result of this case study is an ontology for this area.

Download Full-text

Natural Language Processing methods and systems for biomedical ontology learning

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2010.07.006 ◽

2011 ◽

Vol 44 (1) ◽

pp. 163-179 ◽

Cited By ~ 74

Author(s):

Kaihong Liu ◽

William R. Hogan ◽

Rebecca S. Crowley

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Biomedical Ontology ◽

Ontology Learning ◽

Processing Methods

Download Full-text

Construction and evaluation of event graphs

Natural Language Engineering ◽

10.1017/s1351324914000060 ◽

2014 ◽

Vol 21 (4) ◽

pp. 607-652 ◽

Cited By ~ 7

Author(s):

GORAN GLAVAŠ ◽

JAN ŠNAJDER

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Information Needs ◽

Relation Extraction ◽

Event Extraction ◽

The Individual ◽

Event Based

AbstractEvents play an important role in natural language processing and information retrieval due to numerous event-oriented texts and information needs. Many natural language processing and information retrieval applications could benefit from a structured event-oriented document representation. In this paper, we proposeevent graphsas a novel way of structuring event-based information from text. Nodes in event graphs represent the individual mentions of events, whereas edges represent the temporal and coreference relations between mentions. Contrary to previous natural language processing research, which has mainly focused on individual event extraction tasks, we describe a complete end-to-end system for event graph extraction from text. Our system is a three-stage pipeline that performs anchor extraction, argument extraction, and relation extraction (temporal relation extraction and event coreference resolution), each at a performance level comparable with the state of the art. We presentEvExtra, a large newspaper corpus annotated with event mentions and event graphs, on which we train and evaluate our models. To measure the overall quality of the constructed event graphs, we propose two metrics based on the tensor product between automatically and manually constructed graphs. Finally, we evaluate the overall quality of event graphs with the proposed evaluation metrics and perform a headroom analysis of the system.

Download Full-text