Delayed Combination of Feature Embedding in Bidirectional LSTM CRF for NER

Chirawan Ronran; Seungwoo Lee; Hong Jun Jang

doi:10.3390/app10217557

Delayed Combination of Feature Embedding in Bidirectional LSTM CRF for NER

Applied Sciences ◽

10.3390/app10217557 ◽

2020 ◽

Vol 10 (21) ◽

pp. 7557

Author(s):

Chirawan Ronran ◽

Seungwoo Lee ◽

Hong Jun Jang

Keyword(s):

Neural Network ◽

Language Processing ◽

Short Term Memory ◽

Conditional Random Field ◽

Named Entity Recognition ◽

Network Models ◽

Word Embedding ◽

Entity Recognition ◽

Neural Network Models ◽

Bidirectional Lstm

Named Entity Recognition (NER) plays a vital role in natural language processing (NLP). Currently, deep neural network models have achieved significant success in NER. Recent advances in NER systems have introduced various feature selections to identify appropriate representations and handle Out-Of-the-Vocabulary (OOV) words. After selecting the features, they are all concatenated at the embedding layer before being fed into a model to label the input sequences. However, when concatenating the features, information collisions may occur and this would cause the limitation or degradation of the performance. To overcome the information collisions, some works tried to directly connect some features to latter layers, which we call the delayed combination and show its effectiveness by comparing it to the early combination. As feature encodings for input, we selected the character-level Convolutional Neural Network (CNN) or Long Short-Term Memory (LSTM) word encoding, the pre-trained word embedding, and the contextual word embedding and additionally designed CNN-based sentence encoding using a dictionary. These feature encodings are combined at early or delayed position of the bidirectional LSTM Conditional Random Field (CRF) model according to each feature’s characteristics. We evaluated the performance of this model on the CoNLL 2003 and OntoNotes 5.0 datasets using the F1 score and compared the delayed combination model with our own implementation of the early combination as well as the previous works. This comparison convinces us that our delayed combination is more effective than the early one and also highly competitive.

Download Full-text

Bidirectional Recurrent Neural Network Approach for Arabic Named Entity Recognition

Future Internet ◽

10.3390/fi10120123 ◽

2018 ◽

Vol 10 (12) ◽

pp. 123 ◽

Cited By ~ 7

Author(s):

Mohammed Ali ◽

Guanzheng Tan ◽

Aamir Hussain

Keyword(s):

Neural Network ◽

Language Processing ◽

Recurrent Neural Network ◽

Short Term Memory ◽

Named Entity Recognition ◽

Recognition Task ◽

Word Embedding ◽

Entity Recognition ◽

Named Entity ◽

Lstm Network

Recurrent neural network (RNN) has achieved remarkable success in sequence labeling tasks with memory requirement. RNN can remember previous information of a sequence and can thus be used to solve natural language processing (NLP) tasks. Named entity recognition (NER) is a common task of NLP and can be considered a classification problem. We propose a bidirectional long short-term memory (LSTM) model for this entity recognition task of the Arabic text. The LSTM network can process sequences and relate to each part of it, which makes it useful for the NER task. Moreover, we use pre-trained word embedding to train the inputs that are fed into the LSTM network. The proposed model is evaluated on a popular dataset called “ANERcorp.” Experimental results show that the model with word embedding achieves a high F-score measure of approximately 88.01%.

Download Full-text

End-to-End Recurrent Neural Network Models for Vietnamese Named Entity Recognition: Word-Level Vs. Character-Level

Communications in Computer and Information Science - Computational Linguistics ◽

10.1007/978-981-10-8438-6_18 ◽

2018 ◽

pp. 219-232 ◽

Cited By ~ 5

Author(s):

Thai-Hoang Pham ◽

Phuong Le-Hong

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Named Entity Recognition ◽

Network Models ◽

Entity Recognition ◽

Neural Network Models ◽

Named Entity ◽

Word Level ◽

End To End

Download Full-text

Information Extraction of Cybersecurity Concepts: An LSTM Approach

Applied Sciences ◽

10.3390/app9193945 ◽

2019 ◽

Vol 9 (19) ◽

pp. 3945 ◽

Cited By ~ 4

Author(s):

Houssem Gasmi ◽

Jannik Laval ◽

Abdelaziz Bouras

Keyword(s):

Neural Network ◽

Domain Knowledge ◽

Conditional Random Fields ◽

Short Term Memory ◽

Network Models ◽

Relation Extraction ◽

Entity Recognition ◽

Feature Engineering ◽

Neural Network Models ◽

Feature Based

Extracting cybersecurity entities and the relationships between them from online textual resources such as articles, bulletins, and blogs and converting these resources into more structured and formal representations has important applications in cybersecurity research and is valuable for professional practitioners. Previous works to accomplish this task were mainly based on utilizing feature-based models. Feature-based models are time-consuming and need labor-intensive feature engineering to describe the properties of entities, domain knowledge, entity context, and linguistic characteristics. Therefore, to alleviate the need for feature engineering, we propose the usage of neural network models, specifically the long short-term memory (LSTM) models to accomplish the tasks of Named Entity Recognition (NER) and Relation Extraction (RE). We evaluated the proposed models on two tasks. The first task is performing NER and evaluating the results against the state-of-the-art Conditional Random Fields (CRFs) method. The second task is performing RE using three LSTM models and comparing their results to assess which model is more suitable for the domain of cybersecurity. The proposed models achieved competitive performance with less feature-engineering work. We demonstrate that exploiting neural network models in cybersecurity text mining is effective and practical.

Download Full-text

Innovative Deep Neural Network Modeling for Fine-Grained Chinese Entity Recognition

Electronics ◽

10.3390/electronics9061001 ◽

2020 ◽

Vol 9 (6) ◽

pp. 1001 ◽

Cited By ~ 1

Author(s):

Jingang Liu ◽

Chunhe Xia ◽

Haihua Yan ◽

Wenjing Xu

Keyword(s):

Neural Network ◽

Language Processing ◽

Short Term Memory ◽

Named Entity Recognition ◽

Training Model ◽

Entity Recognition ◽

Coarse Grained ◽

Neural Network Modeling ◽

Fine Grained ◽

Named Entity

Named entity recognition (NER) is a basic but crucial task in the field of natural language processing (NLP) and big data analysis. The recognition of named entities based on Chinese is more complicated and difficult than English, which makes the task of NER in Chinese more challenging. In particular, fine-grained named entity recognition is more challenging than traditional named entity recognition tasks, mainly because fine-grained tasks have higher requirements for the ability of automatic feature extraction and information representation of deep neural models. In this paper, we propose an innovative neural network model named En2BiLSTM-CRF to improve the effect of fine-grained Chinese entity recognition tasks. This proposed model including the initial encoding layer, the enhanced encoding layer, and the decoding layer combines the advantages of pre-training model encoding, dual bidirectional long short-term memory (BiLSTM) networks, and a residual connection mechanism. Hence, it can encode information multiple times and extract contextual features hierarchically. We conducted sufficient experiments on two representative datasets using multiple important metrics and compared them with other advanced baselines. We present promising results showing that our proposed En2BiLSTM-CRF has better performance as well as better generalization ability in both fine-grained and coarse-grained Chinese entity recognition tasks.

Download Full-text

CWPC_BiAtt: Character–Word–Position Combined BiLSTM-Attention for Chinese Named Entity Recognition

Information ◽

10.3390/info11010045 ◽

2020 ◽

Vol 11 (1) ◽

pp. 45 ◽

Cited By ~ 1

Author(s):

Shardrom Johnson ◽

Sherlock Shen ◽

Yuanchen Liu

Keyword(s):

Language Processing ◽

Short Term Memory ◽

Conditional Random Field ◽

Named Entity Recognition ◽

Attention Mechanism ◽

Entity Recognition ◽

Position Information ◽

Named Entity ◽

Pos Tagging ◽

Word Position

Usually taken as linguistic features by Part-Of-Speech (POS) tagging, Named Entity Recognition (NER) is a major task in Natural Language Processing (NLP). In this paper, we put forward a new comprehensive-embedding, considering three aspects, namely character-embedding, word-embedding, and pos-embedding stitched in the order we give, and thus get their dependencies, based on which we propose a new Character–Word–Position Combined BiLSTM-Attention (CWPC_BiAtt) for the Chinese NER task. Comprehensive-embedding via the Bidirectional Llong Short-Term Memory (BiLSTM) layer can get the connection between the historical and future information, and then employ the attention mechanism to capture the connection between the content of the sentence at the current position and that at any location. Finally, we utilize Conditional Random Field (CRF) to decode the entire tagging sequence. Experiments show that CWPC_BiAtt model we proposed is well qualified for the NER task on Microsoft Research Asia (MSRA) dataset and Weibo NER corpus. A high precision and recall were obtained, which verified the stability of the model. Position-embedding in comprehensive-embedding can compensate for attention-mechanism to provide position information for the disordered sequence, which shows that comprehensive-embedding has completeness. Looking at the entire model, our proposed CWPC_BiAtt has three distinct characteristics: completeness, simplicity, and stability. Our proposed CWPC_BiAtt model achieved the highest F-score, achieving the state-of-the-art performance in the MSRA dataset and Weibo NER corpus.

Download Full-text

Ontology-Based Healthcare Named Entity Recognition from Twitter Messages Using a Recurrent Neural Network Approach

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph16193628 ◽

2019 ◽

Vol 16 (19) ◽

pp. 3628 ◽

Cited By ~ 5

Author(s):

Erdenebileg Batbaatar ◽

Keun Ho Ryu

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Adverse Drug Events ◽

Viterbi Algorithm ◽

Short Term Memory ◽

Conditional Random Field ◽

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Health Related

Named Entity Recognition (NER) in the healthcare domain involves identifying and categorizing disease, drugs, and symptoms for biosurveillance, extracting their related properties and activities, and identifying adverse drug events appearing in texts. These tasks are important challenges in healthcare. Analyzing user messages in social media networks such as Twitter can provide opportunities to detect and manage public health events. Twitter provides a broad range of short messages that contain interesting information for information extraction. In this paper, we present a Health-Related Named Entity Recognition (HNER) task using healthcare-domain ontology that can recognize health-related entities from large numbers of user messages from Twitter. For this task, we employ a deep learning architecture which is based on a recurrent neural network (RNN) with little feature engineering. To achieve our goal, we collected a large number of Twitter messages containing health-related information, and detected biomedical entities from the Unified Medical Language System (UMLS). A bidirectional long short-term memory (BiLSTM) model learned rich context information, and a convolutional neural network (CNN) was used to produce character-level features. The conditional random field (CRF) model predicted a sequence of labels that corresponded to a sequence of inputs, and the Viterbi algorithm was used to detect health-related entities from Twitter messages. We provide comprehensive results giving valuable insights for identifying medical entities in Twitter for various applications. The BiLSTM-CRF model achieved a precision of 93.99%, recall of 73.31%, and F1-score of 81.77% for disease or syndrome HNER; a precision of 90.83%, recall of 81.98%, and F1-score of 87.52% for sign or symptom HNER; and a precision of 94.85%, recall of 73.47%, and F1-score of 84.51% for pharmacologic substance named entities. The ontology-based manual annotation results show that it is possible to perform high-quality annotation despite the complexity of medical terminology and the lack of context in tweets.

Download Full-text

An Experimental Study of Hybrid Machine Learning Models for Extracting Named Entities

10.29007/dp5m ◽

2019 ◽

Author(s):

Lei Jiang ◽

Elena Bolshakova

Keyword(s):

Neural Network ◽

Conditional Random Fields ◽

Named Entity Recognition ◽

Network Models ◽

Entity Recognition ◽

Neural Network Models ◽

Named Entities ◽

Hybrid Neural Network ◽

Named Entity ◽

Two Hybrid

The paper describes two hybrid neural network models for named entity recognition (NER) in texts, as well as results of experiments with them. The first model, namely Bi-LSTM-CRF, is known and used for NER, while the other model named Gated-CNN- CRF is proposed in this work. It combines convolutional neural network (CNN), gated linear units, and conditional random fields (CRF). Both models were tested for NER on three different language datasets, for English, Russian, and Chinese. All resulted scores of precision, recall and F1-measure for both models are close to the state-of-the-art for NER, and for the English dataset CoNLL-2003, Gated-CNN-CRF model achieves 92.66 of F1-measure, outperforming the known result.

Download Full-text

Viability of Neural Networks for Core Technologies for Resource-Scarce Languages

Information ◽

10.3390/info11010041 ◽

2020 ◽

Vol 11 (1) ◽

pp. 41

Author(s):

Melinda Loubser ◽

Martin J. Puttkammer

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

South African ◽

Language Processing ◽

Named Entity Recognition ◽

Entity Recognition ◽

Neural Network Models ◽

African Languages ◽

Pos Tagging

In this paper, the viability of neural network implementations of core technologies (the focus of this paper is on text technologies) for 10 resource-scarce South African languages is evaluated. Neural networks are increasingly being used in place of other machine learning methods for many natural language processing tasks with good results. However, in the South African context, where most languages are resource-scarce, very little research has been done on neural network implementations of core language technologies. In this paper, we address this gap by evaluating neural network implementations of four core technologies for ten South African languages. The technologies we address are part of speech tagging, named entity recognition, compound analysis and lemmatization. Neural architectures that performed well on similar tasks in other settings were implemented for each task and the performance was assessed in comparison with currently used machine learning implementations of each technology. The neural network models evaluated perform better than the baselines for compound analysis, are viable and comparable to the baseline on most languages for POS tagging and NER, and are viable, but not on par with the baseline, for Afrikaans lemmatization.

Download Full-text

Quantifying Uncertainties in Natural Language Processing Tasks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017322 ◽

2019 ◽

Vol 33 ◽

pp. 7322-7329 ◽

Cited By ~ 1

Author(s):

Yijun Xiao ◽

William Yang Wang

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Intelligent Systems ◽

Named Entity Recognition ◽

Network Models ◽

Entity Recognition ◽

Neural Network Models ◽

Named Entity ◽

Modeling Uncertainties

Reliable uncertainty quantification is a first step towards building explainable, transparent, and accountable artificial intelligent systems. Recent progress in Bayesian deep learning has made such quantification realizable. In this paper, we propose novel methods to study the benefits of characterizing model and data uncertainties for natural language processing (NLP) tasks. With empirical experiments on sentiment analysis, named entity recognition, and language modeling using convolutional and recurrent neural network models, we show that explicitly modeling uncertainties is not only necessary to measure output confidence levels, but also useful at enhancing model performances in various NLP tasks.

Download Full-text

Context-Aware Bidirectional Neural Model for Sindhi Named Entity Recognition

Applied Sciences ◽

10.3390/app11199038 ◽

2021 ◽

Vol 11 (19) ◽

pp. 9038

Author(s):

Wazir Ali ◽

Jay Kumar ◽

Zenglin Xu ◽

Rajesh Kumar ◽

Yazhou Ren

Keyword(s):

Language Processing ◽

Short Term Memory ◽

Conditional Random Field ◽

Named Entity Recognition ◽

Representation Learning ◽

Entity Recognition ◽

Context Aware ◽

Named Entity ◽

Word Level ◽

Task Oriented

Named entity recognition (NER) is a fundamental task in many natural language processing (NLP) applications, such as text summarization and semantic information retrieval. Recently, deep neural networks (NNs) with the attention mechanism yield excellent performance in NER by taking advantage of character-level and word-level representation learning. In this paper, we propose a deep context-aware bidirectional long short-term memory (CaBiLSTM) model for the Sindhi NER task. The model relies upon contextual representation learning (CRL), bidirectional encoder, self-attention, and sequential conditional random field (CRF). The CaBiLSTM model incorporates task-oriented CRL based on joint character-level and word-level representations. It takes character-level input to learn the character representations. Afterwards, the character representations are transformed into word features, and the bidirectional encoder learns the word representations. The output of the final encoder is fed into the self-attention through a hidden layer before decoding. Finally, we employ the CRF for the prediction of label sequences. The baselines and the proposed CaBiLSTM model are compared by exploiting pretrained Sindhi GloVe (SdGloVe), Sindhi fastText (SdfastText), task-oriented, and CRL-based word representations on the recently proposed SiNER dataset. Our proposed CaBiLSTM model achieved a high F1-score of 91.25% on the SiNER dataset with CRL without relying on additional handmade features, such as hand-crafted rules, gazetteers, or dictionaries.

Download Full-text