A Kernel-Based Approach for Biomedical Named Entity Recognition

The Scientific World JOURNAL ◽

10.1155/2013/950796 ◽

2013 ◽

Vol 2013 ◽

pp. 1-7 ◽

Cited By ~ 8

Author(s):

Rakesh Patra ◽

Sujan Kumar Saha

Keyword(s):

Kernel Function ◽

Text Processing ◽

Named Entity Recognition ◽

Kernel Functions ◽

Entity Recognition ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Named Entity ◽

Tree Kernel

Support vector machine (SVM) is one of the popular machine learning techniques used in various text processing tasks including named entity recognition (NER). The performance of the SVM classifier largely depends on the appropriateness of the kernel function. In the last few years a number of task-specific kernel functions have been proposed and used in various text processing tasks, for example, string kernel, graph kernel, tree kernel and so on. So far very few efforts have been devoted to the development of NER task specific kernel. In the literature we found that the tree kernel has been used in NER task only for entity boundary detection or reannotation. The conventional tree kernel is unable to execute the complete NER task on its own. In this paper we have proposed a kernel function, motivated by the tree kernel, which is able to perform the complete NER task. To examine the effectiveness of the proposed kernel, we have applied the kernel function on the openly available JNLPBA 2004 data. Our kernel executes the complete NER task and achieves reasonable accuracy.

Download Full-text

An Improved Word Representation for Deep Learning Based NER in Indian Languages

Information ◽

10.3390/info10060186 ◽

2019 ◽

Vol 10 (6) ◽

pp. 186 ◽

Cited By ~ 1

Author(s):

Ajees A P ◽

Manju K ◽

Sumam Mary Idicula

Keyword(s):

Deep Learning ◽

Named Entity Recognition ◽

Entity Recognition ◽

Machine Learning Techniques ◽

Support Vector ◽

Indian Languages ◽

Named Entity ◽

Text Document ◽

Learning Techniques ◽

Word Representation

Named Entity Recognition (NER) is the process of identifying the elementary units in a text document and classifying them into predefined categories such as person, location, organization and so forth. NER plays an important role in many Natural Language Processing applications like information retrieval, question answering, machine translation and so forth. Resolving the ambiguities of lexical items involved in a text document is a challenging task. NER in Indian languages is always a complex task due to their morphological richness and agglutinative nature. Even though different solutions were proposed for NER, it is still an unsolved problem. Traditional approaches to Named Entity Recognition were based on the application of hand-crafted features to classical machine learning techniques such as Hidden Markov Model (HMM), Support Vector Machine (SVM), Conditional Random Field (CRF) and so forth. But the introduction of deep learning techniques to the NER problem changed the scenario, where the state of art results have been achieved using deep learning architectures. In this paper, we address the problem of effective word representation for NER in Indian languages by capturing the syntactic, semantic and morphological information. We propose a deep learning based entity extraction system for Indian languages using a novel combined word representation, including character-level, word-level and affix-level embeddings. We have used ‘ARNEKT-IECSIL 2018’ shared data for training and testing. Our results highlight the improvement that we obtained over the existing pre-trained word representations.

Download Full-text

Named Entity Recognition for Code Mixed Social Media Sentences

International Journal of Software Science and Computational Intelligence ◽

10.4018/ijssci.2021040102 ◽

2021 ◽

Vol 13 (2) ◽

pp. 23-36

Author(s):

Yashvardhan Sharma ◽

Rupal Bhargava ◽

Bapiraju Vamsi Tadikonda

Keyword(s):

Social Media ◽

Language Processing ◽

Short Term Memory ◽

Named Entity Recognition ◽

Entity Recognition ◽

Machine Learning Techniques ◽

Support Vector ◽

Internet Applications ◽

Named Entity ◽

Code Mixing

With the increase of internet applications and social media platforms there has been an increase in the informal way of text communication. People belonging to different regions tend to mix their regional language with English on social media text. This has been the trend with many multilingual nations now and is commonly known as code mixing. In code mixing, multiple languages are used within a statement. The problem of named entity recognition (NER) is a well-researched topic in natural language processing (NLP), but the present NER systems tend to perform inefficiently on code-mixed text. This paper proposes three approaches to improve named entity recognizers for handling code-mixing. The first approach is based on machine learning techniques such as support vector machines and other tree-based classifiers. The second approach is based on neural networks and the third approach uses long short-term memory (LSTM) architecture to solve the problem.

Download Full-text

A COMPARISON STUDY OF DIFFERENT KERNEL FUNCTIONS FOR SVM-BASED CLASSIFICATION OF MULTI-TEMPORAL POLARIMETRY SAR DATA

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xl-2-w3-281-2014 ◽

2014 ◽

Vol XL-2/W3 ◽

pp. 281-285 ◽

Cited By ~ 21

Author(s):

B. Yekkehkhany ◽

A. Safari ◽

S. Homayouni ◽

M. Hasanlou

Keyword(s):

Kernel Function ◽

Temporal Integration ◽

Experimental Tests ◽

Kernel Functions ◽

Polynomial Kernel ◽

Support Vector ◽

Svm Classifier ◽

Rbf Kernel ◽

Multi Temporal ◽

Dimension Space

In this paper, a framework is developed based on Support Vector Machines (SVM) for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR) imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF). <br><br> The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA) to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.

Download Full-text

Exploring the Adaptation of Recurrent Neural Network Approaches for Extracting Drug–Drug Interactions from Biomedical Text

International Journal of Machine Learning and Computing ◽

10.18178/ijmlc.2021.11.4.1046 ◽

2021 ◽

Vol 11 (4) ◽

pp. 267-273

Author(s):

Wen-Juan Hou ◽

◽

Bamfa Ceesay

Keyword(s):

Text Processing ◽

Named Entity Recognition ◽

Event Extraction ◽

Entity Recognition ◽

Biomedical Text ◽

Automatic Extraction ◽

Named Entity ◽

Structured Information ◽

Network Approaches ◽

Form Information

Information extraction (IE) is the process of automatically identifying structured information from unstructured or partially structured text. IE processes can involve several activities, such as named entity recognition, event extraction, relationship discovery, and document classification, with the overall goal of translating text into a more structured form. Information on the changes in the effect of a drug, when taken in combination with a second drug, is known as drug–drug interaction (DDI). DDIs can delay, decrease, or enhance absorption of drugs and thus decrease or increase their efficacy or cause adverse effects. Recent research trends have shown several adaptation of recurrent neural networks (RNNs) from text. In this study, we highlight significant challenges of using RNNs in biomedical text processing and propose automatic extraction of DDIs aiming at overcoming some challenges. Our results show that the system is competitive against other systems for the task of extracting DDIs.

Download Full-text

ChemTok: A New Rule Based Tokenizer for Chemical Named Entity Recognition

BioMed Research International ◽

10.1155/2016/4248026 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9 ◽

Cited By ~ 5

Author(s):

Abbas Akkasi ◽

Ekrem Varoğlu ◽

Nazife Dimililer

Keyword(s):

Conditional Random Fields ◽

Named Entity Recognition ◽

Classification Performance ◽

Entity Recognition ◽

Support Vector ◽

Learning Approaches ◽

Data Set ◽

Rule Based ◽

Named Entity ◽

Vector Machines

Named Entity Recognition (NER) from text constitutes the first step in many text mining applications. The most important preliminary step for NER systems using machine learning approaches is tokenization where raw text is segmented into tokens. This study proposes an enhanced rule based tokenizer, ChemTok, which utilizes rules extracted mainly from the train data set. The main novelty of ChemTok is the use of the extracted rules in order to merge the tokens split in the previous steps, thus producing longer and more discriminative tokens. ChemTok is compared to the tokenization methods utilized by ChemSpot and tmChem. Support Vector Machines and Conditional Random Fields are employed as the learning algorithms. The experimental results show that the classifiers trained on the output of ChemTok outperforms all classifiers trained on the output of the other two tokenizers in terms of classification performance, and the number of incorrectly segmented entities.

Download Full-text

SCIENTIFIC NAMED ENTITY RECOGNITION WITH THE HELP OF MODERN METHODS

Bulletin Series of Physics & Mathematical Sciences ◽

10.51889/2021-3.1728-7901.11 ◽

2021 ◽

Vol 75 (3) ◽

pp. 94-99

Author(s):

A.M. Yelenov ◽

◽

A.B. Jaxylykova ◽

Keyword(s):

Machine Learning ◽

Language Processing ◽

Named Entity Recognition ◽

Recognition Task ◽

Entity Recognition ◽

Support Vector ◽

Scientific Article ◽

Natural Languages ◽

Named Entity ◽

Learning Area

This research focuses on a comparative study of the Named Entity Recognition task for scientific article texts. Natural language processing could be considered as one of the cornerstones in the machine learning area which devotes its attention to the problems connected with the understanding of different natural languages and linguistic analysis. It was already shown that current deep learning techniques have a good performance and accuracy in such areas as image recognition, pattern recognition, computer vision, that could mean that such technology probably would be successful in the neuro-linguistic programming area too and lead to a dramatic increase on the research interest on this topic. For a very long time, quite trivial algorithms have been used in this area, such as support vector machines or various types of regression, basic encoding on text data was also used, which did not provide high results. The following dataset was used to process the experiment models: Dataset Scientific Entity Relation Core. The algorithms used were Long short-term memory, Random Forest Classifier with Conditional Random Fields, and Named-entity recognition with Bidirectional Encoder Representations from Transformers. In the findings, the metrics scores of all models were compared to each other to make a comparison. This research is devoted to the processing of scientific articles, concerning the machine learning area, because the subject is not investigated on enough properly level.The consideration of this task can help machines to understand natural languages better, so that they can solve other neuro-linguistic programming tasks better, enhancing scores in common sense.

Download Full-text

Advances in Computational Linguistics and Text Processing Frameworks

Advances in Computer and Electrical Engineering - Handbook of Research on Engineering Innovations and Technology Management in Organizations ◽

10.4018/978-1-7998-2772-6.ch012 ◽

2020 ◽

pp. 217-244

Author(s):

Ayush Srivastav ◽

Hera Khan ◽

Amit Kumar Mishra

Keyword(s):

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Computational Linguistics ◽

Language Processing ◽

Text Processing ◽

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Part Of Speech

The chapter provides an eloquent account of the major methodologies and advances in the field of Natural Language Processing. The most popular models that have been used over time for the task of Natural Language Processing have been discussed along with their applications in their specific tasks. The chapter begins with the fundamental concepts of regex and tokenization. It provides an insight to text preprocessing and its methodologies such as Stemming and Lemmatization, Stop Word Removal, followed by Part-of-Speech tagging and Named Entity Recognition. Further, this chapter elaborates the concept of Word Embedding, its various types, and some common frameworks such as word2vec, GloVe, and fastText. A brief description of classification algorithms used in Natural Language Processing is provided next, followed by Neural Networks and its advanced forms such as Recursive Neural Networks and Seq2seq models that are used in Computational Linguistics. A brief description of chatbots and Memory Networks concludes the chapter.

Download Full-text

Utilizing external corpora through kernel function: application in biomedical named entity recognition

Progress in Artificial Intelligence ◽

10.1007/s13748-020-00208-0 ◽

2020 ◽

Vol 9 (3) ◽

pp. 209-219

Author(s):

Rakesh Patra ◽

Sujan Kumar Saha

Keyword(s):

Kernel Function ◽

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Biomedical Named Entity Recognition

Download Full-text

Tuning support vector machines for biomedical named entity recognition

10.3115/1118149.1118150 ◽

2002 ◽

Cited By ~ 82

Author(s):

Jun'ichi Kazama ◽

Takaki Makino ◽

Yoshihiro Ohta ◽

Jun'ichi Tsujii

Keyword(s):

Support Vector Machines ◽

Named Entity Recognition ◽

Entity Recognition ◽

Support Vector ◽

Named Entity ◽

Vector Machines ◽

Biomedical Named Entity Recognition

Download Full-text

NAMED ENTITY RECOGNITION IN BIOMEDICAL LITERATURE USING TWO-LAYER SUPPORT VECTOR MACHINES

Proceedings of the Ninth International Conference on Enterprise Information Systems ◽

10.5220/0002357300390045 ◽

2007 ◽

Keyword(s):

Support Vector Machines ◽

Named Entity Recognition ◽

Biomedical Literature ◽

Entity Recognition ◽

Support Vector ◽

Named Entity ◽

Vector Machines

Download Full-text