Tagger: BeCalm API for rapid named entity recognition

A real time Named Entity Recognition system for Arabic text mining

Language Resources and Evaluation ◽

10.1007/s10579-011-9146-z ◽

2011 ◽

Vol 46 (4) ◽

pp. 543-563 ◽

Cited By ~ 11

Author(s):

Harith Al-Jumaily ◽

Paloma Martínez ◽

José L. Martínez-Fernández ◽

Erik Van der Goot

Keyword(s):

Text Mining ◽

Real Time ◽

Named Entity Recognition ◽

Recognition System ◽

Entity Recognition ◽

Arabic Text ◽

Named Entity

Download Full-text

SicknessMiner: a deep-learning-driven text-mining tool to abridge disease-disease associations

BMC Bioinformatics ◽

10.1186/s12859-021-04397-w ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Nícia Rosário-Ferreira ◽

Victor Guimarães ◽

Vítor S. Costa ◽

Irina S. Moreira

Keyword(s):

Text Mining ◽

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Disease Similarity ◽

Disease Associations ◽

Named Entity Normalization ◽

Mining Tool ◽

Or Gene ◽

Text Mining Tool

Abstract Background Blood cancers (BCs) are responsible for over 720 K yearly deaths worldwide. Their prevalence and mortality-rate uphold the relevance of research related to BCs. Despite the availability of different resources establishing Disease-Disease Associations (DDAs), the knowledge is scattered and not accessible in a straightforward way to the scientific community. Here, we propose SicknessMiner, a biomedical Text-Mining (TM) approach towards the centralization of DDAs. Our methodology encompasses Named Entity Recognition (NER) and Named Entity Normalization (NEN) steps, and the DDAs retrieved were compared to the DisGeNET resource for qualitative and quantitative comparison. Results We obtained the DDAs via co-mention using our SicknessMiner or gene- or variant-disease similarity on DisGeNET. SicknessMiner was able to retrieve around 92% of the DisGeNET results and nearly 15% of the SicknessMiner results were specific to our pipeline. Conclusions SicknessMiner is a valuable tool to extract disease-disease relationship from RAW input corpus.

Download Full-text

Recognition of Chemical Entities using Pattern Matching and Functional Group Classification

International Journal of Intelligent Information Technologies ◽

10.4018/ijiit.2016100102 ◽

2016 ◽

Vol 12 (4) ◽

pp. 21-44 ◽

Cited By ~ 3

Author(s):

R. Hema ◽

T. V. Geetha

Keyword(s):

Pattern Matching ◽

Functional Group ◽

Named Entity Recognition ◽

Chemical Compounds ◽

Chemical Entity ◽

Entity Recognition ◽

Matching Method ◽

Named Entity ◽

One Way Anova

The two main challenges in chemical entity recognition are: (i) New chemical compounds are constantly being synthesized infinitely. (ii) High ambiguity in chemical representation in which a chemical entity is being described by different nomenclatures. Therefore, the identification and maintenance of chemical terminologies is a tough task. Since most of the existing text mining methods followed the term-based approaches, the problems of polysemy and synonymy came into the picture. So, a Named Entity Recognition (NER) system based on pattern matching in chemical domain is developed to extract the chemical entities from chemical documents. The Tf-idf and PMI association measures are used to filter out the non-chemical terms. The F-score of 92.19% is achieved for chemical NER. This proposed method is compared with the baseline method and other existing approaches. As the final step, the filtered chemical entities are classified into sixteen functional groups. The classification is done using SVM One against All multiclass classification approach and achieved the accuracy of 87%. One-way ANOVA is used to test the quality of pattern matching method with the other existing chemical NER methods.

Download Full-text

A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining

IEEE Access ◽

10.1109/access.2019.2920708 ◽

2019 ◽

Vol 7 ◽

pp. 73729-73740 ◽

Cited By ~ 13

Author(s):

Donghyeon Kim ◽

Jinhyuk Lee ◽

Chan Ho So ◽

Hwisang Jeon ◽

Minbyul Jeong ◽

...

Keyword(s):

Text Mining ◽

Named Entity Recognition ◽

Entity Recognition ◽

Biomedical Text ◽

Biomedical Text Mining ◽

Named Entity

Download Full-text

A Deep Learning Approach to Integrate Human-Level Understanding in a Chatbot

10.5121/csit.2021.112309 ◽

2021 ◽

Author(s):

Afia Fairoose Abedin ◽

Amirul Islam Al Mamun ◽

Rownak Jahan Nowrin ◽

Amitabha Chakrabarty ◽

Moin Mostakim ◽

...

Keyword(s):

Deep Learning ◽

Service Management ◽

Named Entity Recognition ◽

Entity Recognition ◽

Emotion Detection ◽

Named Entity ◽

Input Text ◽

Quality Of Products ◽

Task Oriented

In recent times, a large number of people have been involved in establishing their own businesses. Unlike humans, chatbots can serve multiple customers at a time, are available 24/7 and reply in less than a fraction of a second. Though chatbots perform well in task-oriented activities, in most cases they fail to understand personalized opinions, statements or even queries which later impact the organization for poor service management. Lack of understanding capabilities in bots disinterest humans to continue conversations with them. Usually, chatbots give absurd responses when they are unable to interpret a user’s text accurately. Extracting the client reviews from conversations by using chatbots, organizations can reduce the major gap of understanding between the users and the chatbot and improve their quality of products and services.Thus, in our research we incorporated all the key elements that are necessary for a chatbot to analyse andunderstand an input text precisely and accurately. We performed sentiment analysis, emotion detection, intent classification and named-entity recognition using deep learning to develop chatbots with humanistic understanding and intelligence. The efficiency of our approach can be demonstrated accordingly by the detailed analysis.

Download Full-text

Techniques for Named Entity Recognition

Advances in Human and Social Aspects of Technology - Collaboration and the Semantic Web ◽

10.4018/978-1-4666-0894-8.ch011 ◽

2012 ◽

pp. 191-217 ◽

Cited By ~ 1

Author(s):

Girish Keshav Palshikar

Keyword(s):

Semantic Processing ◽

Named Entity Recognition ◽

Entity Recognition ◽

Named Entities ◽

Named Entity ◽

Domain Specific ◽

Number Of Factors ◽

Web Contents ◽

Biological Domain

While building and using a fully semantic understanding of Web contents is a distant goal, named entities (NEs) provide a small, tractable set of elements carrying a well-defined semantics. Generic named entities are names of persons, locations, organizations, phone numbers, and dates, while domain-specific named entities includes names of for example, proteins, enzymes, organisms, genes, cells, et cetera, in the biological domain. An ability to automatically perform named entity recognition (NER) – i.e., identify occurrences of NE in Web contents – can have multiple benefits, such as improving the expressiveness of queries and also improving the quality of the search results. A number of factors make building highly accurate NER a challenging task. Given the importance of NER in semantic processing of text, this chapter presents a detailed survey of NER techniques for English text.

Download Full-text

Techniques for Named Entity Recognition

Bioinformatics ◽

10.4018/978-1-4666-3604-0.ch022 ◽

2013 ◽

pp. 400-426 ◽

Cited By ~ 2

Author(s):

Girish Keshav Palshikar

Keyword(s):

Semantic Processing ◽

Named Entity Recognition ◽

Entity Recognition ◽

Named Entities ◽

Named Entity ◽

Domain Specific ◽

Number Of Factors ◽

Web Contents ◽

Biological Domain

While building and using a fully semantic understanding of Web contents is a distant goal, named entities (NEs) provide a small, tractable set of elements carrying a well-defined semantics. Generic named entities are names of persons, locations, organizations, phone numbers, and dates, while domain-specific named entities includes names of for example, proteins, enzymes, organisms, genes, cells, et cetera, in the biological domain. An ability to automatically perform named entity recognition (NER) – i.e., identify occurrences of NE in Web contents – can have multiple benefits, such as improving the expressiveness of queries and also improving the quality of the search results. A number of factors make building highly accurate NER a challenging task. Given the importance of NER in semantic processing of text, this chapter presents a detailed survey of NER techniques for English text.

Download Full-text

Klasifikasi jenis kejadian menggunakan kombinasi NeuroNER dan Recurrent Convolutional Neural Network pada data Twitter

Register Jurnal Ilmiah Teknologi Sistem Informasi ◽

10.26594/register.v4i2.1242 ◽

2018 ◽

Vol 4 (2) ◽

pp. 81

Author(s):

Fatra Nonggala Putra ◽

Chastine Fatichah

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Real Time ◽

Detection System ◽

Named Entity Recognition ◽

Entity Recognition ◽

Experimental Result ◽

Incident Detection ◽

Named Entity ◽

F Measure

Sistem deteksi kejadian dari data Twitter bertujuan untuk mendapatkan data secara real-time sebagai alternatif sistem deteksi kejadian yang murah. Penelitian tentang sistem deteksi kejadian telah dilakukan sebelumnya. Salah satu modul utama dari sistem deteksi kejadian adalah modul klasifikasi jenis kejadian. Informasi dapat diklasifikasikan sebagai kejadian penting jika memiliki entitas yang merepresentasikan di mana lokasi kejadian terjadi. Beberapa penelitian sebelumnya masih memanfaatkan fitur ‘buatan tangan’, maupun fitur model berbasis pipeline seperti n-gram sebagai penentuan fitur kunci klasifikasi yang tidak efektif dengan performa kurang optimal. Oleh karena itu, diusulkan penggabungan metode Neuro Named Entity Recognition (NeuroNER) dan klasifier Recurrent Convolutional Neural Network (RCNN) yang diharapkan dapat melakukan deteksi kejadian secara efektif dan optimal. Pertama, sistem melakukan pengenalan entitas bernama pada data tweet untuk mengenali entitas lokasi yang terdapat dalam teks tweet, karena informasi kejadian haruslah memiliki minimal satu entitas lokasi. Kedua, jika tweet terdeteksi memiliki entitas lokasi maka akan dilakukan proses klasifikasi kejadian menggunakan klasifier RCNN. Berdasarkan hasil uji coba, disimpulkan bahwa sistem deteksi kejadian menggunakan penggabungan NeuroNER dan RCNN bekerja dengan sangat baik dengan nilai rata-rata precision, recall, dan f-measure masing-masing 94,87%, 92,73%, dan 93,73%. The incident detection system from Twitter data aims to obtain real-time information as an alternative low-cost incident detection system. One of the main modules in the incident detection system is the classification module. Information is classified as important incident if it has an entity that represents where the incident occurred. Some previous studies still use 'handmade' features as well as feature-based pipeline models such as n-grams as the key features for classification which are deemed as ineffective. Therefore, this research propose a combination of Neuro Named Entity Recognition (NeuroNER) and Recurrent Convolutional Neural Network (RCNN) as an effective classification method for incident detection. First, the system perform named entity recognition to identify the location contained in the tweet text because the event information should have at least one location entity. Then, if the location is successfully identified, the incident will be classified using RCNN. Experimental result shows that the incident detection system using combination of NeuroNER and RCNN works very well with the average value of precision, recall, and f-measure 92.44%, 94.76%, and 93.53% respectively.

Download Full-text

Leveraging Concepts in Open Access Publications

Journal of Data Mining & Digital Humanities ◽

10.46298/jdmdh.5081 ◽

2020 ◽

Vol 2019 ◽

Author(s):

Andrea Bertino ◽

Luca Foppiano ◽

Laurent Romary ◽

Pierre Mounier

Keyword(s):

Open Access ◽

Named Entity Recognition ◽

Open Science ◽

Entity Recognition ◽

Current Status ◽

Data Sets ◽

Digital Platforms ◽

Social Sciences And Humanities ◽

Named Entity ◽

The Eu

This paper addresses the integration of a Named Entity Recognition and Disambiguation (NERD) service within a group of open access (OA) publishing digital platforms and considers its potential impact on both research and scholarly publishing. The software powering this service, called entity-fishing, was initially developed by Inria in the context of the EU FP7 project CENDARI and provides automatic entity recognition and disambiguation using the Wikipedia and Wikidata data sets. The application is distributed with an open-source licence, and it has been deployed as a web service in DARIAH's infrastructure hosted by the French HumaNum. In the paper, we focus on the specific issues related to its integration on five OA platforms specialized in the publication of scholarly monographs in the social sciences and humanities (SSH), as part of the work carried out within the EU H2020 project HIRMEOS (High Integration of Research Monographs in the European Open Science infrastructure). In the first section, we give a brief overview of the current status and evolution of OA publications, considering specifically the challenges that OA monographs are encountering. In the second part, we show how the HIRMEOS project aims to face these challenges by optimizing five OA digital platforms for the publication of monographs from the SSH and ensuring their interoperability. In sections three and four we give a comprehensive description of the entity-fishing service, focusing on its concrete applications in real use cases together with some further possible ideas on how to exploit the annotations generated. We show that entity-fishing annotations can improve both research and publishing process. In the last chapter, we briefly present further possible application scenarios that could be made available through infrastructural projects.

Download Full-text

Text mining of 15 million full-text scientific articles

10.1101/162099 ◽

2017 ◽

Cited By ~ 5

Author(s):

David Westergaard ◽

Hans-Henrik Stærfeldt ◽

Christian Tønsberg ◽

Lars Juhl Jensen ◽

Søren Brunak

Keyword(s):

Text Mining ◽

Full Text ◽

Disease Gene ◽

Scientific Literature ◽

Named Entity Recognition ◽

Recognition System ◽

Entity Recognition ◽

Data Sets ◽

Named Entity ◽

Benchmark Data

AbstractAcross academia and industry, text mining has become a popular strategy for keeping up with the rapid growth of the scientific literature. Text mining of the scientific literature has mostly been carried out on collections of abstracts, due to their availability. Here we present an analysis of 15 million English scientific full-text articles published during the period 1823–2016. We describe the development in article length and publication sub-topics during these nearly 250 years. We showcase the potential of text mining by extracting published protein–protein, disease–gene, and protein subcellular associations using a named entity recognition system, and quantitatively report on their accuracy using gold standard benchmark data sets. We subsequently compare the findings to corresponding results obtained on 16.5 million abstracts included in MEDLINE and show that text mining of full-text articles consistently outperforms using abstracts only.

Download Full-text