scholarly journals An Algorithm of Vocabulary Enhanced Intelligent Question Answering Based on FLAT1

2021 ◽  
Author(s):  
Jing Sheng Lei ◽  
Shi Chao Ye ◽  
Sheng Ying Yang ◽  
Wei Song ◽  
Guan Mian Liang

The main purpose of the intelligent question answering system based on the knowledge graph is to accurately match the natural language question and the triple information in the knowledge graph. Among them, the entity recognition part is one of the key points. The wrong entity recognition result will cause the error to be done propagated, resulting in the ultimate failure to get the correct answer. In recent years, the lexical enhancement structure of word nodes combined with word nodes has been proved to be an effective method for Chinese named entity recognition. In order to solve the above problems, this paper proposes a vocabulary-enhanced entity recognition algorithm (KGFLAT) based on FLAT for intelligent question answering system. This method uses a new dictionary that combines the entity information of the knowledge graph, and only uses layer normalization for the removal of residual connection for the shallower network model. The system uses data provided by the NLPCC 2018 Task7 KBQA task for evaluation. The experimental results show that this method can effectively solve the entity recognition task in the intelligent question answering system and achieve the improvement of the FLAT model, and the average F1 value is 94.72

2020 ◽  
Author(s):  
Vladislav Mikhailov ◽  
Tatiana Shavrina

Named Entity Recognition (NER) is a fundamental task in the fields of natural language processing and information extraction. NER has been widely used as a standalone tool or an essential component in a variety of applications such as question answering, dialogue assistants and knowledge graphs development. However, training reliable NER models requires a large amount of labelled data which is expensive to obtain, particularly in specialized domains. This paper describes a method to learn a domain-specific NER model for an arbitrary set of named entities when domain-specific supervision is not available. We assume that the supervision can be obtained with no human effort, and neural models can learn from each other. The code, data and models are publicly available.


2020 ◽  
Vol 34 (05) ◽  
pp. 7919-7926
Author(s):  
Qizhen He ◽  
Liang Wu ◽  
Yida Yin ◽  
Heming Cai

By modeling the context information, ELMo and BERT have successfully improved the state-of-the-art of word representation, and demonstrated their effectiveness on the Named Entity Recognition task. In this paper, in addition to such context modeling, we propose to encode the prior knowledge of entities from an external knowledge base into the representation, and introduce a Knowledge-Graph Augmented Word Representation or KAWR for named entity recognition. Basically, KAWR provides a kind of knowledge-aware representation for words by 1) encoding entity information from a pre-trained KG embedding model with a new recurrent unit (GERU), and 2) strengthening context modeling from knowledge wise by providing a relation attention scheme based on the entity relations defined in KG. We demonstrate that KAWR, as an augmented version of the existing linguistic word representations, promotes F1 scores on 5 datasets in various domains by +0.46∼+2.07. Better generalization is also observed for KAWR on new entities that cannot be found in the training sets.


Author(s):  
Ivan Christanno ◽  
Priscilla Priscilla ◽  
Jody Johansyah Maulana ◽  
Derwin Suhartono ◽  
Rini Wongso

The objective of this research was to create a closed-domain of automated question answering system specifically for events called Eve. Automated Question Answering System (QAS) is a system that accepts question input in the form of natural language. The question will be processed through modules to finally return the most appropriate answer to the corresponding question instead of returning a full document as an output. Thescope of the events was those which were organized by Students Association of Computer Science (HIMTI) in Bina Nusantara University. It consisted of 3 main modules namely query processing, information retrieval, and information extraction. Meanwhile, the approaches used in this system included question classification, document indexing, named entity recognition and others. For the results, the system can answer 63 questions for word matching technique, and 32 questions for word similarity technique out of 94 questions correctly.


2021 ◽  
Vol 16 ◽  
pp. 1-10
Author(s):  
Husni Teja Sukmana ◽  
JM Muslimin ◽  
Asep Fajar Firmansyah ◽  
Lee Kyung Oh

In Indonesia, philanthropy is identical to Zakat. Zakat belongs to a specific domain because it has its characteristics of knowledge. This research studied knowledge graph in the Zakat domain called KGZ which is conducted in Indonesia. This area is still rarely performed, thus it becomes the first knowledge graph for Zakat in Indonesia. It is designed to provide basic knowledge on Zakat and managing the Zakat in Indonesia. There are some issues with building KGZ, firstly, the existing Indonesian named entity recognition (NER) is non-restricted and general-purpose based which data is obtained from a general source like news. Second, there is no dataset for NER in the Zakat domain. We define four steps to build KGZ, involving data acquisition, extracting entities and their relationship, mapping to ontology, and deploying knowledge graphs and visualizations. This research contributed a knowledge graph for Zakat (KGZ) and a building NER model for Zakat, called KGZ-NER. We defined 17 new named entity classes related to Zakat with 272 entities, 169 relationships and provided labelled datasets for KGZ-NER that are publicly accessible. We applied the Indonesian-Open Domain Information Extractor framework to process identifying entities’ relationships. Then designed modeling of information using resources description framework (RDF) to build the knowledge base for KGZ and store it to GraphDB, a product from Ontotext. This NER model has a precision 0.7641, recall 0.4544, and F1-score 0.5655. The increasing data size of KGZ is required to discover all of the knowledge of Zakat and managing Zakat in Indonesia. Moreover, sufficient resources are required in future works.


2021 ◽  
Vol 75 (3) ◽  
pp. 94-99
Author(s):  
A.M. Yelenov ◽  
◽  
A.B. Jaxylykova ◽  

This research focuses on a comparative study of the Named Entity Recognition task for scientific article texts. Natural language processing could be considered as one of the cornerstones in the machine learning area which devotes its attention to the problems connected with the understanding of different natural languages and linguistic analysis. It was already shown that current deep learning techniques have a good performance and accuracy in such areas as image recognition, pattern recognition, computer vision, that could mean that such technology probably would be successful in the neuro-linguistic programming area too and lead to a dramatic increase on the research interest on this topic. For a very long time, quite trivial algorithms have been used in this area, such as support vector machines or various types of regression, basic encoding on text data was also used, which did not provide high results. The following dataset was used to process the experiment models: Dataset Scientific Entity Relation Core. The algorithms used were Long short-term memory, Random Forest Classifier with Conditional Random Fields, and Named-entity recognition with Bidirectional Encoder Representations from Transformers. In the findings, the metrics scores of all models were compared to each other to make a comparison. This research is devoted to the processing of scientific articles, concerning the machine learning area, because the subject is not investigated on enough properly level.The consideration of this task can help machines to understand natural languages better, so that they can solve other neuro-linguistic programming tasks better, enhancing scores in common sense.


Sign in / Sign up

Export Citation Format

Share Document