Acquiring dominant compound terms to build Korean domain knowledge bases

Author(s):  
Hanmin Jung ◽  
HeeKwan Koo ◽  
Byeong-Hee Lee ◽  
Won-Kyung Sung
1994 ◽  
Vol 33 (05) ◽  
pp. 454-463 ◽  
Author(s):  
A. M. van Ginneken ◽  
J. van der Lei ◽  
J. H. van Bemmel ◽  
P. W. Moorman

Abstract:Clinical narratives in patient records are usually recorded in free text, limiting the use of this information for research, quality assessment, and decision support. This study focuses on the capture of clinical narratives in a structured format by supporting physicians with structured data entry (SDE). We analyzed and made explicit which requirements SDE should meet to be acceptable for the physician on the one hand, and generate unambiguous patient data on the other. Starting from these requirements, we found that in order to support SDE, the knowledge on which it is based needs to be made explicit: we refer to this knowledge as descriptional knowledge. We articulate the nature of this knowledge, and propose a model in which it can be formally represented. The model allows the construction of specific knowledge bases, each representing the knowledge needed to support SDE within a circumscribed domain. Data entry is made possible through a general entry program, of which the behavior is determined by a combination of user input and the content of the applicable domain knowledge base. We clarify how descriptional knowledge is represented, modeled, and used for data entry to achieve SDE, which meets the proposed requirements.


Author(s):  
Aatif Ahmad Khan ◽  
Sanjay Kumar Malik

Semantic Search refers to set of approaches dealing with usage of Semantic Web technologies for information retrieval in order to make the process machine understandable and fetch precise results. Knowledge Bases (KB) act as the backbone for semantic search approaches to provide machine interpretable information for query processing and retrieval of results. These KB include Resource Description Framework (RDF) datasets and populated ontologies. In this paper, an assessment of the largest cross-domain KB is presented that are exploited in large scale semantic search and are freely available on Linked Open Data Cloud. Analysis of these datasets is a prerequisite for modeling effective semantic search approaches because of their suitability for particular applications. Only the large scale, cross-domain datasets are considered, which are having sizes more than 10 million RDF triples. Survey of sizes of the datasets in triples count has been depicted along with triples data format(s) supported by them, which is quite significant to develop effective semantic search models.


Author(s):  
Iván Cantador ◽  
Pablo Castells ◽  
Alejandro Bellogín

Recommender systems have achieved success in a variety of domains, as a means to help users in information overload scenarios by proactively finding items or services on their behalf, taking into account or predicting their tastes, priorities, or goals. Challenging issues in their research agenda include the sparsity of user preference data and the lack of flexibility to incorporate contextual factors in the recommendation methods. To a significant extent, these issues can be related to a limited description and exploitation of the semantics underlying both user and item representations. The authors propose a three-fold knowledge representation, in which an explicit, semantic-rich domain knowledge space is incorporated between user and item spaces. The enhanced semantics support the development of contextualisation capabilities and enable performance improvements in recommendation methods. As a proof of concept and evaluation testbed, the approach is evaluated through its implementation in a news recommender system, in which it is tested with real users. In such scenario, semantic knowledge bases and item annotations are automatically produced from public sources.


2018 ◽  
Author(s):  
Xia Jing ◽  
Nicholas R Hardiker ◽  
Stephen Kay ◽  
Yongsheng Gao

BACKGROUND Ontologies are key enabling technologies for the Semantic Web. The Web Ontology Language (OWL) is a semantic markup language for publishing and sharing ontologies. OBJECTIVE The supply of customizable, computable, and formally represented molecular genetics information and health information, via electronic health record (EHR) interfaces, can play a critical role in achieving precision medicine. In this study, we used cystic fibrosis as an example to build an Ontology-based Knowledge Base prototype on Cystic Fibrobis (OntoKBCF) to supply such information via an EHR prototype. In addition, we elaborate on the construction and representation principles, approaches, applications, and representation challenges that we faced in the construction of OntoKBCF. The principles and approaches can be referenced and applied in constructing other ontology-based domain knowledge bases. METHODS First, we defined the scope of OntoKBCF according to possible clinical information needs about cystic fibrosis on both a molecular level and a clinical phenotype level. We then selected the knowledge sources to be represented in OntoKBCF. We utilized top-to-bottom content analysis and bottom-up construction to build OntoKBCF. Protégé-OWL was used to construct OntoKBCF. The construction principles included (1) to use existing basic terms as much as possible; (2) to use intersection and combination in representations; (3) to represent as many different types of facts as possible; and (4) to provide 2-5 examples for each type. HermiT 1.3.8.413 within Protégé-5.1.0 was used to check the consistency of OntoKBCF. RESULTS OntoKBCF was constructed successfully, with the inclusion of 408 classes, 35 properties, and 113 equivalent classes. OntoKBCF includes both atomic concepts (such as amino acid) and complex concepts (such as “adolescent female cystic fibrosis patient”) and their descriptions. We demonstrated that OntoKBCF could make customizable molecular and health information available automatically and usable via an EHR prototype. The main challenges include the provision of a more comprehensive account of different patient groups as well as the representation of uncertain knowledge, ambiguous concepts, and negative statements and more complicated and detailed molecular mechanisms or pathway information about cystic fibrosis. CONCLUSIONS Although cystic fibrosis is just one example, based on the current structure of OntoKBCF, it should be relatively straightforward to extend the prototype to cover different topics. Moreover, the principles underpinning its development could be reused for building alternative human monogenetic diseases knowledge bases.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Ruiqing Yan ◽  
Lanchang Sun ◽  
Fang Wang ◽  
Xiaoming Zhang

Recently, pretrained language models, such as Bert and XLNet, have rapidly advanced the state of the art on many NLP tasks. They can model implicit semantic information between words in the text. However, it is solely at the token level without considering the background knowledge. Intuitively, background knowledge influences the efficacy of text understanding. Inspired by this, we focus on improving model pretraining by leveraging external knowledge. Different from recent research that optimizes pretraining models by knowledge masking strategies, we propose a simple but general method to transfer explicit knowledge with pretraining. To be specific, we first match knowledge facts from a knowledge base (KB) and then add a knowledge injunction layer to a transformer directly without changing its architecture. This study seeks to find the direct impact of explicit knowledge on model pretraining. We conduct experiments on 7 datasets using 5 knowledge bases in different downstream tasks. Our investigation reveals promising results in all the tasks. The experiment also verifies that domain-specific knowledge is superior to open-domain knowledge in domain-specific task, and different knowledge bases have different performances in different tasks.


Semantic Web ◽  
2013 ◽  
pp. 235-269 ◽  
Author(s):  
Iván Cantador ◽  
Pablo Castells ◽  
Alejandro Bellogín

Recommender systems have achieved success in a variety of domains, as a means to help users in information overload scenarios by proactively finding items or services on their behalf, taking into account or predicting their tastes, priorities, or goals. Challenging issues in their research agenda include the sparsity of user preference data and the lack of flexibility to incorporate contextual factors in the recommendation methods. To a significant extent, these issues can be related to a limited description and exploitation of the semantics underlying both user and item representations. The authors propose a three-fold knowledge representation, in which an explicit, semantic-rich domain knowledge space is incorporated between user and item spaces. The enhanced semantics support the development of contextualisation capabilities and enable performance improvements in recommendation methods. As a proof of concept and evaluation testbed, the approach is evaluated through its implementation in a news recommender system, in which it is tested with real users. In such scenario, semantic knowledge bases and item annotations are automatically produced from public sources.


Sign in / Sign up

Export Citation Format

Share Document