Turning Informal Thesauri into Formal Ontologies: A Feasibility Study on Biomedical Knowledge Re-Use

Udo Hahn

doi:10.1002/cfg.247

Turning Informal Thesauri into Formal Ontologies: A Feasibility Study on Biomedical Knowledge Re-Use

Comparative and Functional Genomics ◽

10.1002/cfg.247 ◽

2003 ◽

Vol 4 (1) ◽

pp. 94-97 ◽

Cited By ~ 6

Author(s):

Udo Hahn

Keyword(s):

Domain Knowledge ◽

Large Scale ◽

Description Logics ◽

Biomedical Domain ◽

Biomedical Knowledge ◽

Knowledge Conversion ◽

Formal Knowledge ◽

Knowledge Repositories ◽

Formal Ontologies ◽

Rigorous Description

This paper reports a large-scale knowledge conversion and curation experiment. Biomedical domain knowledge from a semantically weak and shallow terminological resource, the UMLS, is transformed into a rigorous description logics format. This way, the broad coverage of the UMLS is combined with inference mechanisms for consistency and cycle checking. They are the key to proper cleansing of the knowledge directly imported from the UMLS, as well as subsequent updating, maintenance and refinement of large knowledge repositories. The emerging biomedical knowledge base currently comprises more than 240 000 conceptual entities and hence constitutes one of the largest formal knowledge repositories ever built.

Download Full-text

Formal Ontologies in Biomedical Knowledge Representation

Yearbook of Medical Informatics ◽

10.1055/s-0038-1638845 ◽

2013 ◽

Vol 22 (01) ◽

pp. 132-146 ◽

Cited By ~ 3

Author(s):

L. Jansen ◽

S. Schulz

Keyword(s):

Decision Support ◽

Knowledge Representation ◽

Description Logics ◽

Clinical Decision Support Systems ◽

Knowledge Bases ◽

Medical Decision ◽

Knowledge Representation And Reasoning ◽

Digital Information ◽

Biomedical Knowledge ◽

Formal Ontologies

Summary Objectives: Medical decision support and other intelligent applications in the life sciences depend on increasing amounts of digital information. Knowledge bases as well as formal ontologies are being used to organize biomedical knowledge and data. However, these two kinds of artefacts are not always clearly distinguished. Whereas the popular RDF(S) standard provides an intuitive triple-based representation, it is semantically weak. Description logics based ontology languages like OWL-DL carry a clear-cut semantics, but they are computationally expensive, and they are often misinterpreted to encode all kinds of statements, including those which are not ontological. Method: We distinguish four kinds of statements needed to comprehensively represent domain knowledge: universal statements, terminological statements, statements about particulars and contingent statements. We argue that the task of formal ontologies is solely to represent universal statements, while the non-ontological kinds of statements can nevertheless be connected with ontological representations. To illustrate these four types of representations, we use a running example from parasitology. Results: We finally formulate recommendations for semantically adequate ontologies that can efficiently be used as a stable framework for more context-dependent biomedical knowledge representation and reasoning applications like clinical decision support systems.

Download Full-text

Capability-oriented architectural analysis method based on fuzzy description logic

Computer Science and Information Systems ◽

10.2298/csis150222046t ◽

2016 ◽

Vol 13 (1) ◽

pp. 287-308 ◽

Cited By ~ 2

Author(s):

Zhang Tingting ◽

Liu Xiaoming ◽

Wang Zhixue ◽

Dong Qingchao

Keyword(s):

Business Strategy ◽

Domain Knowledge ◽

Description Logic ◽

Description Logics ◽

Model Verification ◽

Architectural Analysis ◽

Requirements Modeling ◽

Strategy Model ◽

Domain Specific Modeling ◽

Fuzzy Description

A number of problems may arise from architectural requirements modeling, including alignment of it with business strategy, model integration and handling the uncertain and vague information. The paper introduces a method for modeling architectural requirements in a way of ontology-based and capability-oriented requirements elicitation. The requirements can be modeled within a three-layer framework. The Capability Meta-concept Framework is provided at the top level. The domain experts can capture the domain knowledge within the framework, forming the domain ontology at the second level. The domain concepts can be used for extending the UML to produce a domain-specific modeling language. A fuzzy UML is introduced to model the vague and uncertain features of the capability requirements. An algorithm is provided to transform the fuzzy UML models into the fuzzy Description Logics ontology for model verification. A case study is given to demonstrate the applicability of the method.

Download Full-text

Bidirectional Spreading Activation Method for Finding Human Diseases Relatedness Using Well-Formed Disease Ontology

Data Analytics in Medicine ◽

10.4018/978-1-7998-1204-3.ch090 ◽

2020 ◽

pp. 1814-1825

Author(s):

Said Fathalla ◽

Yaman M. Khalid Kannot

Keyword(s):

Domain Knowledge ◽

Human Diseases ◽

Spreading Activation ◽

Semantic Matching ◽

Medical Systems ◽

Biomedical Knowledge ◽

Disease Ontology ◽

Extensive Information ◽

Spreading Activation Algorithm ◽

Two Phases

The successful application of semantic web in medical informatics and the fast expanding of biomedical knowledge have prompted to the requirement for a standardized representation of knowledge and an efficient algorithm for querying this extensive information. Spreading activation algorithm is suitable to work on incomplete and large datasets. This article presents a method called SAOO (Spreading Activation over Ontology) which identifies the relatedness between two human diseases by applying spreading activation algorithm based on bidirectional search technique over large disease ontology. The proposed methodology is divided into two phases: Semantic matching and Disease relatedness detection. In Semantic Matching, semantically identify diseases in user's query in the ontology. In the Disease Relatedness Detection, URIs of the diseases are passed to the relatedness detector which returns the set of diseases that may connect them. The proposed method improves the non-semantic medical systems by considering semantic domain knowledge to infer diseases relatedness.

Download Full-text

OpenBioLink: a benchmarking framework for large-scale biomedical link prediction

Bioinformatics ◽

10.1093/bioinformatics/btaa274 ◽

2020 ◽

Vol 36 (13) ◽

pp. 4097-4098 ◽

Cited By ~ 3

Author(s):

Anna Breit ◽

Simon Ott ◽

Asan Agibetov ◽

Matthias Samwald

Keyword(s):

Link Prediction ◽

Large Scale ◽

Source Code ◽

Machine Learning Algorithms ◽

Knowledge Networks ◽

Supplementary Information ◽

Supplementary Data ◽

Biomedical Knowledge ◽

High Quality ◽

Baseline Evaluation

Abstract Summary Recently, novel machine-learning algorithms have shown potential for predicting undiscovered links in biomedical knowledge networks. However, dedicated benchmarks for measuring algorithmic progress have not yet emerged. With OpenBioLink, we introduce a large-scale, high-quality and highly challenging biomedical link prediction benchmark to transparently and reproducibly evaluate such algorithms. Furthermore, we present preliminary baseline evaluation results. Availability and implementation Source code and data are openly available at https://github.com/OpenBioLink/OpenBioLink. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A Medical Informatics Perspective on Health Informatics 3.0

Yearbook of Medical Informatics ◽

10.1055/s-0038-1638733 ◽

2011 ◽

Vol 20 (01) ◽

pp. 30-32

Author(s):

P. Ruch ◽

Keyword(s):

Semantic Web ◽

Health Informatics ◽

Large Scale ◽

Biomedical Data ◽

Biomedical Knowledge ◽

Data Set ◽

Web 3.0 ◽

Research Fields ◽

Emerging Trends ◽

Excellent Research

SummaryTo summarize current advances of the so-called Web 3.0 and emerging trends of the semantic web.We provide a synopsis of the articles selected for the IMIA Yearbook 2011, from which we attempt to derive a synthetic overview of the today’s and future activities in the field.while the state of the research in the field is illustrated by a set of fairly heterogeneous studies, it is possible to identify significant clusters. While the most salient challenge and obsessional target of the semantic web remains its ambition to simply interconnect all available information, it is interesting to observe the developments of complementary research fields such as information sciences and text analytics. The combined expression power and virtually unlimited data aggregation skills of Web 3.0 technologies make it a disruptive instrument to discover new biomedical knowledge. In parallel, such an unprecedented situation creates new threats for patients participating in large-scale genetic studies as Wjst demonstrate how various data set can be coupled to re-identify anonymous genetic information.The best paper selection of articles on decision support shows examples of excellent research on methods concerning original development of core semantic web techniques as well as transdisciplinary achievements as exemplified with literature-based analytics. This selected set of scientific investigations also demonstrates the needs for computerized applications to transform the biomedical data overflow into more operational clinical knowledge with potential threats for confidentiality directly associated with such advances. Altogether these papers support the idea that more elaborated computer tools, likely to combine heterogeneous text and data contents should soon emerge for the benefit of both experimentalists and hopefully clinicians.

Download Full-text

Machine Learning for Fluid Mechanics

Annual Review of Fluid Mechanics ◽

10.1146/annurev-fluid-010719-060214 ◽

2020 ◽

Vol 52 (1) ◽

pp. 477-508 ◽

Cited By ~ 122

Author(s):

Steven L. Brunton ◽

Bernd R. Noack ◽

Petros Koumoutsakos

Keyword(s):

Machine Learning ◽

Fluid Mechanics ◽

Domain Knowledge ◽

Large Scale ◽

Past History ◽

Fluid Flows ◽

Field Measurements ◽

Industrial Applications ◽

Current Lines ◽

Large Scale Simulations

The field of fluid mechanics is rapidly advancing, driven by unprecedented volumes of data from experiments, field measurements, and large-scale simulations at multiple spatiotemporal scales. Machine learning (ML) offers a wealth of techniques to extract information from data that can be translated into knowledge about the underlying fluid mechanics. Moreover, ML algorithms can augment domain knowledge and automate tasks related to flow control and optimization. This article presents an overview of past history, current developments, and emerging opportunities of ML for fluid mechanics. We outline fundamental ML methodologies and discuss their uses for understanding, modeling, optimizing, and controlling fluid flows. The strengths and limitations of these methods are addressed from the perspective of scientific inquiry that considers data as an inherent part of modeling, experiments, and simulations. ML provides a powerful information-processing framework that can augment, and possibly even transform, current lines of fluid mechanics research and industrial applications.

Download Full-text

Domain Knowledge Embedding Regularization Neural Networks for Workload Prediction and Analysis in Cloud Computing

Journal of Information Technology Research ◽

10.4018/jitr.2018100109 ◽

2018 ◽

Vol 11 (4) ◽

pp. 137-154 ◽

Cited By ~ 2

Author(s):

Lei Li ◽

Min Feng ◽

Lianwen Jin ◽

Shenjin Chen ◽

Lihong Ma ◽

...

Keyword(s):

Neural Networks ◽

Cloud Computing ◽

Artificial Neural Networks ◽

Cost Effectiveness ◽

Domain Knowledge ◽

Large Scale ◽

Workload Prediction ◽

Platform As A Service ◽

Resource Cost ◽

Artificial Neural

Online services are now commonly deployed via cloud computing based on Infrastructure as a Service (IaaS) to Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS). However, workload is not constant over time, so guaranteeing the quality of service (QoS) and resource cost-effectiveness, which is determined by on-demand workload resource requirements, is a challenging issue. In this article, the authors propose a neural network-based-method termed domain knowledge embedding regularization neural networks (DKRNN) for large-scale workload prediction. Based on analyzing the statistical properties of a real large-scale workload, domain knowledge, which provides extended information about workload changes, is embedded into artificial neural networks (ANN) for linear regression to improve prediction accuracy. Furthermore, the regularization with noisy is combined to improve the generalization ability of artificial neural networks. The experiments demonstrate that the model can achieve more accuracy of workload prediction, provide more adaptive resource for higher resource cost effectiveness and have less impact on the QoS.

Download Full-text

Creating explorable extended reality environments with semantic annotations

Multimedia Tools and Applications ◽

10.1007/s11042-020-09772-y ◽

2020 ◽

Author(s):

Jakub Flotyński

Keyword(s):

Domain Knowledge ◽

Description Logics ◽

Main Element ◽

Arbitrary Domain ◽

Aspect Oriented Programming ◽

Domain Experts ◽

Semantic Annotations ◽

Knowledge Based ◽

And Control ◽

3D Content

Abstract The main element of extended reality (XR) environments is behavior-rich 3D content consisting of objects that act and interact with one another as well as with users. Such actions and interactions constitute the evolution of the content over time. Multiple application domains of XR, e.g., education, training, marketing, merchandising, and design, could benefit from the analysis of 3D content changes based on general or domain knowledge comprehensible to average users or domain experts. Such analysis can be intended, in particular, to monitor, comprehend, examine, and control XR environments as well as users’ skills, experience, interests and preferences, and XR objects’ features. However, it is difficult to achieve as long as XR environments are developed with methods and tools that focus on programming and 3D modeling rather than expressing domain knowledge accompanying content users and objects, and their behavior. The main contribution of this paper is an approach to creating explorable knowledge-based XR environments with semantic annotations. The approach combines description logics with aspect-oriented programming, which enables knowledge representation in an arbitrary domain as well as transformation of available environments with minimal users’ effort. We have implemented the approach using well-established development tools and exemplify it with an explorable immersive car showroom. The approach enables efficient creation of explorable XR environments and knowledge acquisition from XR.

Download Full-text

Concept Induction in Description Logics Using Information-Theoretic Heuristics

International Journal on Semantic Web and Information Systems ◽

10.4018/jswis.2011040102 ◽

2011 ◽

Vol 7 (2) ◽

pp. 23-44 ◽

Cited By ~ 6

Author(s):

Nicola Fanizzi

Keyword(s):

Semantic Web ◽

Experimental Evaluation ◽

Learning Algorithm ◽

Description Logics ◽

Learning Problem ◽

Information Theoretic ◽

Concept Induction ◽

Formal Ontologies ◽

Refinement Operators ◽

Theoretical Foundations

This paper presents an approach to ontology construction pursued through the induction of concept descriptions expressed in Description Logics. The author surveys the theoretical foundations of the standard representations for formal ontologies in the Semantic Web. After stating the learning problem in this peculiar context, a FOIL-like algorithm is presented that can be applied to learn DL concept descriptions. The algorithm performs a search through a space of candidate concept definitions by means of refinement operators. This process is guided by heuristics that are based on the available examples. The author discusses related theoretical aspects of learning with the inherent incompleteness underlying the semantics of this representation. The experimental evaluation of the system DL-Foil, which implements the learning algorithm, was carried out in two series of sessions on real ontologies from standard repositories for different domains expressed in diverse description logics.

Download Full-text

Helping Novices Avoid the Hazards of Data: Leveraging Ontologies to Improve Model Generalization Automatically with Online Data Sources

AI Magazine ◽

10.1609/aimag.v37i2.2626 ◽

2016 ◽

Vol 37 (2) ◽

pp. 19-32 ◽

Cited By ~ 1

Author(s):

Sasin Janpuangtong ◽

Dylan A. Shell

Keyword(s):

Domain Knowledge ◽

Large Scale ◽

Model Building ◽

Data Extraction ◽

Data Sources ◽

Online Data ◽

Model Generalization ◽

Improve Model ◽

Building Process ◽

High Level

The infrastructure and tools necessary for large-scale data analytics, formerly the exclusive purview of experts, are increasingly available. Whereas a knowledgeable data-miner or domain expert can rightly be expected to exercise caution when required (for example, around fallacious conclusions supposedly supported by the data), the nonexpert may benefit from some judicious assistance. This article describes an end-to-end learning framework that allows a novice to create models from data easily by helping structure the model building process and capturing extended aspects of domain knowledge. By treating the whole modeling process interactively and exploiting high-level knowledge in the form of an ontology, the framework is able to aid the user in a number of ways, including in helping to avoid pitfalls such as data dredging. Prudence must be exercised to avoid these hazards as certain conclusions may only be supported if, for example, there is extra knowledge which gives reason to trust a narrower set of hypotheses. This article adopts the solution of using higher-level knowledge to allow this sort of domain knowledge to be used automatically, selecting relevant input attributes, and thence constraining the hypothesis space. We describe how the framework automatically exploits structured knowledge in an ontology to identify relevant concepts, and how a data extraction component can make use of online data sources to find measurements of those concepts so that their relevance can be evaluated. To validate our approach, models of four different problem domains were built using our implementation of the framework. Prediction error on unseen examples of these models show that our framework, making use of the ontology, helps to improve model generalization.

Download Full-text