Semantic Similarity Measures Applied to an Ontology for Human-Like Interaction

Mining information from sentences through Semantic Web data and Information Extraction tasks

Journal of Information Science ◽

10.1177/0165551520934387 ◽

2020 ◽

pp. 016555152093438

Author(s):

Jose L. Martinez-Rodriguez ◽

Ivan Lopez-Arevalo ◽

Ana B. Rios-Alvarado

Keyword(s):

Semantic Web ◽

Natural Language ◽

Information Extraction ◽

Knowledge Base ◽

Semantic Similarity ◽

Similarity Measure ◽

Real World ◽

Semantic Similarity Measure ◽

Web Standards ◽

Extract Information

The Semantic Web provides guidelines for the representation of information about real-world objects (entities) and their relations (properties). This is helpful for the dissemination and consumption of information by people and applications. However, the information is mainly contained within natural language sentences, which do not have a structure or linguistic descriptions ready to be directly processed by computers. Thus, the challenge is to identify and extract the elements of information that can be represented. Hence, this article presents a strategy to extract information from sentences and its representation with Semantic Web standards. Our strategy involves Information Extraction tasks and a hybrid semantic similarity measure to get entities and relations that are later associated with individuals and properties from a Knowledge Base to create RDF triples (Subject–Predicate–Object structures). The experiments demonstrate the feasibility of our method and that it outperforms the accuracy provided by a pattern-based method from the literature.

Download Full-text

A New Semantic Similarity Measure Based On Ontology for Movie Rate Prediction

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c4442.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 6756-6762

Keyword(s):

Semantic Similarity ◽

Similarity Measure ◽

Experimental Evaluation ◽

Pearson Correlation ◽

Similarity Measures ◽

Similarity Score ◽

Cosine Similarity ◽

Semantic Similarity Measure ◽

Rate Prediction ◽

Target User

A recommendation algorithm comprises of two important steps: 1) Predicting rates, and 2) Recommendation. Rate prediction is a cumulative function of the similarity score between two movies and rate history of those movies by other users. There are various methods for rate prediction such as weighted sum method, regression, deviation based etc. All these methods rely on finding similar items to the items previously viewed/rated by target user, with assumption that user tends to have similar rating for similar items. Computing the similarities can be done using various similarity measures such as Euclidian Distance, Cosine Similarity, Adjusted Cosine Similarity, Pearson Correlation, Jaccard Similarity etc. All of these well-known approaches calculate similarity score between two movies using simple rating based data. Hence, such similarity measures could not accurately model rating behavior of user. In this paper, we will show that the accuracy in rate prediction can be enhanced by incorporating ontological domain knowledge in similarity computation. This paper introduces a new ontological semantic similarity measure between two movies. For experimental evaluation, the performance of proposed approach is compared with two existing approaches: 1) Adjusted Cosine Similarity (ACS), and 2) Weighted Slope One (WSO) algorithm, in terms of two performance measures: 1) Execution time and 2) Mean Absolute Error (MAE). The open-source Movielens (ml-1m) dataset is used for experimental evaluation. As our results show, the ontological semantic similarity measure enhances the performance of rate prediction as compared to the existing-well known approaches.

Download Full-text

Methods for a similarity measure for clinical attributes based on survival data analysis

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-019-0917-6 ◽

2019 ◽

Vol 19 (1) ◽

Author(s):

Christian Karmen ◽

Matthias Gietzelt ◽

Petra Knaup-Gregori ◽

Matthias Ganzinger

Keyword(s):

Overall Survival ◽

Clinical Trials ◽

Similarity Measure ◽

Survival Data ◽

Similarity Measures ◽

Case Based Reasoning ◽

Local Similarity ◽

Survival Functions ◽

Global Similarity ◽

Case Based

Abstract Background Case-based reasoning is a proven method that relies on learned cases from the past for decision support of a new case. The accuracy of such a system depends on the applied similarity measure, which quantifies the similarity between two cases. This work proposes a collection of methods for similarity measures especially for comparison of clinical cases based on survival data, as they are available for example from clinical trials. Methods Our approach is intended to be used in scenarios, where it is of interest to use longitudinal data, such as survival data, for a case-based reasoning approach. This might be especially important, where uncertainty about the ideal therapy decision exists. The collection of methods consists of definitions of the local similarity of nominal as well as numeric attributes, a calculation of attribute weights, a feature selection method and finally a global similarity measure. All of them use survival time (consisting of survival status and overall survival) as a reference of similarity. As a baseline, we calculate a survival function for each value of any given clinical attribute. Results We define the similarity between values of the same attribute by putting the estimated survival functions in relation to each other. Finally, we quantify the similarity by determining the area between corresponding curves of survival functions. The proposed global similarity measure is designed especially for cases from randomized clinical trials or other collections of clinical data with survival information. Overall survival can be considered as an eligible and alternative solution for similarity calculations. It is especially useful, when similarity measures that depend on the classic solution-describing attribute “applied therapy” are not applicable. This is often the case for data from clinical trials containing randomized arms. Conclusions In silico evaluation scenarios showed that the mean accuracy of biomarker detection in k = 10 most similar cases is higher (0.909–0.998) than for competing similarity measures, such as Heterogeneous Euclidian-Overlap Metric (0.657–0.831) and Discretized Value Difference Metric (0.535–0.671). The weight calculation method showed a more than six times (6.59–6.95) higher weight for biomarker attributes over non-biomarker attributes. These results suggest that the similarity measure described here is suitable for applications based on survival data.

Download Full-text

EMD Based Semantic User Similarity using Past Travel Histories

Journal of Cases on Information Technology ◽

10.4018/jcit.20220801oa04 ◽

2022 ◽

Vol 24 (3) ◽

pp. 0-0

Keyword(s):

Information Retrieval ◽

Mobile Devices ◽

Semantic Similarity ◽

Similarity Measure ◽

Similarity Measures ◽

Cost Effective ◽

Semantic Similarity Measure ◽

User Similarity ◽

Percentage Improvement ◽

The Cost

The cost-effective and easy availability of handheld mobile devices and ubiquity of location acquisition services such as GPS and GSM networks has helped expedient logging and sharing of location histories of mobile users. This work aims to find semantic user similarity using their past travel histories. Application of the semantic similarity measure can be found in tourism-related recommender systems and information retrieval. The paper presents Earth Mover’s Distance (EMD) based semantic user similarity measure using users' GPS logs. The similarity measure is applied and evaluated on the GPS dataset of 182 users collected from April 2007 to August 2012 by Microsoft's GeoLife project. The proposed similarity measure is compared with conventional similarity measures used in literature such as Jaccard, Dice, and Pearsons’ Correlation. The percentage improvement of EMD based approach over existing approaches in terms of average RMSE is 10.70%, and average MAE is 5.73%.

Download Full-text

EMD-Based Semantic User Similarity Using Past Travel Histories

Journal of Cases on Information Technology ◽

10.4018/jcit.20220701.oa2 ◽

2022 ◽

Vol 24 (3) ◽

pp. 1-17

Author(s):

Sunita Tiwari ◽

Saroj Kaushik

Keyword(s):

Information Retrieval ◽

Mobile Devices ◽

Semantic Similarity ◽

Similarity Measure ◽

Similarity Measures ◽

Cost Effective ◽

Semantic Similarity Measure ◽

User Similarity ◽

Percentage Improvement ◽

The Cost

The cost-effective and easy availability of handheld mobile devices and ubiquity of location acquisition services such as GPS and GSM networks has helped expedient logging and sharing of location histories of mobile users. This work aims to find semantic user similarity using their past travel histories. Application of the semantic similarity measure can be found in tourism-related recommender systems and information retrieval. The paper presents Earth Mover’s Distance (EMD) based semantic user similarity measure using users' GPS logs. The similarity measure is applied and evaluated on the GPS dataset of 182 users collected from April 2007 to August 2012 by Microsoft's GeoLife project. The proposed similarity measure is compared with conventional similarity measures used in literature such as Jaccard, Dice, and Pearsons’ Correlation. The percentage improvement of EMD based approach over existing approaches in terms of average RMSE is 10.70%, and average MAE is 5.73%.

Download Full-text

Faculty Opinions recommendation of Exploiting disjointness axioms to improve semantic similarity measures.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.722317980.793528331 ◽

2017 ◽

Author(s):

Sebastian Köhler

Keyword(s):

Semantic Similarity ◽

Similarity Measures

Download Full-text

A Semantic Similarity Measure between Ontological Concepts

ACTA AUTOMATICA SINICA ◽

10.3724/sp.j.1004.2012.00229 ◽

2012 ◽

Vol 38 (2) ◽

pp. 229-235 ◽

Cited By ~ 3

Author(s):

Wen-Qing LI ◽

Xin SUN ◽

Chang-You ZHANG ◽

Ye FENG

Keyword(s):

Semantic Similarity ◽

Similarity Measure ◽

Semantic Similarity Measure

Download Full-text

MATHURA (MBI) - A NOVEL IMPUTATION MEASURE FOR IMPUTATION OF MISSING VALUES IN MEDICAL DATASETS

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666191216123352 ◽

2019 ◽

Vol 13 ◽

Author(s):

B. Mathura Bai ◽

N. Mangathayaru ◽

B. Padmaja Rani ◽

Shadi Aljawarneh

Keyword(s):

Similarity Measure ◽

Medical Records ◽

Missing Values ◽

Similarity Measures ◽

Common Problems ◽

Experiment Analysis

: Missing attribute values in medical datasets are one of the most common problems faced when mining medical datasets. Estimation of missing values is a major challenging task in pre-processing of datasets. Any wrong estimate of missing attribute values can lead to inefficient and improper classification thus resulting in lower classifier accuracies. Similarity measures play a key role during the imputation process. The use of an appropriate and better similarity measure can help to achieve better imputation and improved classification accuracies. This paper proposes a novel imputation measure for finding similarity between missing and non-missing instances in medical datasets. Experiments are carried by applying both the proposed imputation technique and popular benchmark existing imputation techniques. Classification is carried using KNN, J48, SMO and RBFN classifiers. Experiment analysis proved that after imputation of medical records using proposed imputation technique, the resulting classification accuracies reported by the classifiers KNN, J48 and SMO have improved when compared to other existing benchmark imputation techniques.

Download Full-text