scholarly journals Keyword Query Expansion on Linked Data Using Linguistic and Semantic Features

Author(s):  
Saeedeh Shekarpour ◽  
Konrad Hoffner ◽  
Jens Lehmann ◽  
Soren Auer
Author(s):  
VELISLAVA STOYKOVA ◽  
DANIELA MAJCHRAKOVA

The paper presents results of the application of a statistical approach for Slovak to Bulgarian language machine translation. It uses Information Retrieval inspired search techniques and employs sever alalgorithmic steps of parallel statistical search with query expansion in Slovak-Bulgarian EUROPARL 7 Corpus using the Sketch Engine software and its scoring. The search includes the generation of concordances,collocations, word sketch differences, word sketches, and thesauri of the studied keyword (query) by using a statistical scoring, which is regarded as intermediate (inter-lingual) semantic standard presentation by means of which the studied keyword (from the source language) is mapped together with its possible translation equivalents (onto the target language. The results present the study of adjectival collocabillity in both Slovak and Bulgarian language from the corpus of political speech texts outlining the standard semantic relations based on the evaluation of statistical scoring. Finally, the advantages and shortcomings of the approach are discussed.


2021 ◽  
pp. 563-579
Author(s):  
Evan W. Patton ◽  
William Van Woensel ◽  
Oshani Seneviratne ◽  
Giuseppe Loseto ◽  
Floriano Scioscia ◽  
...  

Author(s):  
Isabelle Augenstein ◽  
Anna Lisa Gentile ◽  
Barry Norton ◽  
Ziqi Zhang ◽  
Fabio Ciravegna
Keyword(s):  

2017 ◽  
Vol 2017 ◽  
pp. 1-12 ◽  
Author(s):  
Yingqi Wang ◽  
Nianbin Wang ◽  
Lianke Zhou

Due to the ambiguity and impreciseness of keyword query in relational databases, the research on keyword query expansion has attracted wide attention. Existing query expansion methods expose users’ query intention to a certain extent, but most of them cannot balance the precision and recall. To address this problem, a novel two-step query expansion approach is proposed based on query recommendation and query interpretation. First, a probabilistic recommendation algorithm is put forward by constructing a term similarity matrix and Viterbi model. Second, by using the translation algorithm of triples and construction algorithm of query subgraphs, query keywords are translated to query subgraphs with structural and semantic information. Finally, experimental results on a real-world dataset demonstrate the effectiveness and rationality of the proposed method.


Author(s):  
Jordi Armengol-Estapé ◽  
Marta R. Costa-jussà

AbstractIntroducing factors such as linguistic features has long been proposed in machine translation to improve the quality of translations. More recently, factored machine translation has proven to still be useful in the case of sequence-to-sequence systems. In this work, we investigate whether this gains hold in the case of the state-of-the-art architecture in neural machine translation, the Transformer, instead of recurrent architectures. We propose a new model, the Factored Transformer, to introduce an arbitrary number of word features in the source sequence in an attentional system. Specifically, we suggest two variants depending on the level at which the features are injected. Moreover, we suggest two combination mechanisms for the word features and words themselves. We experiment both with classical linguistic features and semantic features extracted from a linked data database, and with two low-resource datasets. With the best-found configuration, we show improvements of 0.8 BLEU over the baseline Transformer in the IWSLT German-to-English task. Moreover, we experiment with the more challenging FLoRes English-to-Nepali benchmark, which includes both low-resource and very distant languages, and obtain an improvement of 1.2 BLEU. These improvements are achieved with linguistic and not with semantic information.


Author(s):  
Sarah Dahir ◽  
Abderrahim El Qadi ◽  
Hamid Bennis

<p class="0abstract">Information Retrieval (IR) in the medical domain is considered as a challenging task for many reasons. Short health queries tend to lack information on user's intent, and the target corpus may not have sufficient information for Relevance Feedbacks. And even, if the user obtains relevant documents to his/her queries, it is difficult for him/her to understand the technical terms.  In contrast, in this paper, we propose an approach for health queries reformulation based on graph matching between two external linked data sources: DBpedia and Unified Medical Language System (UMLS). DBpedia has a broad coverage of topics and less noise compared to Wikipedia articles, and UMLS is specific to the medical domain. We also introduced the degree centrality to measure the graph connectivity and to select the most efficient candidate terms for query expansion. Experimental results on MEDLINE collection using Okapi BM25 as a retrieval model showed that our approach outperformed related methods, and the two sources achieved very good retrieval results. They helped in the diversification of the retrieved documents and the improvement of the recall.</p>


2009 ◽  
Vol 2 (1) ◽  
pp. 121-132 ◽  
Author(s):  
Nikos Sarkas ◽  
Nilesh Bansal ◽  
Gautam Das ◽  
Nick Koudas

Sign in / Sign up

Export Citation Format

Share Document