Semantic based Information Retrieval System by using WSD and DICE Coefficient

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset207259 ◽

2020 ◽

pp. 274-279

Author(s):

Prof Thwe ◽

Thi Thi Tun ◽

Ohnmar Aung

Keyword(s):

Information Retrieval ◽

Word Sense Disambiguation ◽

Ambiguous Word ◽

Relevant Information ◽

Word Sense ◽

Improve Performance ◽

Lexical Resource ◽

Ambiguous Words ◽

Sense Disambiguation ◽

Similarity Method

In many NLP applications such as machine translation, content analysis and information retrieval, word sense disambiguation (WSD) is an important technique. In the information retrieval (IR) system, ambiguous words are damaging effect on the precision of this system. In this situation, WSD process is useful for automatically identifying the correct meaning of an ambiguous word. Therefore, this system proposes the word sense disambiguation algorithm to increase the precision of the IR system. This system provides additional semantics as conceptually related words with the help of glosses to each keyword in the query by disambiguating their meanings. This system uses the WordNet as the lexical resource that encodes concepts of each term. In this system, various senses that are provided by WSD algorithm have been used as semantics for indexing the documents to improve performance of IR system. By using keyword and sense, this system retrieves the relevant information according to the Dice similarity method.

Download Full-text

A Word Sense Disambiguation Method Based on Reconstruction of Context by Correlation

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.135-136.160 ◽

2011 ◽

Vol 135-136 ◽

pp. 160-166 ◽

Cited By ~ 1

Author(s):

Xin Hua Fan ◽

Bing Jun Zhang ◽

Dong Zhou

Keyword(s):

Information Entropy ◽

Average Distance ◽

Word Sense Disambiguation ◽

Ambiguous Word ◽

Experimental Results ◽

Occurrence Frequency ◽

Word Sense ◽

Occurrence Data ◽

Ambiguous Words ◽

Sense Disambiguation

This paper presents a word sense disambiguation method by reconstructing the context using the correlation between words. Firstly, we figure out the relevance between words though the statistical quantity(co-occurrence frequency , the average distance and the information entropy) from the corpus. Secondly, we see the words that have lager correlation value between ambiguous word than other words in the context as the important words, and use this kind of words to reconstruct the context, then we use the reconstructed context as the new context of the ambiguous words .In the end, we use the method of the sememe co-occurrence data[10] for word sense disambiguation. The experimental results have proved the feasibility of this method.

Download Full-text

Query Expansion using Semantic Network for Information Retrieval in Telugu Language

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1586.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 874-877

Keyword(s):

Information Retrieval ◽

Language Processing ◽

Query Expansion ◽

Retrieval System ◽

Semantic Network ◽

Word Sense Disambiguation ◽

Ambiguous Word ◽

Information Retrieval System ◽

Word Sense ◽

Sense Disambiguation

Now-a-days digital documents are playing a major role in all the areas /web, as such all the information is digitalised. Queries are used by the search engines to retrieve the information. Query plays a major role in information retrieval system, as a result relevant and non relevant documents are retrieved. Query expansion techniques will better the performance of the information retrieval system. Our proposed query expansion technique is Word Sense Disambiguation. This is to find the correct sense of the ambiguous word in regional Telugu language. In Query expansion, if the added query term is an ambiguous word, accuracy of relevant documents will be very less. So to avoid this, proposed method Word Sense Disambiguation (WSD) is used, which is related to NLP Natural Language Processing and Artificial Intelligence AI. WSD improves the accuracy of information retrieval system.

Download Full-text

Word Sense Disambiguation for Improving the Quality of Machine Translation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.981.153 ◽

2014 ◽

Vol 981 ◽

pp. 153-156

Author(s):

Chun Xiang Zhang ◽

Long Deng ◽

Xue Yao Gao ◽

Li Li Guo

Keyword(s):

Machine Translation ◽

Language Processing ◽

Word Sense Disambiguation ◽

Ambiguous Word ◽

Translation System ◽

Word Sense ◽

Ambiguous Words ◽

Sense Disambiguation ◽

Machine Translation System

Word sense disambiguation is key to many application problems in natural language processing. In this paper, a specific classifier of word sense disambiguation is introduced into machine translation system in order to improve the quality of the output translation. Firstly, translation of ambiguous word is deleted from machine translation of Chinese sentence. Secondly, ambiguous word is disambiguated and the classification labels are translations of ambiguous word. Thirdly, these two translations are combined. 50 Chinese sentences including ambiguous words are collected for test experiments. Experimental results show that the translation quality is improved after the proposed method is applied.

Download Full-text

Developing Corpora using Wikipedia and Word2vec for Word Sense Disambiguation

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v12.i3.pp1239-1246 ◽

2018 ◽

Vol 12 (3) ◽

pp. 1239

Author(s):

Farza Nurifan ◽

Riyanarto Sarno ◽

Cahyaningtyas Sekar Wahyuni

Keyword(s):

Semantic Similarity ◽

Word Sense Disambiguation ◽

Ambiguous Word ◽

Anaphora Resolution ◽

Word Sense ◽

Accuracy Rate ◽

Ambiguous Words ◽

Sense Disambiguation ◽

Improve Accuracy ◽

Research Show

Word Sense Disambiguation (WSD) is one of the most difficult problems in the artificial intelligence field or well known as AI-hard or AI-complete. A lot of problems can be solved using word sense disambiguation approaches like sentiment analysis, machine translation, search engine relevance, coherence, anaphora resolution, and inference. In this paper, we do research to solve WSD problem with two small corpora. We propose the use of Word2vec and Wikipedia to develop the corpora. After developing the corpora, we measure the sentence similarity with the corpora using cosine similarity to determine the meaning of the ambiguous word. Lastly, to improve accuracy, we use Lesk algorithms and Wu Palmer similarity to deal with problems when there is no word from a sentence in the corpora (we call it as semantic similarity). The results of our research show an 86.94% accuracy rate and the semantic similarity improve the accuracy rate by 12.96% in determining the meaning of ambiguous words.

Download Full-text

Word sense disambiguation using implicit information

Natural Language Engineering ◽

10.1017/s1351324919000421 ◽

2019 ◽

Vol 26 (4) ◽

pp. 413-432 ◽

Cited By ~ 1

Author(s):

Goonjan Jain ◽

D.K. Lobiyal

Keyword(s):

Word Sense Disambiguation ◽

Ambiguous Word ◽

Word Sense ◽

Implicit Information ◽

Ambiguous Words ◽

Sense Disambiguation ◽

Novel Method ◽

Unsupervised Approach ◽

Polysemous Words ◽

Better Than

AbstractHumans proficiently interpret the true sense of an ambiguous word by establishing association among words in a sentence. The complete sense of text is also based on implicit information, which is not explicitly mentioned. The absence of this implicit information is a significant problem for a computer program that attempts to determine the correct sense of ambiguous words. In this paper, we propose a novel method to uncover the implicit information that links the words of a sentence. We reveal this implicit information using a graph, which is then used to disambiguate the ambiguous word. The experiments show that the proposed algorithm interprets the correct sense for both homonyms and polysemous words. Our proposed algorithm has performed better than the approaches presented in the SemEval-2013 task for word sense disambiguation and has shown an accuracy of 79.6 percent, which is 2.5 percent better than the best unsupervised approach in SemEval-2007.

Download Full-text

A HIGHLY ACCURATE BOOTSTRAPPING ALGORITHM FOR WORD SENSE DISAMBIGUATION

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213001000398 ◽

2001 ◽

Vol 10 (01n02) ◽

pp. 5-21 ◽

Cited By ~ 17

Author(s):

RADA F. MIHALCEA ◽

DAN I. MOLDOVAN

Keyword(s):

High Precision ◽

Word Sense Disambiguation ◽

Original Text ◽

Word Sense ◽

New Words ◽

Input Text ◽

Ambiguous Words ◽

Sense Disambiguation ◽

Very High

In this paper, we present a bootstrapping algorithm for Word Sense Disambiguation which succeeds in disambiguating a subset of the words in the input text with very high precision. It uses WordNet and a semantic tagged corpus, for the purpose of identifying the correct sense of the words in a given text. The bootstrapping process initializes a set of ambiguous words with all the nouns and verbs in the text. It then applies various disambiguation procedures and builds a set of disambiguated words: new words are sense tagged based on their relation to the already disambiguated words, and then added to the set. This process allows us to identify, in the original text, a set of words which can be disambiguated with high precision; 55% of the verbs and nouns are disambiguated with an accuracy of 92%.

Download Full-text

Evaluating Word Sense Disambiguation Tools for Information Retrieval Task

Lecture Notes in Computer Science - Evaluating Systems for Multilingual and Multimodal Information Access ◽

10.1007/978-3-642-04447-2_13 ◽

2009 ◽

pp. 113-117 ◽

Cited By ~ 3

Author(s):

Fernando Martínez-Santiago ◽

José M. Perea-Ortega ◽

Miguel A. García-Cumbreras

Keyword(s):

Information Retrieval ◽

Word Sense Disambiguation ◽

Word Sense ◽

Retrieval Task ◽

Sense Disambiguation

Download Full-text

A Comparative Analysis of Supervised Word Sense Disambiguation in Information Retrieval

Communication and Intelligent Systems - Lecture Notes in Networks and Systems ◽

10.1007/978-981-16-1089-9_10 ◽

2021 ◽

pp. 111-120

Author(s):

Chandrakala Arya ◽

Manoj Diwakar ◽

Shobha Arya

Keyword(s):

Information Retrieval ◽

Comparative Analysis ◽

Word Sense Disambiguation ◽

Word Sense ◽

Sense Disambiguation

Download Full-text

A Novel Approach to Word Sense Disambiguation Based on Topical and Semantic Association

The Scientific World JOURNAL ◽

10.1155/2013/586327 ◽

2013 ◽

Vol 2013 ◽

pp. 1-8 ◽

Cited By ~ 2

Author(s):

Xin Wang ◽

Wanli Zuo ◽

Ying Wang

Keyword(s):

Language Processing ◽

Fundamental Problem ◽

Word Sense Disambiguation ◽

Ambiguous Word ◽

Semantic Features ◽

Word Sense ◽

Semantic Association ◽

Data Set ◽

Novel Approach ◽

Sense Disambiguation

Word sense disambiguation (WSD) is a fundamental problem in nature language processing, the objective of which is to identify the most proper sense for an ambiguous word in a given context. Although WSD has been researched over the years, the performance of existing algorithms in terms of accuracy and recall is still unsatisfactory. In this paper, we propose a novel approach to word sense disambiguation based on topical and semantic association. For a given document, supposing that its topic category is accurately discriminated, the correct sense of the ambiguous term is identified through the corresponding topic and semantic contexts. We firstly extract topic discriminative terms from document and construct topical graph based on topic span intervals to implement topic identification. We then exploit syntactic features, topic span features, and semantic features to disambiguate nouns and verbs in the context of ambiguous word. Finally, we conduct experiments on the standard data set SemCor to evaluate the performance of the proposed method, and the results indicate that our approach achieves relatively better performance than existing approaches.

Download Full-text

Word sense disambiguation to improve precision for ambiguous queries

Open Computer Science ◽

10.2478/s13537-012-0032-6 ◽

2012 ◽

Vol 2 (4) ◽

Cited By ~ 2

Author(s):

Adrian-Gabriel Chifu ◽

Radu-Tudor Ionescu

Keyword(s):

Information Retrieval ◽

Ad Hoc ◽

Naive Bayes ◽

Word Sense Disambiguation ◽

Naïve Bayes ◽

Word Sense ◽

Interdisciplinary Approaches ◽

Sense Disambiguation ◽

Ranking Technique

AbstractSuccess in Information Retrieval (IR) depends on many variables. Several interdisciplinary approaches try to improve the quality of the results obtained by an IR system. In this paper we propose a new way of using word sense disambiguation (WSD) in IR. The method we develop is based on Naïve Bayes classification and can be used both as a filtering and as a re-ranking technique. We show on the TREC ad-hoc collection that WSD is useful in the case of queries which are difficult due to sense ambiguity. Our interest regards improving the precision after 5, 10 and 30 retrieved documents (P@5, P@10, P@30), respectively, for such lowest precision queries.

Download Full-text