scholarly journals An Approach to Word Sense Disambiguation Combining Modified Lesk and Bag-of-Words

Author(s):  
Alok Ranjan Pal ◽  
Anirban Kundu ◽  
Abhay Singh ◽  
Raj Shekhar ◽  
Kunal Sinha
2017 ◽  
Vol 14 (4) ◽  
Author(s):  
Rui Antunes ◽  
Sérgio Matos

AbstractWord sense disambiguation (WSD) is an important step in biomedical text mining, which is responsible for assigning an unequivocal concept to an ambiguous term, improving the accuracy of biomedical information extraction systems. In this work we followed supervised and knowledge-based disambiguation approaches, with the best results obtained by supervised means. In the supervised method we used bag-of-words as local features, and word embeddings as global features. In the knowledge-based method we combined word embeddings, concept textual definitions extracted from the UMLS database, and concept association values calculated from the MeSH co-occurrence counts from MEDLINE articles. Also, in the knowledge-based method, we tested different word embedding averaging functions to calculate the surrounding context vectors, with the goal to give more importance to closest words of the ambiguous term. The MSH WSD dataset, the most common dataset used for evaluating biomedical concept disambiguation, was used to evaluate our methods. We obtained a top accuracy of 95.6 % by supervised means, while the best knowledge-based accuracy was 87.4 %. Our results show that word embedding models improved the disambiguation accuracy, proving to be a powerful resource in the WSD task.


Author(s):  
Manuel Ladron de Guevara ◽  
Christopher George ◽  
Akshat Gupta ◽  
Daragh Byrne ◽  
Ramesh Krishnamurti

2017 ◽  
Vol 132 ◽  
pp. 47-61 ◽  
Author(s):  
Yoan Gutiérrez ◽  
Sonia Vázquez ◽  
Andrés Montoyo

2005 ◽  
Vol 12 (5) ◽  
pp. 554-565 ◽  
Author(s):  
Martijn J. Schuemie ◽  
Jan A. Kors ◽  
Barend Mons

Sign in / Sign up

Export Citation Format

Share Document