Word Sense Induction Using Correlated Topic Model

A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00122 ◽

2015 ◽

Vol 3 ◽

pp. 59-71 ◽

Cited By ~ 8

Author(s):

Jing Wang ◽

Mohit Bansal ◽

Kevin Gimpel ◽

Brian D. Ziebart ◽

Clement T. Yu

Keyword(s):

Latent Variables ◽

Topic Model ◽

Model Performance ◽

Ambiguous Word ◽

Local Context ◽

Word Sense ◽

Word Sense Induction ◽

Improve Model ◽

Previous State ◽

Improve Model Performance

Word sense induction (WSI) seeks to automatically discover the senses of a word in a corpus via unsupervised methods. We propose a sense-topic model for WSI, which treats sense and topic as two separate latent variables to be inferred jointly. Topics are informed by the entire document, while senses are informed by the local context surrounding the ambiguous word. We also discuss unsupervised ways of enriching the original corpus in order to improve model performance, including using neural word embeddings and external corpora to expand the context of each data instance. We demonstrate significant improvements over the previous state-of-the-art, achieving the best results reported to date on the SemEval-2013 WSI task.

Download Full-text

AutoSense Model for Word Sense Induction

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016212 ◽

2019 ◽

Vol 33 ◽

pp. 6212-6219 ◽

Cited By ~ 1

Author(s):

Reinald Kim Amplayo ◽

Seung-won Hwang ◽

Min Song

Keyword(s):

Latent Variable ◽

Word Sense ◽

Name Disambiguation ◽

Variable Model ◽

Fine Grained ◽

Word Sense Induction ◽

Author Name Disambiguation ◽

Competing Models ◽

Word Senses ◽

Better Than

Word sense induction (WSI), or the task of automatically discovering multiple senses or meanings of a word, has three main challenges: domain adaptability, novel sense detection, and sense granularity flexibility. While current latent variable models are known to solve the first two challenges, they are not flexible to different word sense granularities, which differ very much among words, from aardvark with one sense, to play with over 50 senses. Current models either require hyperparameter tuning or nonparametric induction of the number of senses, which we find both to be ineffective. Thus, we aim to eliminate these requirements and solve the sense granularity problem by proposing AutoSense, a latent variable model based on two observations: (1) senses are represented as a distribution over topics, and (2) senses generate pairings between the target word and its neighboring word. These observations alleviate the problem by (a) throwing garbage senses and (b) additionally inducing fine-grained word senses. Results show great improvements over the stateof-the-art models on popular WSI datasets. We also show that AutoSense is able to learn the appropriate sense granularity of a word. Finally, we apply AutoSense to the unsupervised author name disambiguation task where the sense granularity problem is more evident and show that AutoSense is evidently better than competing models. We share our data and code here: https://github.com/rktamplayo/AutoSense.

Download Full-text

PageRank-based Word Sense Induction within Web Search Results Clustering

IEEE/ACM Joint Conference on Digital Libraries ◽

10.1109/jcdl.2014.6970227 ◽

2014 ◽

Cited By ~ 1

Author(s):

Jose G. Moreno ◽

Gael Dias

Keyword(s):

Web Search ◽

Word Sense ◽

Search Results ◽

Word Sense Induction ◽

Search Results Clustering

Download Full-text

A Knowledge Based Method for Chinese Word Sense Induction

2010 Fourth International Conference on Genetic and Evolutionary Computing ◽

10.1109/icgec.2010.68 ◽

2010 ◽

Cited By ~ 1

Author(s):

Peng Jin ◽

Rui Sui ◽

Yihao Zhang

Keyword(s):

Word Sense ◽

Chinese Word ◽

Knowledge Based ◽

Word Sense Induction

Download Full-text

Probit Normal Correlated Topic Model

Open Journal of Statistics ◽

10.4236/ojs.2014.411083 ◽

2014 ◽

Vol 04 (11) ◽

pp. 879-888

Author(s):

Xingchen Yu ◽

Ernest Fokoué

Keyword(s):

Topic Model ◽

Correlated Topic Model

Download Full-text

Topic Modeling for Word Sense Induction

Language Processing and Knowledge in the Web - Lecture Notes in Computer Science ◽

10.1007/978-3-642-40722-2_10 ◽

2013 ◽

pp. 97-103

Author(s):

Johannes Knopp ◽

Johanna Völker ◽

Simone Paolo Ponzetto

Keyword(s):

Topic Modeling ◽

Word Sense ◽

Word Sense Induction

Download Full-text

A Study on the Charateristics of eWOM of Consumers as YouTube Creators Using Correlated Topic Model

The Korean Journal of Advertising and Public Relations ◽

10.16914/kjapr.2021.23.3.37 ◽

2021 ◽

Vol 23 (3) ◽

pp. 37-72

Author(s):

Hye-Ra Oh ◽

Yunjae Cheong

Keyword(s):

Topic Model ◽

Correlated Topic Model

Download Full-text

Word sense induction using word embeddings and community detection in complex networks

Physica A Statistical Mechanics and its Applications ◽

10.1016/j.physa.2019.02.032 ◽

2019 ◽

Vol 523 ◽

pp. 180-190 ◽

Cited By ~ 1

Author(s):

Edilson A. Corrêa ◽

Diego R. Amancio

Keyword(s):

Complex Networks ◽

Community Detection ◽

Word Sense ◽

Word Embeddings ◽

Word Sense Induction

Download Full-text

A comparison of graph-based word sense induction clustering algorithms in a pseudoword evaluation framework

Language Resources and Evaluation ◽

10.1007/s10579-018-9415-1 ◽

2018 ◽

Vol 52 (3) ◽

pp. 733-770 ◽

Cited By ~ 1

Author(s):

Flavio Massimiliano Cecchini ◽

Martin Riedl ◽

Elisabetta Fersini ◽

Chris Biemann

Keyword(s):

Clustering Algorithms ◽

Evaluation Framework ◽

Word Sense ◽

Word Sense Induction

Download Full-text

Word Sense Induction and Disambiguation

Structure Discovery in Natural Language ◽

10.1007/978-3-642-25923-4_7 ◽

2011 ◽

pp. 145-155

Author(s):

Chris Biemann

Keyword(s):

Word Sense ◽

Word Sense Induction

Download Full-text