scholarly journals Intuitive Contrasting Map for Antonym Embeddings

2021 ◽  
Author(s):  
Igor Samenko ◽  
Alexey Tikhonov ◽  
Ivan P. Yamshchikov

This paper shows that modern word embeddings contain information that distinguishes synonyms and antonyms despite small cosine similarities between corresponding vectors. This information is implicitly encoded in the geometry of the embeddings and could be extracted with a straightforward manifold learning procedure or a contrasting map. Such a map is trained on a small labeled subset of the data and can produce new embeddings that explicitly highlight specific semantic attributes of the word. The new embeddings produced by the map are shown to improve the performance on downstream tasks.

Author(s):  
Tatsunori B. Hashimoto ◽  
David Alvarez-Melis ◽  
Tommi S. Jaakkola

Continuous word representations have been remarkably useful across NLP tasks but remain poorly understood. We ground word embeddings in semantic spaces studied in the cognitive-psychometric literature, taking these spaces as the primary objects to recover. To this end, we relate log co-occurrences of words in large corpora to semantic similarity assessments and show that co-occurrences are indeed consistent with an Euclidean semantic space hypothesis. Framing word embedding as metric recovery of a semantic space unifies existing word embedding algorithms, ties them to manifold learning, and demonstrates that existing algorithms are consistent metric recovery methods given co-occurrence counts from random walks. Furthermore, we propose a simple, principled, direct metric recovery algorithm that performs on par with the state-of-the-art word embedding and manifold learning methods. Finally, we complement recent focus on analogies by constructing two new inductive reasoning datasets—series completion and classification—and demonstrate that word embeddings can be used to solve them as well.


2021 ◽  
Vol 32 (2) ◽  
pp. 218-240 ◽  
Author(s):  
Tessa E. S. Charlesworth ◽  
Victor Yang ◽  
Thomas C. Mann ◽  
Benedek Kurdi ◽  
Mahzarin R. Banaji

Stereotypes are associations between social groups and semantic attributes that are widely shared within societies. The spoken and written language of a society affords a unique way to measure the magnitude and prevalence of these widely shared collective representations. Here, we used word embeddings to systematically quantify gender stereotypes in language corpora that are unprecedented in size (65+ million words) and scope (child and adult conversations, books, movies, TV). Across corpora, gender stereotypes emerged consistently and robustly for both theoretically selected stereotypes (e.g., work–home) and comprehensive lists of more than 600 personality traits and more than 300 occupations. Despite underlying differences across language corpora (e.g., time periods, formats, age groups), results revealed the pervasiveness of gender stereotypes in every corpus. Using gender stereotypes as the focal issue, we unite 19th-century theories of collective representations and 21st-century evidence on implicit social cognition to understand the subtle yet persistent presence of collective representations in language.


2014 ◽  
Vol 39 (12) ◽  
pp. 2077-2089
Author(s):  
Min YUAN ◽  
Lei CHENG ◽  
Ran-Gang ZHU ◽  
Ying-Ke LEI

2013 ◽  
Vol 32 (6) ◽  
pp. 1670-1673
Author(s):  
Xue-yan ZHOU ◽  
Jian-min HAN ◽  
Yu-bin ZHAN

Sign in / Sign up

Export Citation Format

Share Document