Pipeline for a Data-driven Network of Linguistic Terms

Mapping Intimacies ◽

10.3384/ecp184176 ◽

2021 ◽

Author(s):

Søren Wichmann

Keyword(s):

Mutual Information ◽

Data Driven ◽

Linguistic Terms ◽

Text Documents ◽

Page Rank ◽

Pointwise Mutual Information

The present work is aimed at (1) developing a search machine adapted to the large DReaM corpus of linguistic descriptive literature and (2) getting insights into how a data-driven ontology of linguistic terminology might be built. Starting from close to 20,000 text documents from the literature of language descriptions, from documents either born digitally or scanned and OCR’d, we extract keywords and pass them through a pruning pipeline where mainly keywords that can be considered as belonging to linguistic terminology survive. Subsequently we quantify relations among those terms using Normalized Pointwise Mutual Information (NPMI) and use the resulting measures, in conjunction with the Google Page Rank (GPR), to build networks of linguistic terms.

Download Full-text

An Approach for Retrieval of Text Documents by Hybridizing Structural Topic Modeling and Pointwise Mutual Information

Lecture Notes in Electrical Engineering - Innovations in Electrical and Electronic Engineering ◽

10.1007/978-981-16-0749-3_74 ◽

2021 ◽

pp. 969-977

Author(s):

K. Vishal ◽

Gerard Deepak ◽

A. Santhanavijayan

Keyword(s):

Mutual Information ◽

Topic Modeling ◽

Text Documents ◽

Structural Topic Modeling ◽

Pointwise Mutual Information

Download Full-text

Comparison Extraction Feature Using Double Propagation and Pointwise Mutual Information to Select a Product

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/407/1/012147 ◽

2018 ◽

Vol 407 ◽

pp. 012147

Author(s):

A Rahman

Keyword(s):

Mutual Information ◽

Pointwise Mutual Information

Download Full-text

LitRev: A data driven method for quick literature review from PubMed

10.1101/2021.12.07.471694 ◽

2021 ◽

Author(s):

Gourab Das

Keyword(s):

Literature Review ◽

Mutual Information ◽

Power Law ◽

Data Driven ◽

Power Law Distribution ◽

Information Score ◽

Data Driven Approach

LitRev is a novel robust data driven approach, devel-oped for quick literature review on a particular topic of interest. This method identifies common biological phrases that follow a power law distribution and important phrases which have the normalized point wise mutual information score greater than zero.

Download Full-text

Two-stage Photovoltaic Power Forecasting based on Extreme Learning Machine and Improved Pointwise Mutual Information

2019 IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC) ◽

10.1109/appeec45492.2019.8994387 ◽

2019 ◽

Author(s):

Zhengrong Chen ◽

Yang Hu

Keyword(s):

Mutual Information ◽

Extreme Learning Machine ◽

Two Stage ◽

Photovoltaic Power ◽

Learning Machine ◽

Power Forecasting ◽

Pointwise Mutual Information

Download Full-text

Data-Driven Representations for Testing Independence: A Connection with Mutual Information Estimation

2020 IEEE International Symposium on Information Theory (ISIT) ◽

10.1109/isit44484.2020.9174158 ◽

2020 ◽

Author(s):

Mauricio E. Gonzalez ◽

Jorge F. Silva

Keyword(s):

Mutual Information ◽

Data Driven ◽

Testing Independence

Download Full-text

Topic Optimization Method Based on Pointwise Mutual Information

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-319-26555-1_17 ◽

2015 ◽

pp. 148-155

Author(s):

Yuxin Ding ◽

Shengli Yan

Keyword(s):

Mutual Information ◽

Optimization Method ◽

Pointwise Mutual Information

Download Full-text

Weighted Average Pointwise Mutual Information for Feature Selection in Text Categorization

Knowledge Discovery in Databases: PKDD 2005 - Lecture Notes in Computer Science ◽

10.1007/11564126_27 ◽

2005 ◽

pp. 252-263 ◽

Cited By ~ 8

Author(s):

Karl-Michael Schneider

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Text Categorization ◽

Weighted Average ◽

Pointwise Mutual Information

Download Full-text

More data trumps smarter algorithms: Comparing pointwise mutual information with latent semantic analysis

Behavior Research Methods ◽

10.3758/brm.41.3.647 ◽

2009 ◽

Vol 41 (3) ◽

pp. 647-656 ◽

Cited By ~ 81

Author(s):

Gabriel Recchia ◽

Michael N. Jones

Keyword(s):

Mutual Information ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

Pointwise Mutual Information

Download Full-text

A semantic space approach to the computational semantics of noun compounds

Natural Language Engineering ◽

10.1017/s135132491200037x ◽

2013 ◽

Vol 20 (2) ◽

pp. 185-234 ◽

Cited By ~ 3

Author(s):

AKIRA UTSUMI

Keyword(s):

Mutual Information ◽

Semantic Analysis ◽

Semantic Relatedness ◽

Semantic Space ◽

Computational Semantics ◽

Semantic Classification ◽

Space Model ◽

Noun Compounds ◽

Comparison Algorithms ◽

Pointwise Mutual Information

AbstractThis study examines the ability of a semantic space model to represent the meaning of noun compounds such as ‘information gathering’ or ‘heart disease.’ For a semantic space model to compute the meaning and the attributional similarity (or semantic relatedness) for unfamiliar noun compounds that do not occur in a corpus, the vector for a noun compound must be computed from the vectors of its constituent words using vector composition algorithms. Six composition algorithms (i.e., centroid, multiplication, circular convolution, predication, comparison, and dilation) are compared in terms of the quality of the computation of the attributional similarity for English and Japanese noun compounds. To evaluate the performance of the computation of the similarity, this study uses three tasks (i.e., related word ranking, similarity correlation, and semantic classification), and two types of semantic spaces (i.e., latent semantic analysis-based and positive pointwise mutual information-based spaces). The result of these tasks is that the dilation algorithm is generally most effective in computing the similarity of noun compounds, while the multiplication algorithm is best suited specifically for the positive pointwise mutual information-based space. In addition, the comparison algorithm works better for unfamiliar noun compounds that do not occur in the corpus. These findings indicate that in general a semantic space model, and in particular the dilation, multiplication, and comparison algorithms have sufficient ability to compute the attributional similarity for noun compounds.

Download Full-text

Relevant and Informative Response Generation using Pointwise Mutual Information

10.18653/v1/w19-4115 ◽

2019 ◽

Author(s):

Junya Takayama ◽

Yuki Arase

Keyword(s):

Mutual Information ◽

Pointwise Mutual Information

Download Full-text