The interdependence of frequency, predictability, and informativity in the segmental domain

2018 ◽  
Vol 4 (s2) ◽  
Author(s):  
Uriel Cohen Priva ◽  
T. Florian Jaeger

AbstractIt has long been noted that language production seems to reflect a correlation between message redundancy and signal reduction. More frequent words and contextually predictable instances of words, for example, tend to be produced with shorter and less clear signals. The same tendency is observed in the language code (e.g. the phonological lexicon), where more frequent words and words that are typically contextually predictable tend to have fewer segments or syllables. Average predictability in context (informativity) also seems to be an important factor in understanding phonological alternations. What has received little attention so far is the relation between various information-theoretic indices – such as frequency, contextual predictability, and informativity. Although each of these indices has been associated with different theories about the source of the redundancy-reduction link, different indices tend to be highly correlated in natural language, making it difficult to tease apart their effects. We present a computational approach to this problem. We assess the correlations between frequency, predictability, and informativity, and assess when these correlations are likely to create spurious (null or non-null) effects depending on, for example, the amount of data available to the researcher.

2017 ◽  
Vol 8 (1) ◽  
pp. 106-131
Author(s):  
Frances Yung ◽  
Kevin Duh ◽  
Taku Komura ◽  
Yuji Matsumoto

Discourse relations can either be explicitly marked by discourse connectives (DCs), such as therefore and but, or implicitly conveyed in natural language utterances. How speakers choose between the two options is a question that is not well understood. In this study, we propose a psycholinguistic model that predicts whether or not speakers will produce an explicit marker given the discourse relation they wish to express. Our model is based on two information-theoretic frameworks: (1) the Rational Speech Acts model, which models the pragmatic interaction between language production and interpretation by Bayesian inference, and (2) the Uniform Information Density theory, which advocates that speakers adjust linguistic redundancy to maintain a uniform rate of information transmission. Specifically, our model quantifies the utility of using or omitting a DC based on the expected surprisal of comprehension, cost of production, and availability of other signals in the rest of the utterance. Experiments based on the Penn Discourse Treebank show that our approach outperforms the state-of-the-art performance at predicting the presence of DCs (Patterson and Kehler, 2013), in addition to giving an explanatory account of the speaker’s choice.


2021 ◽  
Vol 18 (2) ◽  
pp. 172988142199958
Author(s):  
Larkin Folsom ◽  
Masahiro Ono ◽  
Kyohei Otsu ◽  
Hyoshin Park

Mission-critical exploration of uncertain environments requires reliable and robust mechanisms for achieving information gain. Typical measures of information gain such as Shannon entropy and KL divergence are unable to distinguish between different bimodal probability distributions or introduce bias toward one mode of a bimodal probability distribution. The use of a standard deviation (SD) metric reduces bias while retaining the ability to distinguish between higher and lower risk distributions. Areas of high SD can be safely explored through observation with an autonomous Mars Helicopter allowing safer and faster path plans for ground-based rovers. First, this study presents a single-agent information-theoretic utility-based path planning method for a highly correlated uncertain environment. Then, an information-theoretic two-stage multiagent rapidly exploring random tree framework is presented, which guides Mars helicopter through regions of high SD to reduce uncertainty for the rover. In a Monte Carlo simulation, we compare our information-theoretic framework with a rover-only approach and a naive approach, in which the helicopter scouts ahead of the rover along its planned path. Finally, the model is demonstrated in a case study on the Jezero region of Mars. Results show that the information-theoretic helicopter improves the travel time for the rover on average when compared with the rover alone or with the helicopter scouting ahead along the rover’s initially planned route.


Author(s):  
Subhro Roy ◽  
Tim Vieira ◽  
Dan Roth

Little work from the Natural Language Processing community has targeted the role of quantities in Natural Language Understanding. This paper takes some key steps towards facilitating reasoning about quantities expressed in natural language. We investigate two different tasks of numerical reasoning. First, we consider Quantity Entailment, a new task formulated to understand the role of quantities in general textual inference tasks. Second, we consider the problem of automatically understanding and solving elementary school math word problems. In order to address these quantitative reasoning problems we first develop a computational approach which we show to successfully recognize and normalize textual expressions of quantities. We then use these capabilities to further develop algorithms to assist reasoning in the context of the aforementioned tasks.


Autism ◽  
2020 ◽  
Vol 24 (5) ◽  
pp. 1232-1245
Author(s):  
Emily F Ferguson ◽  
Allison S Nahmias ◽  
Samantha Crabbe ◽  
Talia Liu ◽  
David S Mandell ◽  
...  

Many children diagnosed with autism spectrum disorder who receive early intervention reap developmental benefits, but little is known about characteristics of early intervention placements in the community that optimize individual growth. The extent to which children hear and use language, in particular, may contribute significantly to developmental outcomes. We analyzed natural language production and exposure to language in preschoolers on the autism spectrum across three classroom compositions: autism only, mixed disability, and inclusion. Autistic children in inclusion classrooms produced more speech, received significantly more verbal input from their peers, and were exposed to a similar amount of teacher talk compared to children in autism only or mixed disability classrooms. These findings shed preliminary light on the linguistic environment of early intervention placements in the community, along with the characteristics of children placed in early intervention settings that may influence their language exposure from peers and teachers. Natural language sampling is a promising method for capturing language exposure in early intervention settings and providing context for understanding developmental outcomes resulting from early intervention. Lay abstract Early intervention is important for preschoolers on the autism spectrum, but little is known about early intervention classrooms in the community. This study found that children with better language skills and lower autism severity have more verbal interactions with their classmates, especially in classrooms with typically developing peers (inclusion settings). Findings suggest that natural language sampling is a useful method for characterizing autistic children and their early intervention settings. In addition, natural language sampling may have important implications for understanding individual opportunities for development in community early intervention settings.


Signals ◽  
2021 ◽  
Vol 2 (4) ◽  
pp. 754-770
Author(s):  
Daniela López De Luise

Like many other brain productions, language is a complex tool that helps individuals to communicate with each other. Many studies from computational linguistics aim to exhibit and understand the structures and content production. At present, a large list of contributions can describe and manage it with different levels of precision and applicability, but there is still a requirement for generative purposes. This paper is focused on stating the roots to understand language production from a combination of entropy and fractals. It is part of a larger work on seven rules that are intended to help build sentences automatically, in the context of dialogs with humans. As part of the scope of this paper, a set of dialogs are outlined and pre-processed. Three of the thermodynamic rules of language production are introduced and applied. Also, the communication implications and statistical evaluation are presented. From the results, a final analysis suggests that the exploration of fractals explanations of the entropy and entropy perspectives could provide a prospective insight for automatic sentence generation in natural language.


Author(s):  
José Ignacio Serrano

Owing to the growing amount of digital information stored in natural language, systems that automatically process text are of crucial importance and extremely useful. There is currently a considerable amount of research work (Sebastiani, 2002; Crammer et al., 2003) using a large variety of machine learning algorithms and other Knowledge Discovery in Databases (KDD) methods that are applied to Text Categorization (automatically labeling of texts according to category), Information Retrieval (retrieval of texts similar to a given cue), Information Extraction (identification of pieces of text that contains certain meanings), and Question/Answering (automatic answering of user questions about a certain topic). The texts or documents used can be stored either in ad hoc databases or in the World Wide Web. Data mining in texts, the well-known Text Mining, is a case of KDD with some particular issues: on one hand, the features are obtained from the words contained in texts or are the words themselves. Therefore, text mining systems faces with a huge amount of attributes. On the other hand, the features are highly correlated to form meanings, so it is necessary to take the relationships among words into account, what implies the consideration of syntax and semantics as human beings do. KDD techniques require input texts to be represented as a set of attributes in order to deal with them. The text-to-representation process is called text or document indexing, and the attributes and called indexes. Accordingly, indexing is a crucial process in text mining because indexed representations must collect, only with a set of indexes, most of the information expressed in natural language in the texts with the minimum loss of semantics, in order to perform as well as possible.


2021 ◽  
Author(s):  
Marc Serra-Peralta ◽  
Joan Serrà ◽  
Álvaro Corral

Abstract Music is a fundamental human construct, and harmony provides the building blocks of musical language. Using the Kunstderfuge corpus of classical music, we analyze the historical evolution of the richness of harmonic vocabulary of 76 classical composers, covering almost 6 centuries. Such corpus comprises about 9500 pieces, resulting in more than 5 million tokens of music codewords. The fulfilment of Heaps' law for the relation between the size of the harmonic vocabulary of a composer (in codeword types) and the total length of his works (in codeword tokens), with an exponent around 0.35, allows us to define a relative measure of vocabulary richness that has a transparent interpretation. When coupled with the considered corpus, this measure allows us to quantify harmony richness across centuries, unveiling a clear increasing linear trend. In this way, we are able to rank the composers in terms of richness of vocabulary, in the same way as for other related metrics, such as entropy. We find that the latter is particularly highly correlated with our measure of richness. Our approach is not specific for music and can be applied to other systems built by tokens of different types, as for instance natural language.


2014 ◽  
Vol 24 (07) ◽  
pp. 1440007 ◽  
Author(s):  
VALERO LAPARRA ◽  
SANDRA JIMÉNEZ ◽  
DEVIS TUIA ◽  
GUSTAU CAMPS-VALLS ◽  
JESUS MALO

This paper presents a new framework for manifold learning based on a sequence of principal polynomials that capture the possibly nonlinear nature of the data. The proposed Principal Polynomial Analysis (PPA) generalizes PCA by modeling the directions of maximal variance by means of curves, instead of straight lines. Contrarily to previous approaches, PPA reduces to performing simple univariate regressions, which makes it computationally feasible and robust. Moreover, PPA shows a number of interesting analytical properties. First, PPA is a volume-preserving map, which in turn guarantees the existence of the inverse. Second, such an inverse can be obtained in closed form. Invertibility is an important advantage over other learning methods, because it permits to understand the identified features in the input domain where the data has physical meaning. Moreover, it allows to evaluate the performance of dimensionality reduction in sensible (input-domain) units. Volume preservation also allows an easy computation of information theoretic quantities, such as the reduction in multi-information after the transform. Third, the analytical nature of PPA leads to a clear geometrical interpretation of the manifold: it allows the computation of Frenet–Serret frames (local features) and of generalized curvatures at any point of the space. And fourth, the analytical Jacobian allows the computation of the metric induced by the data, thus generalizing the Mahalanobis distance. These properties are demonstrated theoretically and illustrated experimentally. The performance of PPA is evaluated in dimensionality and redundancy reduction, in both synthetic and real datasets from the UCI repository.


Sign in / Sign up

Export Citation Format

Share Document