Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche

Christophe Coupé; Yoon Oh; Dan Dediu; François Pellegrino

doi:10.1126/sciadv.aaw2594

Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche

Science Advances ◽

10.1126/sciadv.aaw2594 ◽

2019 ◽

Vol 5 (9) ◽

pp. eaaw2594 ◽

Cited By ~ 10

Author(s):

Christophe Coupé ◽

Yoon Oh ◽

Dan Dediu ◽

François Pellegrino

Keyword(s):

Quantitative Methods ◽

Niche Construction ◽

Speech Rate ◽

Shannon Information ◽

Construction Process ◽

Natural Languages ◽

Information Rates ◽

Similar Information ◽

Level Information ◽

Encoding Efficiency

Language is universal, but it has few indisputably universal characteristics, with cross-linguistic variation being the norm. For example, languages differ greatly in the number of syllables they allow, resulting in large variation in the Shannon information per syllable. Nevertheless, all natural languages allow their speakers to efficiently encode and transmit information. We show here, using quantitative methods on a large cross-linguistic corpus of 17 languages, that the coupling between language-level (information per syllable) and speaker-level (speech rate) properties results in languages encoding similar information rates (~39 bits/s) despite wide differences in each property individually: Languages are more similar in information rates than in Shannon information or speech rate. These findings highlight the intimate feedback loops between languages’ structural properties and their speakers’ neurocognition and biology under communicative pressures. Thus, language is the product of a multiscale communicative niche construction process at the intersection of biology, environment, and culture.

Download Full-text

Anthropogenic climate change as a monumental niche construction process: background and philosophical aspects

Biology & Philosophy ◽

10.1007/s10539-020-09754-2 ◽

2020 ◽

Vol 35 (4) ◽

Author(s):

Andra Meneganzin ◽

Telmo Pievani ◽

Stefano Caserini

Keyword(s):

Climate Change ◽

Niche Construction ◽

Anthropogenic Climate Change ◽

Construction Process

Download Full-text

Breast Cancer Histopathological Image Classification using Stochastic Dilated Residual Ghost Model

International Journal of Information Retrieval Research ◽

10.4018/ijirr.289655 ◽

2022 ◽

Vol 12 (1) ◽

pp. 0-0

Keyword(s):

Breast Cancer ◽

Data Augmentation ◽

State Of The Art ◽

Classification Model ◽

Histopathological Image ◽

Similar Information ◽

Level Information ◽

Histopathological Image Classification ◽

Percent Accuracy ◽

Accuracy Rates

A new deep learning-based classification model called the Stochastic Dilated Residual Ghost (SDRG) was proposed in this work for categorizing histopathology images of breast cancer. The SDRG model used the proposed Multiscale Stochastic Dilated Convolution (MSDC) model, a ghost unit, stochastic upsampling, and downsampling units to categorize breast cancer accurately. This study addresses four primary issues: first, strain normalization was used to manage color divergence, data augmentation with several factors was used to handle the overfitting. The second challenge is extracting and enhancing tiny and low-level information such as edge, contour, and color accuracy; it is done by the proposed multiscale stochastic and dilation unit. The third contribution is to remove redundant or similar information from the convolution neural network using a ghost unit. According to the assessment findings, the SDRG model scored overall 95.65 percent accuracy rates in categorizing images with a precision of 99.17 percent, superior to state-of-the-art approaches.

Download Full-text

Discourse-Level Information Recall in Early and Late Bilinguals: Evidence From Single-Language and Cross-Linguistic Tasks

Frontiers in Psychology ◽

10.3389/fpsyg.2021.757351 ◽

2021 ◽

Vol 12 ◽

Author(s):

Isabelle Chou ◽

Jiehui Hu ◽

Edinson Muñoz ◽

Adolfo M. García

Keyword(s):

Verbal Memory ◽

Ecological Validity ◽

Speech Rate ◽

Real Life ◽

Control Measures ◽

L2 Acquisition ◽

Shape Information ◽

Information Recall ◽

Level Information ◽

Late Bilinguals

Bilingualism research indicates that verbal memory skills are sensitive to age of second language (L2) acquisition (AoA). However, most tasks employ disconnected, decontextualized stimuli, undermining ecological validity. Here, we assessed whether AoA impacts the ability to recall information from naturalistic discourse in single-language and cross-linguistic tasks. Twenty-four early and 25 late Chinese-English bilinguals listened to real-life L2 newscasts and orally reproduced their information in English (Task 1) and Chinese (Task 2). Both groups were compared in terms of recalled information (presence and correctness of idea units) and key control measures (e.g., attentional skills, speech rate). Across both tasks, information completeness was higher for early than late bilinguals. This occurred irrespective of attentional speed, speech rate, and additional relevant factors. Such results bridge the gap between classical memory paradigms and ecological designs in bilingualism research, illuminating how particular language profiles shape information processing in daily communicative scenarios.

Download Full-text

Transformer-Based Methods for Neural Decoding

10.20944/preprints202108.0011.v1 ◽

2021 ◽

Author(s):

Haonan He

Keyword(s):

Quantitative Methods ◽

Data Augmentation ◽

Feature Space ◽

Neural Decoding ◽

Natural Languages ◽

Decoding Algorithms ◽

Complex Interactions ◽

Potential Applications ◽

Previous State ◽

Low Dimensional

Neural decoding from spiking activity is an essential tool for understanding the information encoded in population neurons, especially in applications like brain-computer interface (BCI). Various quantitative methods have been proposed and have shown superiorities under different scenarios respectively. From the machine learning perspective, the decoding task is to map the high-dimensional spatial &amp; temporal neuronal activity to the low-dimensional physical quantities (e.g., velocity, position). Because of the complex interactions and the abundant dynamics among neural circuits, good decoding algorithms usually have the capability of capturing flexible spatiotemporal structures embedded in the input feature space. Recently, the Transformer-based models are widely used in processing natural languages and images due to its superior performances in handling long-range and global dependencies. Hence, in this work we examine the potential applications of Transformers in neural decoding and introduce two Transformer-based models. Besides adapting the Transformer to neuronal data, we also propose a data augmentation method for overcoming the data shortage issue. We test our models on three experimental datasets and their performances are comparable to the previous state-of-the-art (SOTA) RNN-based methods. In addition, Transformer-based models show increased decoding performances when the input sequences are longer, while LSTM-based models deteriorate quickly. Our research suggests that Transformer-based models are important additions to the existing neural decoding solutions, especially for large datasets with long temporal dependencies.

Download Full-text

Do we need statistics when we have linguistics?

DELTA Documentação de Estudos em Lingüística Teórica e Aplicada ◽

10.1590/s0102-44502002000200003 ◽

2002 ◽

Vol 18 (2) ◽

pp. 233-271 ◽

Cited By ~ 2

Author(s):

Pascual Cantos Gómez

Keyword(s):

Quantitative Research ◽

Quantitative Methods ◽

Function Analysis ◽

Language Models ◽

Linguistic Features ◽

Chi Square ◽

Natural Languages ◽

Statistical Language Models ◽

And Linguistics ◽

Frequency Counts

Statistics is known to be a quantitative approach to research. However, most of the research done in the fields of language and linguistics is of a different kind, namely qualitative. Succinctly, qualitative analysis differs from quantitative analysis is that in the former no attempt is made to assign frequencies, percentages and the like, to the linguistic features found or identified in the data. In quantitative research, linguistic features are classified and counted, and even more complex statistical models are constructed in order to explain these observed facts. In qualitative research, however, we use the data only for identifying and describing features of language usage and for providing real occurrences/examples of particular phenomena. In this paper, we shall try to show how quantitative methods and statistical techniques can supplement qualitative analyses of language. We shall attempt to present some mathematical and statistical properties of natural languages, and introduce some of the quantitative methods which are of the most value in working empirically with texts and corpora, illustrating the various issues with numerous examples and moving from the most basic descriptive techniques (frequency counts and percentages) to decision-taking techniques (chi-square and z-score) and to more sophisticated statistical language models (Type-Token/Lemma-Token/Lemma-Type formulae, cluster analysis and discriminant function analysis).

Download Full-text

Litterfall as a niche construction process in a northern hardwood forest

Ecosphere ◽

10.1890/es14-00442.1 ◽

2015 ◽

Vol 6 (7) ◽

pp. art117 ◽

Cited By ~ 10

Author(s):

Seth W. Bigelow ◽

Charles D. Canham

Keyword(s):

Niche Construction ◽

Hardwood Forest ◽

Northern Hardwood Forest ◽

Construction Process ◽

Northern Hardwood

Download Full-text

Rhythm in the Romance Languages

10.1093/acrefore/9780199384655.013.431 ◽

2021 ◽

Author(s):

Pier Marco Bertinetto

Keyword(s):

Speech Rate ◽

High Sensitivity ◽

Research Topic ◽

Romance Languages ◽

Future Research ◽

Common Denominator ◽

Close Inspection ◽

Natural Languages ◽

Speech Rhythm ◽

Comparative Results

Speech rhythm is a popular research topic but a still poorly understood phenomenon. A critical assessment of the algorithmic tools developed in the last two decades to analyze rhythm in natural languages shows that they can at best lead to a topological arrangement of the languages to be compared, with no ambition to actually offer objective and absolute measures. Besides, all available tools are heavily influenced by any source of variability, in particular: speech rate, speech style (most notably, spontaneous vs. read), and even speaker identity. Although this shows their high sensitivity to the input details, it raises severe doubts as for the actual relevance of the comparative results obtained in the study of different languages. Future research will have to learn to overcome these weaknesses. Most importantly, readers should be alerted to the false idol of a common Romance rhythmic footprint. Close inspection of the prosodic characteristics of the main Romance languages indicates that the differences are indeed remarkable and likely to feed diverging rhythmical behaviors. Besides, one should take into account the vast intrafamily variability, up to the tiniest local vernaculars, which often diverge in extraordinary ways from the ‘roof’ language supposed to constitute a sort of common denominator.

Download Full-text

Social epistemic actions

Behavioral and Brain Sciences ◽

10.1017/s0140525x19002802 ◽

2020 ◽

Vol 43 ◽

Author(s):

Giovanni Pezzulo ◽

Laura Barca ◽

Domenico Maisto ◽

Francesco Donnarumma

Keyword(s):

Social Interaction ◽

Joint Action ◽

Niche Construction ◽

Cultural Practices ◽

Learning Processes ◽

Epistemic Actions ◽

Target Article ◽

Online Social Interaction

Abstract We consider the ways humans engage in social epistemic actions, to guide each other's attention, prediction, and learning processes towards salient information, at the timescale of online social interaction and joint action. This parallels the active guidance of other's attention, prediction, and learning processes at the longer timescale of niche construction and cultural practices, as discussed in the target article.

Download Full-text

The Successful Stuttering Management Program: A Preliminary Report on Outcomes

Perspectives on Fluency and Fluency Disorders ◽

10.1044/ffd20.1.20 ◽

2010 ◽

Vol 20 (1) ◽

pp. 20-25 ◽

Cited By ~ 2

Author(s):

Jim Tsiamtsiouris ◽

Kim Krieger

Keyword(s):

Preliminary Report ◽

Speech Rate ◽

Management Program ◽

Intensive Program ◽

Adults Who Stutter

Abstract The purpose of this study was to test the hypothesis that adults who stutter will exhibit significant improvements after attending a residential, 3-week intensive program that focuses on avoidance reduction and stuttering modification therapy. Preliminary analyses focused on four measures: (a) SSI-3, (b) speech rate, (c) S-24 Scale, and (d) OASES. Results indicated significant improvements on all of the measures.

Download Full-text

Quantitative Methods of Data Analysis for the Physical Sciences and Engineering

10.1017/9781139342568 ◽

2018 ◽

Cited By ~ 3

Author(s):

Douglas G. Martinson

Keyword(s):

Data Analysis ◽

Quantitative Methods ◽

Physical Sciences

Download Full-text