scholarly journals Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche

2019 ◽  
Vol 5 (9) ◽  
pp. eaaw2594 ◽  
Author(s):  
Christophe Coupé ◽  
Yoon Oh ◽  
Dan Dediu ◽  
François Pellegrino

Language is universal, but it has few indisputably universal characteristics, with cross-linguistic variation being the norm. For example, languages differ greatly in the number of syllables they allow, resulting in large variation in the Shannon information per syllable. Nevertheless, all natural languages allow their speakers to efficiently encode and transmit information. We show here, using quantitative methods on a large cross-linguistic corpus of 17 languages, that the coupling between language-level (information per syllable) and speaker-level (speech rate) properties results in languages encoding similar information rates (~39 bits/s) despite wide differences in each property individually: Languages are more similar in information rates than in Shannon information or speech rate. These findings highlight the intimate feedback loops between languages’ structural properties and their speakers’ neurocognition and biology under communicative pressures. Thus, language is the product of a multiscale communicative niche construction process at the intersection of biology, environment, and culture.

2022 ◽  
Vol 12 (1) ◽  
pp. 0-0

A new deep learning-based classification model called the Stochastic Dilated Residual Ghost (SDRG) was proposed in this work for categorizing histopathology images of breast cancer. The SDRG model used the proposed Multiscale Stochastic Dilated Convolution (MSDC) model, a ghost unit, stochastic upsampling, and downsampling units to categorize breast cancer accurately. This study addresses four primary issues: first, strain normalization was used to manage color divergence, data augmentation with several factors was used to handle the overfitting. The second challenge is extracting and enhancing tiny and low-level information such as edge, contour, and color accuracy; it is done by the proposed multiscale stochastic and dilation unit. The third contribution is to remove redundant or similar information from the convolution neural network using a ghost unit. According to the assessment findings, the SDRG model scored overall 95.65 percent accuracy rates in categorizing images with a precision of 99.17 percent, superior to state-of-the-art approaches.


2021 ◽  
Vol 12 ◽  
Author(s):  
Isabelle Chou ◽  
Jiehui Hu ◽  
Edinson Muñoz ◽  
Adolfo M. García

Bilingualism research indicates that verbal memory skills are sensitive to age of second language (L2) acquisition (AoA). However, most tasks employ disconnected, decontextualized stimuli, undermining ecological validity. Here, we assessed whether AoA impacts the ability to recall information from naturalistic discourse in single-language and cross-linguistic tasks. Twenty-four early and 25 late Chinese-English bilinguals listened to real-life L2 newscasts and orally reproduced their information in English (Task 1) and Chinese (Task 2). Both groups were compared in terms of recalled information (presence and correctness of idea units) and key control measures (e.g., attentional skills, speech rate). Across both tasks, information completeness was higher for early than late bilinguals. This occurred irrespective of attentional speed, speech rate, and additional relevant factors. Such results bridge the gap between classical memory paradigms and ecological designs in bilingualism research, illuminating how particular language profiles shape information processing in daily communicative scenarios.


Author(s):  
Haonan He

Neural decoding from spiking activity is an essential tool for understanding the information encoded in population neurons, especially in applications like brain-computer interface (BCI). Various quantitative methods have been proposed and have shown superiorities under different scenarios respectively. From the machine learning perspective, the decoding task is to map the high-dimensional spatial & temporal neuronal activity to the low-dimensional physical quantities (e.g., velocity, position). Because of the complex interactions and the abundant dynamics among neural circuits, good decoding algorithms usually have the capability of capturing flexible spatiotemporal structures embedded in the input feature space. Recently, the Transformer-based models are widely used in processing natural languages and images due to its superior performances in handling long-range and global dependencies. Hence, in this work we examine the potential applications of Transformers in neural decoding and introduce two Transformer-based models. Besides adapting the Transformer to neuronal data, we also propose a data augmentation method for overcoming the data shortage issue. We test our models on three experimental datasets and their performances are comparable to the previous state-of-the-art (SOTA) RNN-based methods. In addition, Transformer-based models show increased decoding performances when the input sequences are longer, while LSTM-based models deteriorate quickly. Our research suggests that Transformer-based models are important additions to the existing neural decoding solutions, especially for large datasets with long temporal dependencies.


Author(s):  
Pascual Cantos Gómez

Statistics is known to be a quantitative approach to research. However, most of the research done in the fields of language and linguistics is of a different kind, namely qualitative. Succinctly, qualitative analysis differs from quantitative analysis is that in the former no attempt is made to assign frequencies, percentages and the like, to the linguistic features found or identified in the data. In quantitative research, linguistic features are classified and counted, and even more complex statistical models are constructed in order to explain these observed facts. In qualitative research, however, we use the data only for identifying and describing features of language usage and for providing real occurrences/examples of particular phenomena. In this paper, we shall try to show how quantitative methods and statistical techniques can supplement qualitative analyses of language. We shall attempt to present some mathematical and statistical properties of natural languages, and introduce some of the quantitative methods which are of the most value in working empirically with texts and corpora, illustrating the various issues with numerous examples and moving from the most basic descriptive techniques (frequency counts and percentages) to decision-taking techniques (chi-square and z-score) and to more sophisticated statistical language models (Type-Token/Lemma-Token/Lemma-Type formulae, cluster analysis and discriminant function analysis).


Author(s):  
Pier Marco Bertinetto

Speech rhythm is a popular research topic but a still poorly understood phenomenon. A critical assessment of the algorithmic tools developed in the last two decades to analyze rhythm in natural languages shows that they can at best lead to a topological arrangement of the languages to be compared, with no ambition to actually offer objective and absolute measures. Besides, all available tools are heavily influenced by any source of variability, in particular: speech rate, speech style (most notably, spontaneous vs. read), and even speaker identity. Although this shows their high sensitivity to the input details, it raises severe doubts as for the actual relevance of the comparative results obtained in the study of different languages. Future research will have to learn to overcome these weaknesses. Most importantly, readers should be alerted to the false idol of a common Romance rhythmic footprint. Close inspection of the prosodic characteristics of the main Romance languages indicates that the differences are indeed remarkable and likely to feed diverging rhythmical behaviors. Besides, one should take into account the vast intrafamily variability, up to the tiniest local vernaculars, which often diverge in extraordinary ways from the ‘roof’ language supposed to constitute a sort of common denominator.


2020 ◽  
Vol 43 ◽  
Author(s):  
Giovanni Pezzulo ◽  
Laura Barca ◽  
Domenico Maisto ◽  
Francesco Donnarumma

Abstract We consider the ways humans engage in social epistemic actions, to guide each other's attention, prediction, and learning processes towards salient information, at the timescale of online social interaction and joint action. This parallels the active guidance of other's attention, prediction, and learning processes at the longer timescale of niche construction and cultural practices, as discussed in the target article.


2010 ◽  
Vol 20 (1) ◽  
pp. 20-25 ◽  
Author(s):  
Jim Tsiamtsiouris ◽  
Kim Krieger

Abstract The purpose of this study was to test the hypothesis that adults who stutter will exhibit significant improvements after attending a residential, 3-week intensive program that focuses on avoidance reduction and stuttering modification therapy. Preliminary analyses focused on four measures: (a) SSI-3, (b) speech rate, (c) S-24 Scale, and (d) OASES. Results indicated significant improvements on all of the measures.


Sign in / Sign up

Export Citation Format

Share Document