scholarly journals Discrimination between Modal, Breathy and Pressed Voice for Single Vowels Using Neck-Surface Vibration Signals

2019 ◽  
Vol 9 (7) ◽  
pp. 1505 ◽  
Author(s):  
Zhengdong Lei ◽  
Evan Kennedy ◽  
Laura Fasanella ◽  
Nicole Yee-Key Li-Jessen ◽  
Luc Mongeau

The purpose of this study was to investigate the feasibility of using neck-surface acceleration signals to discriminate between modal, breathy and pressed voice. Voice data for five English single vowels were collected from 31 female native Canadian English speakers using a portable Neck Surface Accelerometer (NSA) and a condenser microphone. Firstly, auditory-perceptual ratings were conducted by five clinically-certificated Speech Language Pathologists (SLPs) to categorize voice type using the audio recordings. Intra- and inter-rater analyses were used to determine the SLPs’ reliability for the perceptual categorization task. Mixed-type samples were screened out, and congruent samples were kept for the subsequent classification task. Secondly, features such as spectral harmonics, jitter, shimmer and spectral entropy were extracted from the NSA data. Supervised learning algorithms were used to map feature vectors to voice type categories. A feature wrapper strategy was used to evaluate the contribution of each feature or feature combinations to the classification between different voice types. The results showed that the highest classification accuracy on a full set was 82.5%. The breathy voice classification accuracy was notably greater (approximately 12%) than those of the other two voice types. Shimmer and spectral entropy were the best correlated metrics for the classification accuracy.

2012 ◽  
Vol 12 ◽  
pp. 191-217 ◽  
Author(s):  
Christie Brien ◽  
Laura L. Sabourin

The processing of homonyms is complex considering homonyms have many lexical properties. For instance, train contains semantic (a locomotive/to instruct) and syntactic (noun/verb) properties, each affecting interpretation. Previous studies find homonym processing influenced by lexical frequency (Duffy et al. 1988) as well as syntactic and semantic context (Folk & Morris 2003; Swinney 1979; Tanenhaus et al. 1979). This cross-modal lexical-decision study investigates second language (L2) effects on homonym processing in the first language (L1). Participants were monolingual English speakers and Canadian English/French bilinguals who acquired L2 French at distinct periods. The early bilinguals revealed no significant differences compared to monolinguals (p = .219) supporting the Reordered Access Model (Duffy et al. 1988). However, the late bilinguals revealed longer reaction times, syntactic priming effects (p < .001), and lexical frequency effects (p < .001), suggesting a heightened sensitivity to surface cues influencing homonym processing in the L1 due to a newly-acquired L2 (Cook 2003).


Author(s):  
Melissa Bettoni ◽  
Priscilla Rizzi

The present study aimed at investigating students’ perceptions about the study of pronunciation and the comprehensibility of their speech. Twenty-four English-speaking Brazilians at the advanced level or higher had audio recordings of their sentences judged by four English speakers from different nationalities representing the three circles in Kachru’s World Englishes Model (1985). Comprehensibility, accentedness, number of mispronunciations at the segmental level (such as palatalization, voicing, devoicing, epenthesis), native language of the judge, and perceptions about the study of pronunciation were tabulated and compared quantitatively and qualitatively. Results indicated positive correlations among better compreensibility, less accentedness and fewer mispronunciations at the segmental level; and, between comprehensibility and the desire for specific knowledge regarding pronunciation.


2018 ◽  
Vol 26 (1) ◽  
pp. 120-128 ◽  
Author(s):  
Andrew Peterson ◽  
Arthur Spirling

Measuring the polarization of legislators and parties is a key step in understanding how politics develops over time. But in parliamentary systems—where ideological positions estimated from roll calls may not be informative—producing valid estimates is extremely challenging. We suggest a new measurement strategy that makes innovative use of the “accuracy” of machine classifiers, i.e., the number of correct predictions made as a proportion of all predictions. In our case, the “labels” are the party identifications of the members of parliament, predicted from their speeches along with some information on debate subjects. Intuitively, when the learner is able to discriminate members in the two main Westminster parties well, we claim we are in a period of “high” polarization. By contrast, when the classifier has low accuracy—and makes a relatively large number of mistakes in terms of allocating members to parties based on the data—we argue parliament is in an era of “low” polarization. This approach is fast and substantively valid, and we demonstrate its merits with simulations, and by comparing the estimates from 78 years of House of Commons speeches with qualitative and quantitative historical accounts of the same. As a headline finding, we note that contemporary British politics is approximately as polarized as it was in the mid-1960s—that is, in the middle of the “postwar consensus”. More broadly, we show that the technical performance of supervised learning algorithms can be directly informative about substantive matters in social science.


2020 ◽  
pp. 1-26
Author(s):  
Michael Friesner ◽  
Laura Kastronic ◽  
Jeffrey Lamontagne

This study compares the effects of city and ethnicity with respect to Quebec English speakers’ participation in two ongoing changes affecting /æ/ in Canadian English: retraction as part of the Canadian Shift and tensing in prenasal environments. Quebec English speakers might be expected to differ in their behavior with regard to these two phenomena as compared to other Canadian English speakers. Based on an analysis of Cartesian distances and a mixed-effects model using spontaneous speech, the authors find that Quebec English speakers are less advanced with respect to the Canadian Shift, especially speakers from Quebec City. For tensing, British-origin speakers from Montreal and Quebec City are found to pattern similarly, participating in the more widespread patterning, while Jewish and Italian speakers are moving in the opposite direction. The authors argue that this move away from characteristically Canadian patterns is an artefact of the interplay between the two phenomena under study, reflective of differential replication of the Canadian Shift in the two environments.


2011 ◽  
Vol 23 (11) ◽  
pp. 3331-3342 ◽  
Author(s):  
Viktor Kharlamov ◽  
Kenneth Campbell ◽  
Nina Kazanina

Speech sounds are not always perceived in accordance with their acoustic–phonetic content. For example, an early and automatic process of perceptual repair, which ensures conformity of speech inputs to the listener's native language phonology, applies to individual input segments that do not exist in the native inventory or to sound sequences that are illicit according to the native phonotactic restrictions on sound co-occurrences. The present study with Russian and Canadian English speakers shows that listeners may perceive phonetically distinct and licit sound sequences as equivalent when the native language system provides robust evidence for mapping multiple phonetic forms onto a single phonological representation. In Russian, due to an optional but productive t-deletion process that affects /stn/ clusters, the surface forms [sn] and [stn] may be phonologically equivalent and map to a single phonological form /stn/. In contrast, [sn] and [stn] clusters are usually phonologically distinct in (Canadian) English. Behavioral data from identification and discrimination tasks indicated that [sn] and [stn] clusters were more confusable for Russian than for English speakers. The EEG experiment employed an oddball paradigm with nonwords [asna] and [astna] used as the standard and deviant stimuli. A reliable mismatch negativity response was elicited approximately 100 msec postchange in the English group but not in the Russian group. These findings point to a perceptual repair mechanism that is engaged automatically at a prelexical level to ensure immediate encoding of speech inputs in phonological terms, which in turn enables efficient access to the meaning of a spoken utterance.


Linguistics ◽  
2020 ◽  
Vol 0 (0) ◽  
Author(s):  
Ksenia Gnevsheva

AbstractMultiple studies demonstrate that social and linguistic information is connected in speech perception such that the priming of a social category will affect listeners’ linguistic behavior. At the same time, the degree to which social information is relied upon during speech perception is less well understood. The current study investigates whether priming of a country affects the perceived degree of foreign accent and whether this effect varies across different social groups. Two groups of bilinguals (one dominant in Russian, another dominant in English) listened to audio recordings of monolingual and bilingual (also either dominant in Russian or English) speakers and rated the degree of their foreign accentedness in English and Russian. The recordings were divided by topic: neutral, Russia-related, and Australia-related. Statistical analysis revealed a significant effect of topic: Russia-related clips were rated as more foreign-accented in English by bilinguals dominant in Russian, and Australia-related clips were rated as less foreign-accented in Russian when produced by bilinguals dominant in English. The variation is explained through listeners’ using social information more when the linguistic information is less reliable.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-14 ◽  
Author(s):  
Lei Yan ◽  
Tao Zhen ◽  
Jian-Lei Kong ◽  
Lian-Ming Wang ◽  
Xiao-Lei Zhou

Human gait phase recognition is a significant technology for rehabilitation training robot, human disease diagnosis, artificial prosthesis, and so on. The efficient design of the recognition method for gait information is the key issue in the current gait phase division and eigenvalues extraction research. In this paper, a novel voting-weighted integrated neural network (VWI-DNN) is proposed to detect different gait phases from multidimensional acceleration signals. More specifically, it first employs a gait information acquisition system to collect different IMU sensors data fixed on the human lower limb. Then, with dimensionality reduction and four-phase division preprocessing, key features are selected and merged as unified vectors to learn common and domain knowledge in time domain. Next, multiple refined DNNs are transferred to design a multistream integrated neural network, which utilizes the mixture-granularity information to exploit high-dimensional feature representative. Finally, a voting-weighted function is developed to fuse different submodels as a unified representation for distinguishing small discrepancy among different gait phases. The end-to-end implementation of the VWI-DNN model is fine-tuned by the loss optimization of gradient back-propagation. Experimental results demonstrate the outperforming performance of the proposed method with higher classification accuracy compared with the other methods, of which classification accuracy and macro-F1 is up to 99.5%. More discussions are provided to indicate the potential applications in combination with other works.


Author(s):  
Sonja Frazier

This research project was conducted as a pilot study to explore how pitch accent is used by NCES (Native Canadian English Speakers) and MLE (Mandarin Learners of English). Pitch accents are the prominent high or low tones, that are predominantly found on content words (N, V, Adv, Adj, etc.) in English. In order to compare how both speech communities use pitch accent in English, participants were given an EI (Elicited Imitation) Task. The EI involved participants hearing and then repeating a sentence. It is also reconstructive in nature meaning that the participants process the sentence, then reconstruct it with their own grammar, and finally reproduce it. The results showed that Mandarin speakers had more pitch accents than English speakers, adding pitch accents on function words (Art, Pro, Prep, etc.) as well. The results also demonstrated that Mandarin speakers had less creaky words (words said in a very low pitch, also known as laryngealization or vocal fry) than the English participants. Implications of this study concern ESL Education; such as should English pitch accent patterns and creak in English be taught to English language learners.


Author(s):  
Qingjun Wang ◽  
Yibo Li ◽  
Xueping Liu

Fatigue driving is bringing more and more serious harm, but there are various reasons for fatigue driving, it is still difficult to test the driver’s fatigue. This paper defines a method to test driver’s fatigue based on the EEG, and different from other researches into fatigue driving, this paper mainly takes the fatigue features of EEG signals in fatigue state and uses wavelet entropy as the feature extraction method to analyze the features of wavelet entropy and spectral entropy features as well as the classification accuracy under the same classifier. The SVM is used to show the classifier’s results. The accuracy of the driver fatigue state monitoring using the wavelet entropy is 90.7%, which is higher than the use of spectral entropy as the characteristic accuracy rate of 81.3%. The results show that the frequency characteristics of EEG can be well applied to driving fatigue testing, but different frequency feature calculation methods will affect the classification accuracy.


2020 ◽  
Vol 48 (1) ◽  
pp. 31-71
Author(s):  
Charles Boberg

Previous research has shown that Canadian English displays a unique pattern of nativizing the stressed vowel of foreign words spelled with the letter <a>, like lava, pasta, and spa, known as foreign (a), with more use of /æ/ (the trap vowel) and less use of /ah/ (the palm vowel) than American English. This paper analyzes one hundred examples of foreign (a), produced by sixty-one Canadian and thirty-one American English-speakers, in order to shed more light on this pattern and its current development. Acoustic analysis is used to determine whether each participant assigns each vowel to English /æ/, /ah/, or an intermediate category between /æ/ and /ah/. It reports that the Canadian pattern, though still distinct, is converging with the American pattern, in that Canadians now use slightly more /ah/ than /æ/; that men appear to lead this change but this is because they participate less than women do in the Short Front (Canadian) Vowel Shift; that intermediate vowel assignments are comparatively rare, suggesting that a new low-central vowel phoneme is not emerging; that the Canadian tendency toward American pronunciation is not well aligned with overt attitudes toward the United States and American English; and that the national differences in foreign (a) assignment result not from structural, phonological differences between the dialects so much as from a complex set of sociocultural factors.


Sign in / Sign up

Export Citation Format

Share Document