Auditory–Visual Speech Integration in Bipolar Disorder: A Preliminary Study

Arzu Yordamlı; Doğu Erdener

doi:10.3390/languages3040038

Auditory–Visual Speech Integration in Bipolar Disorder: A Preliminary Study

Languages ◽

10.3390/languages3040038 ◽

2018 ◽

Vol 3 (4) ◽

pp. 38 ◽

Cited By ~ 1

Author(s):

Arzu Yordamlı ◽

Doğu Erdener

Keyword(s):

Bipolar Disorder ◽

Mcgurk Effect ◽

Visual Speech ◽

Physiological Data ◽

Striking Difference ◽

Control Group ◽

Speech Stimuli ◽

Visual Speech Information ◽

Speech Information ◽

And Control

This study aimed to investigate how individuals with bipolar disorder integrate auditory and visual speech information compared to healthy individuals. Furthermore, we wanted to see whether there were any differences between manic and depressive episode bipolar disorder patients with respect to auditory and visual speech integration. It was hypothesized that the bipolar group’s auditory–visual speech integration would be weaker than that of the control group. Further, it was predicted that those in the manic phase of bipolar disorder would integrate visual speech information more robustly than their depressive phase counterparts. To examine these predictions, a McGurk effect paradigm with an identification task was used with typical auditory–visual (AV) speech stimuli. Additionally, auditory-only (AO) and visual-only (VO, lip-reading) speech perceptions were also tested. The dependent variable for the AV stimuli was the amount of visual speech influence. The dependent variables for AO and VO stimuli were accurate modality-based responses. Results showed that the disordered and control groups did not differ in AV speech integration and AO speech perception. However, there was a striking difference in favour of the healthy group with respect to the VO stimuli. The results suggest the need for further research whereby both behavioural and physiological data are collected simultaneously. This will help us understand the full dynamics of how auditory and visual speech information are integrated in people with bipolar disorder.

Download Full-text

Auditory-Visual Speech Perception in Bipolar Disorder: A Preliminary Study

10.20944/preprints201807.0106.v1 ◽

2018 ◽

Author(s):

Arzu Yordamlı ◽

Doğu Erdener

Keyword(s):

Bipolar Disorder ◽

Speech Perception ◽

Mcgurk Effect ◽

Visual Speech ◽

Physiological Data ◽

Control Group ◽

Visual Speech Perception ◽

Visual Speech Information ◽

Speech Information ◽

Depressive Episodes

The focus of this study was to investigate how individuals with bipolar disorder integrate auditory and visual speech information compared to non-disordered individuals and whether there were any differences in auditory and visual speech integration in the manic and depressive episodes in bipolar disorder patients. It was hypothesized that bipolar groups’ auditory-visual speech integration would be less robust than the control group. Further, it was predicted that those in the manic phase of bipolar disorder would integrate visual speech information more than their depressive phase counterparts. To examine these, the McGurk effect paradigm was used with typical auditory-visual speech (AV) as well as auditory-only (AO) speech perception on visual-only (VO) stimuli. Results. Results showed that the disordered and non-disordered groups did not differ on auditory-visual speech (AV) integration and auditory-only (AO) speech perception but on visual-only (VO) stimuli. The results are interpreted to pave the way for further research whereby both behavioural and physiological data are collected simultaneously. This will allow us understand the full dynamics of how, actually, the auditory and visual (relatively impoverished in bipolar disorder) speech information are integrated in people with bipolar disorder.

Download Full-text

Cross-modal Interactions during Perception of Audiovisual Speech and Nonspeech Signals: An fMRI Study

Journal of Cognitive Neuroscience ◽

10.1162/jocn.2010.21421 ◽

2011 ◽

Vol 23 (1) ◽

pp. 221-237 ◽

Cited By ~ 20

Author(s):

Ingo Hertrich ◽

Susanne Dietrich ◽

Hermann Ackermann

Keyword(s):

Auditory System ◽

Visual Information ◽

Time Course ◽

Right Hemisphere ◽

Mcgurk Effect ◽

Visual Speech ◽

Auditory Information ◽

Speech Communication ◽

Visual Speech Information ◽

Speech Information

During speech communication, visual information may interact with the auditory system at various processing stages. Most noteworthy, recent magnetoencephalography (MEG) data provided first evidence for early and preattentive phonetic/phonological encoding of the visual data stream—prior to its fusion with auditory phonological features [Hertrich, I., Mathiak, K., Lutzenberger, W., & Ackermann, H. Time course of early audiovisual interactions during speech and non-speech central-auditory processing: An MEG study. Journal of Cognitive Neuroscience, 21, 259–274, 2009]. Using functional magnetic resonance imaging, the present follow-up study aims to further elucidate the topographic distribution of visual–phonological operations and audiovisual (AV) interactions during speech perception. Ambiguous acoustic syllables—disambiguated to /pa/ or /ta/ by the visual channel (speaking face)—served as test materials, concomitant with various control conditions (nonspeech AV signals, visual-only and acoustic-only speech, and nonspeech stimuli). (i) Visual speech yielded an AV-subadditive activation of primary auditory cortex and the anterior superior temporal gyrus (STG), whereas the posterior STG responded both to speech and nonspeech motion. (ii) The inferior frontal and the fusiform gyrus of the right hemisphere showed a strong phonetic/phonological impact (differential effects of visual /pa/ vs. /ta/) upon hemodynamic activation during presentation of speaking faces. Taken together with the previous MEG data, these results point at a dual-pathway model of visual speech information processing: On the one hand, access to the auditory system via the anterior supratemporal “what” path may give rise to direct activation of “auditory objects.” On the other hand, visual speech information seems to be represented in a right-hemisphere visual working memory, providing a potential basis for later interactions with auditory information such as the McGurk effect.

Download Full-text

Metabolic Syndrome in Patients with Bipolar Disorder Treated with Atypical Antipsychotics, their First Degree Relatives and Control Group

10.26226/morressier.58b2e095d462b8028d892cb7 ◽

2017 ◽

Author(s):

Shahrzad Arya

Keyword(s):

Metabolic Syndrome ◽

Bipolar Disorder ◽

Atypical Antipsychotics ◽

Control Group ◽

First Degree Relatives ◽

And Control

Download Full-text

Auditory and Auditory-Visual Perception of Clear and Conversational Speech

Journal of Speech Language and Hearing Research ◽

10.1044/jslhr.4002.432 ◽

1997 ◽

Vol 40 (2) ◽

pp. 432-443 ◽

Cited By ~ 82

Author(s):

Karen S. Helfer

Keyword(s):

Visual Cues ◽

Presentation Mode ◽

Visual Presentation ◽

Visual Speech ◽

Conversational Speech ◽

Clear Speech ◽

Nature Of Information ◽

Speech Cues ◽

Visual Speech Information ◽

Speech Information

Research has shown that speaking in a deliberately clear manner can improve the accuracy of auditory speech recognition. Allowing listeners access to visual speech cues also enhances speech understanding. Whether the nature of information provided by speaking clearly and by using visual speech cues is redundant has not been determined. This study examined how speaking mode (clear vs. conversational) and presentation mode (auditory vs. auditory-visual) influenced the perception of words within nonsense sentences. In Experiment 1, 30 young listeners with normal hearing responded to videotaped stimuli presented audiovisually in the presence of background noise at one of three signal-to-noise ratios. In Experiment 2, 9 participants returned for an additional assessment using auditory-only presentation. Results of these experiments showed significant effects of speaking mode (clear speech was easier to understand than was conversational speech) and presentation mode (auditoryvisual presentation led to better performance than did auditory-only presentation). The benefit of clear speech was greater for words occurring in the middle of sentences than for words at either the beginning or end of sentences for both auditory-only and auditory-visual presentation, whereas the greatest benefit from supplying visual cues was for words at the end of sentences spoken both clearly and conversationally. The total benefit from speaking clearly and supplying visual cues was equal to the sum of each of these effects. Overall, the results suggest that speaking clearly and providing visual speech information provide complementary (rather than redundant) information.

Download Full-text

The Effect of a Concurrent Working Memory Task and Temporal Offsets on the Integration of Auditory and Visual Speech Information

Seeing and Perceiving ◽

10.1163/187847611x620937 ◽

2012 ◽

Vol 25 (1) ◽

pp. 87-106 ◽

Cited By ~ 13

Author(s):

Kevin G. Munhall ◽

Julie N. Buchan

Keyword(s):

Working Memory ◽

Memory Task ◽

Visual Speech ◽

Visual Speech Information ◽

Speech Information

Download Full-text

Second Language Instruction

Advances in Educational Technologies and Instructional Design - Handbook of Research on Bilingual and Intercultural Education ◽

10.4018/978-1-7998-2588-3.ch005 ◽

2020 ◽

pp. 105-123

Author(s):

Doğu Erdener

Keyword(s):

Speech Perception ◽

Applied Field ◽

General Framework ◽

Language Instruction ◽

Visual Speech ◽

Second Language Instruction ◽

Perception Process ◽

L2 Instruction ◽

Visual Speech Information ◽

Speech Information

Speech perception has long been taken for granted as an auditory-only process. However, it is now firmly established that speech perception is an auditory-visual process in which visual speech information in the form of lip and mouth movements are taken into account in the speech perception process. Traditionally, foreign language (L2) instructional methods and materials are auditory-based. This chapter presents a general framework of evidence that visual speech information will facilitate L2 instruction. The author claims that this knowledge will form a bridge to cover the gap between psycholinguistics and L2 instruction as an applied field. The chapter also describes how orthography can be used in L2 instruction. While learners from a transparent L1 orthographic background can decipher phonology of orthographically transparent L2s –overriding the visual speech information – that is not the case for those from orthographically opaque L1s.

Download Full-text

Animated virtual characters to explore audio-visual speech in controlled and naturalistic environments

Scientific Reports ◽

10.1038/s41598-020-72375-y ◽

2020 ◽

Vol 10 (1) ◽

Cited By ~ 1

Author(s):

Raphaël Thézé ◽

Mehdi Ali Gadiri ◽

Louis Albert ◽

Antoine Provost ◽

Anne-Lise Giraud ◽

...

Keyword(s):

Speech Processing ◽

Background Noise ◽

Mcgurk Effect ◽

Visual Speech ◽

Natural Speech ◽

Virtual Characters ◽

Speech Stimuli ◽

Stimulus Timing ◽

Phonetic Features ◽

Set Up

Abstract Natural speech is processed in the brain as a mixture of auditory and visual features. An example of the importance of visual speech is the McGurk effect and related perceptual illusions that result from mismatching auditory and visual syllables. Although the McGurk effect has widely been applied to the exploration of audio-visual speech processing, it relies on isolated syllables, which severely limits the conclusions that can be drawn from the paradigm. In addition, the extreme variability and the quality of the stimuli usually employed prevents comparability across studies. To overcome these limitations, we present an innovative methodology using 3D virtual characters with realistic lip movements synchronized on computer-synthesized speech. We used commercially accessible and affordable tools to facilitate reproducibility and comparability, and the set-up was validated on 24 participants performing a perception task. Within complete and meaningful French sentences, we paired a labiodental fricative viseme (i.e. /v/) with a bilabial occlusive phoneme (i.e. /b/). This audiovisual mismatch is known to induce the illusion of hearing /v/ in a proportion of trials. We tested the rate of the illusion while varying the magnitude of background noise and audiovisual lag. Overall, the effect was observed in 40% of trials. The proportion rose to about 50% with added background noise and up to 66% when controlling for phonetic features. Our results conclusively demonstrate that computer-generated speech stimuli are judicious, and that they can supplement natural speech with higher control over stimulus timing and content.

Download Full-text

Hemispheric differences in perceiving and integrating dynamic visual speech information

The Journal of the Acoustical Society of America ◽

10.1121/1.417400 ◽

1996 ◽

Vol 100 (4) ◽

pp. 2570-2570 ◽

Cited By ~ 1

Author(s):

Jennifer A. Johnson ◽

Lawrence D. Rosenblum

Keyword(s):

Visual Speech ◽

Hemispheric Differences ◽

Visual Speech Information ◽

Speech Information

Download Full-text

Visual speech information improves discrimination of non‐native phonemes in late infancy.

The Journal of the Acoustical Society of America ◽

10.1121/1.4784782 ◽

2009 ◽

Vol 125 (4) ◽

pp. 2778-2778

Author(s):

Robin K. Panneton

Keyword(s):

Visual Speech ◽

Visual Speech Information ◽

Speech Information

Download Full-text

A Novel Method to Extract Lip-Reading Features by Using LGEI and DWT

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1079-1080.820 ◽

2014 ◽

Vol 1079-1080 ◽

pp. 820-823

Author(s):

Li Guo Zheng ◽

Mei Li Zhu ◽

Qing Qing Wang

Keyword(s):

Visual Speech ◽

Second Step ◽

Discrete Wavelet ◽

Noise Resistance ◽

Lip Reading ◽

Novel Method ◽

Reading System ◽

Visual Speech Information ◽

Speech Information ◽

Precision Rate

This paper proposes a novel algorithm used in extraction of lip feature extraction for to improved efficiency and robustness of lip-reading system. First, Lip Gray Energy Image (LGEI) is used to smooth noise, and improve noise resistance of the system. Second, Discrete Wavelet Analysis (DWT) is used to extract salient visual speech information from lip by decorrelating spectral information. Last, lip features are obtained by downsampling data from second step, the resample can effectively reduce the amount of computation. Experimental results show the performance of this method is exceedingly discriminative, accurate and computation efficient, the precision rate can rate 96%.

Download Full-text