New evidence for audiovisual speech scene analysis: Low level interaction between auditory streaming and visual cues in speech perception

2012 ◽  
Vol 131 (4) ◽  
pp. 3269-3269
Author(s):  
Frédéric Berthommier ◽  
Jean-Luc Schwartz
2019 ◽  
Author(s):  
Kristin J. Van Engen ◽  
Avanti Dey ◽  
Mitchell Sommers ◽  
Jonathan E. Peelle

Although listeners use both auditory and visual cues during speech perception, the cognitive and neural bases for their integration remain a matter of debate. One common approach to measuring multisensory integration is to use McGurk tasks, in which discrepant auditory and visual cues produce auditory percepts that differ from those based solely on unimodal input. Not all listeners show the same degree of susceptibility to the McGurk illusion, and these individual differences in susceptibility are frequently used as a measure of audiovisual integration ability. However, despite their popularity, we argue that McGurk tasks are ill-suited for studying the kind of multisensory speech perception that occurs in real life: McGurk stimuli are often based on isolated syllables (which are rare in conversations) and necessarily rely on audiovisual incongruence that does not occur naturally. Furthermore, recent data show that susceptibility on McGurk tasks does not correlate with performance during natural audiovisual speech perception. Although the McGurk effect is a fascinating illusion, truly understanding the combined use of auditory and visual information during speech perception requires tasks that more closely resemble everyday communication.


2015 ◽  
Vol 75 ◽  
pp. 402-410 ◽  
Author(s):  
Philip Jaekl ◽  
Ana Pesquita ◽  
Agnes Alsius ◽  
Kevin Munhall ◽  
Salvador Soto-Faraco

2020 ◽  
Vol 63 (7) ◽  
pp. 2245-2254 ◽  
Author(s):  
Jianrong Wang ◽  
Yumeng Zhu ◽  
Yu Chen ◽  
Abdilbar Mamat ◽  
Mei Yu ◽  
...  

Purpose The primary purpose of this study was to explore the audiovisual speech perception strategies.80.23.47 adopted by normal-hearing and deaf people in processing familiar and unfamiliar languages. Our primary hypothesis was that they would adopt different perception strategies due to different sensory experiences at an early age, limitations of the physical device, and the developmental gap of language, and others. Method Thirty normal-hearing adults and 33 prelingually deaf adults participated in the study. They were asked to perform judgment and listening tasks while watching videos of a Uygur–Mandarin bilingual speaker in a familiar language (Standard Chinese) or an unfamiliar language (Modern Uygur) while their eye movements were recorded by eye-tracking technology. Results Task had a slight influence on the distribution of selective attention, whereas subject and language had significant influences. To be specific, the normal-hearing and the d10eaf participants mainly gazed at the speaker's eyes and mouth, respectively, in the experiment; moreover, while the normal-hearing participants had to stare longer at the speaker's mouth when they confronted with the unfamiliar language Modern Uygur, the deaf participant did not change their attention allocation pattern when perceiving the two languages. Conclusions Normal-hearing and deaf adults adopt different audiovisual speech perception strategies: Normal-hearing adults mainly look at the eyes, and deaf adults mainly look at the mouth. Additionally, language and task can also modulate the speech perception strategy.


2007 ◽  
Vol 11 (4) ◽  
pp. 233-241 ◽  
Author(s):  
Nancy Tye-Murray ◽  
Mitchell Sommers ◽  
Brent Spehar

2019 ◽  
Vol 128 ◽  
pp. 93-100 ◽  
Author(s):  
Masahiro Imafuku ◽  
Masahiko Kawai ◽  
Fusako Niwa ◽  
Yuta Shinya ◽  
Masako Myowa

2012 ◽  
Vol 367 (1591) ◽  
pp. 942-953 ◽  
Author(s):  
Jean-Michel Hupé ◽  
Daniel Pressnitzer

Auditory streaming and visual plaids have been used extensively to study perceptual organization in each modality. Both stimuli can produce bistable alternations between grouped (one object) and split (two objects) interpretations. They also share two peculiar features: (i) at the onset of stimulus presentation, organization starts with a systematic bias towards the grouped interpretation; (ii) this first percept has ‘inertia’; it lasts longer than the subsequent ones. As a result, the probability of forming different objects builds up over time, a landmark of both behavioural and neurophysiological data on auditory streaming. Here we show that first percept bias and inertia are independent. In plaid perception, inertia is due to a depth ordering ambiguity in the transparent (split) interpretation that makes plaid perception tristable rather than bistable: experimental manipulations removing the depth ambiguity suppressed inertia. However, the first percept bias persisted. We attempted a similar manipulation for auditory streaming by introducing level differences between streams, to bias which stream would appear in the perceptual foreground. Here both inertia and first percept bias persisted. We thus argue that the critical common feature of the onset of perceptual organization is the grouping bias, which may be related to the transition from temporally/spatially local to temporally/spatially global computation.


Sign in / Sign up

Export Citation Format

Share Document