scholarly journals Weak observer–level correlation and strong stimulus-level correlation between the McGurk effect and audiovisual speech-in-noise: A causal inference explanation

Cortex ◽  
2020 ◽  
Vol 133 ◽  
pp. 371-383
Author(s):  
John F. Magnotti ◽  
Kristen B. Dzeda ◽  
Kira Wegner-Clemens ◽  
Johannes Rennig ◽  
Michael S. Beauchamp
2020 ◽  
Author(s):  
John F. Magnotti ◽  
Kristen B. Dzeda ◽  
Kira Wegner-Clemens ◽  
Michael S. Beauchamp

AbstractThe McGurk effect is widely used as a measure of multisensory integration during speech perception. Two observations have raised questions about the relationship between the effect and everyday speech perception. First, there is high variability in the strength of the McGurk effect across different stimuli and observers. Second, there is low correlation across observers between perception of the McGurk effect and measures of everyday speech perception, such as the ability to understand noisy audiovisual speech. Using the framework of the causal inference of multisensory speech (CIMS) model, we explored the relationship between the McGurk effect, syllable perception, and sentence perception in seven experiments with a total of 296 different participants. Perceptual reports revealed a relationship between the efficacy of different McGurk stimuli created from the same talker and perception of the auditory component of the McGurk stimuli presented in isolation, either with or without added noise. The CIMS model explained this high stimulus-level correlation using the principles of noisy sensory encoding followed by optimal cue combination within a representational space that was identical for McGurk and everyday speech. In other experiments, CIMS successfully modeled low observer-level correlation between McGurk and everyday speech. Variability in noisy speech perception was modeled using individual differences in noisy sensory encoding, while variability in McGurk perception involved additional differences in causal inference. Participants with all combinations of high and low sensory encoding noise and high and low causal inference disparity thresholds were identified. Perception of the McGurk effect and everyday speech can be explained by a common theoretical framework that includes causal inference.


2014 ◽  
Vol 5 ◽  
Author(s):  
Laura C. Erickson ◽  
Brandon A. Zielinski ◽  
Jennifer E. V. Zielinski ◽  
Guoying Liu ◽  
Peter E. Turkeltaub ◽  
...  

2013 ◽  
Vol 4 ◽  
Author(s):  
John F. Magnotti ◽  
Wei Ji Ma ◽  
Michael S. Beauchamp

2011 ◽  
Vol 24 (1) ◽  
pp. 67-90 ◽  
Author(s):  
Riikka Möttönen ◽  
Kaisa Tiippana ◽  
Mikko Sams ◽  
Hanna Puharinen

AbstractAudiovisual speech perception has been considered to operate independent of sound location, since the McGurk effect (altered auditory speech perception caused by conflicting visual speech) has been shown to be unaffected by whether speech sounds are presented in the same or different location as a talking face. Here we show that sound location effects arise with manipulation of spatial attention. Sounds were presented from loudspeakers in five locations: the centre (location of the talking face) and 45°/90° to the left/right. Auditory spatial attention was focused on a location by presenting the majority (90%) of sounds from this location. In Experiment 1, the majority of sounds emanated from the centre, and the McGurk effect was enhanced there. In Experiment 2, the major location was 90° to the left, causing the McGurk effect to be stronger on the left and centre than on the right. Under control conditions, when sounds were presented with equal probability from all locations, the McGurk effect tended to be stronger for sounds emanating from the centre, but this tendency was not reliable. Additionally, reaction times were the shortest for a congruent audiovisual stimulus, and this was the case independent of location. Our main finding is that sound location can modulate audiovisual speech perception, and that spatial attention plays a role in this modulation.


2018 ◽  
Vol 31 (1-2) ◽  
pp. 19-38 ◽  
Author(s):  
John F. Magnotti ◽  
Debshila Basu Mallick ◽  
Michael S. Beauchamp

We report the unexpected finding that slowing video playback decreases perception of the McGurk effect. This reduction is counter-intuitive because the illusion depends on visual speech influencing the perception of auditory speech, and slowing speech should increase the amount of visual information available to observers. We recorded perceptual data from 110 subjects viewing audiovisual syllables (either McGurk or congruent control stimuli) played back at one of three rates: the rate used by the talker during recording (the natural rate), a slow rate (50% of natural), or a fast rate (200% of natural). We replicated previous studies showing dramatic variability in McGurk susceptibility at the natural rate, ranging from 0–100% across subjects and from 26–76% across the eight McGurk stimuli tested. Relative to the natural rate, slowed playback reduced the frequency of McGurk responses by 11% (79% of subjects showed a reduction) and reduced congruent accuracy by 3% (25% of subjects showed a reduction). Fast playback rate had little effect on McGurk responses or congruent accuracy. To determine whether our results are consistent with Bayesian integration, we constructed a Bayes-optimal model that incorporated two assumptions: individuals combine auditory and visual information according to their reliability, and changing playback rate affects sensory reliability. The model reproduced both our findings of large individual differences and the playback rate effect. This work illustrates that surprises remain in the McGurk effect and that Bayesian integration provides a useful framework for understanding audiovisual speech perception.


2019 ◽  
Author(s):  
Violet Aurora Brown ◽  
Julia Feld Strand

The McGurk effect is a multisensory phenomenon in which discrepant auditory and visual speech signals typically result in an illusory percept (McGurk & MacDonald, 1976). McGurk stimuli are often used in studies assessing the attentional requirements of audiovisual integration (e.g., Alsius et al., 2005), but no study has directly compared the costs associated with integrating congruent versus incongruent audiovisual speech. Some evidence suggests that the McGurk effect may not be representative of naturalistic audiovisual speech processing—susceptibility to the McGurk effect is not associated with the ability to derive benefit from the addition of the visual signal (Van Engen et al., 2017), and distinct cortical regions are recruited when processing congruent versus incongruent speech (Erickson et al., 2014). In two experiments, one using response times to identify congruent and incongruent syllables and one using a dual-task paradigm, we assessed whether congruent and incongruent audiovisual speech incur different attentional costs. We demonstrated that response times to both the speech task (Experiment 1) and a secondary vibrotactile task (Experiment 2) were indistinguishable for congruent compared to incongruent syllables, but McGurk fusions were responded to more quickly than McGurk non-fusions. These results suggest that despite documented differences in how congruent and incongruent stimuli are processed (Erickson et al., 2014; Van Engen, Xie, & Chandrasekaran, 2017), they do not appear to differ in terms of processing time or effort. However, responses that result in McGurk fusions are processed more quickly than those that result in non-fusions, though attentional cost is comparable for the two response types.


Sign in / Sign up

Export Citation Format

Share Document