scholarly journals Converging Evidence From Electrocorticography and BOLD fMRI for a Sharp Functional Boundary in Superior Temporal Gyrus Related to Multisensory Speech Processing

Author(s):  
Muge Ozker ◽  
Daniel Yoshor ◽  
Michael S. Beauchamp
2018 ◽  
Author(s):  
Muge Ozker ◽  
Michael S. Beauchamp

AbstractAlthough humans can understand speech using the auditory modality alone, in noisy environments visual speech information from the talker’s mouth can rescue otherwise unintelligible auditory speech. To investigate the neural substrates of multisensory speech perception, we recorded neural activity from the human superior temporal gyrus using two very different techniques: either directly, using surface electrodes implanted in five participants with epilepsy (electrocorticography, ECOG), or indirectly, using blood oxygen level dependent functional magnetic resonance imaging (BOLD fMRI) in six healthy control fMRI participants. Both ECOG and fMRI participants viewed the same clear and noisy audiovisual speech stimuli and performed the same speech recognition task. Both techniques demonstrated a sharp functional boundary in the STG, which corresponded to an anatomical boundary defined by the posterior edge of Heschl’s gyrus. On the anterior side of the boundary, cortex responded more strongly to clear audiovisual speech than to noisy audiovisual speech, suggesting that anterior STG is primarily involved in processing unisensory auditory speech. On the posterior side of the boundary, cortex preferred noisy audiovisual speech or showed no preference and showed robust responses to auditory-only and visual-only speech, suggesting that posterior STG is specialized for processing multisensory audiovisual speech. For both ECOG and fMRI, the transition between the functionally distinct regions happened within 10 mm of anterior-to-posterior distance along the STG. We relate this boundary to the multisensory neural code underlying speech perception and propose that it represents an important functional division within the human speech perception network.


2018 ◽  
Author(s):  
Arafat Angulo-Perkins ◽  
Luis Concha

ABSTRACT Musicality refers to specific biological traits that allow us to perceive, generate and enjoy music. These abilities can be studied at different organizational levels (e.g., behavioural, physiological, evolutionary), and all of them reflect that music and speech processing are two different cognitive domains. Previous research has shown evidence of this functional divergence in auditory cortical regions in the superior temporal gyrus (such as the planum polare), showing increased activity upon listening to music, as compared to other complex acoustic signals. Here, we examine brain activity underlying vocal music and speech perception, while we compare musicians and non-musicians. We designed a stimulation paradigm using the same voice to produce spoken sentences, hummed melodies, and sung sentences; the same sentences were used in speech and song categories, and the same melodies were used in the musical categories (song and hum). Participants listened to this paradigm while we acquired functional magnetic resonance images (fMRI). Different analyses demonstrated greater involvement of specific auditory and motor regions during music perception, as compared to speech vocalizations. This music sensitive network includes bilateral activation of the planum polare and temporale, as well as a group of regions lateralized to the right hemisphere that included the supplementary motor area, premotor cortex and the inferior frontal gyrus. Our results show that the simple act of listening to music generates stronger activation of motor regions, possibly preparing us to move following the beat. Vocal musical listening, with and without lyrics, is also accompanied by a higher modulation of specific secondary auditory cortices such as the planum polare, confirming its crucial role in music processing independently of previous musical training. This study provides more evidence showing that music perception enhances audio-sensorimotor activity, crucial for clinical approaches exploring music based therapies to improve communicative and motor skills.


2015 ◽  
Vol 122 (2) ◽  
pp. 250-261 ◽  
Author(s):  
Edward F. Chang ◽  
Kunal P. Raygor ◽  
Mitchel S. Berger

Classic models of language organization posited that separate motor and sensory language foci existed in the inferior frontal gyrus (Broca's area) and superior temporal gyrus (Wernicke's area), respectively, and that connections between these sites (arcuate fasciculus) allowed for auditory-motor interaction. These theories have predominated for more than a century, but advances in neuroimaging and stimulation mapping have provided a more detailed description of the functional neuroanatomy of language. New insights have shaped modern network-based models of speech processing composed of parallel and interconnected streams involving both cortical and subcortical areas. Recent models emphasize processing in “dorsal” and “ventral” pathways, mediating phonological and semantic processing, respectively. Phonological processing occurs along a dorsal pathway, from the posterosuperior temporal to the inferior frontal cortices. On the other hand, semantic information is carried in a ventral pathway that runs from the temporal pole to the basal occipitotemporal cortex, with anterior connections. Functional MRI has poor positive predictive value in determining critical language sites and should only be used as an adjunct for preoperative planning. Cortical and subcortical mapping should be used to define functional resection boundaries in eloquent areas and remains the clinical gold standard. In tracing the historical advancements in our understanding of speech processing, the authors hope to not only provide practicing neurosurgeons with additional information that will aid in surgical planning and prevent postoperative morbidity, but also underscore the fact that neurosurgeons are in a unique position to further advance our understanding of the anatomy and functional organization of language.


2018 ◽  
Author(s):  
Yulia Oganian ◽  
Edward F. Chang

AbstractListeners use the slow amplitude modulations of speech, known as the envelope, to segment continuous speech into syllables. However, the underlying neural computations are heavily debated. We used high-density intracranial cortical recordings while participants listened to natural and synthesized control speech stimuli to determine how the envelope is represented in the human superior temporal gyrus (STG), a critical auditory brain area for speech processing. We found that the STG does not encode the instantaneous, moment-by-moment amplitude envelope of speech. Rather, a zone of the middle STG detects discrete acoustic onset edges, defined by local maxima in the rate-of-change of the envelope. Acoustic analysis demonstrated that acoustic onset edges reliably cue the information-rich transition between the consonant-onset and vowel-nucleus of syllables. Furthermore, the steepness of the acoustic edge cued whether a syllable was stressed. Synthesized amplitude-modulated tone stimuli showed that steeper edges elicited monotonically greater cortical responses, confirming the encoding of relative but not absolute amplitude. Overall, encoding of the timing and magnitude of acoustic onset edges in STG underlies our perception of the syllabic rhythm of speech.


2018 ◽  
Author(s):  
Anna Dora Manca ◽  
Francesco Di Russo ◽  
Francesco Sigona ◽  
Mirko Grimaldi

How the brain encodes the speech acoustic signal into phonological representations (distinctive features) is a fundamental question for the neurobiology of language. Whether this process is characterized by tonotopic maps in primary or secondary auditory areas, with bilateral or leftward activity, remains a long-standing challenge. Magnetoencephalographic and ECoG studies have previously failed to show hierarchical and asymmetric hints for speech processing. We employed high-density electroencephalography to map the Salento Italian vowel system onto cortical sources using the N1 auditory evoked component. We found evidence that the N1 is characterized by hierarchical and asymmetric indexes structuring vowels representation. We identified them with two N1 subcomponents: the typical N1 (N1a) peaking at 125-135 ms and localized in the primary auditory cortex bilaterally with a tangential distribution and a late phase of the N1 (N1b) peaking at 145-155 ms and localized in the left superior temporal gyrus with a radial distribution. Notably, we showed that the processing of distinctive feature representations begins early in the primary auditory cortex and carries on in the superior temporal gyrus along lateral-medial, anterior-posterior and inferior-superior gradients. It is the dynamical interface of both auditory cortices and the interaction effects between different distinctive features that generate the categorical representations of vowels.


2018 ◽  
Vol 115 (6) ◽  
pp. E1299-E1308 ◽  
Author(s):  
Sophie Bouton ◽  
Valérian Chambon ◽  
Rémi Tyrand ◽  
Adrian G. Guggisberg ◽  
Margitta Seeck ◽  
...  

Percepts and words can be decoded from distributed neural activity measures. However, the existence of widespread representations might conflict with the more classical notions of hierarchical processing and efficient coding, which are especially relevant in speech processing. Using fMRI and magnetoencephalography during syllable identification, we show that sensory and decisional activity colocalize to a restricted part of the posterior superior temporal gyrus (pSTG). Next, using intracortical recordings, we demonstrate that early and focal neural activity in this region distinguishes correct from incorrect decisions and can be machine-decoded to classify syllables. Crucially, significant machine decoding was possible from neuronal activity sampled across different regions of the temporal and frontal lobes, despite weak or absent sensory or decision-related responses. These findings show that speech-sound categorization relies on an efficient readout of focal pSTG neural activity, while more distributed activity patterns, although classifiable by machine learning, instead reflect collateral processes of sensory perception and decision.


2008 ◽  
Vol 20 (3) ◽  
pp. 541-552 ◽  
Author(s):  
Eveline Geiser ◽  
Tino Zaehle ◽  
Lutz Jancke ◽  
Martin Meyer

The present study investigates the neural correlates of rhythm processing in speech perception. German pseudosentences spoken with an exaggerated (isochronous) or a conversational (nonisochronous) rhythm were compared in an auditory functional magnetic resonance imaging experiment. The subjects had to perform either a rhythm task (explicit rhythm processing) or a prosody task (implicit rhythm processing). The study revealed bilateral activation in the supplementary motor area (SMA), extending into the cingulate gyrus, and in the insulae, extending into the right basal ganglia (neostriatum), as well as activity in the right inferior frontal gyrus (IFG) related to the performance of the rhythm task. A direct contrast between isochronous and nonisochronous sentences revealed differences in lateralization of activation for isochronous processing as a function of the explicit and implicit tasks. Explicit processing revealed activation in the right posterior superior temporal gyrus (pSTG), the right supramarginal gyrus, and the right parietal operculum. Implicit processing showed activation in the left supramarginal gyrus, the left pSTG, and the left parietal operculum. The present results indicate a function of the SMA and the insula beyond motor timing and speak for a role of these brain areas in the perception of acoustically temporal intervals. Secondly, the data speak for a specific task-related function of the right IFG in the processing of accent patterns. Finally, the data sustain the assumption that the right secondary auditory cortex is involved in the explicit perception of auditory suprasegmental cues and, moreover, that activity in the right secondary auditory cortex can be modulated by top-down processing mechanisms.


2013 ◽  
Vol 25 (12) ◽  
pp. 2179-2188 ◽  
Author(s):  
Katya Krieger-Redwood ◽  
M. Gareth Gaskell ◽  
Shane Lindsay ◽  
Elizabeth Jefferies

Several accounts of speech perception propose that the areas involved in producing language are also involved in perceiving it. In line with this view, neuroimaging studies show activation of premotor cortex (PMC) during phoneme judgment tasks; however, there is debate about whether speech perception necessarily involves motor processes, across all task contexts, or whether the contribution of PMC is restricted to tasks requiring explicit phoneme awareness. Some aspects of speech processing, such as mapping sounds onto meaning, may proceed without the involvement of motor speech areas if PMC specifically contributes to the manipulation and categorical perception of phonemes. We applied TMS to three sites—PMC, posterior superior temporal gyrus, and occipital pole—and for the first time within the TMS literature, directly contrasted two speech perception tasks that required explicit phoneme decisions and mapping of speech sounds onto semantic categories, respectively. TMS to PMC disrupted explicit phonological judgments but not access to meaning for the same speech stimuli. TMS to two further sites confirmed that this pattern was site specific and did not reflect a generic difference in the susceptibility of our experimental tasks to TMS: stimulation of pSTG, a site involved in auditory processing, disrupted performance in both language tasks, whereas stimulation of occipital pole had no effect on performance in either task. These findings demonstrate that, although PMC is important for explicit phonological judgments, crucially, PMC is not necessary for mapping speech onto meanings.


2020 ◽  
Vol 32 (5) ◽  
pp. 877-888
Author(s):  
Maxime Niesen ◽  
Marc Vander Ghinst ◽  
Mathieu Bourguignon ◽  
Vincent Wens ◽  
Julie Bertels ◽  
...  

Discrimination of words from nonspeech sounds is essential in communication. Still, how selective attention can influence this early step of speech processing remains elusive. To answer that question, brain activity was recorded with magnetoencephalography in 12 healthy adults while they listened to two sequences of auditory stimuli presented at 2.17 Hz, consisting of successions of one randomized word (tagging frequency = 0.54 Hz) and three acoustically matched nonverbal stimuli. Participants were instructed to focus their attention on the occurrence of a predefined word in the verbal attention condition and on a nonverbal stimulus in the nonverbal attention condition. Steady-state neuromagnetic responses were identified with spectral analysis at sensor and source levels. Significant sensor responses peaked at 0.54 and 2.17 Hz in both conditions. Sources at 0.54 Hz were reconstructed in supratemporal auditory cortex, left superior temporal gyrus (STG), left middle temporal gyrus, and left inferior frontal gyrus. Sources at 2.17 Hz were reconstructed in supratemporal auditory cortex and STG. Crucially, source strength in the left STG at 0.54 Hz was significantly higher in verbal attention than in nonverbal attention condition. This study demonstrates speech-sensitive responses at primary auditory and speech-related neocortical areas. Critically, it highlights that, during word discrimination, top–down attention modulates activity within the left STG. This area therefore appears to play a crucial role in selective verbal attentional processes for this early step of speech processing.


2013 ◽  
Vol 25 (5) ◽  
pp. 706-718 ◽  
Author(s):  
Sara Guediche ◽  
Caden Salvata ◽  
Sheila E. Blumstein

Listeners' perception of acoustically presented speech is constrained by many different sources of information that arise from other sensory modalities and from more abstract higher-level language context. An open question is how perceptual processes are influenced by and interact with these other sources of information. In this study, we use fMRI to examine the effect of a prior sentence fragment meaning on the categorization of two possible target words that differ in an acoustic phonetic feature of the initial consonant, VOT. Specifically, we manipulate the bias of the sentence context (biased, neutral) and the target type (ambiguous, unambiguous). Our results show that an interaction between these two factors emerged in a cluster in temporal cortex encompassing the left middle temporal gyrus and the superior temporal gyrus. The locus and pattern of these interactions support an interactive view of speech processing and suggest that both the quality of the input and the potential bias of the context together interact and modulate neural activation patterns.


Sign in / Sign up

Export Citation Format

Share Document