Cortical Representations of Speech in a Multi-talker Auditory Scene

Mapping Intimacies ◽

10.1101/124750 ◽

2017 ◽

Cited By ~ 1

Author(s):

Krishna C. Puvvada ◽

Jonathan Z. Simon

Keyword(s):

Auditory Cortex ◽

Human Subjects ◽

Higher Order ◽

Neural Representation ◽

Cortical Areas ◽

Auditory Scene ◽

Global Representation ◽

Stimulus Reconstruction ◽

Background Object ◽

Cortical Representations

AbstractThe ability to parse a complex auditory scene into perceptual objects is facilitated by a hierarchical auditory system. Successive stages in the hierarchy transform an auditory scene of multiple overlapping sources, from peripheral tonotopically-based representations in the auditory nerve, into perceptually distinct auditory-objects based representation in auditory cortex. Here, using magnetoencephalography (MEG) recordings from human subjects, both men and women, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in distinct hierarchical stages of auditory cortex. Using systems-theoretic methods of stimulus reconstruction, we show that the primary-like areas in auditory cortex contain dominantly spectro-temporal based representations of the entire auditory scene. Here, both attended and ignored speech streams are represented with almost equal fidelity, and a global representation of the full auditory scene with all its streams is a better candidate neural representation than that of individual streams being represented separately. In contrast, we also show that higher order auditory cortical areas represent the attended stream separately, and with significantly higher fidelity, than unattended streams. Furthermore, the unattended background streams are more faithfully represented as a single unsegregated background object rather than as separated objects. Taken together, these findings demonstrate the progression of the representations and processing of a complex acoustic scene up through the hierarchy of human auditory cortex.Significance StatementUsing magnetoencephalography (MEG) recordings from human listeners in a simulated cocktail party environment, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in separate hierarchical stages of auditory cortex. We show that the primary-like areas in auditory cortex use a dominantly spectro-temporal based representation of the entire auditory scene, with both attended and ignored speech streams represented with almost equal fidelity. In contrast, we show that higher order auditory cortical areas represent an attended speech stream separately from, and with significantly higher fidelity than, unattended speech streams. Furthermore, the unattended background streams are represented as a single undivided background object rather than as distinct background objects.

Download Full-text

Effects of Hearing Aid Noise Reduction on Early and Late Cortical Representations of Competing Talkers in Noise

Frontiers in Neuroscience ◽

10.3389/fnins.2021.636060 ◽

2021 ◽

Vol 15 ◽

Author(s):

Emina Alickovic ◽

Elaine Hoi Ning Ng ◽

Lorenz Fiedler ◽

Sébastien Santurette ◽

Hamish Innes-Brown ◽

...

Keyword(s):

Auditory Cortex ◽

Noise Reduction ◽

Time Window ◽

Hearing Aid ◽

Hierarchical Processing ◽

Neural Recordings ◽

Auditory Scene ◽

Stimulus Reconstruction ◽

The Individual ◽

Auditory Scenes

ObjectivesPrevious research using non-invasive (magnetoencephalography, MEG) and invasive (electrocorticography, ECoG) neural recordings has demonstrated the progressive and hierarchical representation and processing of complex multi-talker auditory scenes in the auditory cortex. Early responses (<85 ms) in primary-like areas appear to represent the individual talkers with almost equal fidelity and are independent of attention in normal-hearing (NH) listeners. However, late responses (>85 ms) in higher-order non-primary areas selectively represent the attended talker with significantly higher fidelity than unattended talkers in NH and hearing–impaired (HI) listeners. Motivated by these findings, the objective of this study was to investigate the effect of a noise reduction scheme (NR) in a commercial hearing aid (HA) on the representation of complex multi-talker auditory scenes in distinct hierarchical stages of the auditory cortex by using high-density electroencephalography (EEG).DesignWe addressed this issue by investigating early (<85 ms) and late (>85 ms) EEG responses recorded in 34 HI subjects fitted with HAs. The HA noise reduction (NR) was either on or off while the participants listened to a complex auditory scene. Participants were instructed to attend to one of two simultaneous talkers in the foreground while multi-talker babble noise played in the background (+3 dB SNR). After each trial, a two-choice question about the content of the attended speech was presented.ResultsUsing a stimulus reconstruction approach, our results suggest that the attention-related enhancement of neural representations of target and masker talkers located in the foreground, as well as suppression of the background noise in distinct hierarchical stages is significantly affected by the NR scheme. We found that the NR scheme contributed to the enhancement of the foreground and of the entire acoustic scene in the early responses, and that this enhancement was driven by better representation of the target speech. We found that the target talker in HI listeners was selectively represented in late responses. We found that use of the NR scheme resulted in enhanced representations of the target and masker speech in the foreground and a suppressed representation of the noise in the background in late responses. We found a significant effect of EEG time window on the strengths of the cortical representation of the target and masker.ConclusionTogether, our analyses of the early and late responses obtained from HI listeners support the existing view of hierarchical processing in the auditory cortex. Our findings demonstrate the benefits of a NR scheme on the representation of complex multi-talker auditory scenes in different areas of the auditory cortex in HI listeners.

Download Full-text

Neural Coding of Noisy and Reverberant Speech in Human Auditory Cortex

10.1101/229153 ◽

2017 ◽

Cited By ~ 1

Author(s):

Krishna C Puvvada ◽

Marisel Villafañe-Delgado ◽

Christian Brodbeck ◽

Jonathan Z Simon

Keyword(s):

Neural Coding ◽

Human Subjects ◽

Neural Representation ◽

Speech Communication ◽

Specific Information ◽

Neural Encoding ◽

Multiple Reflections ◽

Noise Sources ◽

Listening Environments ◽

Stimulus Reconstruction

AbstractSpeech communication in daily listening environments is complicated by the phenomenon of reverberation, wherein any sound reaching the ear is a mixture of the direct component from the source and multiple reflections off surrounding objects and the environment. The brain plays a central role in comprehending speech accompanied by such distortion, which, frequently, is further complicated by the presence of additional noise sources in the vicinity. Here, using magnetoencephalography (MEG) recordings from human subjects, we investigate the neural representation of speech in noisy, reverberant listening conditions as measured by phase-locked MEG responses to the slow temporal modulations of speech. Using systems-theoretic linear methods of stimulus encoding, we observe that the cortex maintains both distorted and distortion-free (cleaned) representations of speech. Also, we show that, while neural encoding of speech remains robust to additive noise in absence of reverberation, it is detrimentally affected by noise when present along with reverberation. Further, using linear methods of stimulus reconstruction, we show that theta-band neural responses are a likely candidate for the distortion free representation of speech, whereas delta band responses are more likely to carry non-speech specific information regarding the listening environment.

Download Full-text

Neural Representation of Concurrent Harmonic Sounds in Monkey Primary Auditory Cortex: Implications for Models of Auditory Scene Analysis

Journal of Neuroscience ◽

10.1523/jneurosci.0025-14.2014 ◽

2014 ◽

Vol 34 (37) ◽

pp. 12425-12443 ◽

Cited By ~ 10

Author(s):

Y. I. Fishman ◽

M. Steinschneider ◽

C. Micheyl

Keyword(s):

Auditory Cortex ◽

Primary Auditory Cortex ◽

Auditory Scene Analysis ◽

Neural Representation ◽

Scene Analysis ◽

Auditory Scene

Download Full-text

Competition and convergence between auditory and cross-modal visual inputs to primary auditory cortical areas

Journal of Neurophysiology ◽

10.1152/jn.00407.2010 ◽

2011 ◽

Vol 105 (4) ◽

pp. 1558-1573 ◽

Cited By ~ 13

Author(s):

Yu-Ting Mao ◽

Tian-Miao Hua ◽

Sarah L. Pallas

Keyword(s):

Auditory Cortex ◽

Multisensory Processing ◽

Sensory Deprivation ◽

Target Space ◽

Target Cells ◽

Auditory Input ◽

Visual Neurons ◽

Cortical Areas ◽

Visual Inputs ◽

Medial Geniculate

Sensory neocortex is capable of considerable plasticity after sensory deprivation or damage to input pathways, especially early in development. Although plasticity can often be restorative, sometimes novel, ectopic inputs invade the affected cortical area. Invading inputs from other sensory modalities may compromise the original function or even take over, imposing a new function and preventing recovery. Using ferrets whose retinal axons were rerouted into auditory thalamus at birth, we were able to examine the effect of varying the degree of ectopic, cross-modal input on reorganization of developing auditory cortex. In particular, we assayed whether the invading visual inputs and the existing auditory inputs competed for or shared postsynaptic targets and whether the convergence of input modalities would induce multisensory processing. We demonstrate that although the cross-modal inputs create new visual neurons in auditory cortex, some auditory processing remains. The degree of damage to auditory input to the medial geniculate nucleus was directly related to the proportion of visual neurons in auditory cortex, suggesting that the visual and residual auditory inputs compete for cortical territory. Visual neurons were not segregated from auditory neurons but shared target space even on individual target cells, substantially increasing the proportion of multisensory neurons. Thus spatial convergence of visual and auditory input modalities may be sufficient to expand multisensory representations. Together these findings argue that early, patterned visual activity does not drive segregation of visual and auditory afferents and suggest that auditory function might be compromised by converging visual inputs. These results indicate possible ways in which multisensory cortical areas may form during development and evolution. They also suggest that rehabilitative strategies designed to promote recovery of function after sensory deprivation or damage need to take into account that sensory cortex may become substantially more multisensory after alteration of its input during development.

Download Full-text

Disparity in interaural time difference improves the accuracy of neural representations of individual concurrent narrowband sounds in rat inferior colliculus and auditory cortex

Journal of Neurophysiology ◽

10.1152/jn.00284.2019 ◽

2020 ◽

Vol 123 (2) ◽

pp. 695-706

Author(s):

Lu Luo ◽

Na Xu ◽

Qian Wang ◽

Liang Li

Keyword(s):

Auditory Cortex ◽

Inferior Colliculus ◽

Interaural Time Difference ◽

Critical Role ◽

Low Frequency ◽

Time Difference ◽

Neural Representation ◽

Frequency Bands ◽

Neural Segregation ◽

Peripheral Auditory System

The central mechanisms underlying binaural unmasking for spectrally overlapping concurrent sounds, which are unresolved in the peripheral auditory system, remain largely unknown. In this study, frequency-following responses (FFRs) to two binaurally presented independent narrowband noises (NBNs) with overlapping spectra were recorded simultaneously in the inferior colliculus (IC) and auditory cortex (AC) in anesthetized rats. The results showed that for both IC FFRs and AC FFRs, introducing an interaural time difference (ITD) disparity between the two concurrent NBNs enhanced the representation fidelity, reflected by the increased coherence between the responses evoked by double-NBN stimulation and the responses evoked by single NBNs. The ITD disparity effect varied across frequency bands, being more marked for higher frequency bands in the IC and lower frequency bands in the AC. Moreover, the coherence between IC responses and AC responses was also enhanced by the ITD disparity, and the enhancement was most prominent for low-frequency bands and the IC and the AC on the same side. These results suggest a critical role of the ITD cue in the neural segregation of spectrotemporally overlapping sounds. NEW & NOTEWORTHY When two spectrally overlapped narrowband noises are presented at the same time with the same sound-pressure level, they mask each other. Introducing a disparity in interaural time difference between these two narrowband noises improves the accuracy of the neural representation of individual sounds in both the inferior colliculus and the auditory cortex. The lower frequency signal transformation from the inferior colliculus to the auditory cortex on the same side is also enhanced, showing the effect of binaural unmasking.

Download Full-text

First order connections of the visual sector of the thalamic reticular nucleus in marmoset monkeys (Callithrix jacchus)

Visual Neuroscience ◽

10.1017/s0952523807070770 ◽

2007 ◽

Vol 24 (6) ◽

pp. 857-874 ◽

Cited By ~ 11

Author(s):

THOMAS FITZGIBBON ◽

BRETT A. SZMAJDA ◽

PAUL R. MARTIN

Keyword(s):

Visual Field ◽

Callithrix Jacchus ◽

Higher Order ◽

Dorsal Lateral Geniculate Nucleus ◽

Thalamic Reticular Nucleus ◽

Reticular Nucleus ◽

Lower Visual Field ◽

First Order ◽

Cortical Areas ◽

Visual Areas

The thalamic reticular nucleus (TRN) supplies an important inhibitory input to the dorsal thalamus. Previous studies in non-primate mammals have suggested that the visual sector of the TRN has a lateral division, which has connections with first-order (primary) sensory thalamic and cortical areas, and a medial division, which has connections with higher-order (association) thalamic and cortical areas. However, the question whether the primate TRN is segregated in the same manner is controversial. Here, we investigated the connections of the TRN in a New World primate, the marmoset (Callithrix jacchus). The topography of labeled cells and terminals was analyzed following iontophoretic injections of tracers into the primary visual cortex (V1) or the dorsal lateral geniculate nucleus (LGNd). The results show that rostroventral TRN, adjacent to the LGNd, is primarily connected with primary visual areas, while the most caudal parts of the TRN are associated with higher order visual thalamic areas. A small region of the TRN near the caudal pole of the LGNd (foveal representation) contains connections where first (lateral TRN) and higher order visual areas (medial TRN) overlap. Reciprocal connections between LGNd and TRN are topographically organized, so that a series of rostrocaudal injections within the LGNd labeled cells and terminals in the TRN in a pattern shaped like rostrocaudal overlapping “fish scales.” We propose that the dorsal areas of the TRN, adjacent to the top of the LGNd, represent the lower visual field (connected with medial LGNd), and the more ventral parts of the TRN contain a map representing the upper visual field (connected with lateral LGNd).

Download Full-text

Dense Associative Memory Is Robust to Adversarial Inputs

Neural Computation ◽

10.1162/neco_a_01143 ◽

2018 ◽

Vol 30 (12) ◽

pp. 3151-3167 ◽

Cited By ~ 19

Author(s):

Dmitry Krotov ◽

John Hopfield

Keyword(s):

Objective Function ◽

Associative Memory ◽

Energy Function ◽

Human Subjects ◽

Human Perception ◽

Higher Order ◽

Human Vision ◽

Training Data ◽

Decision Boundary ◽

Interaction Vertex

Deep neural networks (DNNs) trained in a supervised way suffer from two known problems. First, the minima of the objective function used in learning correspond to data points (also known as rubbish examples or fooling images) that lack semantic similarity with the training data. Second, a clean input can be changed by a small, and often imperceptible for human vision, perturbation so that the resulting deformed input is misclassified by the network. These findings emphasize the differences between the ways DNNs and humans classify patterns and raise a question of designing learning algorithms that more accurately mimic human perception compared to the existing methods. Our article examines these questions within the framework of dense associative memory (DAM) models. These models are defined by the energy function, with higher-order (higher than quadratic) interactions between the neurons. We show that in the limit when the power of the interaction vertex in the energy function is sufficiently large, these models have the following three properties. First, the minima of the objective function are free from rubbish images, so that each minimum is a semantically meaningful pattern. Second, artificial patterns poised precisely at the decision boundary look ambiguous to human subjects and share aspects of both classes that are separated by that decision boundary. Third, adversarial images constructed by models with small power of the interaction vertex, which are equivalent to DNN with rectified linear units, fail to transfer to and fool the models with higher-order interactions. This opens up the possibility of using higher-order models for detecting and stopping malicious adversarial attacks. The results we present suggest that DAMs with higher-order energy functions are more robust to adversarial and rubbish inputs than DNNs with rectified linear units.

Download Full-text

Neural Representation of Sound Patterns in the Auditory Cortex of Monkeys

Primate Audition - Frontiers in Neuroscience ◽

10.1201/9781420041224.ch9 ◽

2002 ◽

Cited By ~ 1

Author(s):

Michael Brosch ◽

Henning Scheich

Keyword(s):

Auditory Cortex ◽

Neural Representation ◽

Sound Patterns

Download Full-text

Over-Representation of Speech in Older Adults Originates from Early Response in Higher Order Auditory Cortex

Acta Acustica united with Acustica ◽

10.3813/aaa.919221 ◽

2018 ◽

Vol 104 (5) ◽

pp. 774-777 ◽

Cited By ~ 14

Author(s):

Christian Brodbeck ◽

Alessandro Presacco ◽

Samira Anderson ◽

Jonathan Z. Simon

Keyword(s):

Older Adults ◽

Auditory Cortex ◽

Early Response ◽

Higher Order

Download Full-text

Neural Representation of Concurrent Vowels in Macaque Primary Auditory Cortex

eNeuro ◽

10.1523/eneuro.0071-16.2016 ◽

2016 ◽

Vol 3 (3) ◽

pp. ENEURO.0071-16.2016 ◽

Cited By ~ 5

Author(s):

Yonatan I. Fishman ◽

Christophe Micheyl ◽

Mitchell Steinschneider

Keyword(s):

Auditory Cortex ◽

Primary Auditory Cortex ◽

Neural Representation

Download Full-text