Resolving the neural dynamics of visual and auditory scene processing in the human brain: a methodological approach

Radoslaw Martin Cichy; Santani Teng

doi:10.1098/rstb.2016.0108

Resolving the neural dynamics of visual and auditory scene processing in the human brain: a methodological approach

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2016.0108 ◽

2017 ◽

Vol 372 (1714) ◽

pp. 20160108 ◽

Cited By ~ 16

Author(s):

Radoslaw Martin Cichy ◽

Santani Teng

Keyword(s):

Methodological Approach ◽

Computational Modelling ◽

Auditory Scene Analysis ◽

Brain Regions ◽

Scene Analysis ◽

Neural Dynamics ◽

Future Research ◽

Auditory Information ◽

Large Set ◽

Auditory Scene

In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research. This article is part of the themed issue ‘Auditory and visual scene analysis’.

Download Full-text

Integration of visual information in auditory cortex promotes auditory scene analysis through multisensory binding

10.1101/098798 ◽

2017 ◽

Cited By ~ 1

Author(s):

Huriye Atilgan ◽

Stephen M. Town ◽

Katherine C. Wood ◽

Gareth P. Jones ◽

Ross K. Maddox ◽

...

Keyword(s):

Auditory Cortex ◽

Visual Information ◽

Field Potential ◽

Auditory Scene Analysis ◽

Visual Signals ◽

Scene Analysis ◽

Auditory Information ◽

Amplitude Fluctuations ◽

Auditory Scene ◽

Stimulus Features

SummaryHow and where in the brain audio-visual signals are bound to create multimodal objects remains unknown. One hypothesis is that temporal coherence between dynamic multisensory signals provides a mechanism for binding stimulus features across sensory modalities. Here we report that when the luminance of a visual stimulus is temporally coherent with the amplitude fluctuations of one sound in a mixture, the representation of that sound is enhanced in auditory cortex. Critically, this enhancement extends to include both binding and non-binding features of the sound. We demonstrate that visual information conveyed from visual cortex, via the phase of the local field potential is combined with auditory information within auditory cortex. These data provide evidence that early cross-sensory binding provides a bottom-up mechanism for the formation of cross-sensory objects and that one role for multisensory binding in auditory cortex is to support auditory scene analysis.

Download Full-text

Multi-microphone speech enhancement informed by auditory scene analysis

2016 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM) ◽

10.1109/sam.2016.7569625 ◽

2016 ◽

Cited By ~ 2

Author(s):

Axel Plinge ◽

Sharon Gannot

Keyword(s):

Speech Enhancement ◽

Auditory Scene Analysis ◽

Scene Analysis ◽

Auditory Scene

Download Full-text

Attention effects on auditory scene analysis: insights from event-related brain potentials

Psychological Research ◽

10.1007/s00426-014-0547-7 ◽

2014 ◽

Vol 78 (3) ◽

pp. 361-378 ◽

Cited By ~ 10

Author(s):

Mona Isabel Spielmann ◽

Erich Schröger ◽

Sonja A. Kotz ◽

Alexandra Bendixen

Keyword(s):

Auditory Scene Analysis ◽

Scene Analysis ◽

Brain Potentials ◽

Auditory Scene

Download Full-text

Auditory scene analysis based on time-frequency integration of shared FM and AM (II): Optimum time-domain integration and stream sound reconstruction

Systems and Computers in Japan ◽

10.1002/scj.1160 ◽

2002 ◽

Vol 33 (10) ◽

pp. 83-94 ◽

Cited By ~ 2

Author(s):

Mototsugu Abe ◽

Shigeru Ando

Keyword(s):

Time Domain ◽

Auditory Scene Analysis ◽

Scene Analysis ◽

Optimum Time ◽

Time Frequency ◽

Auditory Scene ◽

Domain Integration

Download Full-text

SNR-based mask compensation for computational auditory scene analysis applied to speech recognition in a car environment

10.21437/interspeech.2010-270 ◽

2010 ◽

Author(s):

Ji Hun Park ◽

Seon Man Kim ◽

Jae Sam Yoon ◽

Hong Kook Kim ◽

Sung Joo Lee ◽

...

Keyword(s):

Speech Recognition ◽

Auditory Scene Analysis ◽

Scene Analysis ◽

Computational Auditory Scene Analysis ◽

Auditory Scene

Download Full-text

Three directions in research on auditory scene analysis

10.1121/1.4799217 ◽

2013 ◽

Author(s):

Albert S. Bregman

Keyword(s):

Auditory Scene Analysis ◽

Scene Analysis ◽

Auditory Scene

Download Full-text

Creating Mixtures: The Application of Auditory Scene Analysis (ASA) to Audio Recording

Audio Anecdotes III ◽

10.1201/9781439864869-4 ◽

2007 ◽

pp. 39-51

Author(s):

Wieslaw Woszczyk ◽

Albert S. Bregman

Keyword(s):

Auditory Scene Analysis ◽

Scene Analysis ◽

Audio Recording ◽

Auditory Scene

Download Full-text

Perceptual Processes in Orchestration

The Oxford Handbook of Timbre ◽

10.1093/oxfordhb/9780190637224.013.10 ◽

2018 ◽

Cited By ~ 1

Author(s):

Meghan Goodchild ◽

Stephen McAdams

Keyword(s):

Auditory Scene Analysis ◽

Scene Analysis ◽

Auditory Grouping ◽

Musical Scores ◽

Auditory Scene ◽

Music Research

The study of timbre and orchestration in music research is underdeveloped, with few theories to explain instrumental combinations and orchestral shaping. This chapter will outline connections between the orchestration practices of the nineteenth and early twentieth centuries and perceptual principles based on recent research in auditory scene analysis and timbre perception. Analyses of orchestration treatises and musical scores reveal an implicit understanding of auditory grouping principles by which many orchestral effects and techniques function. We will explore how concurrent grouping cues result in blended combinations of instruments, how sequential grouping into segregated melodies or stratified (foreground and background) layers is influenced by timbral similarities and dissimilarities, and how segmental grouping cues create formal boundaries and expressive gestural shaping through changes in instrumental textures. This exploration will be framed within an examination of historical and contemporary discussion of orchestral effects and techniques.

Download Full-text

Speech Envelope Dynamics for Noise-Robust Auditory Scene Analysis in Robotics

International Journal of Humanoid Robotics ◽

10.1142/s0219843620500231 ◽

2020 ◽

Author(s):

Francesco Rea ◽

Austin Kothig ◽

Lukas Grasse ◽

Matthew Tata

Keyword(s):

Auditory Scene Analysis ◽

Scene Analysis ◽

Auditory Scene ◽

Speech Envelope ◽

Noise Robust

Download Full-text

Intelligibility Redux

Music Theory Online ◽

10.30535/mto.23.2.3 ◽

2017 ◽

Vol 23 (2) ◽

Cited By ~ 1

Author(s):

Anna Zayaruznaya

Keyword(s):

Cognitive Psychology ◽

School Culture ◽

20Th Century ◽

Auditory Scene Analysis ◽

Scene Analysis ◽

Multiple Texts ◽

Human Ability ◽

Auditory Scene ◽

Choir School

The medieval composers of polytextual motets have been charged with rendering multiple texts inaudible by superimposing them. While the limited contemporary evidence provided by Jacobus’s comments in theSpeculum musicaeseems at first sight to suggest that medieval listeners would have had trouble understanding texts declaimed simultaneously, closer scrutiny reveals the opposite: that intelligibility was desirable, and linked to modes of performance. This article explores the ways in which 20th-century performance aesthetics and recording technologies have shaped current ideas about the polytextual motet. Recent studies in cognitive psychology suggest that human ability to perform auditory scene analysis—to focus on a given sound in a complicated auditory environment—is enhanced by directional listening and relatively dry acoustics. But the modern listener often encounters motets on recordings with heavy mixing and reverb. Furthermore, combinations of contrasting vocal timbres, which can help differentiate simultaneously sung texts, are precluded by a blended, uniform sound born jointly of English choir-school culture and modernist preferences propagated under the banner of authenticity. Scholarly accounts of motets that focus on sound over sense are often influenced, directly or indirectly, by such mediated listening.

Download Full-text