scholarly journals Auditory and visual scene analysis: an overview

2017 ◽  
Vol 372 (1714) ◽  
pp. 20160099 ◽  
Author(s):  
Hirohito M. Kondo ◽  
Anouk M. van Loon ◽  
Jun-Ichiro Kawahara ◽  
Brian C. J. Moore

We perceive the world as stable and composed of discrete objects even though auditory and visual inputs are often ambiguous owing to spatial and temporal occluders and changes in the conditions of observation. This raises important questions regarding where and how ‘scene analysis’ is performed in the brain. Recent advances from both auditory and visual research suggest that the brain does not simply process the incoming scene properties. Rather, top-down processes such as attention, expectations and prior knowledge facilitate scene perception. Thus, scene analysis is linked not only with the extraction of stimulus features and formation and selection of perceptual objects, but also with selective attention, perceptual binding and awareness. This special issue covers novel advances in scene-analysis research obtained using a combination of psychophysics, computational modelling, neuroimaging and neurophysiology, and presents new empirical and theoretical approaches. For integrative understanding of scene analysis beyond and across sensory modalities, we provide a collection of 15 articles that enable comparison and integration of recent findings in auditory and visual scene analysis. This article is part of the themed issue ‘Auditory and visual scene analysis’.

2017 ◽  
Vol 372 (1714) ◽  
pp. 20160105 ◽  
Author(s):  
Rosy Southwell ◽  
Anna Baumann ◽  
Cécile Gal ◽  
Nicolas Barascud ◽  
Karl Friston ◽  
...  

In this series of behavioural and electroencephalography (EEG) experiments, we investigate the extent to which repeating patterns of sounds capture attention. Work in the visual domain has revealed attentional capture by statistically predictable stimuli, consistent with predictive coding accounts which suggest that attention is drawn to sensory regularities. Here, stimuli comprised rapid sequences of tone pips, arranged in regular (REG) or random (RAND) patterns. EEG data demonstrate that the brain rapidly recognizes predictable patterns manifested as a rapid increase in responses to REG relative to RAND sequences. This increase is reminiscent of the increase in gain on neural responses to attended stimuli often seen in the neuroimaging literature, and thus consistent with the hypothesis that predictable sequences draw attention. To study potential attentional capture by auditory regularities, we used REG and RAND sequences in two different behavioural tasks designed to reveal effects of attentional capture by regularity. Overall, the pattern of results suggests that regularity does not capture attention. This article is part of the themed issue ‘Auditory and visual scene analysis’.


2014 ◽  
Vol 369 (1635) ◽  
pp. 20120512 ◽  
Author(s):  
Rebecca Knight ◽  
Caitlin E. Piette ◽  
Hector Page ◽  
Daniel Walters ◽  
Elizabeth Marozzi ◽  
...  

How the brain combines information from different sensory modalities and of differing reliability is an important and still-unanswered question. Using the head direction (HD) system as a model, we explored the resolution of conflicts between landmarks and background cues. Sensory cue integration models predict averaging of the two cues, whereas attractor models predict capture of the signal by the dominant cue. We found that a visual landmark mostly captured the HD signal at low conflicts: however, there was an increasing propensity for the cells to integrate the cues thereafter. A large conflict presented to naive rats resulted in greater visual cue capture (less integration) than in experienced rats, revealing an effect of experience. We propose that weighted cue integration in HD cells arises from dynamic plasticity of the feed-forward inputs to the network, causing within-trial spatial redistribution of the visual inputs onto the ring. This suggests that an attractor network can implement decision processes about cue reliability using simple architecture and learning rules, thus providing a potential neural substrate for weighted cue integration.


2017 ◽  
Vol 372 (1714) ◽  
pp. 20160113 ◽  
Author(s):  
Richard Veale ◽  
Ziad M. Hafed ◽  
Masatoshi Yoshida

Inherent in visual scene analysis is a bottleneck associated with the need to sequentially sample locations with foveating eye movements. The concept of a ‘saliency map’ topographically encoding stimulus conspicuity over the visual scene has proven to be an efficient predictor of eye movements. Our work reviews insights into the neurobiological implementation of visual salience computation. We start by summarizing the role that different visual brain areas play in salience computation, whether at the level of feature analysis for bottom-up salience or at the level of goal-directed priority maps for output behaviour. We then delve into how a subcortical structure, the superior colliculus (SC), participates in salience computation. The SC represents a visual saliency map via a centre-surround inhibition mechanism in the superficial layers, which feeds into priority selection mechanisms in the deeper layers, thereby affecting saccadic and microsaccadic eye movements. Lateral interactions in the local SC circuit are particularly important for controlling active populations of neurons. This, in turn, might help explain long-range effects, such as those of peripheral cues on tiny microsaccades. Finally, we show how a combination of in vitro neurophysiology and large-scale computational modelling is able to clarify how salience computation is implemented in the local circuit of the SC. This article is part of the themed issue ‘Auditory and visual scene analysis’.


2010 ◽  
Vol 2010 ◽  
pp. 1-13 ◽  
Author(s):  
Elisa Magosso ◽  
Andrea Serino ◽  
Giuseppe di Pellegrino ◽  
Mauro Ursino

Many studies have revealed that attention operates across different sensory modalities, to facilitate the selection of relevant information in the multimodal situations of every-day life. Cross-modal links have been observed either when attention is directed voluntarily (endogenous) or involuntarily (exogenous). The neural basis of cross-modal attention presents a significant challenge to cognitive neuroscience. Here, we used a neural network model to elucidate the neural correlates of visual-tactile interactions in exogenous and endogenous attention. The model includes two unimodal (visual and tactile) areas connected with a bimodal area in each hemisphere and a competition between the two hemispheres. The model is able to explain cross-modal facilitation both in exogenous and endogenous attention, ascribing it to an advantaged activation of the bimodal area on the attended side (via a top-down or bottom-up biasing), with concomitant inhibition towards the opposite side. The model suggests that a competitive/cooperative interaction with biased competition may mediate both forms of cross-modal attention.


2017 ◽  
Vol 372 (1714) ◽  
pp. 20160102 ◽  
Author(s):  
Iris I. A. Groen ◽  
Edward H. Silson ◽  
Chris I. Baker

Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis. This article is part of the themed issue ‘Auditory and visual scene analysis’.


Author(s):  
Riitta Salmelin ◽  
Jan Kujala ◽  
Mia Liljeström

When seeking to uncover the brain correlates of language processing, timing and location are of the essence. Magnetoencephalography (MEG) offers them both, with the highest sensitivity to cortical activity. MEG has shown its worth in revealing cortical dynamics of reading, speech perception, and speech production in adults and children, in unimpaired language processing as well as developmental and acquired language disorders. The MEG signals, once recorded, provide an extensive selection of measures for examination of neural processing. Like all other neuroimaging tools, MEG has its own strengths and limitations of which the user should be aware in order to make the best possible use of this powerful method and to generate meaningful and reliable scientific data. This chapter reviews MEG methodology and how MEG has been used to study the cortical dynamics of language.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Elisabeth Gibert-Sotelo ◽  
Isabel Pujol Payet

Abstract The interest in morphology and its interaction with the other grammatical components has increased in the last twenty years, with new approaches coming into stage so as to get more accurate analyses of the processes involved in morphological construal. This special issue is a valuable contribution to this field of study. It gathers a selection of five papers from the Morphology and Syntax workshop (University of Girona, July 2017) which, on the basis of Romance and Latin phenomena, discuss word structure and its decomposition into hierarchies of features. Even though the papers share a compositional view of lexical items, they adopt different formal theoretical approaches to the lexicon-syntax interface, thus showing the benefit of bearing in mind the possibilities that each framework provides. This introductory paper serves as a guide for the readers of this special collection and offers an overview of the topics dealt in each contribution.


Perception ◽  
1989 ◽  
Vol 18 (6) ◽  
pp. 739-751 ◽  
Author(s):  
Christian Marendaz

Interindividual differences in field dependence—independence (FDI) which emerge in situations of vision—posture conflict when subjects are required to orient their bodies vertically were investigated. The first aim was to see whether the same interindividual differences are found in judgements of the orientation of forms in focal vision in which subjects have to deal with conflicting spatial references processed by different sensory modalities. The second aim was to test the idea that the FDI dimension is due to functional habits linked to balancing. Subjects performed Kopfermann's (1930) shape-orientation task in either a stable (experiment 1) or an unstable (experiment 2) postural condition. Results showed that the FDI dimension comes into play in the solution of the Kopfermann shape orientation task, and that there is an interactive link between FDI and postural balance, consistent with theoretical expectations. More generally, it appears that the ‘choice’ of a spatial reference system is the product of both individual and situational characteristics, and that the ‘vicariance’ (or inter-changeability) of the sensory systems dealing with gravitational upright is at the basis of this interaction.


Sign in / Sign up

Export Citation Format

Share Document