Boosting bottom-up and top-down visual features for saliency estimation

What activates the human mirror neuron system during observation of artificial movements: Bottom-up visual features or top-down intentions?

Neuropsychologia ◽

10.1016/j.neuropsychologia.2008.01.025 ◽

2008 ◽

Vol 46 (7) ◽

pp. 2033-2042 ◽

Cited By ~ 21

Author(s):

Annerose Engel ◽

Michael Burke ◽

Katja Fiehler ◽

Siegfried Bien ◽

Frank Rösler

Keyword(s):

Mirror Neuron System ◽

Mirror Neuron ◽

Visual Features ◽

Top Down ◽

Bottom Up ◽

Neuron System

Download Full-text

Bi-Directional Co-Attention Network for Image Captioning

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3460474 ◽

2021 ◽

Vol 17 (4) ◽

pp. 1-20

Author(s):

Weitao Jiang ◽

Weixuan Wang ◽

Haifeng Hu

Keyword(s):

A Priori ◽

Attention Mechanism ◽

Superior Performance ◽

Significant Advance ◽

Visual Features ◽

Image Captioning ◽

Top Down ◽

Bottom Up ◽

Attention Network ◽

Benchmark Datasets

Image Captioning, which automatically describes an image with natural language, is regarded as a fundamental challenge in computer vision. In recent years, significant advance has been made in image captioning through improving attention mechanism. However, most existing methods construct attention mechanisms based on singular visual features, such as patch features or object features, which limits the accuracy of generated captions. In this article, we propose a Bidirectional Co-Attention Network (BCAN) that combines multiple visual features to provide information from different aspects. Different features are associated with predicting different words, and there are a priori relations between these multiple visual features. Based on this, we further propose a bottom-up and top-down bi-directional co-attention mechanism to extract discriminative attention information. Furthermore, most existing methods do not exploit an effective multimodal integration strategy, generally using addition or concatenation to combine features. To solve this problem, we adopt the Multivariate Residual Module (MRM) to integrate multimodal attention features. Meanwhile, we further propose a Vertical MRM to integrate features of the same category, and a Horizontal MRM to combine features of the different categories, which can balance the contribution of the bottom-up co-attention and the top-down co-attention. In contrast to the existing methods, the BCAN is able to obtain complementary information from multiple visual features via the bi-directional co-attention strategy, and integrate multimodal information via the improved multivariate residual strategy. We conduct a series of experiments on two benchmark datasets (MSCOCO and Flickr30k), and the results indicate that the proposed BCAN achieves the superior performance.

Download Full-text

Salient Object Detection via Bottom-up and Top-down Visual Features

International Journal of Advancements in Computing Technology ◽

10.4156/ijact.vol5.issue6.92 ◽

2013 ◽

Vol 5 (6) ◽

pp. 785-793

Author(s):

Chao Jia ◽

Lin Yang ◽

Fang Hou ◽

Liangliang Duan

Keyword(s):

Object Detection ◽

Salient Object Detection ◽

Visual Features ◽

Salient Object ◽

Top Down ◽

Bottom Up

Download Full-text

Feature-specific patterns of attention and functional connectivity in human visual cortex

10.1101/869552 ◽

2019 ◽

Author(s):

Kirstie Wailes-Newson ◽

Antony B Morland ◽

Richard J. W. Vernon ◽

Alex R. Wade

Keyword(s):

Visual Stimulus ◽

Critical Factor ◽

Attentional Focus ◽

Visual Features ◽

Top Down ◽

Bottom Up ◽

Orientation Contrast ◽

Visual Areas ◽

Stimulus Features ◽

Outstanding Question

AbstractAttending to different features of a scene can alter the responses of neurons in early- and mid- level visual areas but the nature of this change depends on both the (top down) attentional task and the (bottom up) visual stimulus. One outstanding question is the spatial scale at which cortex is modulated by attention to low-level stimulus features such as shape, contrast and orientation. It is unclear whether the recruitment of neurons to particular tasks occurs at an area level or at the level of intra-areal sub-populations, or whether the critical factor is a change in the way that areas communicate with each other. Here we use functional magnetic resonance imaging (fMRI) and psychophysics, to ask how areas known to be involved in processing different visual features (orientation, contrast and shape) are modulated as participants switch between tasks based on those features while the visual stimulus itself is effectively constant. At a univariate level, we find almost no feature-specific bottom-up or top-down responses in the areas we examine. However, multivariate analyses reveal a complex pattern of voxel-level modulation driven by attentional task. Connectivity analyses also demonstrate flexible and selective patterns of connectivity between early visual areas as a function of attentional focus. Overall, we find that attention alters the sensitivity and connectivity of neuronal subpopulations within individual early visual areas but, surprisingly, not the univariate response amplitudes of the areas themselves.

Download Full-text