scholarly journals A magnocellular contribution to conscious object perception via temporal object segmentation

2014 ◽  
Vol 14 (10) ◽  
pp. 1334-1334
Author(s):  
S. C. Goodhew ◽  
H. L. Boal ◽  
M. Edwards
2020 ◽  
Vol 34 (07) ◽  
pp. 13066-13073 ◽  
Author(s):  
Tianfei Zhou ◽  
Shunzhou Wang ◽  
Yi Zhou ◽  
Yazhou Yao ◽  
Jianwu Li ◽  
...  

In this paper, we present a novel Motion-Attentive Transition Network (MATNet) for zero-shot video object segmentation, which provides a new way of leveraging motion information to reinforce spatio-temporal object representation. An asymmetric attention block, called Motion-Attentive Transition (MAT), is designed within a two-stream encoder, which transforms appearance features into motion-attentive representations at each convolutional stage. In this way, the encoder becomes deeply interleaved, allowing for closely hierarchical interactions between object motion and appearance. This is superior to the typical two-stream architecture, which treats motion and appearance separately in each stream and often suffers from overfitting to appearance information. Additionally, a bridge network is proposed to obtain a compact, discriminative and scale-sensitive representation for multi-level encoder features, which is further fed into a decoder to achieve segmentation results. Extensive experiments on three challenging public benchmarks (i.e., DAVIS-16, FBMS and Youtube-Objects) show that our model achieves compelling performance against the state-of-the-arts. Code is available at: https://github.com/tfzhou/MATNet.


Author(s):  
Mennatullah Siam ◽  
Naren Doraiswamy ◽  
Boris N. Oreshkin ◽  
Hengshuai Yao ◽  
Martin Jagersand

Significant progress has been made recently in developing few-shot object segmentation methods. Learning is shown to be successful in few-shot segmentation settings, using pixel-level, scribbles and bounding box supervision. This paper takes another approach, i.e., only requiring image-level label for few-shot object segmentation. We propose a novel multi-modal interaction module for few-shot object segmentation that utilizes a co-attention mechanism using both visual and word embedding. Our model using image-level labels achieves 4.8% improvement over previously proposed image-level few-shot object segmentation. It also outperforms state-of-the-art methods that use weak bounding box supervision on PASCAL-5^i. Our results show that few-shot segmentation benefits from utilizing word embeddings, and that we are able to perform few-shot segmentation using stacked joint visual semantic processing with weak image-level labels. We further propose a novel setup, Temporal Object Segmentation for Few-shot Learning (TOSFL) for videos. TOSFL can be used on a variety of public video data such as Youtube-VOS, as demonstrated in both instance-level and category-level TOSFL experiments.


Sign in / Sign up

Export Citation Format

Share Document