An Anchor-Free Siamese Network with Multi-Template Update for Object Tracking

Tongtong Yuan; Wenzhu Yang; Qian Li; Yuxia Wang

doi:10.3390/electronics10091067

An Anchor-Free Siamese Network with Multi-Template Update for Object Tracking

Electronics ◽

10.3390/electronics10091067 ◽

2021 ◽

Vol 10 (9) ◽

pp. 1067

Author(s):

Tongtong Yuan ◽

Wenzhu Yang ◽

Qian Li ◽

Yuxia Wang

Keyword(s):

Object Tracking ◽

Correlation Energy ◽

Feature Maps ◽

Siamese Network ◽

Template Update ◽

Free Network ◽

Multiple Prediction ◽

Bounding Boxes ◽

High Level ◽

Speed And Accuracy

Siamese trackers are widely used in various fields for their advantages of balancing speed and accuracy. Compared with the anchor-based method, the anchor-free-based approach can reach faster speeds without any drop in precision. Inspired by the Siamese network and anchor-free idea, an anchor-free Siamese network (AFSN) with multi-template updates for object tracking is proposed. To improve tracking performance, a dual-fusion method is adopted in which the multi-layer features and multiple prediction results are combined respectively. The low-level feature maps are concatenated with the high-level feature maps to make full use of both spatial and semantic information. To make the results as stable as possible, the final results are obtained by combining multiple prediction results. Aiming at the template update, a high-confidence multi-template update mechanism is used. The average peak to correlation energy is used to determine whether the template should be updated. We use the anchor-free network to implement object tracking in a per-pixel manner, which computes the object category and bounding boxes directly. Experimental results indicate that the average overlap and success rate of the proposed algorithm increase by about 5% and 10%, respectively, compared to the SiamRPN++ algorithm when running on the dataset of GOT-10k (Generic Object Tracking Benchmark).

Download Full-text

Single-scale Siamese Network Based RGB-D Object Tracking With Adaptive Bounding Boxes

Neurocomputing ◽

10.1016/j.neucom.2021.04.016 ◽

2021 ◽

Author(s):

Feng Xiao ◽

Qiuxia Wu ◽

Han Huang

Keyword(s):

Object Tracking ◽

Siamese Network ◽

Single Scale ◽

Bounding Boxes

Download Full-text

Double stage Siamese network object tracking algorithm based on template update

10.1109/eiecs53707.2021.9588075 ◽

2021 ◽

Author(s):

Jingsong Leng ◽

Hua Cai ◽

Weigang Wang ◽

Zhiyong Ma

Keyword(s):

Object Tracking ◽

Tracking Algorithm ◽

Siamese Network ◽

Template Update ◽

Network Object

Download Full-text

Complementary Object Tracking Using Average Peak-to-Correlation Energy

10.3233/faia210046 ◽

2021 ◽

Author(s):

Kosuke Honda ◽

Hamido Fujita

Keyword(s):

Neural Networks ◽

Object Tracking ◽

Convolutional Neural Networks ◽

Correlation Energy ◽

Target Object ◽

The Other ◽

Tracking Performance ◽

Correlation Filter ◽

Evaluation Index ◽

Siamese Network

In recent years, template-based methods such as Siamese network trackers and Correlation Filter (CF) based trackers have achieved state-of-the-art performance in several benchmarks. Recent Siamese network trackers use deep features extracted from convolutional neural networks to locate the target. However, the tracking performance of these trackers decreases when there are similar distractors to the object and the target object is deformed. On the other hand, correlation filter (CF)-based trackers that use handcrafted features (e.g., HOG features) to spatially locate the target. These two approaches have complementary characteristics due to differences in learning methods, features used, and the size of search regions. Also, we found that these trackers are complementary in terms of performance in benchmarking. Therefore, we propose the “Complementary Tracking framework using Average peak-to-correlation energy” (CTA). CTA is the generic object tracking framework that connects CF-trackers and Siamese-trackers in parallel and exploits the complementary features of these. In CTA, when a tracking failure of the Siamese tracker is detected using Average peak-to-correlation energy (APCE), which is an evaluation index of the response map matrix, the CF-trackers correct the output. In experimental on OTB100, CTA significantly improves the performance over the original tracker for several combinations of Siamese-trackers and CF-rackers.

Download Full-text

SIAMESE NETWORK COMBINED WITH ATTENTION MECHANISM FOR OBJECT TRACKING

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2020-1315-2020 ◽

2020 ◽

Vol XLIII-B2-2020 ◽

pp. 1315-1322

Author(s):

D. Zhang ◽

J. Lv ◽

Z. Cheng ◽

Y. Bai ◽

Y. Cao

Keyword(s):

Deep Learning ◽

Object Tracking ◽

Network Model ◽

Input Image ◽

Attention Mechanism ◽

Tracking Algorithm ◽

Learning Object ◽

Feature Maps ◽

Siamese Network ◽

Network Object

Abstract. After the development of deep learning object tracking methods in recent years, the fully convolutional siamese network object tracking algorithm SiamFC has become a more classic deep learning object tracking algorithm. In view of the problem that the accuracy of the tracking results of SiamFC will be reduced in the case of complex backgrounds, this paper introduces the attention mechanism based on the SiamFC, which performs channel and spatial weighting on the feature maps obtained by convolution of the input image. At the same time, the backbone network model of CNN in the algorithm is adjusted, then the siamese network combined with attention mechanism for object tracking is proposed. It can strengthen the effectiveness of the results of feature extraction and enhance the ability of the network model to discriminate targets. In this paper, the algorithm is tested on the OTB2015, VOT2016 and VOT2017 datasets, and compared with multiple object tracking algorithms. Experimental results show that the algorithm in this paper can better solve the complex background problem in object tracking, and has certain advantages compared with other algorithms.

Download Full-text

MFCFSiam: A Correlation-Filter-Guided Siamese Network with Multifeature for Visual Tracking

Wireless Communications and Mobile Computing ◽

10.1155/2020/6681391 ◽

2020 ◽

Vol 2020 ◽

pp. 1-19

Author(s):

Chenpu Li ◽

Qianjian Xing ◽

Zhenguo Ma ◽

Ke Zang

Keyword(s):

Visual Tracking ◽

Correlation Filter ◽

Evaluation Criterion ◽

Visual Object ◽

Semantic Features ◽

Similarity Learning ◽

Feature Maps ◽

Siamese Network ◽

Histograms Of Oriented Gradients ◽

High Level

With the development of deep learning, trackers based on convolutional neural networks (CNNs) have made significant achievements in visual tracking over the years. The fully connected Siamese network (SiamFC) is a typical representation of those trackers. SiamFC designs a two-branch architecture of a CNN and models’ visual tracking as a general similarity-learning problem. However, the feature maps it uses for visual tracking are only from the last layer of the CNN. Those features contain high-level semantic information but lack sufficiently detailed texture information. This means that the SiamFC tracker tends to drift when there are other same-category objects or when the contrast between the target and the background is very low. Focusing on addressing this problem, we design a novel tracking algorithm that combines a correlation filter tracker and the SiamFC tracker into one framework. In this framework, the correlation filter tracker can use the Histograms of Oriented Gradients (HOG) and color name (CN) features to guide the SiamFC tracker. This framework also contains an evaluation criterion which we design to evaluate the tracking result of the two trackers. If this criterion finds the SiamFC tracker fails in some cases, our framework will use the tracking result from the correlation filter tracker to correct the SiamFC. In this way, the defects of SiamFC’s high-level semantic features are remedied by the HOG and CN features. So, our algorithm provides a framework which combines two trackers together and makes them complement each other in visual tracking. And to the best of our knowledge, our algorithm is also the first one which designs an evaluation criterion using correlation filter and zero padding to evaluate the tracking result. Comprehensive experiments are conducted on the Online Tracking Benchmark (OTB), Temple Color (TC128), Benchmark for UAV Tracking (UAV-123), and Visual Object Tracking (VOT) Benchmark. The results show that our algorithm achieves quite a competitive performance when compared with the baseline tracker and several other state-of-the-art trackers.

Download Full-text

A Deep Hyper Siamese Network for Real-Time Object Tracking

Transactions on Machine Learning and Artificial Intelligence ◽

10.14738/tmlai.81.8020 ◽

2020 ◽

Vol 8 (1) ◽

pp. 35-46

Author(s):

Yongpeng Zhao ◽

Lasheng Yu ◽

Xiaopeng Zheng

Keyword(s):

Object Tracking ◽

Target Object ◽

Visual Object ◽

Feature Maps ◽

Deep Convolutional Neural Networks ◽

Backbone Networks ◽

Feature Representations ◽

Siamese Network ◽

Benchmark Datasets ◽

Siamese Networks

Siamese networks have drawn increasing interest in the field of visual object tracking due to their balance of precision and efficiency. However, Siamese trackers use relatively shallow backbone networks, such as AlexNet, and therefore do not take full advantage of the capabilities of modern deep convolutional neural networks (CNNs). Moreover, the feature representations of the target object in a Siamese tracker are extracted through the last layer of CNNs and mainly capture semantic information, which causes the tracker's precision to be relatively low and to drift easily in the presence of similar distractors. In this paper, a new nonpadding residual unit (NPRU) is designed and used to stack a 22-layer deep ResNet, referred as ResNet22. After utilizing ResNet22 as the backbone network, we can build a deep Siamese network, which can greatly enhance the tracking performance. Considering that the different levels of the feature maps of the CNN represent different aspects of the target object, we aggregated different deep convolutional layers to make use of ResNet22's multilevel feature maps, which can form hyperfeature representations of targets. The designed deep hyper Siamese network is named DHSiam. Experimental results show that DHSiam has achieved significant improvement on multiple benchmark datasets.

Download Full-text

Electronic Structure Benchmark Calculations of Inorganic and Biochemical Carboxylation Reactions

10.26434/chemrxiv.6983264 ◽

2018 ◽

Author(s):

Oscar A. Douglas-Gallardo ◽

David A. Sáez ◽

Stefan Vogt-Geisse ◽

Esteban Vöhringer-Martinez

Keyword(s):

Electronic Structure ◽

Chemical Reactions ◽

Density Functional ◽

Correlation Energy ◽

Electronic Energy ◽

Basis Sets ◽

Electronic Correlation ◽

Correct Description ◽

Carbon Dioxide Co2 ◽

High Level

<div><div><div><p>Carboxylation reactions represent a very special class of chemical reactions that is characterized by the presence of a carbon dioxide (CO2) molecule as reactive species within its global chemical equation. These reactions work as fundamental gear to accomplish the CO2 fixation and thus to build up more complex molecules through different technological and biochemical processes. In this context, a correct description of the CO2 electronic structure turns out to be crucial to study the chemical and electronic properties associated with this kind of reactions. Here, a sys- tematic study of CO2 electronic structure and its contribution to different carboxylation reaction electronic energies has been carried out by means of several high-level ab-initio post-Hartree Fock (post-HF) and Density Functional Theory (DFT) calculations for a set of biochemistry and inorganic systems. We have found that for a correct description of the CO2 electronic correlation energy it is necessary to include post-CCSD(T) contributions (beyond the gold standard). These high-order excitations are required to properly describe the interactions of the four π-electrons as- sociated with the two degenerated π-molecular orbitals of the CO2 molecule. Likewise, our results show that in some reactions it is possible to obtain accurate reaction electronic energy values with computationally less demanding methods when the error in the electronic correlation energy com- pensates between reactants and products. Furthermore, the provided post-HF reference values allowed to validate different DFT exchange-correlation functionals combined with different basis sets for chemical reactions that are relevant in biochemical CO2 fixing enzymes.</p></div></div></div>

Download Full-text

Research of single object tracking method based on Siamese Network and Level Set

2020 The 4th International Conference on Video and Image Processing ◽

10.1145/3447450.3447477 ◽

2020 ◽

Author(s):

Tianbo Liu ◽

Li Su ◽

Shuai Yuan ◽

Gong Cheng ◽

Feng Zhang

Keyword(s):

Object Tracking ◽

Level Set ◽

Single Object ◽

Tracking Method ◽

Siamese Network

Download Full-text

Attention Modulated Multiple Object Tracking with Motion Enhancement and Dual Correlation

Symmetry ◽

10.3390/sym13020266 ◽

2021 ◽

Vol 13 (2) ◽

pp. 266 ◽

Cited By ~ 1

Author(s):

Yifeng Wang ◽

Zhijiang Zhang ◽

Ning Zhang ◽

Dan Zeng

Keyword(s):

Object Tracking ◽

Multiple Object Tracking ◽

Tracking Accuracy ◽

Feature Maps ◽

Backbone Networks ◽

Multiple Object ◽

The Arts ◽

Training Stage ◽

The One ◽

Baseline State

The one-shot multiple object tracking (MOT) framework has drawn more and more attention in the MOT research community due to its advantage in inference speed. However, the tracking accuracy of current one-shot approaches could lead to an inferior performance compared with their two-stage counterparts. The reasons are two-fold: one is that motion information is often neglected due to the single-image input. The other is that detection and re-identification (ReID) are two different tasks with different focuses. Joining detection and re-identification at the training stage could lead to a suboptimal performance. To alleviate the above limitations, we propose a one-shot network named Motion and Correlation-Multiple Object Tracking (MAC-MOT). MAC-MOT introduces a motion enhance attention module (MEA) and a dual correlation attention module (DCA). MEA performs differences on adjacent feature maps which enhances the motion-related features while suppressing irrelevant information. The DCA module focuses on decoupling the detection task and re-identification task to strike a balance and reduce the competition between these two tasks. Moreover, symmetry is a core design idea in our proposed framework which is reflected in Siamese-based deep learning backbone networks, the input of dual stream images, as well as a dual correlation attention module. Our proposed approach is evaluated on the popular multiple object tracking benchmarks MOT16 and MOT17. We demonstrate that the proposed MAC-MOT can achieve a better performance than the baseline state of the arts (SOTAs).

Download Full-text

Efficient common objects localization based on deep hybrid Siamese network

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210854 ◽

2021 ◽

pp. 1-10

Author(s):

Mona M. Moussa ◽

Rasha Shoitan ◽

Mohamed S. Abdallah

Keyword(s):

Distance Matrix ◽

Multiple Objects ◽

Siamese Network ◽

Generation Stage ◽

Pascal Voc ◽

The Common ◽

Conventional Methods ◽

Weakly Supervised ◽

Bounding Boxes ◽

Two Stages

Finding the common objects in a set of images is considered one of the recent challenges in different computer vision tasks. Most of the conventional methods have proposed unsupervised and weakly supervised co-localization methods to find the common objects; however, these methods require producing a huge amount of region proposals. This paper tackles this problem by exploiting supervised learning benefits to localize the common object in a set of unlabeled images containing multiple objects or with no common objects. Two stages are proposed to localize the common objects: the candidate box generation stage and the matching and clustering stage. In the candidate box generation stage, the objects are localized and surrounded by the bounding boxes. The matching and clustering stage is applied on the generated bounding boxes and creates a distance matrix based on a trained Siamese network to reflect the matching percentage. Hierarchical clustering uses the generated distance matrix to find the common objects and create clusters for each one. The proposed method is trained on PASCAL VOC 2007 dataset; on the other hand, it is assessed by applying different experiments on PASCAL VOC 2007 6×2 and Object Discovery datasets, respectively. The results reveal that the proposed method outperforms the conventional methods by 8% to 40% in terms of corloc metric.

Download Full-text