scholarly journals Object Tracking with Adaptive Multicue Incremental Visual Tracker

2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Jiang-tao Wang ◽  
De-bao Chen ◽  
Jing-ai Zhang ◽  
Su-wen Li ◽  
Xing-jun Wang

Generally, subspace learning based methods such as the Incremental Visual Tracker (IVT) have been shown to be quite effective for visual tracking problem. However, it may fail to follow the target when it undergoes drastic pose or illumination changes. In this work, we present a novel tracker to enhance the IVT algorithm by employing a multicue based adaptive appearance model. First, we carry out the integration of cues both in feature space and in geometric space. Second, the integration directly depends on the dynamically-changing reliabilities of visual cues. These two aspects of our method allow the tracker to easily adapt itself to the changes in the context and accordingly improve the tracking accuracy by resolving the ambiguities. Experimental results demonstrate that subspace-based tracking is strongly improved by exploiting the multiple cues through the proposed algorithm.

2016 ◽  
Vol 2016 ◽  
pp. 1-13
Author(s):  
Honghong Yang ◽  
Shiru Qu

Object tracking based on sparse representation has given promising tracking results in recent years. However, the trackers under the framework of sparse representation always overemphasize the sparse representation and ignore the correlation of visual information. In addition, the sparse coding methods only encode the local region independently and ignore the spatial neighborhood information of the image. In this paper, we propose a robust tracking algorithm. Firstly, multiple complementary features are used to describe the object appearance; the appearance model of the tracked target is modeled by instantaneous and stable appearance features simultaneously. A two-stage sparse-coded method which takes the spatial neighborhood information of the image patch and the computation burden into consideration is used to compute the reconstructed object appearance. Then, the reliability of each tracker is measured by the tracking likelihood function of transient and reconstructed appearance models. Finally, the most reliable tracker is obtained by a well established particle filter framework; the training set and the template library are incrementally updated based on the current tracking results. Experiment results on different challenging video sequences show that the proposed algorithm performs well with superior tracking accuracy and robustness.


Symmetry ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 266 ◽  
Author(s):  
Yifeng Wang ◽  
Zhijiang Zhang ◽  
Ning Zhang ◽  
Dan Zeng

The one-shot multiple object tracking (MOT) framework has drawn more and more attention in the MOT research community due to its advantage in inference speed. However, the tracking accuracy of current one-shot approaches could lead to an inferior performance compared with their two-stage counterparts. The reasons are two-fold: one is that motion information is often neglected due to the single-image input. The other is that detection and re-identification (ReID) are two different tasks with different focuses. Joining detection and re-identification at the training stage could lead to a suboptimal performance. To alleviate the above limitations, we propose a one-shot network named Motion and Correlation-Multiple Object Tracking (MAC-MOT). MAC-MOT introduces a motion enhance attention module (MEA) and a dual correlation attention module (DCA). MEA performs differences on adjacent feature maps which enhances the motion-related features while suppressing irrelevant information. The DCA module focuses on decoupling the detection task and re-identification task to strike a balance and reduce the competition between these two tasks. Moreover, symmetry is a core design idea in our proposed framework which is reflected in Siamese-based deep learning backbone networks, the input of dual stream images, as well as a dual correlation attention module. Our proposed approach is evaluated on the popular multiple object tracking benchmarks MOT16 and MOT17. We demonstrate that the proposed MAC-MOT can achieve a better performance than the baseline state of the arts (SOTAs).


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 2894
Author(s):  
Minh-Quan Dao ◽  
Vincent Frémont

Multi-Object Tracking (MOT) is an integral part of any autonomous driving pipelines because it produces trajectories of other moving objects in the scene and predicts their future motion. Thanks to the recent advances in 3D object detection enabled by deep learning, track-by-detection has become the dominant paradigm in 3D MOT. In this paradigm, a MOT system is essentially made of an object detector and a data association algorithm which establishes track-to-detection correspondence. While 3D object detection has been actively researched, association algorithms for 3D MOT has settled at bipartite matching formulated as a Linear Assignment Problem (LAP) and solved by the Hungarian algorithm. In this paper, we adapt a two-stage data association method which was successfully applied to image-based tracking to the 3D setting, thus providing an alternative for data association for 3D MOT. Our method outperforms the baseline using one-stage bipartite matching for data association by achieving 0.587 Average Multi-Object Tracking Accuracy (AMOTA) in NuScenes validation set and 0.365 AMOTA (at level 2) in Waymo test set.


Author(s):  
Hongyang Yu ◽  
Guorong Li ◽  
Weigang Zhang ◽  
Hongxun Yao ◽  
Qingming Huang

Author(s):  
Tianyang Xu ◽  
Zhenhua Feng ◽  
Xiao-Jun Wu ◽  
Josef Kittler

AbstractDiscriminative Correlation Filters (DCF) have been shown to achieve impressive performance in visual object tracking. However, existing DCF-based trackers rely heavily on learning regularised appearance models from invariant image feature representations. To further improve the performance of DCF in accuracy and provide a parsimonious model from the attribute perspective, we propose to gauge the relevance of multi-channel features for the purpose of channel selection. This is achieved by assessing the information conveyed by the features of each channel as a group, using an adaptive group elastic net inducing independent sparsity and temporal smoothness on the DCF solution. The robustness and stability of the learned appearance model are significantly enhanced by the proposed method as the process of channel selection performs implicit spatial regularisation. We use the augmented Lagrangian method to optimise the discriminative filters efficiently. The experimental results obtained on a number of well-known benchmarking datasets demonstrate the effectiveness and stability of the proposed method. A superior performance over the state-of-the-art trackers is achieved using less than $$10\%$$ 10 % deep feature channels.


Sensors ◽  
2019 ◽  
Vol 19 (13) ◽  
pp. 2848 ◽  
Author(s):  
Leonel Rosas-Arias ◽  
Jose Portillo-Portillo ◽  
Aldo Hernandez-Suarez ◽  
Jesus Olivares-Mercado ◽  
Gabriel Sanchez-Perez ◽  
...  

The counting of vehicles plays an important role in measuring the behavior patterns of traffic flow in cities, as streets and avenues can get crowded easily. To address this problem, some Intelligent Transport Systems (ITSs) have been implemented in order to count vehicles with already established video surveillance infrastructure. With this in mind, in this paper, we present an on-line learning methodology for counting vehicles in video sequences based on Incremental Principal Component Analysis (Incremental PCA). This incremental learning method allows us to identify the maximum variability (i.e., motion detection) between a previous block of frames and the actual one by using only the first projected eigenvector. Once the projected image is obtained, we apply dynamic thresholding to perform image binarization. Then, a series of post-processing steps are applied to enhance the binary image containing the objects in motion. Finally, we count the number of vehicles by implementing a virtual detection line in each of the road lanes. These lines determine the instants where the vehicles pass completely through them. Results show that our proposed methodology is able to count vehicles with 96.6% accuracy at 26 frames per second on average—dealing with both camera jitter and sudden illumination changes caused by the environment and the camera auto exposure.


2017 ◽  
Vol 237 ◽  
pp. 101-113 ◽  
Author(s):  
Zhiqiang Zhao ◽  
Ping Feng ◽  
Tianjiang Wang ◽  
Fang Liu ◽  
Caihong Yuan ◽  
...  

Behaviour ◽  
2009 ◽  
Vol 146 (11) ◽  
pp. 1485-1498 ◽  
Author(s):  
Nancy Kohn ◽  
Robert Jaeger

AbstractThe use of multiple cues can enhance the detection, recognition, discrimination, and memorability of individuals by receivers. We conducted two experiments, using only males, to test whether territorial red-backed salamanders, Plethodon cinereus, could use only chemical or only visual cues to remember familiar conspecifics. In both experiments, focal males spent significantly more time threatening unfamiliar than familiar male intruders. They also chemoinvestigated the filter paper containing chemical cues of unfamiliar intruders more often than that of familiar intruders. These results suggest that red-backed salamanders can use both chemical and visual cues to recognize familiar individuals, allowing them to distinguish between less threatening neighbours and more threatening intruders in the heterogeneous forest floor habitat, where visual cues alone would not always be available.


Sign in / Sign up

Export Citation Format

Share Document