scholarly journals Representation for Action Recognition Using Trajectory-Based Low-Level Local Feature and Mid-Level Motion Feature

2017 ◽  
Vol 2017 ◽  
pp. 1-7
Author(s):  
Xiaoqiang Li ◽  
Dan Wang ◽  
Yin Zhang

The dense trajectories and low-level local features are widely used in action recognition recently. However, most of these methods ignore the motion part of action which is the key factor to distinguish the different human action. This paper proposes a new two-layer model of representation for action recognition by describing the video with low-level features and mid-level motion part model. Firstly, we encode the compensated flow (w-flow) trajectory-based local features with Fisher Vector (FV) to retain the low-level characteristic of motion. Then, the motion parts are extracted by clustering the similar trajectories with spatiotemporal distance between trajectories. Finally the representation for action video is the concatenation of low-level descriptors encoding vector and motion part encoding vector. It is used as input to the LibSVM for action recognition. The experiment results demonstrate the improvements on J-HMDB and YouTube datasets, which obtain 67.4% and 87.6%, respectively.

2020 ◽  
Vol 34 (07) ◽  
pp. 12886-12893
Author(s):  
Xiao-Yu Zhang ◽  
Haichao Shi ◽  
Changsheng Li ◽  
Peng Li

Weakly supervised action recognition and localization for untrimmed videos is a challenging problem with extensive applications. The overwhelming irrelevant background contents in untrimmed videos severely hamper effective identification of actions of interest. In this paper, we propose a novel multi-instance multi-label modeling network based on spatio-temporal pre-trimming to recognize actions and locate corresponding frames in untrimmed videos. Motivated by the fact that person is the key factor in a human action, we spatially and temporally segment each untrimmed video into person-centric clips with pose estimation and tracking techniques. Given the bag-of-instances structure associated with video-level labels, action recognition is naturally formulated as a multi-instance multi-label learning problem. The network is optimized iteratively with selective coarse-to-fine pre-trimming based on instance-label activation. After convergence, temporal localization is further achieved with local-global temporal class activation map. Extensive experiments are conducted on two benchmark datasets, i.e. THUMOS14 and ActivityNet1.3, and experimental results clearly corroborate the efficacy of our method when compared with the state-of-the-arts.


Drones ◽  
2021 ◽  
Vol 5 (3) ◽  
pp. 87
Author(s):  
Ketan Kotecha ◽  
Deepak Garg ◽  
Balmukund Mishra ◽  
Pratik Narang ◽  
Vipual Kumar Mishra

Visual data collected from drones has opened a new direction for surveillance applications and has recently attracted considerable attention among computer vision researchers. Due to the availability and increasing use of the drone for both public and private sectors, it is a critical futuristic technology to solve multiple surveillance problems in remote areas. One of the fundamental challenges in recognizing crowd monitoring videos’ human action is the precise modeling of an individual’s motion feature. Most state-of-the-art methods heavily rely on optical flow for motion modeling and representation, and motion modeling through optical flow is a time-consuming process. This article underlines this issue and provides a novel architecture that eliminates the dependency on optical flow. The proposed architecture uses two sub-modules, FMFM (faster motion feature modeling) and AAR (accurate action recognition), to accurately classify the aerial surveillance action. Another critical issue in aerial surveillance is a deficiency of the dataset. Out of few datasets proposed recently, most of them have multiple humans performing different actions in the same scene, such as a crowd monitoring video, and hence not suitable for directly applying to the training of action recognition models. Given this, we have proposed a novel dataset captured from top view aerial surveillance that has a good variety in terms of actors, daytime, and environment. The proposed architecture has shown the capability to be applied in different terrain as it removes the background before using the action recognition model. The proposed architecture is validated through the experiment with varying investigation levels and achieves a remarkable performance of 0.90 validation accuracy in aerial action recognition.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Weihua Zhang ◽  
Yi Zhang ◽  
Chaobang Gao ◽  
Jiliu Zhou

This paper introduces a method for human action recognition based on optical flow motion features extraction. Automatic spatial and temporal alignments are combined together in order to encourage the temporal consistence on each action by an enhanced dynamic time warping (DTW) algorithm. At the same time, a fast method based on coarse-to-fine DTW constraint to improve computational performance without reducing accuracy is induced. The main contributions of this study include (1) a joint spatial-temporal multiresolution optical flow computation method which can keep encoding more informative motion information than recent proposed methods, (2) an enhanced DTW method to improve temporal consistence of motion in action recognition, and (3) coarse-to-fine DTW constraint on motion features pyramids to speed up recognition performance. Using this method, high recognition accuracy is achieved on different action databases like Weizmann database and KTH database.


2013 ◽  
Vol 99 ◽  
pp. 144-153 ◽  
Author(s):  
Xiaoyu Deng ◽  
Xiao Liu ◽  
Mingli Song ◽  
Jun Cheng ◽  
Jiajun Bu ◽  
...  

2013 ◽  
Vol 373-375 ◽  
pp. 1188-1191
Author(s):  
Ju Zhong ◽  
Hua Wen Liu ◽  
Chun Li Lin

The extraction methods of both the shape feature based on Fourier descriptors and the motion feature in time domain were introduced. These features were fused to get a hybrid feature which had higher distinguish ability. This combined representation was used for human action recognition. The experimental results show the proposed hybrid feature has efficient recognition performance in the Weizmann action database .


2013 ◽  
Vol E96.D (12) ◽  
pp. 2896-2899
Author(s):  
Wen ZHOU ◽  
Chunheng WANG ◽  
Baihua XIAO ◽  
Zhong ZHANG ◽  
Yunxue SHAO

Sign in / Sign up

Export Citation Format

Share Document