Improving Action Recognition via Temporal and Complementary Learning

Nour Eldin Elmadany; Yifeng He; Ling Guan

doi:10.1145/3447686

Improving Action Recognition via Temporal and Complementary Learning

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3447686 ◽

2021 ◽

Vol 12 (3) ◽

pp. 1-24

Author(s):

Nour Eldin Elmadany ◽

Yifeng He ◽

Ling Guan

Keyword(s):

Action Recognition ◽

Network Performance ◽

Recognition Performance ◽

Information Modeling ◽

Local Network ◽

Temporal Representation ◽

Learning Techniques ◽

Temporal Learning ◽

Relational Network

In this article, we study the problem of video-based action recognition. We improve the action recognition performance by finding an effective temporal and appearance representation. For capturing the temporal representation, we introduce two temporal learning techniques for improving long-term temporal information modeling, specifically Temporal Relational Network and Temporal Second-Order Pooling-based Network. Moreover, we harness the representation using complementary learning techniques, specifically Global-Local Network and Fuse-Inception Network. Performance evaluation on three datasets (UCF101, HMDB-51, and Mini-Kinetics-200) demonstrated the superiority of the proposed framework compared to the 2D Deep ConvNets-based state-of-the-art techniques.

Download Full-text

Bidirectional LSTM with saliency-aware 3D-CNN features for human action recognition

Journal of Engineering Research ◽

10.36909/jer.v9i3a.8383 ◽

2021 ◽

Vol 9 (3A) ◽

Author(s):

Sheeraz Arif ◽

◽

Jing Wang ◽

Adnan Ahmed Siddiqui ◽

Rashid Hussain ◽

...

Keyword(s):

Neural Network ◽

Action Recognition ◽

Temporal Dynamics ◽

Recognition Performance ◽

Research Work ◽

Video Stream ◽

Video Frame ◽

Convolutional Network ◽

Bidirectional Lstm

Deep convolutional neural network (DCNN) and recurrent neural network (RNN) have been proved as an imperious research area in multimedia understanding and obtained remarkable action recognition performance. However, videos contain rich motion information with varying dimensions. Existing recurrent based pipelines fail to capture long-term motion dynamics in videos with various motion scales and complex actions performed by multiple actors. Consideration of contextual and salient features is more important than mapping a video frame into a static video representation. This research work provides a novel pipeline by analyzing and processing the video information using a 3D convolution (C3D) network and newly introduced deep bidirectional LSTM. Like popular two-stream convent, we also introduce a two-stream framework with one modification; that is, we replace the optical flow stream by saliency-aware stream to avoid the computational complexity. First, we generate a saliency-aware video stream by applying the saliency-aware method. Secondly, a two-stream 3D-convolutional network (C3D) is utilized with two different types of streams, i.e., RGB stream and saliency-aware video stream, to collect both spatial and semantic temporal features. Next, a deep bidirectional LSTM network is used to learn sequential deep temporal dynamics. Finally, time-series-pooling-layer and softmax-layers classify human activity and behavior. The introduced system can learn long-term temporal dependencies and can predict complex human actions. Experimental results demonstrate the significant improvement in action recognition accuracy on different benchmark datasets.

Download Full-text

Action Recognition Using Dynamic Mode Decomposition for Temporal Representation

2020 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata50022.2020.9377733 ◽

2020 ◽

Author(s):

Kyle Pawlowski ◽

Sumit Chakravarty ◽

Ying Xie ◽

Arjun Kumar Joginipelly

Keyword(s):

Action Recognition ◽

Dynamic Mode Decomposition ◽

Dynamic Mode ◽

Temporal Representation ◽

Mode Decomposition

Download Full-text

Spectrum Based Power Management for Congested IoT Networks

Sensors ◽

10.3390/s21082681 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2681

Author(s):

Kedir Mamo Besher ◽

Juan Ivan Nieto-Hipolito ◽

Raymundo Buenrostro-Mariscal ◽

Mohammed Zamshed Ali

Keyword(s):

Power Management ◽

Network Performance ◽

Channel Allocation ◽

Battery Life ◽

Packet Routing ◽

Data Packets ◽

Increasing Demand ◽

Iot Devices ◽

Set Up

With constantly increasing demand in connected society Internet of Things (IoT) network is frequently becoming congested. IoT sensor devices lose more power while transmitting data through congested IoT networks. Currently, in most scenarios, the distributed IoT devices in use have no effective spectrum based power management, and have no guarantee of a long term battery life while transmitting data through congested IoT networks. This puts user information at risk, which could lead to loss of important information in communication. In this paper, we studied the extra power consumed due to retransmission of IoT data packet and bad communication channel management in a congested IoT network. We propose a spectrum based power management solution that scans channel conditions when needed and utilizes the lowest congested channel for IoT packet routing. It also effectively measured power consumed in idle, connected, paging and synchronization status of a standard IoT device in a congested IoT network. In our proposed solution, a Freescale Freedom Development Board (FREDEVPLA) is used for managing channel related parameters. While supervising the congestion level and coordinating channel allocation at the FREDEVPLA level, our system configures MAC and Physical layer of IoT devices such that it provides the outstanding power utilization based on the operating network in connected mode compared to the basic IoT standard. A model has been set up and tested using freescale launchpads. Test data show that battery life of IoT devices using proposed spectrum based power management increases by at least 30% more than non-spectrum based power management methods embedded within IoT devices itself. Finally, we compared our results with the basic IoT standard, IEEE802.15.4. Furthermore, the proposed system saves lot of memory for IoT devices, improves overall IoT network performance, and above all, decrease the risk of losing data packets in communication. The detail analysis in this paper also opens up multiple avenues for further research in future use of channel scanning by FREDEVPLA board.

Download Full-text

Modeling Long-Term Interactions to Enhance Action Recognition

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9412148 ◽

2021 ◽

Author(s):

Alejandro Cartas ◽

Petia Radeva ◽

Mariella Dimiccoli

Keyword(s):

Action Recognition

Download Full-text

Segment spatial-temporal representation and cooperative learning of Convolution Neural Networks for multimodal-based action recognition

Neurocomputing ◽

10.1016/j.neucom.2020.12.020 ◽

2020 ◽

Author(s):

Ziliang Ren ◽

Qieshi Zhang ◽

Jun Cheng ◽

Fusheng Hao ◽

Xiangyang Gao

Keyword(s):

Neural Networks ◽

Cooperative Learning ◽

Action Recognition ◽

Convolution Neural Networks ◽

Temporal Representation

Download Full-text

A Novel LSTM Model with Interaction Dual Attention for Radar Echo Extrapolation

Remote Sensing ◽

10.3390/rs13020164 ◽

2021 ◽

Vol 13 (2) ◽

pp. 164

Author(s):

Chuyao Luo ◽

Xutao Li ◽

Yongliang Wen ◽

Yunming Ye ◽

Xiaofeng Zhang

Keyword(s):

Short Term Memory ◽

Weather Forecast ◽

Vital Role ◽

Data Sets ◽

Short Term ◽

Learning Techniques ◽

Radar Echo ◽

Hidden States ◽

Better Than

The task of precipitation nowcasting is significant in the operational weather forecast. The radar echo map extrapolation plays a vital role in this task. Recently, deep learning techniques such as Convolutional Recurrent Neural Network (ConvRNN) models have been designed to solve the task. These models, albeit performing much better than conventional optical flow based approaches, suffer from a common problem of underestimating the high echo value parts. The drawback is fatal to precipitation nowcasting, as the parts often lead to heavy rains that may cause natural disasters. In this paper, we propose a novel interaction dual attention long short-term memory (IDA-LSTM) model to address the drawback. In the method, an interaction framework is developed for the ConvRNN unit to fully exploit the short-term context information by constructing a serial of coupled convolutions on the input and hidden states. Moreover, a dual attention mechanism on channels and positions is developed to recall the forgotten information in the long term. Comprehensive experiments have been conducted on CIKM AnalytiCup 2017 data sets, and the results show the effectiveness of the IDA-LSTM in addressing the underestimation drawback. The extrapolation performance of IDA-LSTM is superior to that of the state-of-the-art methods.

Download Full-text

Learning Long-Term Dependencies for Action Recognition with a Biologically-Inspired Deep Network

2017 IEEE International Conference on Computer Vision (ICCV) ◽

10.1109/iccv.2017.84 ◽

2017 ◽

Cited By ~ 11

Author(s):

Yemin Shi ◽

Yonghong Tian ◽

Yaowei Wang ◽

Wei Zeng ◽

Tiejun Huang

Keyword(s):

Action Recognition ◽

Biologically Inspired ◽

Deep Network

Download Full-text

Projecting conflict risk following the Shared Socioeconomic pathways: what role for water stress and climate?

10.5194/egusphere-egu21-101 ◽

2021 ◽

Author(s):

Sophie de Bruin ◽

Jannis Hoch ◽

Nina von Uexkull ◽

Halvard Buhaug ◽

Nico Wanders

Keyword(s):

Climate Change ◽

At Risk ◽

Machine Learning Techniques ◽

Mitigation Measures ◽

Related Factors ◽

Socioeconomic Impacts ◽

Learning Techniques ◽

The Future ◽

Conflict Risk

The socioeconomic impacts of&#160;changes in climate-related&#160;and&#160;hydrology-related factors&#160;are increasingly&#160;acknowledged&#160;to affect&#160;the&#160;on-set&#160;of&#160;violent&#160;conflict.&#160;Full consensus&#160;upon&#160;the general&#160;mechanisms&#160;linking&#160;these&#160;factors&#160;with conflict&#160;is,&#160;however,&#160;still limited.&#160;The absence of full&#160;understanding&#160;of&#160;the non-linearities&#160;between all components and the lack of sufficient data make it&#160;therefore&#160;hard to address violent conflict risk on the long-term.&#160;Although it is&#160;neither&#160;desirable nor feasible&#160;to make exact predictions,&#160;projections are a viable means&#160;to provide&#160;insights into potential&#160;future&#160;conflict risks&#160;and uncertainties thereof.&#160;Hence, making&#160;different&#160;projections is a&#8239;legitimate&#8239;way to deal with and understand these uncertainties, since the construction of diverse scenarios delivers insights into&#160;possible realizations of the future.&#160;&#160;Through&#160;machine learning techniques, we&#8239;(re)assess the major drivers of conflict&#160;for the current situation&#160;in Africa, which are&#160;then&#160;applied to project the regions-at-risk following&#160;different&#160;scenarios.&#160;The model shows to accurately reproduce observed historic patterns leading to a&#160;high ROC score of&#160;0.91.&#160;We show that&#160;socio-economic factors&#160;are&#160;most dominant&#160;when&#160;projecting&#160;conflicts&#160;over&#160;the African continent.&#160;The projections show that there is an&#160;overall&#160;reduction in conflict risk&#160;as a result of&#160;increased&#160;economic welfare that&#160;offsets&#160;the&#160;adverse&#160;impacts&#160;of&#160;climate change and&#160;hydrologic variables.&#160;It must be noted, however, that these projections are based on current relations.&#160;In case the relations of drivers and conflict change in the future, the resulting&#160;regions-at-risk may change too.&#160;&#160; By identifying the most prominent drivers,&#160;conflict risk&#160;mitigation measures can be tuned more accurately to reduce the direct and indirect consequences of climate change&#160;on&#160;the population in Africa.&#160;As new and improved&#160;data becomes available, the model can be updated for more robust projections of conflict risk in Africa under climate change.

Download Full-text

Long-term Cholesterol Risk Prediction using Machine Learning Techniques in ELSA Database

10.5220/0010727200003063 ◽

2021 ◽

Author(s):

Nikos Fazakis ◽

Elias Dritsas ◽

Otilia Kocsis ◽

Nikos Fakotakis ◽

Konstantinos Moustakas

Keyword(s):

Machine Learning ◽

Risk Prediction ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

Organization of transport production of permanent way work with the consideration for information modeling

Transport of the Urals ◽

10.20291/1815-9400-2021-3-65-67 ◽

2021 ◽

pp. 65-67

Author(s):

Tatyana Nikolaevna Asalkhanova ◽

◽

Andrey Alexandrovich Oskolkov ◽

Keyword(s):

Traffic Safety ◽

Production Efficiency ◽

Digital Technologies ◽

Digital Transformation ◽

Information Modeling ◽

Railway Transport ◽

Automated Control ◽

Automated Control System ◽

Advanced Development

The JSC «RZD» Long-term program up to 2025 envisages a digital transformation of railway transport. It dedicates a special attention to information and digital technologies of modeling advanced development of infrastructure for provision of growing traffic, increase of production efficiency, provision of expected result of traffic safety and industry economics in whole. The paper presents a design of the information model for infrastructure control within the framework of introduction of the BIM RZD Automated Control System on the example of organization of transport production of permanent way work.

Download Full-text