Deep Reinforcement Learning for Query-Conditioned Video Summarization

Yujia Zhang; Michael Kampffmeyer; Xiaoguang Zhao; Min Tan

doi:10.3390/app9040750

Deep Reinforcement Learning for Query-Conditioned Video Summarization

Applied Sciences ◽

10.3390/app9040750 ◽

2019 ◽

Vol 9 (4) ◽

pp. 750 ◽

Cited By ~ 2

Author(s):

Yujia Zhang ◽

Michael Kampffmeyer ◽

Xiaoguang Zhao ◽

Min Tan

Keyword(s):

Reinforcement Learning ◽

Visual Information ◽

Video Summarization ◽

Experimental Results ◽

Learning Approach ◽

Video Content ◽

User Interests ◽

User Query ◽

Mapping Mechanism

Query-conditioned video summarization requires to (1) find a diverse set of video shots/frames that are representative for the whole video, and that (2) the selected shots/frames are related to a given query. Thus it can be tailored to different user interests leading to a better personalized summary and differs from the generic video summarization which only focuses on video content. Our work targets this query-conditioned video summarization task, by first proposing a Mapping Network (MapNet) in order to express how related a shot is to a given query. MapNet helps establish the relation between the two different modalities (videos and query), which allows mapping of visual information to query space. After that, a deep reinforcement learning-based summarization network (SummNet) is developed to provide personalized summaries by integrating relatedness, representativeness and diversity rewards. These rewards jointly guide the agent to select the most representative and diversity video shots that are most related to the user query. Experimental results on a query-conditioned video summarization benchmark demonstrate the effectiveness of our proposed method, indicating the usefulness of the proposed mapping mechanism as well as the reinforcement learning approach.

Download Full-text

A Reinforcement Learning Approach to Lift Generation in Flapping MAVs: Experimental Results

Proceedings 2007 IEEE International Conference on Robotics and Automation ◽

10.1109/robot.2007.363076 ◽

2007 ◽

Cited By ~ 7

Author(s):

Mehran Motamed ◽

Joseph Yan

Keyword(s):

Reinforcement Learning ◽

Experimental Results ◽

Learning Approach

Download Full-text

A Deep Reinforcement Learning Approach to The Ancient Indian Game - Chowka Bhara

10.36227/techrxiv.16780414 ◽

2021 ◽

Author(s):

Annapurna P Patil ◽

SANJAY RAGHAVENDRA ◽

Shruthi Srinarasi ◽

Reshma Ram

Keyword(s):

Artificial Intelligence ◽

Reinforcement Learning ◽

Experimental Results ◽

Learning Approach ◽

Board Game ◽

Q Learning

<p>Reinforcement Learning (RL) is the study of how Artificial Intelligence (AI) agents learn to make their own decisions in an environment to maximize the cumulative reward received. Although there has been notable progress in the application of RL for games, the category of ancient Indian games has remained almost untouched. Chowka Bhara is one such ancient Indian board game. This work aims at developing a Q-Learning-based RL Chowka Bhara player whose strategies and methodologies are obtained from three Strategic Players viz. Fast Player, Random Player, and Balanced Player. It is observed through the experimental results that the Q-Learning Player outperforms all three Strategic Players.</p>

Download Full-text

Video Summarization via Label Distributions Dual-Reward

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/331 ◽

2021 ◽

Author(s):

Yongbiao Gao ◽

Ning Xu ◽

Xin Geng

Keyword(s):

Reinforcement Learning ◽

Video Summarization ◽

Experimental Results ◽

Average Score ◽

State Representation ◽

Benchmark Datasets ◽

Video Summaries ◽

Paper Label ◽

Reward Mechanism ◽

Score Distributions

Reinforcement learning maps from perceived state representation to actions, which is adopted to solve the video summarization problem. The reward is crucial for deal with the video summarization task via reinforcement learning, since the reward signal defines the goal of video summarization. However, existing reward mechanism in reinforcement learning cannot handle the ambiguity which appears frequently in video summarization, i.e., the diverse consciousness by different people on the same video. To solve this problem, in this paper label distributions are mapped from the CNN and LSTM-based state representation to capture the subjectiveness of video summaries. The dual-reward is designed by measuring the similarity between user score distributions and the generated label distributions. Not only the average score but also the the variance of the subjective opinions are considered in summary generation. Experimental results on several benchmark datasets show that our proposed method outperforms other approaches under various settings.

Download Full-text

A Deep Reinforcement Learning Approach to The Ancient Indian Game - Chowka Bhara

10.36227/techrxiv.16780414.v1 ◽

2021 ◽

Author(s):

Annapurna P Patil ◽

SANJAY RAGHAVENDRA ◽

Shruthi Srinarasi ◽

Reshma Ram

Keyword(s):

Artificial Intelligence ◽

Reinforcement Learning ◽

Experimental Results ◽

Learning Approach ◽

Board Game ◽

Q Learning

Download Full-text

Detecting “DeepFakes” in H.264 Video Data Using Compression Ghost Artifacts

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.4.mwsf-116 ◽

2020 ◽

Vol 2020 (4) ◽

pp. 116-1-116-7

Author(s):

Raphael Antonius Frick ◽

Sascha Zmudzinski ◽

Martin Steinebach

Keyword(s):

Image Forensics ◽

Video Data ◽

Experimental Results ◽

Video Sequences ◽

The Internet ◽

Video Content ◽

High Quality ◽

The Public

In recent years, the number of forged videos circulating on the Internet has immensely increased. Software and services to create such forgeries have become more and more accessible to the public. In this regard, the risk of malicious use of forged videos has risen. This work proposes an approach based on the Ghost effect knwon from image forensics for detecting forgeries in videos that can replace faces in video sequences or change the mimic of a face. The experimental results show that the proposed approach is able to identify forgery in high-quality encoded video content.

Download Full-text