An audio-video summarization scheme based on audio and video analysis

Author(s):  
M. Furini ◽  
V. Ghini
2012 ◽  
Vol 263-266 ◽  
pp. 2364-2368
Author(s):  
Dong Lin Ma ◽  
Xi Jun Zhang ◽  
Qian Mi

In this paper, a video summarization representation algorithm was proposed in compressed domain. In particular, Rough sets(RS) theory is introduced for video analysis to increase. Firstly, DCT coefficients and DC coefficients are extracted from video image sequences, so an Information System can construct with DC coefficients. Then Information System is reduced by ruduction theory of RS, the representation of the video frame is obtained by reduced DC coefficients. Finally, we can obtain the reduced Information System, i.e. the Core of Information System. Since the Core contained all the information in video sequences, and at the same time it banished redundant video frame, so it can be viewed as the effective summarization representation. Experimental results indicate that the algorithm can efficiently generate a set of summarization representative of videos sequences and enjoys following advantages. Only a subset of video frames considered during video analysis, so it can avoid the computational complexity, the video summarization representation becomes more scientific than previous methods.


2008 ◽  
Vol 179 (4S) ◽  
pp. 658-659
Author(s):  
Edan Y Shapiro ◽  
Sero Andonian ◽  
Casey A Seideman ◽  
Marcelo J Sette ◽  
Benjamin R Lee ◽  
...  

Author(s):  
Hrishikesh Bhaumik ◽  
Siddhartha Bhattacharyya ◽  
Susanta Chakraborty

Over the past decade, research in the field of Content-Based Video Retrieval Systems (CBVRS) has attracted much attention as it encompasses processing of all the other media types i.e. text, image and audio. Video summarization is one of the most important applications as it potentially enables efficient and faster browsing of large video collections. A concise version of the video is often required due to constraints in viewing time, storage, communication bandwidth as well as power. Thus, the task of video summarization is to effectively extract the most important portions of the video, without sacrificing the semantic information in it. The results of video summarization can be used in many CBVRS applications like semantic indexing, video surveillance copied video detection etc. However, the quality of the summarization task depends on two basic aspects: content coverage and redundancy removal. These two aspects are both important and contradictory to each other. This chapter aims to provide an insight into the state-of-the-art approaches used for this booming field of research.


2012 ◽  
Vol 490-495 ◽  
pp. 465-469
Author(s):  
Xiang Wei Li ◽  
Yu Xiu Kang ◽  
Gang Zheng

Based on Rough Sets (RS), a novel effective video summarization representation was proposed for video analysis in compressed domain. Firstly, DCT coefficients and DC coefficients are extracted from original video image sequences, so an Information System can construct with DC coefficients. Then, Information System is reduced by attributes reduction theory of RS, the representation of the video frame is achieved by reduced DC coefficients. Finally, the reduced Information System can be achieved. Since the Core of Information System contained all major video information in video sequences, which banished the redundant video frame, so it can be considered as the efficient summarization representation. Compared to conventional or existing algorithm, the algorithm enjoys following advantages. (1) Only a subset of video frames considered during video analysis, so it can avoid the computational complexity. (2) The video summarization representation becomes more scientific and efficient than previous methods. (3) According to the reduced frame number, the algorithm can extract hierarchical dynamic video summarization representation.


2021 ◽  
Vol 11 (11) ◽  
pp. 5260
Author(s):  
Theodoros Psallidas ◽  
Panagiotis Koromilas ◽  
Theodoros Giannakopoulos ◽  
Evaggelos Spyrou

The exponential growth of user-generated content has increased the need for efficient video summarization schemes. However, most approaches underestimate the power of aural features, while they are designed to work mainly on commercial/professional videos. In this work, we present an approach that uses both aural and visual features in order to create video summaries from user-generated videos. Our approach produces dynamic video summaries, that is, comprising the most “important” parts of the original video, which are arranged so as to preserve their temporal order. We use supervised knowledge from both the aforementioned modalities and train a binary classifier, which learns to recognize the important parts of videos. Moreover, we present a novel user-generated dataset which contains videos from several categories. Every 1 sec part of each video from our dataset has been annotated by more than three annotators as being important or not. We evaluate our approach using several classification strategies based on audio, video and fused features. Our experimental results illustrate the potential of our approach.


Sign in / Sign up

Export Citation Format

Share Document