scholarly journals Video Retrieval Berdasarkan Teks dan Gambar

Author(s):  
Rahmi Hidayati ◽  
Agus Harjoko

AbstrakVideo retrieval digunakan untuk melakukan pencarian video berdasarkan query yang dimasukkan oleh user yaitu teks dan gambar. Sistem ini dapat meningkatkan kemampuan pencarian terhadap video dan diharapkan dapat mengurangi waktu temu-kembali video. Tujuan dari penelitian ini adalah merancang dan membuat sebuah aplikasi perangkat lunak video retrieval berdasarkan teks dan gambar yang ada dalam video. Proses indeks untuk teks adalah proses tokenizing, filtering (stopword), stemming. Hasil stemming disimpan dalam tabel indeks teks. Proses indeks untuk gambar adalah  membuat histogram warna dan menghitung nilai rata-rata serta standar deviasi pada setiap warna dasar red, green dan blue (RGB) dari setiap gambar. Hasil ekstraksi fitur disimpan pada tabel gambar. Proses retrieval video menggunakan query yaitu  teks, gambar atau keduanya. Untuk query teks sistem memproses query teks dengan melihat query teks pada tabel indeks teks. Jika query teks ada pada tabel indeks teks sistem akan menampilkan informasi video sesuai dengan teks query. Untuk query gambar sistem memproses query gambar dengan mencari nilai dari fitur ekstraksi yaitu means red, means green, means blue, standar deviasi red, standar deviasi green dan standar deviasi blue. Jika nilai hasil ekstraksi ke enam fitur  query gambar ada pada tabel indeks gambar  sistem akan menampilkan informasi video sesuai dengan gambar query. Untuk query teks dan query gambar, sistem akan menampilkan informasi video jika query teks dan query gambar memiliki keterkaitan yaitu  query teks dan query gambar mempunyai judul film yang sama.  Kata kunci—  video, indeks, retrieval, teks, gambar AbstractRetrieval video has been used to search a video based on the query entered by user which were text and image. This system could increase the searching ability on video browsing and expected to reduce the video’s retrieval time. The research purposes were designing and creating a software application of retrieval video based on the text and image on the video. The index process for the text is tokenizing, filtering (stopword), stemming. The results of stemming to saved in the text index table. Index process for the image is to create an image color histogram and compute the mean and standard deviation at each primary color red, green and blue (RGB) of each image. The results of feature extraction is stored in the image table The process of video retrieval using the query text, images or both. To text query system to process the text query by looking at the text index tables. If there is a text query on the index table system will display information of the video according to the text query. To image query system to process the image query by finding the value of the feature extraction means red, green means, means blue, red standard deviation, standard deviation and standard deviation of blue green. If the value of the six features extracted query image on the index table image will display the video information system according to the query image. To query text and query images, the system will display the video information if the query text and query images have a relationship that is query text and query image has the same film title.  Keywords—  video, index, retrieval, text, image

2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Guanhua Wang ◽  
Hua Ji ◽  
Dexin Kong ◽  
Na Zhang

Nowadays, the heterogeneity gap of different modalities is the key problem for cross-modal retrieval. In order to overcome heterogeneity gaps, potential correlations of different modalities need to be mined. At the same time, the semantic information of class labels is used to reduce the semantic gaps between different modalities data and realize the interdependence and interoperability of heterogeneous data. In order to fully exploit the potential correlation of different modalities, we propose a cross-modal retrieval framework based on graph regularization and modality dependence (GRMD). Firstly, considering the potential feature correlation and semantic correlation, different projection matrices are learned for different retrieval tasks, such as image query text (I2T) or text query image (T2I). Secondly, utilizing the internal structure of original feature space constructs an adjacent graph with semantic information constraints which can make different labels of heterogeneous data closer to the corresponding semantic information. The experimental results on three widely used datasets demonstrate the effectiveness of our method.


Author(s):  
Desi Amirullah ◽  
Ali Ridho Barakbah ◽  
Achmad Basuki

The term "Songket" comes from the Malay word "Sungkit", which means "to hook" or "to gouge". Every motifs names and variations was derived from plants and animals as source of inspiration to create many patterns of songket. Each of songket patterns have a philosophy in form of rhyme that refers to the nature of the sources of songket patterns and that philosophy reflects to the beliefs and values of Malay culture. In this research, we propose a system to facilitate an understanding of songket and the philosophy as a way to conserve Songket culture. We propose a system which is able to collect information in image songket motif variations based on feature extraction methods. On each image songket motif variations, we extracted philosophy of rhyme into impressions, and extracting color features of songket images using a histogram 3D-Color Vector quantization (3D-CVQ), shape feature extraction songket image using HU Moment invariants. Then, we created an image search based on impressions, and impressions search based on image. We use techniques of search based on color, shape and aggregation (combination of colors and shapes). The experiment using impression as query : 1) Result based on color, the average value of true 7.3, total score 41.9, 2) Result based on shape, the average value of true 3, total score 16.4, 3) Result based on aggregation, the average value of true 3, total score 17.4. While based using Image Query : 1) Result based on color, the average precision 95%, 2) Result based on shape, average precision 43.3%, 3) Based aggregation, the average precision 73.3%. From our experiments, it can be concluded that the best search system using query impression and query image is based on the color.Keyword : Image Search, Philosophy, impression, Songket, cultural computing, Feature Extraction, Analytical aggregation.


There has been a revolution in multimedia with technological advancement. Hence, Video recording has increased in leaps and bounds. Video retrieval from a huge database is cumbersome by the existing text based search since a lot of human effort is involved and the retrieval efficiency is meager as well. In view of the present challenges, video retrieval based on video content prevails over the existing conventional methods. Content implies real video information such as video features. The performance of the Content Based Video Retrieval (CBVR) depends on Feature extraction and similar features matching. Since the selection of features in the existing algorithms is not effective, the retrieval processing time is more and the efficiency is less. Combined features of color and motion have been proposed for feature extraction and Spatio-Temporal Scale Invariant Feature Transform is used for Shot Boundary Detection. Since the characteristic of color feature is visual video content and that of motion feature is temporal content, these two features are significant in effective video retrieval. The performance of the CBVR system has been evaluated on the TRECVID dataset and the retrieved videos reveal the effectiveness of proposed algorithm.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Chen Zhang ◽  
Bin Hu ◽  
Yucong Suo ◽  
Zhiqiang Zou ◽  
Yimu Ji

In this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high search efficiency. Our framework consists of the key-frame extraction algorithm and the feature aggregation strategy. Specifically, the key-frame extraction algorithm takes advantage of the clustering idea so that redundant information is removed in video data and storage cost is greatly reduced. The feature aggregation strategy adopts average pooling to encode deep local convolutional features followed by coarse-to-fine retrieval, which allows rapid retrieval in the large-scale video database. The results from extensive experiments on two publicly available datasets demonstrate that the proposed method achieves superior efficiency as well as accuracy over other state-of-the-art visual search methods.


Content-Based Image Retrieval (CBIR) is extensively used technique for image retrieval from large image databases. However, users are not satisfied with the conventional image retrieval techniques. In addition, the advent of web development and transmission networks, the number of images available to users continues to increase. Therefore, a permanent and considerable digital image production in many areas takes place. Quick access to the similar images of a given query image from this extensive collection of images pose great challenges and require proficient techniques. From query by image to retrieval of relevant images, CBIR has key phases such as feature extraction, similarity measurement, and retrieval of relevant images. However, extracting the features of the images is one of the important steps. Recently Convolutional Neural Network (CNN) shows good results in the field of computer vision due to the ability of feature extraction from the images. Alex Net is a classical Deep CNN for image feature extraction. We have modified the Alex Net Architecture with a few changes and proposed a novel framework to improve its ability for feature extraction and for similarity measurement. The proposal approach optimizes Alex Net in the aspect of pooling layer. In particular, average pooling is replaced by max-avg pooling and the non-linear activation function Maxout is used after every Convolution layer for better feature extraction. This paper introduces CNN for features extraction from images in CBIR system and also presents Euclidean distance along with the Comprehensive Values for better results. The proposed framework goes beyond image retrieval, including the large-scale database. The performance of the proposed work is evaluated using precision. The proposed work show better results than existing works.


Author(s):  
Bogdan Ionescu ◽  
Alexandru Marin ◽  
Patrick Lambert ◽  
Didier Coquin ◽  
Constantin Vertan

This article discusses content-based access to video information in large video databases and particularly, to retrieve animated movies. The authors examine temporal segmentation, and propose cut, fade and dissolve detection methods adapted to the constraints of this domain. Further, the authors discuss a fuzzy linguistic approach for deriving automatic symbolic/semantic content annotation in terms of color techniques and action content. The proposed content descriptions are then used with several data mining techniques (SVM, k-means) to automatically retrieve the animation genre and to classify animated movies according to some color techniques. The authors integrate all the previous techniques to constitute a prototype client-server architecture for a 3D virtual environment for interactive video retrieval.


Author(s):  
Lilac Al-Safadi ◽  
Janusz Getta

The advancement of multimedia technologies has enabled electronic processing of information to be recorded in formats that are different from the standard text format. These include image, audio and video formats. The video format is a rich and expressive form of media used in many areas of our everyday life, such as in education, medicine and engineering. The expressiveness of video documents is the main reason for their domination in future information systems. Therefore, effective and efficient access to video information that supports video-based applications has become a critical research area. This has led to the development of, for example, new digitizing and compression tools and technology, video data models and query languages, video data management systems and video analyzers. With applications of a vast amount of stored video data, such as news archives and digital television, video retrieval became, and still is, an active area of research.


2017 ◽  
Vol 10 (1) ◽  
pp. 85-108 ◽  
Author(s):  
Khadidja Belattar ◽  
Sihem Mostefai ◽  
Amer Draa

The use of Computer-Aided Diagnosis in dermatology raises the necessity of integrating Content-Based Image Retrieval (CBIR) technologies. The latter could be helpful to untrained users as a decision support system for skin lesion diagnosis. However, classical CBIR systems perform poorly due to semantic gap. To alleviate this problem, we propose in this paper an intelligent Content-Based Dermoscopic Image Retrieval (CBDIR) system with Relevance Feedback (RF) for melanoma diagnosis that exhibits: efficient and accurate image retrieval as well as visual features extraction that is independent of any specific diagnostic method. After submitting a query image, the proposed system uses linear kernel-based active SVM, combined with histogram intersection-based similarity measure to retrieve the K most similar skin lesion images. The dominant (melanoma, benign) class in this set will be identified as the image query diagnosis. Extensive experiments conducted on our system using a 1097 image database show that the proposed scheme is more effective than CBDIR without the assistance of RF.


1983 ◽  
Vol 17 (3) ◽  
pp. 323-330 ◽  
Author(s):  
F. S. Hill ◽  
Sheldon Walker ◽  
Fuwen Gao

Sign in / Sign up

Export Citation Format

Share Document