scholarly journals On Building a Universal and Compact Visual Vocabulary

2013 ◽  
Vol 2013 ◽  
pp. 1-8 ◽  
Author(s):  
Jian Hou ◽  
Wei-Xue Liu ◽  
Xu E ◽  
Hamid Reza Karimi

Bag-of-visual-words has been shown to be a powerful image representation and attained great success in many computer vision and pattern recognition applications. Usually, for a given dataset, researchers choose to build a specific visual vocabulary from the dataset, and the problem of deriving a universal visual vocabulary is rarely addressed. Based on previous work on the classification performance with respect to visual vocabulary sizes, we arrive at a hypothesis that a universal visual vocabulary can be obtained by taking-into account the similarity extent of keypoints represented by one visual word. We then propose to use a similarity threshold-based clustering method to calculate the optimal vocabulary size, where the universal similarity threshold can be obtained empirically. With the optimal vocabulary size, the optimal visual vocabularies of limited sizes from three datasets are shown to be exchangeable and therefore universal. This result indicates that a universal and compact visual vocabulary can be built from a not too small dataset. Our work narrows the gab between bag-of-visual-words and bag-of-words, where a relatively fixed vocabulary can be used with different text datasets.

2018 ◽  
Vol 45 (1) ◽  
pp. 117-135 ◽  
Author(s):  
Amna Sarwar ◽  
Zahid Mehmood ◽  
Tanzila Saba ◽  
Khurram Ashfaq Qazi ◽  
Ahmed Adnan ◽  
...  

The advancements in the multimedia technologies result in the growth of the image databases. To retrieve images from such image databases using visual attributes of the images is a challenging task due to the close visual appearance among the visual attributes of these images, which also introduces the issue of the semantic gap. In this article, we recommend a novel method established on the bag-of-words (BoW) model, which perform visual words integration of the local intensity order pattern (LIOP) feature and local binary pattern variance (LBPV) feature to reduce the issue of the semantic gap and enhance the performance of the content-based image retrieval (CBIR). The recommended method uses LIOP and LBPV features to build two smaller size visual vocabularies (one from each feature), which are integrated together to build a larger size of the visual vocabulary, which also contains complementary features of both descriptors. Because for efficient CBIR, the smaller size of the visual vocabulary improves the recall, while the bigger size of the visual vocabulary improves the precision or accuracy of the CBIR. The comparative analysis of the recommended method is performed on three image databases, namely, WANG-1K, WANG-1.5K and Holidays. The experimental analysis of the recommended method on these image databases proves its robust performance as compared with the recent CBIR methods.


2019 ◽  
Vol 2019 ◽  
pp. 1-11 ◽  
Author(s):  
Hui Huang ◽  
Yan Ma

The Bag-of-Words (BoW) model is a well-known image categorization technique. However, in conventional BoW, neither the vocabulary size nor the visual words can be determined automatically. To overcome these problems, a hybrid clustering approach that combines improved hierarchical clustering with a K-means algorithm is proposed. We present a cluster validity index for the hierarchical clustering algorithm to adaptively determine when the algorithm should terminate and the optimal number of clusters. Furthermore, we improve the max-min distance method to optimize the initial cluster centers. The optimal number of clusters and initial cluster centers are fed into K-means, and finally the vocabulary size and visual words are obtained. The proposed approach is extensively evaluated on two visual datasets. The experimental results show that the proposed method outperforms the conventional BoW model in terms of categorization and demonstrate the feasibility and effectiveness of our approach.


2021 ◽  
Vol 24 (2) ◽  
pp. 78-86
Author(s):  
Zainab N. Sultani ◽  
◽  
Ban N. Dhannoon ◽  

Image classification is acknowledged as one of the most critical and challenging tasks in computer vision. The bag of visual words (BoVW) model has proven to be very efficient for image classification tasks since it can effectively represent distinctive image features in vector space. In this paper, BoVW using Scale-Invariant Feature Transform (SIFT) and Oriented Fast and Rotated BRIEF(ORB) descriptors are adapted for image classification. We propose a novel image classification system using image local feature information obtained from both SIFT and ORB local feature descriptors. As a result, the constructed SO-BoVW model presents highly discriminative features, enhancing the classification performance. Experiments on Caltech-101 and flowers dataset prove the effectiveness of the proposed method.


2014 ◽  
Vol 2014 ◽  
pp. 1-7
Author(s):  
Wei-Xue Liu ◽  
Jian Hou ◽  
Hamid Reza Karimi

Codebook is an effective image representation method. By clustering in local image descriptors, a codebook is shown to be a distinctive image feature and widely applied in object classification. In almost all existing works on codebooks, the building of the visual vocabulary follows a basic routine, that is, extracting local image descriptors and clustering with a user-designated number of clusters. The problem with this routine lies in that building a codebook for each single dataset is not efficient. In order to deal with this problem, we investigate the influence of vocabulary sizes on classification performance and vocabulary universality with the kNN classifier. Experimental results indicate that, under the condition that the vocabulary size is large enough, the vocabularies built from different datasets are exchangeable and universal.


2020 ◽  
Vol 7 (2) ◽  
pp. 349
Author(s):  
Budiman Baso ◽  
Nanik Suciati

<p class="Abstrak">Ragam motif pada tenun Nusa Tenggara Timur (NTT) seperti flora, fauna dan geometris menjadi suatu keunikan yang dapat membedakan daerah asal dan jenis dari tenun tersebut. Pada penelitian ini, sistem temu kembali citra berbasis isi atau <em>Content-Based Image Retrieval</em> (CBIR) diimplementasikan pada citra tenun NTT sehingga user dapat mencari citra tenun pada <em>database</em> menggunakan citra <em>query </em>berdasarkan fitur visual yang terkandung dalam citra. Seringkali citra <em>query</em> yang diinputkan <em>user</em> memiliki skala, rotasi dan pencahayaan yang bervariasi, sehingga diperlukan suatu metode ektraksi fitur yang dapat mengakomodasi variasi tersebut. Sistem temu kembali citra tenun pada penelitian ini menggunakan model <em>Bag of Visual Words</em> (BoVW) dari <em>keypoints</em> pada citra yang diekstrak dengan metode <em>Speeded Up Robust Feature</em> (SURF). BoVW dibangun menggunakan K-Means untuk menghasilkan <em>visual vocabulary</em> dari <em>keypoints</em> pada seluruh citra <em>training</em>. Representasi BoVW diharapkan dapat menangani variasi skala dan rotasi pada citra. Sedangkan untuk mengatasi variasi pencahayaan pada citra, dilakukan perbaikan kualitas citra dengan menggunakan <em>Contrast Limited Adaptive Histogram Equalization</em> (CLAHE). Percobaan dilakukan dengan membandingkan kinerja dari representasi BoVW yang dibangun menggunakan fitur SURF dengan <em>Maximally Stable Extremal Regions</em> (MSER) pada temu kembali citra tenun. Hasil uji coba menunjukkan bahwa metode SURF menghasilkan rata-rata akurasi 89,86% dan waktu komputasi 9,94 detik, sedangkan MSER menghasilkan rata-rata akurasi 84,04% dan waktu komputasi 1,95 detik.</p><p class="Abstrak"> </p><p class="Abstrak"><em><strong>Abstract</strong></em></p><p class="Abstract"><em>The variety of motifs in East Nusa Tenggara tenun such as flora, fauna and geometric is an unique thing that can distinguish the region of origin and type of the tenun. In this study, the Content-Based Image Retrieval (CBIR) system is implemented in the tenun image. With Content-based techniques Users can search tenun images on the image database by using query images based on visual features contained in the image. Often the query image that the user enters has a different scale, rotation and lighting, so a feature extraction method is needed that can accommodate these differences. The tenun image retrieval system in this study used the Bag of Visual Words (BoVW) model of the keypoints in the extracted image using the Speeded Up Robust Feature (SURF) method. BoVW was built using K-Means to produce visual vocabulary from keypoints on all training images. The representation of BoVW is expected to be able to handle scale variations and rotations in images. Whereas to overcome the lighting variations in the image, image quality improvement is done by using Contrast Limited Adaptive Histogram Equalization (CLAHE). The experiment was conducted by comparing the performance of the BoVW representation which was built using the SURF feature with Maximally Stable Extremal Regions (MSER) at the tenun image retrieval. The results of the trial showed that SURF obtained higher accuracy in all conditions of tenun image data with an average value of 89.86% whereas MSER obtained an average accuracy value of 84.04%. But MSER's computation time is 1.95 seconds faster than SURF which is 9.94 seconds.</em></p><p class="Abstrak"><em><strong><br /></strong></em></p>


2014 ◽  
Vol 8 (5) ◽  
pp. 310-318 ◽  
Author(s):  
Mohammad Mehdi Farhangi ◽  
Mohsen Soryani ◽  
Mahmood Fathy

2011 ◽  
Vol 8 (3) ◽  
pp. 931-951 ◽  
Author(s):  
Xinghao Jiang ◽  
Tanfeng Sun ◽  
Fu Guanglei

Local features have been proved to be effective in image/video semantic analysis. The BOVW (bag of visual words) scheme can cluster local features to form the visual vocabulary which includes an amount of words, where each word is the center of one clustering feature. The vocabulary is used to recognize the image semantic. In this paper, a new scheme to construct semantic-binding hierarchical visual vocabulary is proposed. Some attributes and relationship of the semantic nodes in the model are discussed. The hierarchical semantic model is used to organize the multi-scale semantic into a level-by-level structure. Experiments are performed based on the LabelMe dataset, the performance of our scheme is evaluated and compared with the traditional BOVW scheme, experimental results demonstrate the efficiency and flexibility of our scheme.


2016 ◽  
Vol 30 (3) ◽  
pp. 403-412 ◽  
Author(s):  
Khaled F. Hussain ◽  
Ghada S. Moussa

A large increase in the number and types of vehicles occurred due to the growth in population. This fact brings the need for efficient vehicle classification systems that can be used in traffic surveillance and intelligent transportation systems. In this study, a multi-type vehicle classification system based on Random Neural Networks (RNNs) and Bag-Of-Visual Words (BOVWs) is developed. A 10-fold cross-validation technique is used, with a large dataset, to assess the proposed approach. Moreover, the BOVW–RNN's classification performance is compared with LIVCS, a vehicle classification system based on RNNs. The results reveal that BOVW–RNN classification system produces more reliable and accurate classification results than LIVCS. The main contribution of this paper is that the developed system can serve as a framework for many vehicle classification systems.


Sign in / Sign up

Export Citation Format

Share Document