scholarly journals Combining Convolutional Neural Network and Markov Random Field for Semantic Image Retrieval

2018 ◽  
Vol 2018 ◽  
pp. 1-11
Author(s):  
Haijiao Xu ◽  
Changqin Huang ◽  
Xiaodi Huang ◽  
Chunyan Xu ◽  
Muxiong Huang

With the rapidly growing number of images over the Internet, efficient scalable semantic image retrieval becomes increasingly important. This paper presents a novel approach for semantic image retrieval by combining Convolutional Neural Network (CNN) and Markov Random Field (MRF). As a key step, image concept detection, that is, automatically recognizing multiple semantic concepts in an unlabeled image, plays an important role in semantic image retrieval. Unlike previous work that uses single-concept classifiers one by one, we detect semantic multiconcept by using a multiconcept scene classifier. In other words, our approach takes multiple concepts as a holistic scene for multiconcept scene learning. Specifically, we first train a CNN as a concept classifier, which further includes two types of classifiers: a single-concept fully connected classifier that is best suited to single-concept detection and a multiconcept scene fully connected classifier that is good for holistic scene detection. Then we propose an MRF-based late fusion approach that is able to effectively learn the semantic correlation between the single-concept classifier and multiconcept scene classifier. Finally, the semantic correlation among the subconcepts of images is cought to further improve detection precision. In order to investigate the feasibility and effectiveness of our proposed approach, we conduct comprehensive experiments on two publicly available image databases. The results show that our proposed approach outperforms several state-of-the-art approaches.

Author(s):  
Kang Han ◽  
◽  
Wanggen Wan ◽  
Haiyan Yao ◽  
Li Hou

In this paper, we propose a method called Convolutional Neural Network-Markov Random Field (CNN-MRF) to estimate the crowd count in a still image. We first divide the dense crowd visible image into overlapping patches and then use a deep convolutional neural network to extract features from each patch image, followed by a fully connected neural network to regress the local patch crowd count. Since the local patches have overlapping portions, the crowd count of the adjacent patches has a high correlation. We use this correlation and the Markov random field to smooth the counting results of the local patches. Experiments show that our approach significantly outperforms the state-of-the-art methods on UCF and Shanghaitech crowd counting datasets.


2013 ◽  
Vol 2013 ◽  
pp. 1-16 ◽  
Author(s):  
Ricardo Omar Chávez ◽  
Hugo Jair Escalante ◽  
Manuel Montes-y-Gómez ◽  
Luis Enrique Sucar

This paper introduces a multimodal approach for reranking of image retrieval results based on relevance feedback. We consider the problem of reordering the ranked list of images returned by an image retrieval system, in such a way that relevant images to a query are moved to the first positions of the list. We propose a Markov random field (MRF) model that aims at classifying the images in the initial retrieval-result list as relevant or irrelevant; the output of the MRF is used to generate a new list of ranked images. The MRF takes into account (1) the rank information provided by the initial retrieval system, (2) similarities among images in the list, and (3) relevance feedback information. Hence, the problem of image reranking is reduced to that of minimizing an energy function that represents a trade-off between image relevance and interimage similarity. The proposed MRF is a multimodal as it can take advantage of both visual and textual information by which images are described with. We report experimental results in the IAPR TC12 collection using visual and textual features to represent images. Experimental results show that our method is able to improve the ranking provided by the base retrieval system. Also, the multimodal MRF outperforms unimodal (i.e., either text-based or image-based) MRFs that we have developed in previous work. Furthermore, the proposed MRF outperforms baseline multimodal methods that combine information from unimodal MRFs.


Sign in / Sign up

Export Citation Format

Share Document