scholarly journals Biomedical Imaging Modality Classification Using Combined Visual Features and Textual Terms

2011 ◽  
Vol 2011 ◽  
pp. 1-7 ◽  
Author(s):  
Xian-Hua Han ◽  
Yen-Wei Chen

We describe an approach for the automatic modality classification in medical image retrieval task of the 2010 CLEF cross-language image retrieval campaign (ImageCLEF). This paper is focused on the process of feature extraction from medical images and fuses the different extracted visual features and textual feature for modality classification. To extract visual features from the images, we used histogram descriptor of edge, gray, or color intensity and block-based variation as global features and SIFT histogram as local feature. For textual feature of image representation, the binary histogram of some predefined vocabulary words from image captions is used. Then, we combine the different features using normalized kernel functions for SVM classification. Furthermore, for some easy misclassified modality pairs such as CT and MR or PET and NM modalities, a local classifier is used for distinguishing samples in the pair modality to improve performance. The proposed strategy is evaluated with the provided modality dataset by ImageCLEF 2010.

Algorithms ◽  
2021 ◽  
Vol 14 (8) ◽  
pp. 228
Author(s):  
Ezekiel Mensah Martey ◽  
Hang Lei ◽  
Xiaoyu Li ◽  
Obed Appiah

Image representation plays a vital role in the realisation of Content-Based Image Retrieval (CBIR) system. The representation is performed because pixel-by-pixel matching for image retrieval is impracticable as a result of the rigid nature of such an approach. In CBIR therefore, colour, shape and texture and other visual features are used to represent images for effective retrieval task. Among these visual features, the colour and texture are pretty remarkable in defining the content of the image. However, combining these features does not necessarily guarantee better retrieval accuracy due to image transformations such rotation, scaling, and translation that an image would have gone through. More so, concerns about feature vector representation taking ample memory space affect the running time of the retrieval task. To address these problems, we propose a new colour scheme called Stack Colour Histogram (SCH) which inherently extracts colour and neighbourhood information into a descriptor for indexing images. SCH performs recurrent mean filtering of the image to be indexed. The recurrent blurring in this proposed method works by repeatedly filtering (transforming) the image. The output of a transformation serves as the input for the next transformation, and in each case a histogram is generated. The histograms are summed up bin-by-bin and the resulted vector used to index the image. The image blurring process uses pixel’s neighbourhood information, making the proposed SCH exhibit the inherent textural information of the image that has been indexed. The SCH was extensively tested on the Coil100, Outext, Batik and Corel10K datasets. The Coil100, Outext, and Batik datasets are generally used to assess image texture descriptors, while Corel10K is used for heterogeneous descriptors. The experimental results show that our proposed descriptor significantly improves retrieval and classification rate when compared with (CMTH, MTH, TCM, CTM and NRFUCTM) which are the start-of-the-art descriptors for images with textural features.


Author(s):  
Xinxun Xu ◽  
Muli Yang ◽  
Yanhua Yang ◽  
Hao Wang

Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a specific cross-modal retrieval task for searching natural images given free-hand sketches under the zero-shot scenario. Most existing methods solve this problem by simultaneously projecting visual features and semantic supervision into a low-dimensional common space for efficient retrieval. However, such low-dimensional projection destroys the completeness of semantic knowledge in original semantic space, so that it is unable to transfer useful knowledge well when learning semantic features from different modalities. Moreover, the domain information and semantic information are entangled in visual features, which is not conducive for cross-modal matching since it will hinder the reduction of domain gap between sketch and image. In this paper, we propose a Progressive Domain-independent Feature Decomposition (PDFD) network for ZS-SBIR. Specifically, with the supervision of original semantic knowledge, PDFD decomposes visual features into domain features and semantic ones, and then the semantic features are projected into common space as retrieval features for ZS-SBIR. The progressive projection strategy maintains strong semantic supervision. Besides, to guarantee the retrieval features to capture clean and complete semantic information, the cross-reconstruction loss is introduced to encourage that any combinations of retrieval features and domain features can reconstruct the visual features. Extensive experiments demonstrate the superiority of our PDFD over state-of-the-art competitors.


2018 ◽  
Vol 11 (1) ◽  
pp. 42
Author(s):  
Ahmad Wahyu Rosyadi ◽  
Renest Danardono ◽  
Siprianus Septian Manek ◽  
Agus Zainal Arifin

One of the techniques in region based image retrieval (RBIR) is comparing the global feature of an entire image and the local feature of image’s sub-block in query and database image. The determined sub-block must be able to detect an object with varying sizes and locations. So the sub-block with flexible size and location is needed. We propose a new method for local feature extraction by determining the flexible size and location of sub-block based on the transition region in region based image retrieval. Global features of both query and database image are extracted using invariant moment. Local features in database and query image are extracted using hue, saturation, and value (HSV) histogram and local binary patterns (LBP). There are several steps to extract the local feature of sub-block in the query image. First, preprocessing is conducted to get the transition region, then the flexible sub-block is determined based on the transition region. Afterward, the local feature of sub-block is extracted. The result of this application is the retrieved images ordered by the most similar to the query image. The local feature extraction with the proposed method is effective for image retrieval with precision and recall value are 57%.


Sensors ◽  
2021 ◽  
Vol 21 (10) ◽  
pp. 3406
Author(s):  
Jie Jiang ◽  
Yin Zou ◽  
Lidong Chen ◽  
Yujie Fang

Precise localization and pose estimation in indoor environments are commonly employed in a wide range of applications, including robotics, augmented reality, and navigation and positioning services. Such applications can be solved via visual-based localization using a pre-built 3D model. The increase in searching space associated with large scenes can be overcome by retrieving images in advance and subsequently estimating the pose. The majority of current deep learning-based image retrieval methods require labeled data, which increase data annotation costs and complicate the acquisition of data. In this paper, we propose an unsupervised hierarchical indoor localization framework that integrates an unsupervised network variational autoencoder (VAE) with a visual-based Structure-from-Motion (SfM) approach in order to extract global and local features. During the localization process, global features are applied for the image retrieval at the level of the scene map in order to obtain candidate images, and are subsequently used to estimate the pose from 2D-3D matches between query and candidate images. RGB images only are used as the input of the proposed localization system, which is both convenient and challenging. Experimental results reveal that the proposed method can localize images within 0.16 m and 4° in the 7-Scenes data sets and 32.8% within 5 m and 20° in the Baidu data set. Furthermore, our proposed method achieves a higher precision compared to advanced methods.


Author(s):  
Nandini H. M. ◽  
Chethan H. K. ◽  
Rashmi B. S.

Shot boundary detection in videos is one of the most fundamental tasks towards content-based video retrieval and analysis. In this aspect, an efficient approach to detect abrupt and gradual transition in videos is presented. The proposed method detects the shot boundaries in videos by extracting block-based mean probability binary weight (MPBW) histogram from the normalized Kirsch magnitude frames as an amalgamation of local and global features. Abrupt transitions in videos are detected by utilizing the distance measure between consecutive MPBW histograms and employing an adaptive threshold. In the subsequent step, co-efficient of mean deviation and variance statistical measure is applied on MPBW histograms to detect gradual transitions in the video. Experiments were conducted on TRECVID 2001 and 2007 datasets to analyse and validate the proposed method. Experimental result shows significant improvement of the proposed SBD approach over some of the state-of-the-art algorithms in terms of recall, precision, and F1-score.


Author(s):  
Gangavarapu Venkata Satya Kumar ◽  
Pillutla Gopala Krishna Mohan

In diverse computer applications, the analysis of image content plays a key role. This image content might be either textual (like text appearing in the images) or visual (like shape, color, texture). These two image contents consist of image’s basic features and therefore turn out to be as the major advantage for any of the implementation. Many of the art models are based on the visual search or annotated text for Content-Based Image Retrieval (CBIR) models. There is more demand toward multitasking, a new method needs to be introduced with the combination of both textual and visual features. This paper plans to develop the intelligent CBIR system for the collection of different benchmark texture datasets. Here, a new descriptor named Information Oriented Angle-based Local Tri-directional Weber Patterns (IOA-LTriWPs) is adopted. The pattern is operated not only based on tri-direction and eight neighborhood pixels but also based on four angles [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text]. Once the patterns concerning tri-direction, eight neighborhood pixels, and four angles are taken, the best patterns are selected based on maximum mutual information. Moreover, the histogram computation of the patterns provides the final feature vector, from which the new weighted feature extraction is performed. As a new contribution, the novel weight function is optimized by the Improved MVO on random basis (IMVO-RB), in such a way that the precision and recall of the retrieved image is high. Further, the proposed model has used the logarithmic similarity called Mean Square Logarithmic Error (MSLE) between the features of the query image and trained images for retrieving the concerned images. The analyses on diverse texture image datasets have validated the accuracy and efficiency of the developed pattern over existing.


Sign in / Sign up

Export Citation Format

Share Document