scholarly journals Object-Based Image Retrieval Using the U-Net-Based Neural Network

2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Sandeep Kumar ◽  
Arpit Jain ◽  
Ambuj Kumar Agarwal ◽  
Shilpa Rani ◽  
Anshu Ghimire

Day by day, all the research communities have been focusing on digital image retrieval due to more internet and social media uses. In this paper, a U-Net-based neural network is proposed for the segmentation process and Haar DWT and lifting wavelet schemes are used for feature extraction in content-based image retrieval (CBIR). Haar wavelet is preferred as it is easy to understand, very simple to compute, and the fastest. The U-Net-based neural network (CNN) gives more accurate results than the existing methodology because deep learning techniques extract low-level and high-level features from the input image. For the evaluation process, two benchmark datasets are used, and the accuracy of the proposed method is 93.01% and 88.39% on Corel 1K and Corel 5K. U-Net is used for the segmentation purpose, and it reduces the dimension of the feature vector and feature extraction time by 5 seconds compared to the existing methods. According to the performance analysis, the proposed work has proven that U-Net improves image retrieval performance in terms of accuracy, precision, and recall on both the benchmark datasets.

Author(s):  
Shikha Bhardwaj ◽  
Gitanjali Pandove ◽  
Pawan Kumar Dahiya

Background: In order to retrieve a particular image from vast repository of images, an efficient system is required and such an eminent system is well-known by the name Content-based image retrieval (CBIR) system. Color is indeed an important attribute of an image and the proposed system consist of a hybrid color descriptor which is used for color feature extraction. Deep learning, has gained a prominent importance in the current era. So, the performance of this fusion based color descriptor is also analyzed in the presence of Deep learning classifiers. Method: This paper describes a comparative experimental analysis on various color descriptors and the best two are chosen to form an efficient color based hybrid system denoted as combined color moment-color autocorrelogram (Co-CMCAC). Then, to increase the retrieval accuracy of the hybrid system, a Cascade forward back propagation neural network (CFBPNN) is used. The classification accuracy obtained by using CFBPNN is also compared to Patternnet neural network. Results: The results of the hybrid color descriptor depict that the proposed system has superior results of the order of 95.4%, 88.2%, 84.4% and 96.05% on Corel-1K, Corel-5K, Corel-10K and Oxford flower benchmark datasets respectively as compared to many state-of-the-art related techniques. Conclusion: This paper depict an experimental and analytical analysis on different color feature descriptors namely, Color moment (CM), Color auto-correlogram (CAC), Color histogram (CH), Color coherence vector (CCV) and Dominant color descriptor (DCD). The proposed hybrid color descriptor (Co-CMCAC) is utilized for the withdrawal of color features with Cascade forward back propagation neural network (CFBPNN) is used as a classifier on four benchmark datasets namely Corel-1K, Corel-5K and Corel-10K and Oxford flower.


2021 ◽  
Vol 9 ◽  
Author(s):  
Ashwini K ◽  
P. M. Durai Raj Vincent ◽  
Kathiravan Srinivasan ◽  
Chuan-Yu Chang

Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.


2018 ◽  
Vol 7 (3.1) ◽  
pp. 13
Author(s):  
Raveendra K ◽  
R Vinoth Kanna

Automatic logo based document image retrieval process is an essential and mostly used method in the feature extraction applications. In this paper the architecture of Convolutional Neural Network (CNN) was elaborately explained with pictorial representations in order to understand the complex Convolutional Neural Networks process in a simplified way. The main objective of this paper is to effectively utilize the CNN in the process of automatic logo based document image retrieval methods.  


Content-Based Image Retrieval (CBIR) is extensively used technique for image retrieval from large image databases. However, users are not satisfied with the conventional image retrieval techniques. In addition, the advent of web development and transmission networks, the number of images available to users continues to increase. Therefore, a permanent and considerable digital image production in many areas takes place. Quick access to the similar images of a given query image from this extensive collection of images pose great challenges and require proficient techniques. From query by image to retrieval of relevant images, CBIR has key phases such as feature extraction, similarity measurement, and retrieval of relevant images. However, extracting the features of the images is one of the important steps. Recently Convolutional Neural Network (CNN) shows good results in the field of computer vision due to the ability of feature extraction from the images. Alex Net is a classical Deep CNN for image feature extraction. We have modified the Alex Net Architecture with a few changes and proposed a novel framework to improve its ability for feature extraction and for similarity measurement. The proposal approach optimizes Alex Net in the aspect of pooling layer. In particular, average pooling is replaced by max-avg pooling and the non-linear activation function Maxout is used after every Convolution layer for better feature extraction. This paper introduces CNN for features extraction from images in CBIR system and also presents Euclidean distance along with the Comprehensive Values for better results. The proposed framework goes beyond image retrieval, including the large-scale database. The performance of the proposed work is evaluated using precision. The proposed work show better results than existing works.


2019 ◽  
Vol 9 (19) ◽  
pp. 4036 ◽  
Author(s):  
You ◽  
Wu ◽  
Lee ◽  
Liu

Multi-class classification is a very important technique in engineering applications, e.g., mechanical systems, mechanics and design innovations, applied materials in nanotechnologies, etc. A large amount of research is done for single-label classification where objects are associated with a single category. However, in many application domains, an object can belong to two or more categories, and multi-label classification is needed. Traditionally, statistical methods were used; recently, machine learning techniques, in particular neural networks, have been proposed to solve the multi-class classification problem. In this paper, we develop radial basis function (RBF)-based neural network schemes for single-label and multi-label classification, respectively. The number of hidden nodes and the parameters involved with the basis functions are determined automatically by applying an iterative self-constructing clustering algorithm to the given training dataset, and biases and weights are derived optimally by least squares. Dimensionality reduction techniques are adopted and integrated to help reduce the overfitting problem associated with the RBF networks. Experimental results from benchmark datasets are presented to show the effectiveness of the proposed schemes.


2013 ◽  
Vol 347-350 ◽  
pp. 3537-3540
Author(s):  
Hai Yun Lin ◽  
Yu Jiao Wang ◽  
Jian Chun Cai

In respect of the classification of current image retrieval technology and the existing issues, the paper put forward a method designed for image semantic feature extraction based on artificial intelligence. The new method has solved the tough problem of image semantic feature extraction, by fusing fuzzy logic, genetic algorithm and artificial neural network altogether, which greatly improved the efficiency and accuracy of image retrieval.


2012 ◽  
Vol 2012 ◽  
pp. 1-19 ◽  
Author(s):  
Chih-Fong Tsai

Content-based image retrieval (CBIR) systems require users to query images by their low-level visual content; this not only makes it hard for users to formulate queries, but also can lead to unsatisfied retrieval results. To this end, image annotation was proposed. The aim of image annotation is to automatically assign keywords to images, so image retrieval users are able to query images by keywords. Image annotation can be regarded as the image classification problem: that images are represented by some low-level features and some supervised learning techniques are used to learn the mapping between low-level features and high-level concepts (i.e., class labels). One of the most widely used feature representation methods is bag-of-words (BoW). This paper reviews related works based on the issues of improving and/or applying BoW for image annotation. Moreover, many recent works (from 2006 to 2012) are compared in terms of the methodology of BoW feature generation and experimental design. In addition, several different issues in using BoW are discussed, and some important issues for future research are discussed.


2021 ◽  
Vol 11 (19) ◽  
pp. 9197
Author(s):  
Muhammad Tahir ◽  
Saeed Anwar

Person Re-Identification is an essential task in computer vision, particularly in surveillance applications. The aim is to identify a person based on an input image from surveillance photographs in various scenarios. Most Person re-ID techniques utilize Convolutional Neural Networks (CNNs); however, Vision Transformers are replacing pure CNNs for various computer vision tasks such as object recognition, classification, etc. The vision transformers contain information about local regions of the image. The current techniques take this advantage to improve the accuracy of the tasks underhand. We propose to use the vision transformers in conjunction with vanilla CNN models to investigate the true strength of transformers in person re-identification. We employ three backbones with different combinations of vision transformers on two benchmark datasets. The overall performance of the backbones increased, showing the importance of vision transformers. We provide ablation studies and show the importance of various components of the vision transformers in re-identification tasks.


Author(s):  
Xiaowang Zhang ◽  
Qiang Gao ◽  
Zhiyong Feng

In this paper, we present a neural network (InteractionNN) for sparse predictive analysis where hidden features of sparse data can be learned by multilevel feature interaction. To characterize multilevel interaction of features, InteractionNN consists of three modules, namely, nonlinear interaction pooling, layer-lossing, and embedding. Nonlinear interaction pooling (NI pooling) is a hierarchical structure and, by shortcut connection, constructs low-level feature interactions from basic dense features to elementary features. Layer-lossing is a feed-forward neural network where high-level feature interactions can be learned from low-level feature interactions via correlation of all layers with target. Moreover, embedding is to extract basic dense features from sparse features of data which can help in reducing our proposed model computational complex. Finally, our experiment evaluates on the two benchmark datasets and the experimental results show that InteractionNN performs better than most of state-of-the-art models in sparse regression.


Sign in / Sign up

Export Citation Format

Share Document