A New Approach for Large-Scale Scene Image Retrieval Based on Improved Parallelk-Means Algorithm in MapReduce Environment

On Hierarchical Content-Based Image Retrieval by Dynamic Indexing and Guided Search

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/jcini.2010100102 ◽

2010 ◽

Vol 4 (4) ◽

pp. 18-36 ◽

Cited By ~ 1

Author(s):

Jane You ◽

Qin Li ◽

Jinghua Wang

Keyword(s):

Image Retrieval ◽

Data Warehouse ◽

Image Data ◽

Image Feature ◽

Image Indexing ◽

Content Based Image Retrieval ◽

New Approach ◽

Guided Search ◽

Dynamic Image ◽

Face Features

This paper presents a new approach to content-based image retrieval by using dynamic indexing and guided search in a hierarchical structure, and extending data mining and data warehousing techniques. The proposed algorithms include a wavelet-based scheme for multiple image feature extraction, the extension of a conventional data warehouse and an image database to an image data warehouse for dynamic image indexing. It also provides an image data schema for hierarchical image representation and dynamic image indexing, a statistically based feature selection scheme to achieve flexible similarity measures, and a feature component code to facilitate query processing and guide the search for the best matching. A series of case studies are reported, which include a wavelet-based image color hierarchy, classification of satellite images, tropical cyclone pattern recognition, and personal identification using multi-level palmprint and face features. Experimental results confirm that the new approach is feasible for content-based image retrieval.

Download Full-text

Scalable Database Indexing and Fast Image Retrieval Based on Deep Learning and Hierarchically Nested Structure Applied to Remote Sensing and Plant Biology

Journal of Imaging ◽

10.3390/jimaging5030033 ◽

2019 ◽

Vol 5 (3) ◽

pp. 33 ◽

Cited By ~ 7

Author(s):

Pouria Sadeghi-Tehran ◽

Plamen Angelov ◽

Nicolas Virlet ◽

Malcolm Hawkesford

Keyword(s):

Remote Sensing ◽

Image Retrieval ◽

Large Scale ◽

Feature Fusion ◽

Search Time ◽

Data Retrieval ◽

Plant Biology ◽

Data Generation ◽

Imaging Data ◽

Nested Structure

Digitalisation has opened a wealth of new data opportunities by revolutionizing how images are captured. Although the cost of data generation is no longer a major concern, the data management and processing have become a bottleneck. Any successful visual trait system requires automated data structuring and a data retrieval model to manage, search, and retrieve unstructured and complex image data. This paper investigates a highly scalable and computationally efficient image retrieval system for real-time content-based searching through large-scale image repositories in the domain of remote sensing and plant biology. Images are processed independently without considering any relevant context between sub-sets of images. We utilize a deep Convolutional Neural Network (CNN) model as a feature extractor to derive deep feature representations from the imaging data. In addition, we propose an effective scheme to optimize data structure that can facilitate faster querying at search time based on the hierarchically nested structure and recursive similarity measurements. A thorough series of tests were carried out for plant identification and high-resolution remote sensing data to evaluate the accuracy and the computational efficiency of the proposed approach against other content-based image retrieval (CBIR) techniques, such as the bag of visual words (BOVW) and multiple feature fusion techniques. The results demonstrate that the proposed scheme is effective and considerably faster than conventional indexing structures.

Download Full-text

Application of Optimized Partitioning Around Medoid Algorithm in Image Retrieval

International Journal of Distributed Systems and Technologies ◽

10.4018/ijdst.2021010106 ◽

2021 ◽

Vol 12 (1) ◽

pp. 77-94

Author(s):

Yanxia Jin ◽

Xin Zhang ◽

Yao Jia

Keyword(s):

Image Retrieval ◽

Clustering Algorithm ◽

Particle Swarm ◽

Image Data ◽

Experimental Comparison ◽

Cluster Center ◽

Particle Swarm Algorithm ◽

Query Image ◽

Retrieval Accuracy ◽

Swarm Algorithm

In image retrieval, the major challenge is that the number of images in the gallery is large and irregular, which results in low retrieval accuracy. This paper analyzes the disadvantages of the PAM (partitioning around medoid) clustering algorithm in image data classification and the excessive consumption of time in the computation process of searching clustering representative objects using the PAM clustering algorithm. Fireworks particle swarm algorithm is utilized in the optimization process. PF-PAM algorithm, which is an improved PAM algorithm, is proposed and applied in image retrieval. First, extract the feature vector of the image in the gallery for the first clustering. Next, according to the clustering results, the most optimal cluster center is searched through the firework particle swarm algorithm to obtain the final clustering result. Finally, according to the incoming query image, determine the related image category and get similar images. The experimental comparison with other approaches shows that this method can effectively improve retrieval accuracy.

Download Full-text

Fuzzy Rough C-Mean Based Unsupervised CNN Clustering for Large-Scale Image Data

Applied Sciences ◽

10.3390/app8101869 ◽

2018 ◽

Vol 8 (10) ◽

pp. 1869 ◽

Cited By ~ 3

Author(s):

Saman Riaz ◽

Ali Arshad ◽

Licheng Jiao

Keyword(s):

Deep Learning ◽

Large Scale ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Main Idea ◽

Image Data ◽

Training Image ◽

Stochastic Gradient Descent ◽

Cluster Center ◽

Clustering Method

Deep learning has been well-known for a couple of years, and it indicates incredible possibilities for unsupervised learning of representations with the clustering algorithm. The forms of Convolution Neural Networks (CNN) are now state-of-the-art for many recognition and clustering tasks. However, with the perpetual incrementation of digital images, there exist more and more redundant, irrelevant, and noisy samples which cause CNN running to gradually decrease, and its clustering accuracy decreases concurrently. To conquer these issues, we proposed an effective clustering method for a large-scale image dataset which combines CNN and a Fuzzy-Rough C-Mean (FRCM) clustering algorithm. The main idea is that first a high-level representation, learned by multi-layers of CNN with one clustering layer, produce the initial cluster center, then during training image clusters, and representations, are updating jointly. FRCM is utilized to update the cluster centers in the forward pass, while the parameters of proposed CNN are updated by the backward pass based on Stochastic Gradient Descent (SGD). The concept of the rough set of lower and boundary approximations deal with uncertainty, vagueness, and incompleteness in cluster definition, and fuzzy sets enable efficient handling of overlapping partitions in the noisy environment. The experiment results show that the proposed FRCM based unsupervised CNN clustering method is better than the standard K-Mean, Fuzzy C-Mean, FRCM and also other deep-learning-based clustering algorithms on large-scale image data.

Download Full-text

Image Retrieval Based on WBCH and Clustering Algorithm

INTERNATIONAL JOURNAL OF MANAGEMENT & INFORMATION TECHNOLOGY ◽

10.24297/ijmit.v5i3.761 ◽

2013 ◽

Vol 5 (3) ◽

pp. 604-613

Author(s):

Asmita Bhaskar Shirsath ◽

M. J. Chouhan ◽

N. J Uke

Keyword(s):

Image Retrieval ◽

Large Scale ◽

Clustering Algorithm ◽

Similarity Measures ◽

Image Database ◽

Content Based Image Retrieval ◽

Shape Information ◽

Image Descriptors ◽

Image Retrieval System ◽

Retrieval Result

Research on content-based image retrieval has gained tremendous momentum during the last decade. Color, texture and shape information have been the primitive image descriptors in content based image retrieval systems. In order to get faster retrieval result from large-scale image database ,we proposed image retrieval system in which image database is first pre-processed by Wavelet Based Color Histogram (WBCH) and K-means algorithm and then using Hierarchical clustering algorithm we index the previous result and then by using similarity measures we retrieve the images from pre-processed database. Experiments show that this proposed method offers substantial increase in retrieval speed but needs to be improved on retrieval results.

Download Full-text

A novel method based combined color features for large-scale spatial image data retrieval

Proceedings of the Thirty-Seventh Southeastern Symposium on System Theory, 2005. SSST '05. ◽

10.1109/ssst.2005.1460959 ◽

2005 ◽

Author(s):

Jiecai Luo ◽

K.W. Tobin

Keyword(s):

Large Scale ◽

Image Data ◽

Data Retrieval ◽

Spatial Image ◽

Color Features ◽

Novel Method

Download Full-text

On Hierarchical Content-Based Image Retrieval by Dynamic Indexing and Guided Search

Developments in Natural Intelligence Research and Knowledge Engineering ◽

10.4018/978-1-4666-1743-8.ch008 ◽

2012 ◽

pp. 108-125

Author(s):

Jane You ◽

Qin Li ◽

Jinghua Wang

Keyword(s):

Image Retrieval ◽

Data Warehouse ◽

Similarity Measures ◽

Image Data ◽

Personal Identification ◽

Image Indexing ◽

Content Based Image Retrieval ◽

New Approach ◽

Guided Search ◽

Dynamic Image

This paper presents a new approach to content-based image retrieval by using dynamic indexing and guided search in a hierarchical structure, and extending data mining and data warehousing techniques. The proposed algorithms include a wavelet-based scheme for multiple image feature extraction, the extension of a conventional data warehouse and an image database to an image data warehouse for dynamic image indexing. It also provides an image data schema for hierarchical image representation and dynamic image indexing, a statistically based feature selection scheme to achieve flexible similarity measures, and a feature component code to facilitate query processing and guide the search for the best matching. A series of case studies are reported, which include a wavelet-based image color hierarchy, classification of satellite images, tropical cyclone pattern recognition, and personal identification using multi-level palmprint and face features. Experimental results confirm that the new approach is feasible for content-based image retrieval.

Download Full-text

Fuzzy Logic for Image Retrieval and Image Databases

Intelligent Multimedia Databases and Information Retrieval ◽

10.4018/978-1-61350-126-9.ch013 ◽

2013 ◽

pp. 221-238

Author(s):

Li Yan ◽

Z. M. Ma

Keyword(s):

Fuzzy Logic ◽

Image Retrieval ◽

Fuzzy Sets ◽

Large Scale ◽

Image Data ◽

Content Based Image Retrieval ◽

Data Resource ◽

Database Models ◽

Image Query ◽

Fuzzy Image

Fuzzy set theory has been extensively applied to the representation and processing of imprecise and uncertain data. Image data is becoming an important data resource with rapid growth in the number of large-scale image repositories. However, image data is fuzzy in nature, and imprecision and vagueness may exist in both image descriptions and query specifications. This chapter reviews some major work of image retrieval with fuzzy logic in the literature, including fuzzy content-based image retrieval and database support for fuzzy image retrieval. For the fuzzy content-based image retrieval, we present how fuzzy sets are applied for the extraction and representation of visual (colors, shapes, textures) features, similarity measures and indexing, relevance feedback, and retrieval systems. For the fuzzy image database retrieval, we present how fuzzy sets are applied for fuzzy image query processing based on a defined database models, and how various fuzzy database models can support image data management.

Download Full-text

A Novel Strategy for Retrieving Large Scale Scene Images Based on Emotional Feature Clustering

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001420540191 ◽

2019 ◽

Vol 34 (08) ◽

pp. 2054019

Author(s):

Yueshun He ◽

Wei Zhang ◽

Ping Du ◽

Qiaohe Yang

Keyword(s):

Large Scale ◽

Clustering Algorithm ◽

Programming Model ◽

Dimensional Space ◽

Image Data ◽

Feature Clustering ◽

Parallel Programming Model ◽

Speedup Ratio ◽

Hadoop Cluster ◽

Novel Strategy

Due to complicated data structure, image can present rich information, and so images are applied widely at different fields. Although the image can offer a lot of convenience, handling such data consume much time and multi-dimensional space. Especially when users need to retrieve some images from larger-scale image datasets, the disadvantage is more obvious. So, in order to retrieve larger-scale image data effectively, a scene images retrieval strategy based on the MapReduce parallel programming model is proposed. The proposed strategy first, investigates how to effectively store large-scale scene images under a Hadoop cluster parallel processing architecture. Second, a distributed feature clustering algorithm MeanShift is introduced to implement the clustering process of emotional feature of scene images. Finally, several experiments are conducted to verify the effectiveness and efficiency of the proposed strategy in terms of different aspects such as retrieval accuracy, speedup ratio and efficiency and data scalability.

Download Full-text

Fast graph clustering in large-scale systems based on spectral coarsening

International Journal of Modern Physics B ◽

10.1142/s0217979221501319 ◽

2021 ◽

pp. 2150131

Author(s):

Dasong Sun

Keyword(s):

Complex Networks ◽

Large Scale ◽

Clustering Algorithm ◽

Graph Clustering ◽

Superior Performance ◽

Computational Time ◽

Single Node ◽

Multiple Datasets ◽

Spectral Algorithm ◽

The Individual

Complex networks depict the individual relationship in a population, which can help to deeply mine the characteristics of complex networks and predict the potential collaboration between individuals by analyzing their interaction within different groups or clusters. However, the existing algorithms are with high complexity, which cost much computational time. In this paper, an efficient graph clustering algorithm based on spectral coarsening is proposed, to deal with the large time complexity of the traditional spectral algorithm. We first find the subset most possibly belonged to the same cluster in the original network, and merge them into a single node. The scale of the network will decrease with the network being coarsened. Then, the spectral clustering algorithm is performed on the coarsened network with the maintained advantages and the improved time efficiency. Finally, the experimental results on the multiple datasets demonstrate that the proposed algorithm, compared with the current state-of-the-art methods, has superior performance.

Download Full-text