Figure Based Biomedical Document Retrieval System using Structural Image Features

Author(s):  
Harikrishna G. N. Rai ◽  
K Sai Deepak ◽  
P. Radha Krishna

Multi-modal and Unstructured nature of documents make their retrieval from healthcare document repositories a challenging task. Text based retrieval is the conventional approach used for solving this problem. In this paper, the authors explore an alternate avenue of using embedded figures for the retrieval task. Usually, context of a document is directly reflected in the associated figures, therefore embedded text within these figures along with image features have been used for similarity based retrieval of figures. The present work demonstrates that image features describing the structural properties of figures are sufficient for the figure retrieval task. First, the authors analyze the problem of figure retrieval from biomedical literature and identify significant classes of figures. Second, they use edge information as a means to discriminate between structural properties of each figure category. Finally, the authors present a methodology using a novel feature descriptor namely Fourier Edge Orientation Autocorrelogram (FEOAC) to describe structural properties of figures and build an effective Biomedical document retrieval system. The experimental results demonstrate the better retrieval performance and overall improvement of FEOAC for figure retrieval task, especially when most of the edge information is retained. Apart from invariance to scale, rotation and non-uniform illumination, the proposed feature descriptor is shown to be relatively robust to noisy edges.

With an advent of technologya huge collection of digital images is formed as repositories on world wide web (WWW). The task of searching for similar images in the repository is difficult. In this paper, retrieval of similar images from www is demonstrated with the help of combination of image features as color and shape and then using Siamese neural network which is constructed to the requirement as a novel approach. Here, one-shot learning technique is used to test the Siamese Neural Network model for retrieval performance. Various experiments are conducted with both the methods and results obtained are tabulated. The performance of the system is evaluated with precision parameter and which is found to be high.Also, relative study is made with existing works.


2015 ◽  
Vol 6 (3) ◽  
Author(s):  
Resti Ludviani ◽  
Khadijah F. Hayati ◽  
Agus Zainal Arifin ◽  
Diana Purwitasari

Abstract. An appropriate selection term for expanding a query is very important in query expansion. Therefore, term selection optimization is added to improve query expansion performance on document retrieval system. This study proposes a new approach named Term Relatedness to Query-Entropy based (TRQE) to optimize weight in query expansion by considering semantic and statistic aspects from relevance evaluation of pseudo feedback to improve document retrieval performance. The proposed method has 3 main modules, they are relevace feedback, pseudo feedback, and document retrieval. TRQE is implemented in pseudo feedback module to optimize weighting term in query expansion. The evaluation result shows that TRQE can retrieve document with the highest result at precission of 100% and recall of 22,22%. TRQE for weighting optimization of query expansion is proven to improve retrieval document.     Keywords: TRQE, query expansion, term weighting, term relatedness to query, relevance feedback Abstrak..Pemilihan term yang tepat untuk memperluas queri merupakan hal yang penting pada query expansion. Oleh karena itu, perlu dilakukan optimasi penentuan term yang sesuai sehingga mampu meningkatkan performa query expansion pada system temu kembali dokumen. Penelitian ini mengajukan metode Term Relatedness to Query-Entropy based (TRQE), sebuah metode untuk mengoptimasi pembobotan pada query expansion dengan memperhatikan aspek semantic dan statistic dari penilaian relevansi suatu pseudo feedback sehingga mampu meningkatkan performa temukembali dokumen. Metode yang diusulkan memiliki 3 modul utama yaitu relevan feedback, pseudo feedback, dan document retrieval. TRQE diimplementasikan pada modul pseudo feedback untuk optimasi pembobotan term pada ekspansi query. Evaluasi hasil uji coba menunjukkan bahwa metode TRQE dapat melakukan temukembali dokumen dengan hasil terbaik pada precision  100% dan recall sebesar 22,22%.Metode TRQE untuk optimasi pembobotan pada query expansion terbukti memberikan pengaruh untuk meningkatkan relevansi pencarian dokumen.Kata Kunci: TRQE, ekspansi query, pembobotan term, term relatedness to query, relevance feedback


Sensors ◽  
2020 ◽  
Vol 20 (16) ◽  
pp. 4449
Author(s):  
Federico Magliani ◽  
Laura Sani ◽  
Stefano Cagnoni ◽  
Andrea Prati

Most recent computer vision tasks take into account the distribution of image features to obtain more powerful models and better performance. One of the most commonly used techniques to this purpose is the diffusion algorithm, which fuses manifold data and k-Nearest Neighbors (kNN) graphs. In this paper, we describe how we optimized diffusion in an image retrieval task aimed at mobile vision applications, in order to obtain a good trade-off between computation load and performance. From a computational efficiency viewpoint, the high complexity of the exhaustive creation of a full kNN graph for a large database renders such a process unfeasible on mobile devices. From a retrieval performance viewpoint, the diffusion parameters are strongly task-dependent and affect significantly the algorithm performance. In the method we describe herein, we tackle the first issue by using approximate algorithms in building the kNN tree. The main contribution of this work is the optimization of diffusion parameters using a genetic algorithm (GA), which allows us to guarantee high retrieval performance in spite of such a simplification. The results we have obtained confirm that the global search for the optimal diffusion parameters performed by a genetic algorithm is equivalent to a massive analysis of the diffusion parameter space for which an exhaustive search would be totally unfeasible. We show that even a grid search could often be less efficient (and effective) than the GA, i.e., that the genetic algorithm most often produces better diffusion settings when equal computing resources are available to the two approaches. Our method has been tested on several publicly-available datasets: Oxford5k, ROxford5k, Paris6k, RParis6k, and Oxford105k, and compared to other mainstream approaches.


2019 ◽  
Vol 24 (1) ◽  
pp. 38-48
Author(s):  
Esingbemi Princewill Ebietomere ◽  
Godspower Osaretin Ekuobase

Abstract Legal reasoning, the core of legal practice in many countries, is “stare decisis” and its soundness is usually strengthened by relevant case law consulted. However, the task of relevant case law access and retrieval is tiring to legal practitioners and constitutes a serious drain on their productivity. Existing efforts at addressing this problem are conceptional, restrictive or unreliable. Specifically, existing semantic retrieval (SR) systems for case law are desirous of exceptional retrieval precision. Ontology promises to meet this desire, if introduced to the SR system. As a consequence, an ontology-based SR system for case law has been built using the systems analysis and design methodology. In particular, the component-based software engineering and the agile methodologies are employed to implement the system. Finally, the search and retrieval performance of the resultant SR system has been evaluated using the heuristics evaluation method. The retrieval system has shown to have a search and retrieval performance of about 94 % precision, 80 % recall and 84 % F-measure. Overall, the paper implements the SR system for case law with excellent precision and affirms the superiority of ontology approach over other semantic approaches to SR systems for document retrieval in the legal domain.


2018 ◽  
Author(s):  
Sebastian Otálora ◽  
Roger Schaer ◽  
Oscar Jimenez-del-Toro ◽  
Manfredo Atzori ◽  
Henning Müller

ABSTRACTClinical practice is getting increasingly stressful for pathologists due to increasing complexity and time constraints. Histopathology is slowly shifting to digital pathology, thus creating opportunities to allow pathologists to improve reading quality or save time using Artificial Intelligence (AI)-based applications. We aim to enhance the practice of pathologists through a retrieval system that allows them to simplify their workflow, limit the need for second opinions, while also learning in the process. In this work, an innovative retrieval system for digital pathology is integrated within a Whole Slide Image (WSI) viewer, allowing to define regions of interest in images as queries for finding visually similar areas using deep representations. The back-end similarity computation algorithms are based on a multimodal approach, allowing to exploit both text information and content-based image features. Shallow and deep representations of the images were evaluated, the later showed a better overall retrieval performance in a set of 112 whole slide images from biopsies. The system was also tested by pathologists, highlighting its capabilities and suggesting possible ways to improve it and make it more usable in clinical practice. The retrieval system developed can enhance the practice of pathologists by enabling them to use their experience and knowledge to properly control artificial intelligence tools for navigating repositories of images for decision support purposes.


2020 ◽  
Vol 4 (3) ◽  
pp. 551-557
Author(s):  
Muhammad zaky ramadhan ◽  
Kemas Muslim Lhaksmana

Hadith has several levels of authenticity, among which are weak (dhaif), and fabricated (maudhu) hadith that may not originate from the prophet Muhammad PBUH, and thus should not be considered in concluding an Islamic law (sharia). However, many such hadiths have been commonly confused as authentic hadiths among ordinary Muslims. To easily distinguish such hadiths, this paper proposes a method to check the authenticity of a hadith by comparing them with a collection of fabricated hadiths in Indonesian. The proposed method applies the vector space model and also performs spelling correction using symspell to check whether the use of spelling check can improve the accuracy of hadith retrieval, because it has never been done in previous works and typos are common on Indonesian-translated hadiths on the Web and social media raw text. The experiment result shows that the use of spell checking improves the mean average precision and recall to become 81% (from 73%) and 89% (from 80%), respectively. Therefore, the improvement in accuracy by implementing spelling correction make the hadith retrieval system more feasible and encouraged to be implemented in future works because it can correct typos that are common in the raw text on the Internet.


Author(s):  
Alex Kohn ◽  
François Bry ◽  
Alexander Manta

Studies agree that searchers are often not satisfied with the performance of current enterprise search engines. As a consequence, more scientists worldwide are actively investigating new avenues for searching to improve retrieval performance. This paper contributes to YASA (Your Adaptive Search Agent), a fully implemented and thoroughly evaluated ontology-based information retrieval system for the enterprise. A salient particularity of YASA is that large parts of the ontology are automatically filled with facts by recycling and transforming existing data. YASA offers context-based personalization, faceted navigation, as well as semantic search capabilities. YASA has been deployed and evaluated in the pharmaceutical research department of Roche, Penzberg, and results show that already semantically simple ontologies suffice to considerably improve search performance.


Author(s):  
Bo Wang ◽  
Xiaoting Yu ◽  
Chengeng Huang ◽  
Qinghong Sheng ◽  
Yuanyuan Wang ◽  
...  

The excellent feature extraction ability of deep convolutional neural networks (DCNNs) has been demonstrated in many image processing tasks, by which image classification can achieve high accuracy with only raw input images. However, the specific image features that influence the classification results are not readily determinable and what lies behind the predictions is unclear. This study proposes a method combining the Sobel and Canny operators and an Inception module for ship classification. The Sobel and Canny operators obtain enhanced edge features from the input images. A convolutional layer is replaced with the Inception module, which can automatically select the proper convolution kernel for ship objects in different image regions. The principle is that the high-level features abstracted by the DCNN, and the features obtained by multi-convolution concatenation of the Inception module must ultimately derive from the edge information of the preprocessing input images. This indicates that the classification results are based on the input edge features, which indirectly interpret the classification results to some extent. Experimental results show that the combination of the edge features and the Inception module improves DCNN ship classification performance. The original model with the raw dataset has an average accuracy of 88.72%, while when using enhanced edge features as input, it achieves the best performance of 90.54% among all models. The model that replaces the fifth convolutional layer with the Inception module has the best performance of 89.50%. It performs close to VGG-16 on the raw dataset and is significantly better than other deep neural networks. The results validate the functionality and feasibility of the idea posited.


Sign in / Sign up

Export Citation Format

Share Document