scholarly journals Fi-Fo Detector: Figure and Formula Detection Using Deformable Networks

2020 ◽  
Vol 10 (18) ◽  
pp. 6460
Author(s):  
Junaid Younas ◽  
Shoaib Ahmed Siddiqui ◽  
Mohsin Munir ◽  
Muhammad Imran Malik ◽  
Faisal Shafait ◽  
...  

We propose a novel hybrid approach that fuses traditional computer vision techniques with deep learning models to detect figures and formulas from document images. The proposed approach first fuses the different computer vision based image representations, i.e., color transform, connected component analysis, and distance transform, termed as Fi-Fo image representation. The Fi-Fo image representation is then fed to deep models for further refined representation-learning for detecting figures and formulas from document images. The proposed approach is evaluated on a publicly available ICDAR-2017 Page Object Detection (POD) dataset and its corrected version. It produces the state-of-the-art results for formula and figure detection in document images with an f1-score of 0.954 and 0.922, respectively. Ablation study results reveal that the Fi-Fo image representation helps in achieving superior performance in comparison to raw image representation. Results also establish that the hybrid approach helps deep models to learn more discriminating and refined features.

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Siquan Yu ◽  
Jiaxin Liu ◽  
Zhi Han ◽  
Yong Li ◽  
Yandong Tang ◽  
...  

Image clustering is a complex procedure, which is significantly affected by the choice of image representation. Most of the existing image clustering methods treat representation learning and clustering separately, which usually bring two problems. On the one hand, image representations are difficult to select and the learned representations are not suitable for clustering. On the other hand, they inevitably involve some clustering step, which may bring some error and hurt the clustering results. To tackle these problems, we present a new clustering method that efficiently builds an image representation and precisely discovers cluster assignments. For this purpose, the image clustering task is regarded as a binary pairwise classification problem with local structure preservation. Specifically, we propose here such an approach for image clustering based on a fully convolutional autoencoder and deep adaptive clustering (DAC). To extract the essential representation and maintain the local structure, a fully convolutional autoencoder is applied. To manipulate feature to clustering space and obtain a suitable image representation, the DAC algorithm participates in the training of autoencoder. Our method can learn an image representation that is suitable for clustering and discover the precise clustering label for each image. A series of real-world image clustering experiments verify the effectiveness of the proposed algorithm.


Coatings ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 658
Author(s):  
Anna Sandak ◽  
Edit Földvári-Nagy ◽  
Faksawat Poohphajai ◽  
Rene Herrera Diaz ◽  
Oihana Gordobil ◽  
...  

Wood, as a biological material, is sensitive to environmental conditions and microorganisms; therefore, wood products require protective measures to extend their service life in outdoor applications. Several modification processes are available for the improvement of wood properties, including commercially available solutions. Among the chemical treatments, acetylation by acetic anhydride is one of the most effective methods to induce chemical changes in the constitutive polymers at the cellular wall level. Acetylation reduces wood shrinkage-swelling, increases its durability against biotic agents, improves UV resistance and reduces surface erosion. However, even if the expected service life for external cladding of acetylated wood is estimated to be 60 years, the aesthetics change rapidly during the first years of exposure. Hybrid, or fusion, modification includes processes where the positive effect of a single treatment can be multiplied by merging with additional follow-up modifications. This report presents results of the performance tests of wood samples that, besides the modification by means of acetylation, were additionally protected with seven commercially available coatings. Natural weathering was conducted in Northern Italy for 15 months. Samples were characterized with numerous instruments by measuring samples collected from the stand every three months. Superior performance was observed on samples that merged both treatments. It is due to the combined effect of the wood acetylation and surface coating. Limited shrinkage/swelling of the bulk substrate due to chemical treatment substantially reduced stresses of the coating film. Hybrid process, compared to sole acetylation of wood, assured superior visual performance of the wood surface by preserving its original appearance.


Author(s):  
Lalit B. Damahe ◽  
Nileshsingh V. Thakur

Image representation and compression is one of the important fields of computer vision that contribute to the reduction of size of an image and other types of application areas such as image restoration, retrieval, etc. Image representation is important with respect to storage of image information, and it further extends to the compression, which may be lossy or lossless. Image compression can be applied to various applications which mainly include medical imaging, traffic monitoring, military, multimedia transmission, smart cell devices, and almost in all the domains that require less transmission and storage cost, specifically image retrieval processing. This chapter presents the various image representation compression and retrieval approaches. The retrieval approaches on personal computer and smart cell devices are discussed. Finally, the key issues are identified for image representation compression and retrieval on the basis of performance evaluation parameters like encoding time, decoding time, compression ratio, precision, recall, and elapsed time.


2019 ◽  
Vol 20 (S16) ◽  
Author(s):  
Da Zhang ◽  
Mansur Kabuka

Abstract Background Protein-protein interactions(PPIs) engage in dynamic pathological and biological procedures constantly in our life. Thus, it is crucial to comprehend the PPIs thoroughly such that we are able to illuminate the disease occurrence, achieve the optimal drug-target therapeutic effect and describe the protein complex structures. However, compared to the protein sequences obtainable from various species and organisms, the number of revealed protein-protein interactions is relatively limited. To address this dilemma, lots of research endeavor have investigated in it to facilitate the discovery of novel PPIs. Among these methods, PPI prediction techniques that merely rely on protein sequence data are more widespread than other methods which require extensive biological domain knowledge. Results In this paper, we propose a multi-modal deep representation learning structure by incorporating protein physicochemical features with the graph topological features from the PPI networks. Specifically, our method not only bears in mind the protein sequence information but also discerns the topological representations for each protein node in the PPI networks. In our paper, we construct a stacked auto-encoder architecture together with a continuous bag-of-words (CBOW) model based on generated metapaths to study the PPI predictions. Following by that, we utilize the supervised deep neural networks to identify the PPIs and classify the protein families. The PPI prediction accuracy for eight species ranged from 96.76% to 99.77%, which signifies that our multi-modal deep representation learning framework achieves superior performance compared to other computational methods. Conclusion To the best of our knowledge, this is the first multi-modal deep representation learning framework for examining the PPI networks.


Author(s):  
Juan Gutiérrez ◽  
Gabriel Gómez-Perez ◽  
Jesús Malo ◽  
Gustavo Camps-Valls

Support vector machine (SVM) image coding relies on the ability of SVMs for function approximation. The size and the profile of the e-insensitivity zone of the support vector regression (SVR) at some specific image representation determines (a) the amount of selected support vectors (the compression ratio), and (b) the nature of the introduced error (the compression distortion). However, the selection of an appropriate image representation is a key issue for a meaningful design of the e-insensitivity profile. For example, in image coding applications, taking human perception into account is of paramount relevance to obtain a good rate-distortion performance. However, depending on the accuracy of the considered perception model, certain image representations are not suitable for SVR training. In this chapter, we analyze the general procedure to take human vision models into account in SVR-based image coding. Specifically, we derive the condition for image representation selection and the associated e-insensitivity profiles.


2020 ◽  
Vol 34 (01) ◽  
pp. 27-34 ◽  
Author(s):  
Lei Chen ◽  
Le Wu ◽  
Richang Hong ◽  
Kun Zhang ◽  
Meng Wang

Graph Convolutional Networks~(GCNs) are state-of-the-art graph based representation learning models by iteratively stacking multiple layers of convolution aggregation operations and non-linear activation operations. Recently, in Collaborative Filtering~(CF) based Recommender Systems~(RS), by treating the user-item interaction behavior as a bipartite graph, some researchers model higher-layer collaborative signals with GCNs. These GCN based recommender models show superior performance compared to traditional works. However, these models suffer from training difficulty with non-linear activations for large user-item graphs. Besides, most GCN based models could not model deeper layers due to the over smoothing effect with the graph convolution operation. In this paper, we revisit GCN based CF models from two aspects. First, we empirically show that removing non-linearities would enhance recommendation performance, which is consistent with the theories in simple graph convolutional networks. Second, we propose a residual network structure that is specifically designed for CF with user-item interaction modeling, which alleviates the over smoothing problem in graph convolution aggregation operation with sparse user-item interaction data. The proposed model is a linear model and it is easy to train, scale to large datasets, and yield better efficiency and effectiveness on two real datasets. We publish the source code at https://github.com/newlei/LR-GCCF.


Sign in / Sign up

Export Citation Format

Share Document