Fi-Fo Detector: Figure and Formula Detection Using Deformable Networks

Junaid Younas; Shoaib Ahmed Siddiqui; Mohsin Munir; Muhammad Imran Malik; Faisal Shafait; Paul Lukowicz; Sheraz Ahmed

doi:10.3390/app10186460

Fi-Fo Detector: Figure and Formula Detection Using Deformable Networks

Applied Sciences ◽

10.3390/app10186460 ◽

2020 ◽

Vol 10 (18) ◽

pp. 6460

Author(s):

Junaid Younas ◽

Shoaib Ahmed Siddiqui ◽

Mohsin Munir ◽

Muhammad Imran Malik ◽

Faisal Shafait ◽

...

Keyword(s):

Computer Vision ◽

Image Representation ◽

Hybrid Approach ◽

Representation Learning ◽

Superior Performance ◽

Document Images ◽

Connected Component ◽

Study Results ◽

Image Representations ◽

Ablation Study

We propose a novel hybrid approach that fuses traditional computer vision techniques with deep learning models to detect figures and formulas from document images. The proposed approach first fuses the different computer vision based image representations, i.e., color transform, connected component analysis, and distance transform, termed as Fi-Fo image representation. The Fi-Fo image representation is then fed to deep models for further refined representation-learning for detecting figures and formulas from document images. The proposed approach is evaluated on a publicly available ICDAR-2017 Page Object Detection (POD) dataset and its corrected version. It produces the state-of-the-art results for formula and figure detection in document images with an f1-score of 0.954 and 0.922, respectively. Ablation study results reveal that the Fi-Fo image representation helps in achieving superior performance in comparison to raw image representation. Results also establish that the hybrid approach helps deep models to learn more discriminating and refined features.

Download Full-text

Representation Learning Based on Autoencoder and Deep Adaptive Clustering for Image Clustering

Mathematical Problems in Engineering ◽

10.1155/2021/3742536 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Siquan Yu ◽

Jiaxin Liu ◽

Zhi Han ◽

Yong Li ◽

Yandong Tang ◽

...

Keyword(s):

Local Structure ◽

Image Representation ◽

Representation Learning ◽

Classification Problem ◽

Image Clustering ◽

Clustering Methods ◽

Adaptive Clustering ◽

Complex Procedure ◽

Convolutional Autoencoder ◽

Image Representations

Image clustering is a complex procedure, which is significantly affected by the choice of image representation. Most of the existing image clustering methods treat representation learning and clustering separately, which usually bring two problems. On the one hand, image representations are difficult to select and the learned representations are not suitable for clustering. On the other hand, they inevitably involve some clustering step, which may bring some error and hurt the clustering results. To tackle these problems, we present a new clustering method that efficiently builds an image representation and precisely discovers cluster assignments. For this purpose, the image clustering task is regarded as a binary pairwise classification problem with local structure preservation. Specifically, we propose here such an approach for image clustering based on a fully convolutional autoencoder and deep adaptive clustering (DAC). To extract the essential representation and maintain the local structure, a fully convolutional autoencoder is applied. To manipulate feature to clustering space and obtain a suitable image representation, the DAC algorithm participates in the training of autoencoder. Our method can learn an image representation that is suitable for clustering and discover the precise clustering label for each image. A series of real-world image clustering experiments verify the effectiveness of the proposed algorithm.

Download Full-text

Hybrid Approach for Wood Modification: Characterization and Evaluation of Weathering Resistance of Coatings on Acetylated Wood

Coatings ◽

10.3390/coatings11060658 ◽

2021 ◽

Vol 11 (6) ◽

pp. 658

Author(s):

Anna Sandak ◽

Edit Földvári-Nagy ◽

Faksawat Poohphajai ◽

Rene Herrera Diaz ◽

Oihana Gordobil ◽

...

Keyword(s):

Service Life ◽

Hybrid Approach ◽

Wood Properties ◽

Wood Products ◽

Superior Performance ◽

Surface Erosion ◽

Coating Film ◽

Chemical Treatments ◽

Cellular Wall ◽

Bulk Substrate

Wood, as a biological material, is sensitive to environmental conditions and microorganisms; therefore, wood products require protective measures to extend their service life in outdoor applications. Several modification processes are available for the improvement of wood properties, including commercially available solutions. Among the chemical treatments, acetylation by acetic anhydride is one of the most effective methods to induce chemical changes in the constitutive polymers at the cellular wall level. Acetylation reduces wood shrinkage-swelling, increases its durability against biotic agents, improves UV resistance and reduces surface erosion. However, even if the expected service life for external cladding of acetylated wood is estimated to be 60 years, the aesthetics change rapidly during the first years of exposure. Hybrid, or fusion, modification includes processes where the positive effect of a single treatment can be multiplied by merging with additional follow-up modifications. This report presents results of the performance tests of wood samples that, besides the modification by means of acetylation, were additionally protected with seven commercially available coatings. Natural weathering was conducted in Northern Italy for 15 months. Samples were characterized with numerous instruments by measuring samples collected from the stand every three months. Superior performance was observed on samples that merged both treatments. It is due to the combined effect of the wood acetylation and surface coating. Limited shrinkage/swelling of the bulk substrate due to chemical treatment substantially reduced stresses of the coating film. Hybrid process, compared to sole acetylation of wood, assured superior visual performance of the wood surface by preserving its original appearance.

Download Full-text

Image Representation Learning by Transformation Regression

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9412597 ◽

2021 ◽

Author(s):

Xifeng Guo ◽

Jiyuan Liu ◽

Sihang Zhou ◽

En Zhu ◽

Shihao Dong

Keyword(s):

Image Representation ◽

Representation Learning

Download Full-text

Segmentation of malignant tumours in mammogram images: A hybrid approach using convolutional neural network and connected component analysis

Expert Systems ◽

10.1111/exsy.12826 ◽

2021 ◽

Author(s):

Abhijit Roy ◽

Bikesh Kumar Singh ◽

Sumit K. Banchhor ◽

Kesari Verma

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Hybrid Approach ◽

Component Analysis ◽

Connected Component ◽

Malignant Tumours ◽

Connected Component Analysis ◽

Mammogram Images

Download Full-text

Image Representation Learning by Deep Appearance and Spatial Coding

Computer Vision -- ACCV 2014 - Lecture Notes in Computer Science ◽

10.1007/978-3-319-16865-4_43 ◽

2015 ◽

pp. 659-672 ◽

Cited By ~ 1

Author(s):

Bingyuan Liu ◽

Jing Liu ◽

Zechao Li ◽

Hanqing Lu

Keyword(s):

Image Representation ◽

Representation Learning ◽

Spatial Coding

Download Full-text

A Hybrid Approach for Video Indexing Using Computer Vision and Speech Recognition

Predictive Analytics ◽

10.1201/9781003083177-13 ◽

2020 ◽

pp. 213-225

Author(s):

Saksham Jain ◽

Akshit Pradhan ◽

Vijay Kumar

Keyword(s):

Computer Vision ◽

Speech Recognition ◽

Hybrid Approach ◽

Video Indexing

Download Full-text

Review on Image Representation Compression and Retrieval Approaches

Technological Innovations in Knowledge Management and Decision Support - Advances in Knowledge Acquisition, Transfer, and Management ◽

10.4018/978-1-5225-6164-4.ch009 ◽

2019 ◽

pp. 203-231

Author(s):

Lalit B. Damahe ◽

Nileshsingh V. Thakur

Keyword(s):

Computer Vision ◽

Performance Evaluation ◽

Personal Computer ◽

Compression Ratio ◽

Image Representation ◽

Traffic Monitoring ◽

Time Compression ◽

Key Issues ◽

And Storage ◽

Evaluation Parameters

Image representation and compression is one of the important fields of computer vision that contribute to the reduction of size of an image and other types of application areas such as image restoration, retrieval, etc. Image representation is important with respect to storage of image information, and it further extends to the compression, which may be lossy or lossless. Image compression can be applied to various applications which mainly include medical imaging, traffic monitoring, military, multimedia transmission, smart cell devices, and almost in all the domains that require less transmission and storage cost, specifically image retrieval processing. This chapter presents the various image representation compression and retrieval approaches. The retrieval approaches on personal computer and smart cell devices are discussed. Finally, the key issues are identified for image representation compression and retrieval on the basis of performance evaluation parameters like encoding time, decoding time, compression ratio, precision, recall, and elapsed time.

Download Full-text

Multimodal deep representation learning for protein interaction identification and protein family classification

BMC Bioinformatics ◽

10.1186/s12859-019-3084-y ◽

2019 ◽

Vol 20 (S16) ◽

Cited By ~ 4

Author(s):

Da Zhang ◽

Mansur Kabuka

Keyword(s):

Protein Interactions ◽

Protein Sequence ◽

Representation Learning ◽

Superior Performance ◽

Sequence Information ◽

Protein Protein Interactions ◽

Learning Framework ◽

Topological Features ◽

Ppi Networks ◽

Ppi Prediction

Abstract Background Protein-protein interactions(PPIs) engage in dynamic pathological and biological procedures constantly in our life. Thus, it is crucial to comprehend the PPIs thoroughly such that we are able to illuminate the disease occurrence, achieve the optimal drug-target therapeutic effect and describe the protein complex structures. However, compared to the protein sequences obtainable from various species and organisms, the number of revealed protein-protein interactions is relatively limited. To address this dilemma, lots of research endeavor have investigated in it to facilitate the discovery of novel PPIs. Among these methods, PPI prediction techniques that merely rely on protein sequence data are more widespread than other methods which require extensive biological domain knowledge. Results In this paper, we propose a multi-modal deep representation learning structure by incorporating protein physicochemical features with the graph topological features from the PPI networks. Specifically, our method not only bears in mind the protein sequence information but also discerns the topological representations for each protein node in the PPI networks. In our paper, we construct a stacked auto-encoder architecture together with a continuous bag-of-words (CBOW) model based on generated metapaths to study the PPI predictions. Following by that, we utilize the supervised deep neural networks to identify the PPIs and classify the protein families. The PPI prediction accuracy for eight species ranged from 96.76% to 99.77%, which signifies that our multi-modal deep representation learning framework achieves superior performance compared to other computational methods. Conclusion To the best of our knowledge, this is the first multi-modal deep representation learning framework for examining the PPI networks.

Download Full-text

Perceptual Image Representations for Support Vector Machine Image Coding

Kernel Methods in Bioengineering, Signal and Image Processing ◽

10.4018/978-1-59904-042-4.ch013 ◽

2011 ◽

pp. 303-324

Author(s):

Juan Gutiérrez ◽

Gabriel Gómez-Perez ◽

Jesús Malo ◽

Gustavo Camps-Valls

Keyword(s):

Support Vector Machine ◽

Image Coding ◽

General Procedure ◽

Image Representation ◽

Rate Distortion ◽

Human Perception ◽

Human Vision ◽

Support Vector ◽

Perception Model ◽

Image Representations

Support vector machine (SVM) image coding relies on the ability of SVMs for function approximation. The size and the profile of the e-insensitivity zone of the support vector regression (SVR) at some specific image representation determines (a) the amount of selected support vectors (the compression ratio), and (b) the nature of the introduced error (the compression distortion). However, the selection of an appropriate image representation is a key issue for a meaningful design of the e-insensitivity profile. For example, in image coding applications, taking human perception into account is of paramount relevance to obtain a good rate-distortion performance. However, depending on the accuracy of the considered perception model, certain image representations are not suitable for SVR training. In this chapter, we analyze the general procedure to take human vision models into account in SVR-based image coding. Specifically, we derive the condition for image representation selection and the associated e-insensitivity profiles.

Download Full-text

Revisiting Graph Based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5330 ◽

2020 ◽

Vol 34 (01) ◽

pp. 27-34 ◽

Cited By ~ 5

Author(s):

Lei Chen ◽

Le Wu ◽

Richang Hong ◽

Kun Zhang ◽

Meng Wang

Keyword(s):

Collaborative Filtering ◽

Representation Learning ◽

Superior Performance ◽

Convolutional Network ◽

Convolutional Networks ◽

Proposed Model ◽

Non Linear ◽

Efficiency And Effectiveness ◽

Residual Graph ◽

Interaction Modeling

Graph Convolutional Networks~(GCNs) are state-of-the-art graph based representation learning models by iteratively stacking multiple layers of convolution aggregation operations and non-linear activation operations. Recently, in Collaborative Filtering~(CF) based Recommender Systems~(RS), by treating the user-item interaction behavior as a bipartite graph, some researchers model higher-layer collaborative signals with GCNs. These GCN based recommender models show superior performance compared to traditional works. However, these models suffer from training difficulty with non-linear activations for large user-item graphs. Besides, most GCN based models could not model deeper layers due to the over smoothing effect with the graph convolution operation. In this paper, we revisit GCN based CF models from two aspects. First, we empirically show that removing non-linearities would enhance recommendation performance, which is consistent with the theories in simple graph convolutional networks. Second, we propose a residual network structure that is specifically designed for CF with user-item interaction modeling, which alleviates the over smoothing problem in graph convolution aggregation operation with sparse user-item interaction data. The proposed model is a linear model and it is easy to train, scale to large datasets, and yield better efficiency and effectiveness on two real datasets. We publish the source code at https://github.com/newlei/LR-GCCF.

Download Full-text