Image Splicing Location Based on Illumination Maps and Cluster Region Proposal Network

Ye Zhu; Xiaoqian Shen; Shikun Liu; Xiaoli Zhang; Gang Yan

doi:10.3390/app11188437

Image Splicing Location Based on Illumination Maps and Cluster Region Proposal Network

Applied Sciences ◽

10.3390/app11188437 ◽

2021 ◽

Vol 11 (18) ◽

pp. 8437

Author(s):

Ye Zhu ◽

Xiaoqian Shen ◽

Shikun Liu ◽

Xiaoli Zhang ◽

Gang Yan

Keyword(s):

State Of The Art ◽

Location Information ◽

Context Information ◽

Post Processing ◽

Image Splicing ◽

Stream Network ◽

Cluster Region ◽

Multiple Feature ◽

Common Operation ◽

Feature Pyramid

Splicing is the most common operation in image forgery, where the tampered background regions are imported from different images. Illumination maps are inherent attribute of images and provide significant clues when searching for splicing locations. This paper proposes an end-to-end dual-stream network for splicing location, where the illumination stream, which includes Grey-Edge (GE) and Inverse-Intensity Chromaticity (IIC), extract the inconsistent features, and the image stream extracts the global unnatural tampered features. The dual-stream feature in our network is fused through Multiple Feature Pyramid Network (MFPN), which contains richer context information. Finally, a Cluster Region Proposal Network (C-RPN) with spatial attention and an adaptive cluster anchor are proposed to generate potential tampered regions with greater retention of location information. Extensive experiments, which were evaluated on the NIST16 and CASIA standard datasets, show that our proposed algorithm is superior to some state-of-the-art algorithms, because it achieves accurate tampered locations at the pixel level, and has great robustness in post-processing operations, such as noise, blur and JPEG recompression.

Download Full-text

Adaptive Feature Pyramid Network to Predict Crisp Boundaries via NMS Layer and ODS F-Measure Loss Function

Information ◽

10.3390/info13010032 ◽

2022 ◽

Vol 13 (1) ◽

pp. 32

Author(s):

Gang Sun ◽

Hancheng Yu ◽

Xiangtao Jiang ◽

Mingkui Feng

Keyword(s):

Edge Detection ◽

Loss Function ◽

State Of The Art ◽

Cross Entropy ◽

Post Processing ◽

Multi Scale ◽

Feature Pyramid ◽

Multi Level ◽

Different Levels ◽

F Measure

Edge detection is one of the fundamental computer vision tasks. Recent methods for edge detection based on a convolutional neural network (CNN) typically employ the weighted cross-entropy loss. Their predicted results being thick and needing post-processing before calculating the optimal dataset scale (ODS) F-measure for evaluation. To achieve end-to-end training, we propose a non-maximum suppression layer (NMS) to obtain sharp boundaries without the need for post-processing. The ODS F-measure can be calculated based on these sharp boundaries. So, the ODS F-measure loss function is proposed to train the network. Besides, we propose an adaptive multi-level feature pyramid network (AFPN) to better fuse different levels of features. Furthermore, to enrich multi-scale features learned by AFPN, we introduce a pyramid context module (PCM) that includes dilated convolution to extract multi-scale features. Experimental results indicate that the proposed AFPN achieves state-of-the-art performance on the BSDS500 dataset (ODS F-score of 0.837) and the NYUDv2 dataset (ODS F-score of 0.780).

Download Full-text

Asymmetric Adaptive Fusion in a Two-Stream Network for RGB-D Human Detection

Sensors ◽

10.3390/s21030916 ◽

2021 ◽

Vol 21 (3) ◽

pp. 916

Author(s):

Wenli Zhang ◽

Xiang Guo ◽

Jiaqi Wang ◽

Ning Wang ◽

Kaizhen Chen

Keyword(s):

State Of The Art ◽

Contextual Information ◽

Human Detection ◽

Stream Network ◽

Adaptive Fusion ◽

Indoor Scenes ◽

Stable Performance ◽

Feature Pyramid ◽

Low Illumination ◽

Depth Feature

In recent years, human detection in indoor scenes has been widely applied in smart buildings and smart security, but many related challenges can still be difficult to address, such as frequent occlusion, low illumination and multiple poses. This paper proposes an asymmetric adaptive fusion two-stream network (AAFTS-net) for RGB-D human detection. This network can fully extract person-specific depth features and RGB features while reducing the typical complexity of a two-stream network. A depth feature pyramid is constructed by combining contextual information, with the motivation of combining multiscale depth features to improve the adaptability for targets of different sizes. An adaptive channel weighting (ACW) module weights the RGB-D feature channels to achieve efficient feature selection and information complementation. This paper also introduces a novel RGB-D dataset for human detection called RGBD-human, on which we verify the performance of the proposed algorithm. The experimental results show that AAFTS-net outperforms existing state-of-the-art methods and can maintain stable performance under conditions of frequent occlusion, low illumination and multiple poses.

Download Full-text

Progressive Feature Polishing Network for Salient Object Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6892 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12128-12135 ◽

Cited By ~ 1

Author(s):

Bo Wang ◽

Quan Chen ◽

Min Zhou ◽

Zhiqiang Zhang ◽

Xiaogang Jin ◽

...

Keyword(s):

Object Detection ◽

State Of The Art ◽

Hierarchical Structures ◽

Salient Object Detection ◽

Salient Object ◽

Post Processing ◽

Feature Maps ◽

Multiple Feature ◽

Benchmark Datasets ◽

Multi Level

Feature matters for salient object detection. Existing methods mainly focus on designing a sophisticated structure to incorporate multi-level features and filter out cluttered features. We present Progressive Feature Polishing Network (PFPN), a simple yet effective framework to progressively polish the multi-level features to be more accurate and representative. By employing multiple Feature Polishing Modules (FPMs) in a recurrent manner, our approach is able to detect salient objects with fine details without any post-processing. A FPM parallelly updates the features of each level by directly incorporating all higher level context information. Moreover, it can keep the dimensions and hierarchical structures of the feature maps, which makes it flexible to be integrated with any CNN-based models. Empirical experiments show that our results are monotonically getting better with increasing number of FPMs. Without bells and whistles, PFPN outperforms the state-of-the-art methods significantly on five benchmark datasets under various evaluation metrics. Our code is available at: https://github.com/chenquan-cq/PFPN.

Download Full-text

CasTabDetectoRS: Cascade Network for Table Detection in Document Images with Recursive Feature Pyramid and Switchable Atrous Convolution

10.20944/preprints202109.0059.v1 ◽

2021 ◽

Author(s):

Khurram Azeem Hashmi ◽

Alain Pagani ◽

Marcus Liwicki ◽

Didier Stricker ◽

Muhammad Zeshan Afzal

Keyword(s):

Relative Error ◽

State Of The Art ◽

Error Reduction ◽

Reliable Information ◽

Document Images ◽

Post Processing ◽

Preliminary Step ◽

Backbone Networks ◽

Previous State ◽

Feature Pyramid

Table detection is a preliminary step in extracting reliable information from tables in scanned document images. We present CasTabDetectoRS, a novel end-to-end trainable table detection framework that operates on Cascade Mask R-CNN, including Recursive Feature Pyramid network and Switchable Atrous Convolution in the existing backbone architecture. By utilizing a comparatively lightweight backbone of ResNet-50, this paper demonstrates that superior results are attainable without relying on pre and post-processing methods, heavier backbone networks (ResNet-101, ResNeXt-152), and memory-intensive deformable convolutions. We evaluate the proposed approach on five different publicly available table detection datasets. Our CasTabDetectoRS outperforms the previous state-of-the-art results on four datasets (ICDAR-19, TableBank, UNLV, and Marmot) and accomplishes comparable results on ICDAR-17 POD. Upon comparing with previous state-of-the-art results, we obtain a significant relative error reduction of 56.36%, 20%, 4.5%, and 3.5% on the datasets of ICDAR-19, TableBank, UNLV, and Marmot, respectively. Furthermore, this paper sets a new benchmark by performing exhaustive cross-datasets evaluations to exhibit the generalization capabilities of the proposed method.

Download Full-text

CasTabDetectoRS: Cascade Network for Table Detection in Document Images with Recursive Feature Pyramid and Switchable Atrous Convolution

Journal of Imaging ◽

10.3390/jimaging7100214 ◽

2021 ◽

Vol 7 (10) ◽

pp. 214

Author(s):

Khurram Hashmi ◽

Alain Pagani ◽

Marcus Liwicki ◽

Didier Stricker ◽

Muhammad Zeshan Afzal

Keyword(s):

Relative Error ◽

State Of The Art ◽

Error Reduction ◽

Reliable Information ◽

Document Images ◽

Post Processing ◽

Preliminary Step ◽

Backbone Networks ◽

Previous State ◽

Feature Pyramid

Table detection is a preliminary step in extracting reliable information from tables in scanned document images. We present CasTabDetectoRS, a novel end-to-end trainable table detection framework that operates on Cascade Mask R-CNN, including Recursive Feature Pyramid network and Switchable Atrous Convolution in the existing backbone architecture. By utilizing a comparativelyightweight backbone of ResNet-50, this paper demonstrates that superior results are attainable without relying on pre- and post-processing methods, heavier backbone networks (ResNet-101, ResNeXt-152), and memory-intensive deformable convolutions. We evaluate the proposed approach on five different publicly available table detection datasets. Our CasTabDetectoRS outperforms the previous state-of-the-art results on four datasets (ICDAR-19, TableBank, UNLV, and Marmot) and accomplishes comparable results on ICDAR-17 POD. Upon comparing with previous state-of-the-art results, we obtain a significant relative error reduction of 56.36%, 20%, 4.5%, and 3.5% on the datasets of ICDAR-19, TableBank, UNLV, and Marmot, respectively. Furthermore, this paper sets a new benchmark by performing exhaustive cross-datasets evaluations to exhibit the generalization capabilities of the proposed method.

Download Full-text

Dual-Stream Guided-Learning via a Priori Optimization for Person Re-identification

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3447715 ◽

2021 ◽

Vol 17 (4) ◽

pp. 1-22

Author(s):

Junyi Wu ◽

Yan Huang ◽

Qiang Wu ◽

Zhipeng Gao ◽

Jianqiang Zhao ◽

...

Keyword(s):

Learning Strategy ◽

State Of The Art ◽

A Priori ◽

Background Information ◽

Stream Network ◽

Related Information ◽

Guided Learning ◽

Segmentation Algorithms ◽

Art Methods ◽

Background Clutter

The task of person re-identification (re-ID) is to find the same pedestrian across non-overlapping camera views. Generally, the performance of person re-ID can be affected by background clutter. However, existing segmentation algorithms cannot obtain perfect foreground masks to cover the background information clearly. In addition, if the background is completely removed, some discriminative ID-related cues (i.e., backpack or companion) may be lost. In this article, we design a dual-stream network consisting of a Provider Stream (P-Stream) and a Receiver Stream (R-Stream). The R-Stream performs an a priori optimization operation on foreground information. The P-Stream acts as a pusher to guide the R-Stream to concentrate on foreground information and some useful ID-related cues in the background. The proposed dual-stream network can make full use of the a priori optimization and guided-learning strategy to learn encouraging foreground information and some useful ID-related information in the background. Our method achieves Rank-1 accuracy of 95.4% on Market-1501, 89.0% on DukeMTMC-reID, 78.9% on CUHK03 (labeled), and 75.4% on CUHK03 (detected), outperforming state-of-the-art methods.

Download Full-text

Advancement in the Application of Finite Element Analysis to the Optimization of Composite Yacht Structures

10.5957/csys-2011-005 ◽

2011 ◽

Author(s):

David Fornaro

Keyword(s):

Finite Element Analysis ◽

Finite Element ◽

Composite Structures ◽

State Of The Art ◽

Load Case ◽

Post Processing ◽

Principal Stresses ◽

Element Analysis ◽

Current State ◽

Case Development

Finite Element Analysis (FEA) is mature technology that has been in use for several decades as a tool to optimize structures for a wide variety of applications. Its application to composite structures is not new, however the technology for modeling and analyzing the behavior of composite structures continues to evolve on several fronts. This paper provides a review of the current state-of-the-art with regard to composites FEA, with a particular emphasis on applications to yacht structures. Topics covered are divided into three categories: Pre-processing; Postprocessing; and Non-linear Solutions. Pre-processing topics include meshing, ply properties, laminate definitions, element orientations, global ply tracking and load case development. Post-processing topics include principal stresses, failure indices and strength ratios. Nonlinear solution topics include progressive ply failure. Examples are included to highlight the application of advanced finite element analysis methodologies to the optimization of composite yacht structures.

Download Full-text

Mosaic Super-resolution via Sequential Feature Pyramid Networks

10.36227/techrxiv.11402130 ◽

2019 ◽

Author(s):

Mehrdad Shoeiby ◽

Mohammad Ali Armin ◽

Sadegh Aliakbarian ◽

Saeed Anwar ◽

Lars petersson

Keyword(s):

State Of The Art ◽

Super Resolution ◽

Autonomous Driving ◽

Single Shot ◽

Current State ◽

Wide Range ◽

Feature Pyramid ◽

Novel Method ◽

Convolutional Lstm ◽

Mosaic Images

<div>Advances in the design of multi-spectral cameras have</div><div>led to great interests in a wide range of applications, from</div><div>astronomy to autonomous driving. However, such cameras</div><div>inherently suffer from a trade-off between the spatial and</div><div>spectral resolution. In this paper, we propose to address</div><div>this limitation by introducing a novel method to carry out</div><div>super-resolution on raw mosaic images, multi-spectral or</div><div>RGB Bayer, captured by modern real-time single-shot mo-</div><div>saic sensors. To this end, we design a deep super-resolution</div><div>architecture that benefits from a sequential feature pyramid</div><div>along the depth of the network. This, in fact, is achieved</div><div>by utilizing a convolutional LSTM (ConvLSTM) to learn the</div><div>inter-dependencies between features at different receptive</div><div>fields. Additionally, by investigating the effect of different</div><div>attention mechanisms in our framework, we show that a</div><div>ConvLSTM inspired module is able to provide superior at-</div><div>tention in our context. Our extensive experiments and anal-</div><div>yses evidence that our approach yields significant super-</div><div>resolution quality, outperforming current state-of-the-art</div><div>mosaic super-resolution methods on both Bayer and multi-</div><div>spectral images. Additionally, to the best of our knowledge,</div><div>our method is the first specialized method to super-resolve</div><div>mosaic images, whether it be multi-spectral or Bayer.</div><div><br></div>

Download Full-text

Joint Representation Learning of Legislator and Legislation for Roll Call Prediction

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/198 ◽

2020 ◽

Cited By ~ 1

Author(s):

Yuqiao Yang ◽

Xiaoqiang Lin ◽

Geng Lin ◽

Zengfeng Huang ◽

Changjian Jiang ◽

...

Keyword(s):

Neural Networks ◽

State Of The Art ◽

Ideal Point ◽

Representation Learning ◽

Context Information ◽

Roll Call ◽

Triplet Loss ◽

Joint Representation ◽

Narrative Description ◽

The Ideal

In this paper, we explore to learn representations of legislation and legislator for the prediction of roll call results. The most popular approach for this topic is named the ideal point model that relies on historical voting information for representation learning of legislators. It largely ignores the context information of the legislative data. We, therefore, propose to incorporate context information to learn dense representations for both legislators and legislation. For legislators, we incorporate relations among them via graph convolutional neural networks (GCN) for their representation learning. For legislation, we utilize its narrative description via recurrent neural networks (RNN) for representation learning. In order to align two kinds of representations in the same vector space, we introduce a triplet loss for the joint training. Experimental results on a self-constructed dataset show the effectiveness of our model for roll call results prediction compared to some state-of-the-art baselines.

Download Full-text

Feature weighting network for aircraft engine defect detection

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691320500125 ◽

2020 ◽

Vol 18 (03) ◽

pp. 2050012

Author(s):

Liqiong Chen ◽

Lian Zou ◽

Cien Fan ◽

Yifeng Liu

Keyword(s):

Defect Detection ◽

State Of The Art ◽

Aircraft Engine ◽

Air Transportation ◽

Feature Weighting ◽

Detection Methods ◽

Detection Accuracy ◽

Practical Applications ◽

Feature Pyramid ◽

New Feature

Automatic aircraft engine defect detection is a challenging but important task in industry which can ensure safe air transportation and flight. In this paper, we propose a fast and accurate feature weighting network (FWNet) to solve the problem of defect scale variation and improve detection accuracy. The framework is designed based on recent popular convolutional neural networks and feature pyramid. To further boost the representation power of the network, a new feature weighting module (FWM) was proposed to recalibrate the channel-wise attention and increase the weights of valid features. The model was trained and tested on a self-built dataset, which consisted of 1916 images and contained three defect types: ablation, crack and coating missing. Extensive experimental results verify the effectiveness of the proposed FWM and show that the proposed method can accurately detect engine defects of different scales and different locations. Our method obtains 89.4% mAP and can run at 6FPS, which surpasses other state-of-the-art detection methods and can quickly provide diagnostic basis for aircraft maintenance inspectors in practical applications.

Download Full-text