CNN-Based Target Recognition and Identification for Infrared Imaging in Defense Systems

Antoine d’Acremont; Ronan Fablet; Alexandre Baussard; Guillaume Quin

doi:10.3390/s19092040

CNN-Based Target Recognition and Identification for Infrared Imaging in Defense Systems

Sensors ◽

10.3390/s19092040 ◽

2019 ◽

Vol 19 (9) ◽

pp. 2040 ◽

Cited By ~ 7

Author(s):

Antoine d’Acremont ◽

Ronan Fablet ◽

Alexandre Baussard ◽

Guillaume Quin

Keyword(s):

Large Scale ◽

Data Augmentation ◽

Infrared Imaging ◽

State Of The Art ◽

Object Identification ◽

Fine Tuning ◽

Support Vector ◽

Defense Systems ◽

Large Scale Dataset ◽

In The Wild

Convolutional neural networks (CNNs) have rapidly become the state-of-the-art models for image classification applications. They usually require large groundtruthed datasets for training. Here, we address object identification and recognition in the wild for infrared (IR) imaging in defense applications, where no such large-scale dataset is available. With a focus on robustness issues, especially viewpoint invariance, we introduce a compact and fully convolutional CNN architecture with global average pooling. We show that this model trained from realistic simulation datasets reaches a state-of-the-art performance compared with other CNNs with no data augmentation and fine-tuning steps. We also demonstrate a significant improvement in the robustness to viewpoint changes with respect to an operational support vector machine (SVM)-based scheme.

Download Full-text

VIPPrint: Validating Synthetic Image Detection and Source Linking Methods on a Large Scale Dataset of Printed Documents

Journal of Imaging ◽

10.3390/jimaging7030050 ◽

2021 ◽

Vol 7 (3) ◽

pp. 50

Author(s):

Anselmo Ferreira ◽

Ehsan Nowroozi ◽

Mauro Barni

Keyword(s):

Large Scale ◽

State Of The Art ◽

Child Pornography ◽

Forensic Analysis ◽

Synthetic Image ◽

Image Detection ◽

Face Images ◽

Large Scale Dataset ◽

Scanned Images ◽

Analysis Of The Images

The possibility of carrying out a meaningful forensic analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation or the synthetic nature of images, since the artifacts commonly found in manipulated and synthetic images are gone after the images are printed and scanned. A problem hindering research in this area is the lack of large scale reference datasets to be used for algorithm development and benchmarking. Motivated by this issue, we present a new dataset composed of a large number of synthetic and natural printed face images. To highlight the difficulties associated with the analysis of the images of the dataset, we carried out an extensive set of experiments comparing several printer attribution methods. We also verified that state-of-the-art methods to distinguish natural and synthetic face images fail when applied to print and scanned images. We envision that the availability of the new dataset and the preliminary experiments we carried out will motivate and facilitate further research in this area.

Download Full-text

ShadingNet: Image Intrinsics by Fine-Grained Shading Decomposition

International Journal of Computer Vision ◽

10.1007/s11263-021-01477-5 ◽

2021 ◽

Author(s):

Anil S. Baslamisli ◽

Partha Das ◽

Hoang-An Le ◽

Sezer Karaoglu ◽

Theo Gevers

Keyword(s):

Neural Network ◽

Large Scale ◽

State Of The Art ◽

Image Decomposition ◽

Natural Environments ◽

Decomposition Algorithms ◽

Ambient Light ◽

Fine Grained ◽

Large Scale Dataset ◽

Direct Illumination

AbstractIn general, intrinsic image decomposition algorithms interpret shading as one unified component including all photometric effects. As shading transitions are generally smoother than reflectance (albedo) changes, these methods may fail in distinguishing strong photometric effects from reflectance variations. Therefore, in this paper, we propose to decompose the shading component into direct (illumination) and indirect shading (ambient light and shadows) subcomponents. The aim is to distinguish strong photometric effects from reflectance variations. An end-to-end deep convolutional neural network (ShadingNet) is proposed that operates in a fine-to-coarse manner with a specialized fusion and refinement unit exploiting the fine-grained shading model. It is designed to learn specific reflectance cues separated from specific photometric effects to analyze the disentanglement capability. A large-scale dataset of scene-level synthetic images of outdoor natural environments is provided with fine-grained intrinsic image ground-truths. Large scale experiments show that our approach using fine-grained shading decompositions outperforms state-of-the-art algorithms utilizing unified shading on NED, MPI Sintel, GTA V, IIW, MIT Intrinsic Images, 3DRMS and SRD datasets.

Download Full-text

TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild

Computer Vision – ECCV 2018 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-01246-5_19 ◽

2018 ◽

pp. 310-327 ◽

Cited By ~ 45

Author(s):

Matthias Müller ◽

Adel Bibi ◽

Silvio Giancola ◽

Salman Alsubaihi ◽

Bernard Ghanem

Keyword(s):

Object Tracking ◽

Large Scale ◽

Large Scale Dataset ◽

In The Wild

Download Full-text

Tomato pest classification using deep convolutional neural network with transfer learning, fine tuning and scratch learning

Intelligent Decision Technologies ◽

10.3233/idt-200192 ◽

2021 ◽

pp. 1-10

Author(s):

Gayatri Pattnaik ◽

Vimal K. Shrivastava ◽

K. Parvathi

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Transfer Learning ◽

Data Augmentation ◽

State Of The Art ◽

Deep Convolutional Neural Network ◽

Fine Tuning ◽

Tomato Plants ◽

Random Weights

Pests are major threat to economic growth of a country. Application of pesticide is the easiest way to control the pest infection. However, excessive utilization of pesticide is hazardous to environment. The recent advances in deep learning have paved the way for early detection and improved classification of pest in tomato plants which will benefit the farmers. This paper presents a comprehensive analysis of 11 state-of-the-art deep convolutional neural network (CNN) models with three configurations: transfers learning, fine-tuning and scratch learning. The training in transfer learning and fine tuning initiates from pre-trained weights whereas random weights are used in case of scratch learning. In addition, the concept of data augmentation has been explored to improve the performance. Our dataset consists of 859 tomato pest images from 10 categories. The results demonstrate that the highest classification accuracy of 94.87% has been achieved in the transfer learning approach by DenseNet201 model with data augmentation.

Download Full-text

Legal Judgment Prediction Based on Multiclass Information Fusion

Complexity ◽

10.1155/2020/3089189 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Kongfan Zhu ◽

Rundong Guo ◽

Weifeng Hu ◽

Zeqiang Li ◽

Yujun Li

Keyword(s):

Information Fusion ◽

Real World ◽

Large Scale ◽

State Of The Art ◽

External Information ◽

Criminal Cases ◽

Law System ◽

Large Scale Dataset ◽

Assistant Systems ◽

Civil Law System

Legal judgment prediction (LJP), as an effective and critical application in legal assistant systems, aims to determine the judgment results according to the information based on the fact determination. In real-world scenarios, to deal with the criminal cases, judges not only take advantage of the fact description, but also consider the external information, such as the basic information of defendant and the court view. However, most existing works take the fact description as the sole input for LJP and ignore the external information. We propose a Transformer-Hierarchical-Attention-Multi-Extra (THME) Network to make full use of the information based on the fact determination. We conduct experiments on a real-world large-scale dataset of criminal cases in the civil law system. Experimental results show that our method outperforms state-of-the-art LJP methods on all judgment prediction tasks.

Download Full-text

TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6282 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7780-7788

Author(s):

Siddhant Garg ◽

Thuy Vu ◽

Alessandro Moschitti

Keyword(s):

Large Scale ◽

Question Answering ◽

Positive Impact ◽

Fine Tuning ◽

Target Domain ◽

Domain Specific ◽

Transfer Step ◽

Industrial Setting ◽

Large Scale Dataset ◽

Effective Use

We propose TandA, an effective technique for fine-tuning pre-trained Transformer models for natural language tasks. Specifically, we first transfer a pre-trained model into a model for a general task by fine-tuning it with a large and high-quality dataset. We then perform a second fine-tuning step to adapt the transferred model to the target domain. We demonstrate the benefits of our approach for answer sentence selection, which is a well-known inference task in Question Answering. We built a large scale dataset to enable the transfer step, exploiting the Natural Questions dataset. Our approach establishes the state of the art on two well-known benchmarks, WikiQA and TREC-QA, achieving the impressive MAP scores of 92% and 94.3%, respectively, which largely outperform the the highest scores of 83.4% and 87.5% of previous work. We empirically show that TandA generates more stable and robust models reducing the effort required for selecting optimal hyper-parameters. Additionally, we show that the transfer step of TandA makes the adaptation step more robust to noise. This enables a more effective use of noisy datasets for fine-tuning. Finally, we also confirm the positive impact of TandA in an industrial setting, using domain specific datasets subject to different types of noise.

Download Full-text

A Hybrid Network for Large-Scale Action Recognition from RGB and Depth Modalities

Sensors ◽

10.3390/s20113305 ◽

2020 ◽

Vol 20 (11) ◽

pp. 3305 ◽

Cited By ~ 1

Author(s):

Huogen Wang ◽

Zhanjie Song ◽

Wanqing Li ◽

Pichao Wang

Keyword(s):

Neural Network ◽

Action Recognition ◽

Canonical Correlation ◽

Large Scale ◽

State Of The Art ◽

Hybrid Network ◽

Support Vector ◽

Multiple Modalities ◽

Large Margin ◽

Percentage Points

The paper presents a novel hybrid network for large-scale action recognition from multiple modalities. The network is built upon the proposed weighted dynamic images. It effectively leverages the strengths of the emerging Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) based approaches to specifically address the challenges that occur in large-scale action recognition and are not fully dealt with by the state-of-the-art methods. Specifically, the proposed hybrid network consists of a CNN based component and an RNN based component. Features extracted by the two components are fused through canonical correlation analysis and then fed to a linear Support Vector Machine (SVM) for classification. The proposed network achieved state-of-the-art results on the ChaLearn LAP IsoGD, NTU RGB+D and Multi-modal & Multi-view & Interactive ( M 2 I ) datasets and outperformed existing methods by a large margin (over 10 percentage points in some cases).

Download Full-text

Local Support Vector Machine based on Cooperative Clustering for very large-scale dataset

2012 8th International Conference on Natural Computation ◽

10.1109/icnc.2012.6234598 ◽

2012 ◽

Cited By ~ 1

Author(s):

Chuanhuan Yin ◽

Yingying Zhu ◽

Shaomin Mu ◽

Shengfeng Tian

Keyword(s):

Support Vector Machine ◽

Large Scale ◽

Support Vector ◽

Large Scale Dataset ◽

Local Support ◽

Cooperative Clustering

Download Full-text

Detection of Manipulated Face Videos over Social Networks: A Large-Scale Study

Journal of Imaging ◽

10.3390/jimaging7100193 ◽

2021 ◽

Vol 7 (10) ◽

pp. 193

Author(s):

Federico Marcon ◽

Cecilia Pasquini ◽

Giulia Boato

Keyword(s):

Large Scale ◽

State Of The Art ◽

Forensic Analysis ◽

General Purpose ◽

Fine Tuning ◽

Specific Technique ◽

Multimedia Forensics ◽

Shared Data ◽

Social Media Platforms ◽

The Web

The detection of manipulated videos represents a highly relevant problem in multimedia forensics, which has been widely investigated in the last years. However, a common trait of published studies is the fact that the forensic analysis is typically applied on data prior to their potential dissemination over the web. This work addresses the challenging scenario where manipulated videos are first shared through social media platforms and then are subject to the forensic analysis. In this context, a large scale performance evaluation has been carried out involving general purpose deep networks and state-of-the-art manipulated data, and studying different effects. Results confirm that a performance drop is observed in every case when unseen shared data are tested by networks trained on non-shared data; however, fine-tuning operations can mitigate this problem. Also, we show that the output of differently trained networks can carry useful forensic information for the identification of the specific technique used for visual manipulation, both for shared and non-shared data.

Download Full-text

Edge-assisted Collaborative Image Recognition for Mobile Augmented Reality

ACM Transactions on Sensor Networks ◽

10.1145/3469033 ◽

2022 ◽

Vol 18 (1) ◽

pp. 1-31

Author(s):

Guohao Lan ◽

Zida Liu ◽

Yunfan Zhang ◽

Tim Scargill ◽

Jovan Stojkovic ◽

...

Keyword(s):

Augmented Reality ◽

Image Recognition ◽

Large Scale ◽

Data Augmentation ◽

Recognition Accuracy ◽

Image Distortion ◽

Mobile Augmented Reality ◽

The Real ◽

In The Wild ◽

Mobile Ar

Mobile Augmented Reality (AR), which overlays digital content on the real-world scenes surrounding a user, is bringing immersive interactive experiences where the real and virtual worlds are tightly coupled. To enable seamless and precise AR experiences, an image recognition system that can accurately recognize the object in the camera view with low system latency is required. However, due to the pervasiveness and severity of image distortions, an effective and robust image recognition solution for “in the wild” mobile AR is still elusive. In this article, we present CollabAR, an edge-assisted system that provides distortion-tolerant image recognition for mobile AR with imperceptible system latency . CollabAR incorporates both distortion-tolerant and collaborative image recognition modules in its design. The former enables distortion-adaptive image recognition to improve the robustness against image distortions, while the latter exploits the spatial-temporal correlation among mobile AR users to improve recognition accuracy. Moreover, as it is difficult to collect a large-scale image distortion dataset, we propose a Cycle-Consistent Generative Adversarial Network-based data augmentation method to synthesize realistic image distortion. Our evaluation demonstrates that CollabAR achieves over 85% recognition accuracy for “in the wild” images with severe distortions, while reducing the end-to-end system latency to as low as 18.2 ms.

Download Full-text