Boosting Targeted Black-Box Attacks via Ensemble Substitute Training and Linear Augmentation

Xianfeng Gao; Yu-an Tan; Hongwei Jiang; Quanxin Zhang; Xiaohui Kuang

doi:10.3390/app9112286

Boosting Targeted Black-Box Attacks via Ensemble Substitute Training and Linear Augmentation

Applied Sciences ◽

10.3390/app9112286 ◽

2019 ◽

Vol 9 (11) ◽

pp. 2286 ◽

Cited By ~ 6

Author(s):

Xianfeng Gao ◽

Yu-an Tan ◽

Hongwei Jiang ◽

Quanxin Zhang ◽

Xiaohui Kuang

Keyword(s):

Neural Networks ◽

Image Classification ◽

Deep Neural Networks ◽

Black Box ◽

Decision Boundary ◽

Success Rates ◽

Small Perturbations ◽

Targeted Attacks ◽

Adversarial Examples ◽

Effectiveness And Efficiency

These years, Deep Neural Networks (DNNs) have shown unprecedented performance in many areas. However, some recent studies revealed their vulnerability to small perturbations added on source inputs. Furthermore, we call the ways to generate these perturbations’ adversarial attacks, which contain two types, black-box and white-box attacks, according to the adversaries’ access to target models. In order to overcome the problem of black-box attackers’ unreachabilities to the internals of target DNN, many researchers put forward a series of strategies. Previous works include a method of training a local substitute model for the target black-box model via Jacobian-based augmentation and then use the substitute model to craft adversarial examples using white-box methods. In this work, we improve the dataset augmentation to make the substitute models better fit the decision boundary of the target model. Unlike the previous work that just performed the non-targeted attack, we make it first to generate targeted adversarial examples via training substitute models. Moreover, to boost the targeted attacks, we apply the idea of ensemble attacks to the substitute training. Experiments on MNIST and GTSRB, two common datasets for image classification, demonstrate our effectiveness and efficiency of boosting a targeted black-box attack, and we finally attack the MNIST and GTSRB classifiers with the success rates of 97.7% and 92.8%.

Download Full-text

A Black-Box Approach to Generate Adversarial Examples Against Deep Neural Networks for High Dimensional Input

2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC) ◽

10.1109/dsc.2019.00078 ◽

2019 ◽

Author(s):

Chengru Song ◽

Changqiao Xu ◽

Shujie Yang ◽

Zan Zhou ◽

Changhui Gong

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Black Box ◽

High Dimensional ◽

Adversarial Examples

Download Full-text

Optimizing for Interpretability in Deep Neural Networks with Tree Regularization

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.12558 ◽

2021 ◽

Vol 72 ◽

pp. 1-37

Author(s):

Mike Wu ◽

Sonali Parbhoo ◽

Michael C. Hughes ◽

Volker Roth ◽

Finale Doshi-Velez

Keyword(s):

Neural Networks ◽

Predictive Power ◽

Deep Neural Networks ◽

Large Body ◽

Black Box ◽

New Family ◽

Deep Model ◽

Real World Applications ◽

Adversarial Examples ◽

Key Barrier

Deep models have advanced prediction in many domains, but their lack of interpretability remains a key barrier to the adoption in many real world applications. There exists a large body of work aiming to help humans understand these black box functions to varying levels of granularity – for example, through distillation, gradients, or adversarial examples. These methods however, all tackle interpretability as a separate process after training. In this work, we take a different approach and explicitly regularize deep models so that they are well-approximated by processes that humans can step through in little time. Specifically, we train several families of deep neural networks to resemble compact, axis-aligned decision trees without significant compromises in accuracy. The resulting axis-aligned decision functions uniquely make tree regularized models easy for humans to interpret. Moreover, for situations in which a single, global tree is a poor estimator, we introduce a regional tree regularizer that encourages the deep model to resemble a compact, axis-aligned decision tree in predefined, human-interpretable contexts. Using intuitive toy examples, benchmark image datasets, and medical tasks for patients in critical care and with HIV, we demonstrate that this new family of tree regularizers yield models that are easier for humans to simulate than L1 or L2 penalties without sacrificing predictive power.

Download Full-text

AdverseGen: A Practical Tool for Generating Adversarial Examples to Deep Neural Networks Using Black-Box Approaches

Lecture Notes in Computer Science - Artificial Intelligence XXXVIII ◽

10.1007/978-3-030-91100-3_25 ◽

2021 ◽

pp. 313-326

Author(s):

Keyuan Zhang ◽

Kaiyue Wu ◽

Siyu Chen ◽

Yunce Zhao ◽

Xin Yao

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Black Box ◽

Adversarial Examples ◽

Practical Tool

Download Full-text

Representation of white- and black-box adversarial examples in deep neural networks and humans: A functional magnetic resonance imaging study

2019 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2019.8851763 ◽

2019 ◽

Author(s):

Chihye Han ◽

Wonjun Yoon ◽

Gihyun Kwon ◽

Daeshik Kim ◽

Seungkyu Nam

Keyword(s):

Magnetic Resonance Imaging ◽

Neural Networks ◽

Magnetic Resonance ◽

Deep Neural Networks ◽

Imaging Study ◽

Black Box ◽

Magnetic Resonance Imaging Study ◽

Functional Magnetic Resonance ◽

Resonance Imaging ◽

Adversarial Examples

Download Full-text

The stability of neural networks under condition of adversarial attacks to biomedical image classification

Journal of the Belarusian State University. Mathematics and Informatics ◽

10.33581/2520-6508-2020-3-60-72 ◽

2020 ◽

pp. 60-72

Author(s):

Dmitry M. Voynov ◽

Vassili A. Kovalev

Keyword(s):

Neural Networks ◽

Image Classification ◽

Classification Accuracy ◽

Black Box ◽

Accuracy Score ◽

Biomedical Image ◽

L2 Norm ◽

Adversarial Examples ◽

The Stability ◽

Biomedical Image Classification

Recently, the majority of research and development teams working in the field deep learning are concentrated on the improvement of the classification accuracy and related measures of the quality of image classification whereas the problem of adversarial attacks to deep neural networks attracts much less attention. This article is dedicated to an experimental study of the influence of various factors on the stability of convolutional neural networks under the condition of adversarial attacks to biomedical image classification. On a very extensive dataset consisted of more than 1.45 million of radiological as well as histological images we assess the efficiency of attacks performed using the projected gradient descent (PGD), DeepFool and Carlini – Wagner (CW) methods. We analyze the results of both white and black box attacks to the commonly used neural architectures such as InceptionV3, Densenet121, ResNet50, MobileNet and Xception. The basic conclusion of this study is that in the field of biomedical image classification the problem of adversarial attack stays sharp because the methods of attacks being tested are successfully attacking the above-mentioned networks so that depending on the specific task their original classification accuracy falls down from 83–97 % down to the accuracy score of 15 %. Also, it was found that under similar conditions the PGD method is less successful in adversarial attacks comparing to the DeepFool and CW methods. When the original images and adversarial examples are compared using the L2-norm, the DeepFool and CW methods generate the adversarial examples of similar maliciousness. In addition, in three out of four of black-box attacks, the PGD method has demonstrated lower attacking efficiency.

Download Full-text

Simple Iterative Method for Generating Targeted Universal Adversarial Perturbations

Algorithms ◽

10.3390/a13110268 ◽

2020 ◽

Vol 13 (11) ◽

pp. 268 ◽

Cited By ~ 1

Author(s):

Hokuto Hirano ◽

Kazuhiro Takemoto

Keyword(s):

Neural Networks ◽

Iterative Method ◽

Image Classification ◽

Deep Neural Networks ◽

State Of The Art ◽

Specific Class ◽

Targeted Attacks ◽

Fast Gradient ◽

Classification Tasks ◽

Sign Method

Deep neural networks (DNNs) are vulnerable to adversarial attacks. In particular, a single perturbation known as the universal adversarial perturbation (UAP) can foil most classification tasks conducted by DNNs. Thus, different methods for generating UAPs are required to fully evaluate the vulnerability of DNNs. A realistic evaluation would be with cases that consider targeted attacks; wherein the generated UAP causes the DNN to classify an input into a specific class. However, the development of UAPs for targeted attacks has largely fallen behind that of UAPs for non-targeted attacks. Therefore, we propose a simple iterative method to generate UAPs for targeted attacks. Our method combines the simple iterative method for generating non-targeted UAPs and the fast gradient sign method for generating a targeted adversarial perturbation for an input. We applied the proposed method to state-of-the-art DNN models for image classification and proved the existence of almost imperceptible UAPs for targeted attacks; further, we demonstrated that such UAPs can be easily generated.

Download Full-text

A Kernel Perspective for the Decision Boundary of Deep Neural Networks

2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI) ◽

10.1109/ictai50040.2020.00105 ◽

2020 ◽

Author(s):

Yifan Zhang ◽

Shizhong Liao

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Decision Boundary

Download Full-text

Natural Scene Statistics for Detecting Adversarial Examples in Deep Neural Networks

2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP) ◽

10.1109/mmsp48831.2020.9287056 ◽

2020 ◽

Author(s):

Anouar Kherchouche ◽

Sid Ahmed Fezza ◽

Wassim Hamidouche ◽

Olivier Deforges

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Natural Scene ◽

Natural Scene Statistics ◽

Adversarial Examples

Download Full-text

Hybrid deep neural networks to infer state models of black-box systems

Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering ◽

10.1145/3324884.3416559 ◽

2020 ◽

Author(s):

Mohammad Jafar Mashhadi ◽

Hadi Hemmati

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Black Box ◽

State Models

Download Full-text

Diversity Adversarial Training against Adversarial Attack on Deep Neural Networks

Symmetry ◽

10.3390/sym13030428 ◽

2021 ◽

Vol 13 (3) ◽

pp. 428

Author(s):

Hyun Kwon ◽

Jun Lee

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Diversity Training ◽

Original Data ◽

Training Method ◽

Learning Framework ◽

Adversarial Examples ◽

Adversarial Training ◽

Adversarial Attack ◽

Accuracy Rates

This paper presents research focusing on visualization and pattern recognition based on computer science. Although deep neural networks demonstrate satisfactory performance regarding image and voice recognition, as well as pattern analysis and intrusion detection, they exhibit inferior performance towards adversarial examples. Noise introduction, to some degree, to the original data could lead adversarial examples to be misclassified by deep neural networks, even though they can still be deemed as normal by humans. In this paper, a robust diversity adversarial training method against adversarial attacks was demonstrated. In this approach, the target model is more robust to unknown adversarial examples, as it trains various adversarial samples. During the experiment, Tensorflow was employed as our deep learning framework, while MNIST and Fashion-MNIST were used as experimental datasets. Results revealed that the diversity training method has lowered the attack success rate by an average of 27.2 and 24.3% for various adversarial examples, while maintaining the 98.7 and 91.5% accuracy rates regarding the original data of MNIST and Fashion-MNIST.

Download Full-text