A Black-Box Attack Method against Machine-Learning-Based Anomaly Network Flow Detection Models

Security and Communication Networks ◽

10.1155/2021/5578335 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Sensen Guo ◽

Jinxiong Zhao ◽

Xiaoyu Li ◽

Junhong Duan ◽

Dejun Mu ◽

...

Keyword(s):

Machine Learning ◽

Language Processing ◽

Network Flow ◽

Black Box ◽

Target Model ◽

Detection Algorithms ◽

Machine Learning Model ◽

Tremendous Progress ◽

Adversarial Examples ◽

Flow Detection

In recent years, machine learning has made tremendous progress in the fields of computer vision, natural language processing, and cybersecurity; however, we cannot ignore that machine learning models are vulnerable to adversarial examples, with some minor malicious input modifications, while appearing unmodified to human observers, the outputs of machine learning-based model can be misled easily. Likewise, attackers can bypass machine-learning-based security defenses model to attack systems in real time by generating adversarial examples. In this paper, we propose a black-box attack method against machine-learning-based anomaly network flow detection algorithms. Our attack strategy consists in training another model to substitute for the target machine learning model. Based on the overall understanding of the substitute model and the migration of the adversarial examples, we use the substitute model to craft adversarial examples. The experiment has shown that our method can attack the target model effectively. We attack several kinds of network flow detection models, which are based on different kinds of machine learning methods, and we find that the adversarial examples crafted by our method can bypass the detection of the target model with high probability.

Download Full-text

MODES: model-based optimization on distributed embedded systems

Machine Learning ◽

10.1007/s10994-021-06014-6 ◽

2021 ◽

Author(s):

Junjie Shi ◽

Jiang Bian ◽

Jakob Richter ◽

Kuan-Hsun Chen ◽

Jörg Rahnenführer ◽

...

Keyword(s):

Machine Learning ◽

Embedded Systems ◽

Learning Model ◽

Black Box ◽

Distributed Embedded Systems ◽

Data Set ◽

Individual Model ◽

Model Based ◽

Machine Learning Model ◽

Distributed Machine Learning

AbstractThe predictive performance of a machine learning model highly depends on the corresponding hyper-parameter setting. Hence, hyper-parameter tuning is often indispensable. Normally such tuning requires the dedicated machine learning model to be trained and evaluated on centralized data to obtain a performance estimate. However, in a distributed machine learning scenario, it is not always possible to collect all the data from all nodes due to privacy concerns or storage limitations. Moreover, if data has to be transferred through low bandwidth connections it reduces the time available for tuning. Model-Based Optimization (MBO) is one state-of-the-art method for tuning hyper-parameters but the application on distributed machine learning models or federated learning lacks research. This work proposes a framework $$\textit{MODES}$$ MODES that allows to deploy MBO on resource-constrained distributed embedded systems. Each node trains an individual model based on its local data. The goal is to optimize the combined prediction accuracy. The presented framework offers two optimization modes: (1) $$\textit{MODES}$$ MODES -B considers the whole ensemble as a single black box and optimizes the hyper-parameters of each individual model jointly, and (2) $$\textit{MODES}$$ MODES -I considers all models as clones of the same black box which allows it to efficiently parallelize the optimization in a distributed setting. We evaluate $$\textit{MODES}$$ MODES by conducting experiments on the optimization for the hyper-parameters of a random forest and a multi-layer perceptron. The experimental results demonstrate that, with an improvement in terms of mean accuracy ($$\textit{MODES}$$ MODES -B), run-time efficiency ($$\textit{MODES}$$ MODES -I), and statistical stability for both modes, $$\textit{MODES}$$ MODES outperforms the baseline, i.e., carry out tuning with MBO on each node individually with its local sub-data set.

Download Full-text

Word prediction in computational historical linguistics

Journal of Language Modelling ◽

10.15398/jlm.v8i2.268 ◽

2021 ◽

Vol 8 (2) ◽

Author(s):

Peter Dekker ◽

Willem Zuidema

Keyword(s):

Machine Learning ◽

Language Processing ◽

Historical Linguistics ◽

Data Representation ◽

Target Language ◽

Prediction Methods ◽

Word Prediction ◽

Tree Reconstruction ◽

Source Language ◽

Machine Learning Model

In this paper, we investigate how the prediction paradigm from machine learning and Natural Language Processing (NLP) can be put to use in computational historical linguistics. We propose word prediction as an intermediate task, where the forms of unseen words in some target language are predicted from the forms of the corresponding words in a source language. Word prediction allows us to develop algorithms for phylogenetic tree reconstruction, sound correspondence identification and cognate detection, in ways close to attested methods for linguistic reconstruction. We will discuss different factors, such as data representation and the choice of machine learning model, that have to be taken into account when applying prediction methods in historical linguistics. We present our own implementations and evaluate them on different tasks in historical linguistics.

Download Full-text

Textual Adversarial Attacking with Limited Queries

Electronics ◽

10.3390/electronics10212671 ◽

2021 ◽

Vol 10 (21) ◽

pp. 2671

Author(s):

Yu Zhang ◽

Junan Yang ◽

Xiaoshuai Li ◽

Hui Liu ◽

Kun Shao

Keyword(s):

Language Processing ◽

Main Idea ◽

Local Model ◽

Small Perturbations ◽

Target Model ◽

Word Level ◽

Sentence Level ◽

Adversarial Examples ◽

Reducing Costs ◽

The Cost

Recent studies have shown that natural language processing (NLP) models are vulnerable to adversarial examples, which are maliciously designed by adding small perturbations to benign inputs that are imperceptible to the human eye, leading to false predictions by the target model. Compared to character- and sentence-level textual adversarial attacks, word-level attack can generate higher-quality adversarial examples, especially in a black-box setting. However, existing attack methods usually require a huge number of queries to successfully deceive the target model, which is costly in a real adversarial scenario. Hence, finding appropriate models is difficult. Therefore, we propose a novel attack method, the main idea of which is to fully utilize the adversarial examples generated by the local model and transfer part of the attack to the local model to complete ahead of time, thereby reducing costs related to attacking the target model. Extensive experiments conducted on three public benchmarks show that our attack method can not only improve the success rate but also reduce the cost, while outperforming the baselines by a significant margin.

Download Full-text

Fenix: A Semantic Search Engine Based on an Ontology and a Model Trained with Machine Learning to Support Research

10.5121/csit.2021.110709 ◽

2021 ◽

Author(s):

Felipe Cujar-Rosero ◽

David Santiago Pinchao Ortiz ◽

Silvio Ricardo Timaran Pereira ◽

Jimmy Mateo Guerrero Restrepo

Keyword(s):

Machine Learning ◽

Virtual Environment ◽

Search Engine ◽

Language Processing ◽

Machine Learning Algorithms ◽

Semantic Search ◽

Research Projects ◽

Machine Learning Model ◽

The University ◽

Semantic Search Engine

This paper presents the final results of the research project that aimed to build a Semantic Search Engine that uses an Ontology and a model trained with Machine Learning to support the semantic search of research projects of the System of Research from the University of Nariño. For the construction of FENIX, as this Engine is called, it was used a methodology that includes the stages: appropriation of knowledge, installation and configuration of tools, libraries and technologies, collection, extraction and preparation of research projects, design and development of the Semantic Search Engine. The main results of the work were three: a) the complete construction of the Ontology with classes, object properties (predicates), data properties (attributes) and individuals (instances) in Protegé, SPARQL queries with Apache Jena Fuseki and the respective coding with Owlready2 using Jupyter Notebook with Python within the virtual environment of anaconda; b) the successful training of the model for which Machine Learning algorithms and specifically Natural Language Processing algorithms were used such as: SpaCy, NLTK, Word2vec and Doc2vec, this was also done in Jupyter Notebook with Python within the virtual environment of anaconda and with Elasticsearch; and c) the creation of FENIX managing and unifying the queries for the Ontology and for the Machine Learning model. The tests showed that FENIX was successful in all the searches that were carried out because its results were satisfactory.

Download Full-text

Natural language processing and entrustable professional activity text feedback in surgery: A machine learning model of resident autonomy

The American Journal of Surgery ◽

10.1016/j.amjsurg.2020.11.044 ◽

2020 ◽

Author(s):

Christopher C. Stahl ◽

Sarah A. Jung ◽

Alexandra A. Rosser ◽

Aaron S. Kraut ◽

Benjamin H. Schnapp ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Model ◽

Professional Activity ◽

Entrustable Professional Activity ◽

Machine Learning Model

Download Full-text

Development of a Machine Learning Model for Knowledge Acquisition, Relationship Extraction and Discovery in Domain Ontology Engineering using Jaccord Relationship Extraction and Neural Network

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c6362.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 7809-7817

Keyword(s):

Neural Network ◽

Machine Learning ◽

Knowledge Acquisition ◽

Language Processing ◽

Learning Model ◽

Heterogeneous Data ◽

Relationship Extraction ◽

Machine Learning Model ◽

Proposed Model ◽

Domain Independent

Creating a fast domain independent ontology through knowledge acquisition is a key problem to be addressed in the domain of knowledge engineering. Updating and validation is impossible without the intervention of domain experts, which is an expensive and tedious process. Thereby, an automatic system to model the ontology has become essential. This manuscript presents a machine learning model based on heterogeneous data from multiple domains including agriculture, health care, food and banking, etc. The proposed model creates a complete domain independent process that helps in populating the ontology automatically by extracting the text from multiple sources by applying natural language processing and various techniques of data extraction. The ontology instances are classified based on the domain. A Jaccord Relationship extraction process and the Neural Network Approval for Automated Theory is used for retrieval of data, automated indexing, mapping and knowledge discovery and rule generation. The results and solutions show the proposed model can automatically and efficiently construct automated Ontology

Download Full-text

Local Post-hoc Explainable Methods for Adversarial Text Attacks

10.36227/techrxiv.17185568.v1 ◽

2021 ◽

Author(s):

Yidong Chai ◽

Ruicheng Liang ◽

Hongyi Zhu ◽

Sagar Samtani ◽

Meng Wang ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Language Processing ◽

Black Box ◽

Learning Models ◽

Two Phase ◽

Sensitivity Estimation ◽

Execution Phase ◽

Adversarial Examples ◽

Post Hoc

Deep learning models have significantly advanced various natural language processing tasks. However, they are strikingly vulnerable to adversarial text attacks, even in the black-box setting where no model knowledge is accessible to hackers. Such attacks are conducted with a two-phase framework: 1) a sensitivity estimation phase to evaluate each element’s sensitivity to the target model’s prediction, and 2) a perturbation execution phase to craft the adversarial examples based on estimated element sensitivity. This study explored the connections between the local post-hoc explainable methods for deep learning and black-box adversarial text attacks and proposed a novel eXplanation-based method for crafting Adversarial Text Attacks (XATA). XATA leverages local post-hoc explainable methods (e.g., LIME or SHAP) to measure input elements’ sensitivity and adopts the word replacement perturbation strategy to craft adversarial examples. We evaluated the attack performance of the proposed XATA on three commonly used text-based datasets: IMDB Movie Review, Yelp Reviews-Polarity, and Amazon Reviews-Polarity. The proposed XATA outperformed existing baselines in various target models, including LSTM, GRU, CNN, and BERT. Moreover, we found that improved local post-hoc explainable methods (e.g., SHAP) lead to more effective adversarial attacks. These findings showed that when researchers constantly advance the explainability of deep learning models with local post-hoc methods, they also provide hackers with weapons to craft more targeted and dangerous adversarial attacks.

Download Full-text

Interpretable machine learning with reject option

at - Automatisierungstechnik ◽

10.1515/auto-2017-0123 ◽

2018 ◽

Vol 66 (4) ◽

pp. 283-290 ◽

Cited By ~ 7

Author(s):

Johannes Brinkrolf ◽

Barbara Hammer

Keyword(s):

Machine Learning ◽

Vector Quantization ◽

Random Forests ◽

Black Box ◽

Learning Models ◽

Process Automation ◽

Reject Option ◽

Interpretable Machine Learning ◽

Adversarial Examples ◽

Machine Learning Models

Abstract Classification by means of machine learning models constitutes one relevant technology in process automation and predictive maintenance. However, common techniques such as deep networks or random forests suffer from their black box characteristics and possible adversarial examples. In this contribution, we give an overview about a popular alternative technology from machine learning, namely modern variants of learning vector quantization, which, due to their combined discriminative and generative nature, incorporate interpretability and the possibility of explicit reject options for irregular samples. We give an explicit bound on minimum changes required for a change of the classification in case of LVQ networks with reject option, and we demonstrate the efficiency of reject options in two examples.

Download Full-text

Argot: Generating Adversarial Readable Chinese Texts

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/351 ◽

2020 ◽

Author(s):

Zihan Zhang ◽

Mingxuan Liu ◽

Chao Zhang ◽

Yiming Zhang ◽

Zhou Li ◽

...

Keyword(s):

Image Processing ◽

Natural Language Processing ◽

Success Rate ◽

Language Processing ◽

Black Box ◽

Essential Step ◽

Chinese Characteristics ◽

Adversarial Examples ◽

Chinese Texts ◽

Chinese And English

Natural language processing (NLP) models are known vulnerable to adversarial examples, similar to image processing models. Studying adversarial texts is an essential step to improve the robustness of NLP models. However, existing studies mainly focus on analyzing English texts and generating adversarial examples for English texts. There is no work studying the possibility and effect of the transformation to another language, e.g, Chinese. In this paper, we analyze the differences between Chinese and English, and explore the methodology to transform the existing English adversarial generation method to Chinese. We propose a novel black-box adversarial Chinese texts generation solution Argot, by utilizing the method for adversarial English samples and several novel methods developed on Chinese characteristics. Argot could effectively and efficiently generate adversarial Chinese texts with good readability. Furthermore, Argot could also automatically generate targeted Chinese adversarial text, achieving a high success rate and ensuring readability of the Chinese.

Download Full-text

Generating adversarial examples without specifying a target model

PeerJ Computer Science ◽

10.7717/peerj-cs.702 ◽

2021 ◽

Vol 7 ◽

pp. e702

Author(s):

Gaoming Yang ◽

Mingwei Li ◽

Xianjing Fang ◽

Ji Zhang ◽

Xingzhu Liang

Keyword(s):

Deep Learning ◽

Success Rate ◽

Black Box ◽

Time Cost ◽

Learning Models ◽

Security Threat ◽

Practical Situation ◽

Data Set ◽

Target Model ◽

Adversarial Examples

Adversarial examples are regarded as a security threat to deep learning models, and there are many ways to generate them. However, most existing methods require the query authority of the target during their work. In a more practical situation, the attacker will be easily detected because of too many queries, and this problem is especially obvious under the black-box setting. To solve the problem, we propose the Attack Without a Target Model (AWTM). Our algorithm does not specify any target model in generating adversarial examples, so it does not need to query the target. Experimental results show that it achieved a maximum attack success rate of 81.78% in the MNIST data set and 87.99% in the CIFAR-10 data set. In addition, it has a low time cost because it is a GAN-based method.

Download Full-text