Robustness Evaluations of Sustainable Machine Learning Models against Data Poisoning Attacks in the Internet of Things

Corey Dunn; Nour Moustafa; Benjamin Turnbull

doi:10.3390/su12166434

Robustness Evaluations of Sustainable Machine Learning Models against Data Poisoning Attacks in the Internet of Things

Sustainability ◽

10.3390/su12166434 ◽

2020 ◽

Vol 12 (16) ◽

pp. 6434 ◽

Cited By ~ 1

Author(s):

Corey Dunn ◽

Nour Moustafa ◽

Benjamin Turnbull

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Large Scale ◽

Gradient Boosting ◽

The Internet ◽

Learning Models ◽

Detection Rates ◽

Ongoing Research ◽

The Internet Of Things ◽

Machine Learning Models

With the increasing popularity of the Internet of Things (IoT) platforms, the cyber security of these platforms is a highly active area of research. One key technology underpinning smart IoT systems is machine learning, which classifies and predicts events from large-scale data in IoT networks. Machine learning is susceptible to cyber attacks, particularly data poisoning attacks that inject false data when training machine learning models. Data poisoning attacks degrade the performances of machine learning models. It is an ongoing research challenge to develop trustworthy machine learning models resilient and sustainable against data poisoning attacks in IoT networks. We studied the effects of data poisoning attacks on machine learning models, including the gradient boosting machine, random forest, naive Bayes, and feed-forward deep learning, to determine the levels to which the models should be trusted and said to be reliable in real-world IoT settings. In the training phase, a label modification function is developed to manipulate legitimate input classes. The function is employed at data poisoning rates of 5%, 10%, 20%, and 30% that allow the comparison of the poisoned models and display their performance degradations. The machine learning models have been evaluated using the ToN_IoT and UNSW NB-15 datasets, as they include a wide variety of recent legitimate and attack vectors. The experimental results revealed that the models’ performances will be degraded, in terms of accuracy and detection rates, if the number of the trained normal observations is not significantly larger than the poisoned data. At the rate of data poisoning of 30% or greater on input data, machine learning performances are significantly degraded.

Download Full-text

Application of Traditional Machine Learning Models to Detect Abnormal Traffic in the Internet of Things Networks

10.1007/978-3-030-88081-1_55 ◽

2021 ◽

pp. 735-744

Author(s):

Evgeniya Istratova ◽

Mikhail Grif ◽

Dmitry Dostovalov

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

The Internet ◽

Learning Models ◽

The Internet Of Things ◽

Machine Learning Models

Download Full-text

On the Performance of Machine Learning Models for Anomaly-Based Intelligent Intrusion Detection Systems for the Internet of Things

IEEE Internet of Things Journal ◽

10.1109/jiot.2021.3103829 ◽

2021 ◽

pp. 1-1

Author(s):

Ghada Abdelmoumin ◽

Danda B. Rawat ◽

Abdul Rahman

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Intrusion Detection ◽

Intrusion Detection Systems ◽

The Internet ◽

Learning Models ◽

Detection Systems ◽

The Internet Of Things ◽

Machine Learning Models

Download Full-text

Edge Machine Learning for AI-Enabled IoT Devices: A Review

Sensors ◽

10.3390/s20092533 ◽

2020 ◽

Vol 20 (9) ◽

pp. 2533 ◽

Cited By ~ 6

Author(s):

Massimo Merenda ◽

Carlo Porcaro ◽

Demetrio Iero

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Machine Learning Algorithms ◽

The Internet ◽

Learning Models ◽

Iot Devices ◽

High Level ◽

And Behavior ◽

The Internet Of Things ◽

Machine Learning Models

In a few years, the world will be populated by billions of connected devices that will be placed in our homes, cities, vehicles, and industries. Devices with limited resources will interact with the surrounding environment and users. Many of these devices will be based on machine learning models to decode meaning and behavior behind sensors’ data, to implement accurate predictions and make decisions. The bottleneck will be the high level of connected things that could congest the network. Hence, the need to incorporate intelligence on end devices using machine learning algorithms. Deploying machine learning on such edge devices improves the network congestion by allowing computations to be performed close to the data sources. The aim of this work is to provide a review of the main techniques that guarantee the execution of machine learning models on hardware with low performances in the Internet of Things paradigm, paving the way to the Internet of Conscious Things. In this work, a detailed review on models, architecture, and requirements on solutions that implement edge machine learning on Internet of Things devices is presented, with the main goal to define the state of the art and envisioning development requirements. Furthermore, an example of edge machine learning implementation on a microcontroller will be provided, commonly regarded as the machine learning “Hello World”.

Download Full-text

An Intrusion Detection System for the Internet of Things Based on Machine Learning: Review and Challenges

Symmetry ◽

10.3390/sym13061011 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1011

Author(s):

Ahmed Adnan ◽

Abdullah Muhammed ◽

Abdul Azim Abd Ghani ◽

Azizol Abdullah ◽

Fahrul Hakim

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Concept Drift ◽

The Internet ◽

Ongoing Research ◽

Active Research ◽

The Internet Of Things

An intrusion detection system (IDS) is an active research topic and is regarded as one of the important applications of machine learning. An IDS is a classifier that predicts the class of input records associated with certain types of attacks. In this article, we present a review of IDSs from the perspective of machine learning. We present the three main challenges of an IDS, in general, and of an IDS for the Internet of Things (IoT), in particular, namely concept drift, high dimensionality, and computational complexity. Studies on solving each challenge and the direction of ongoing research are addressed. In addition, in this paper, we dedicate a separate section for presenting datasets of an IDS. In particular, three main datasets, namely KDD99, NSL, and Kyoto, are presented. This article concludes that three elements of concept drift, high-dimensional awareness, and computational awareness that are symmetric in their effect and need to be addressed in the neural network (NN)-based model for an IDS in the IoT.

Download Full-text

Design of Machine Learning Prediction System Based on the Internet of Things Framework for Monitoring Fine PM Concentrations

Environments ◽

10.3390/environments8100099 ◽

2021 ◽

Vol 8 (10) ◽

pp. 99

Author(s):

Shun-Yuan Wang ◽

Wen-Bin Lin ◽

Yu-Chieh Shu

Keyword(s):

Machine Learning ◽

Air Pollution ◽

Particulate Matter ◽

Random Forest ◽

Internet Of Things ◽

Random Forest Model ◽

The Internet ◽

Learning Models ◽

Forest Model ◽

The Internet Of Things

In this study, a mobile air pollution sensing unit based on the Internet of Things framework was designed for monitoring the concentration of fine particulate matter in three urban areas. This unit was developed using the NodeMCU-32S microcontroller, PMS5003-G5 (particulate matter sensing module), and Ublox NEO-6M V2 (GPS positioning module). The sensing unit transmits data of the particulate matter concentration and coordinates of a polluted location to the backend server through 3G and 4G telecommunication networks for data collection. This system will complement the government’s PM2.5 data acquisition system. Mobile monitoring stations meet the air pollution monitoring needs of some areas that require special observation. For example, an AIoT development system will be installed. At intersections with intensive traffic, it can be used as a reference for government transportation departments or environmental inspection departments for environmental quality monitoring or evacuation of traffic flow. Furthermore, the particulate matter distributions in three areas, namely Xinzhuang, Sanchong, and Luzhou Districts, which are all in New Taipei City of Taiwan, were estimated using machine learning models, the data of stationary monitoring stations, and the measurements of the mobile sensing system proposed in this study. Four types of learning models were trained, namely the decision tree, random forest, multilayer perceptron, and radial basis function neural network, and their prediction results were evaluated. The root mean square error was used as the performance indicator, and the learning results indicate that the random forest model outperforms the other models for both the training and testing sets. To examine the generalizability of the learning models, the models were verified in relation to data measured on three days: 15 February, 28 February, and 1 March 2019. A comparison between the model predicted and the measured data indicates that the random forest model provides the most stable and accurate prediction values and could clearly present the distribution of highly polluted areas. The results of these models are visualized in the form of maps by using a web application. The maps allow users to understand the distribution of polluted areas intuitively.

Download Full-text

A machine-learning-based hardware-Trojan detection approach for chips in the Internet of Things

International Journal of Distributed Sensor Networks ◽

10.1177/1550147719888098 ◽

2019 ◽

Vol 15 (12) ◽

pp. 155014771988809 ◽

Cited By ~ 4

Author(s):

Chen Dong ◽

Jinghui Chen ◽

Wenzhong Guo ◽

Jian Zou

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Detection Methods ◽

Gradient Boosting ◽

The Internet ◽

Hardware Trojan ◽

Hardware Trojan Detection ◽

Trojan Detection ◽

Extreme Gradient Boosting ◽

The Internet Of Things

With the development of the Internet of Things, smart devices are widely used. Hardware security is one key issue in the security of the Internet of Things. As the core component of the hardware, the integrated circuit must be taken seriously with its security. The pre-silicon detection methods do not require gold chips, are not affected by process noise, and are suitable for the safe detection of a very large-scale integration. Therefore, more and more researchers are paying attention to the pre-silicon detection method. In this study, we propose a machine-learning-based hardware-Trojan detection method at the gate level. First, we put forward new Trojan-net features. After that, we use the scoring mechanism of the eXtreme Gradient Boosting to set up a new effective feature set of 49 out of 56 features. Finally, the hardware-Trojan classifier was trained and detected based on the new feature set by the eXtreme Gradient Boosting algorithm, respectively. The experimental results show that the proposed method can obtain 89.84% average Recall, 86.75% average F-measure, and 99.83% average Accuracy, which is the best detection result among existing machine-learning-based hardware-Trojan detection methods.

Download Full-text

Epigenetic Target Prediction with Accurate Machine Learning Models

10.26434/chemrxiv.13522313 ◽

2021 ◽

Author(s):

Norberto Sánchez-Cruz ◽

Jose L. Medina-Franco

Keyword(s):

Machine Learning ◽

Small Molecules ◽

Predictive Models ◽

Large Scale ◽

Target Prediction ◽

Quantitative Measure ◽

Learning Models ◽

Discovery Research ◽

Drug Discovery Research ◽

Machine Learning Models

Epigenetic targets are a significant focus for drug discovery research, as demonstrated by the eight approved epigenetic drugs for treatment of cancer and the increasing availability of chemogenomic data related to epigenetics. This data represents a large amount of structure-activity relationships that has not been exploited thus far for the development of predictive models to support medicinal chemistry efforts. Herein, we report the first large-scale study of 26318 compounds with a quantitative measure of biological activity for 55 protein targets with epigenetic activity. Through a systematic comparison of machine learning models trained on molecular fingerprints of different design, we built predictive models with high accuracy for the epigenetic target profiling of small molecules. The models were thoroughly validated showing mean precisions up to 0.952 for the epigenetic target prediction task. Our results indicate that the herein reported models have considerable potential to identify small molecules with epigenetic activity. Therefore, our results were implemented as freely accessible and easy-to-use web application.

Download Full-text

Detecting Abnormal Behavior of an IoT Device in the Network Based on a Traffic Model

Telecom IT ◽

10.31854/2307-1303-2019-7-3-50-55 ◽

2019 ◽

Vol 7 (3) ◽

pp. 50-55

Author(s):

D. Saharov ◽

D. Kozlov

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Mobile Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Abnormal Behavior ◽

Traffic Model ◽

The Internet ◽

Wide Spread ◽

The Internet Of Things

The article deals with the СoAP Protocol that regulates the transmission and reception of information traf-fic by terminal devices in IoT networks. The article describes a model for detecting abnormal traffic in 5G/IoT networks using machine learning algorithms, as well as the main methods for solving this prob-lem. The relevance of the article is due to the wide spread of the Internet of things and the upcoming update of mobile networks to the 5g generation.

Download Full-text

Utilizing Blockchain Technology in Social Media Bot Identification

10.36227/techrxiv.12049374 ◽

2020 ◽

Author(s):

Shreya Reddy ◽

Lisa Ewen ◽

Pankti Patel ◽

Prerak Patel ◽

Ankit Kundal ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Gold Standard ◽

The Internet ◽

Learning Models ◽

Current Time ◽

Machine Learning Methods ◽

Blockchain Technology ◽

Modern Age ◽

Machine Learning Models

As bots become more prevalent and smarter in the modern age of the internet, it becomes ever more important that they be identified and removed. Recent research has dictated that machine learning methods are accurate and the gold standard of bot identification on social media. Unfortunately, machine learning models do not come without their negative aspects such as lengthy training times, difficult feature selection, and overwhelming pre-processing tasks. To overcome these difficulties, we are proposing a blockchain framework for bot identification. At the current time, it is unknown how this method will perform, but it serves to prove the existence of an overwhelming gap of research under this area.

Download Full-text

Open Issues and Security Challenges of Data Communication Channels in Distributed Internet of Things (IoT): A Survey

Circulation in Computer Science ◽

10.22632/ccs-2017-252-63 ◽

2018 ◽

Vol 3 (1) ◽

pp. 22-32 ◽

Cited By ~ 4

Author(s):

Ernest Ezema ◽

Azizol Abdullah ◽

Nor Fazlida Binti Mohd

Keyword(s):

Internet Of Things ◽

Radio Frequency Identification ◽

Industrial Revolution ◽

Security And Privacy ◽

The Internet ◽

Ongoing Research ◽

Advantages And Disadvantages ◽

Exchange Information ◽

The Internet Of Things ◽

Open Issues

The concept of the Internet of Things (IoT) has evolved over time. The introduction of the Internet of Things and Services into the manufacturing environment has ushered in a fourth industrial revolution: Industry 4.0. It is no doubt that the world is undergoing constant transformations that somehow change the trajectory and history of humanity. We can illustrate this with the first and second industrial revolutions and the information revolution. IoT is a paradigm based on the internet that comprises many interconnected technologies like RFID (Radio Frequency Identification) and WSAN (Wireless Sensor and Actor Networks) to exchange information. The current needs for better control, monitoring and management in many areas, and the ongoing research in this field, have originated the appearance and creation of multiple systems like smart-home, smart-city and smart-grid. The IoT services can have centralized or distributed architecture. The centralized approach provides is where central entities acquire, process, and provide information while the distributed architectures, is where entities at the edge of the network exchange information and collaborate with each other in a dynamic way. To understand the two approaches, it is necessary to know its advantages and disadvantages especially in terms of security and privacy issues. This paper shows that the distributed approach has various challenges that need to be solved. But also, various interesting properties and strengths. In this paper we present the main research challenges and the existing solutions in the field of IoT security, identifying open issues, the industrial revolution and suggesting some hints for future research.

Download Full-text