An LSTM-Based Deep Learning Approach for Classifying Malicious Traffic at the Packet Level

Ren-Hung Hwang; Min-Chun Peng; Van-Linh Nguyen; Yu-Lun Chang

doi:10.3390/app9163414

An LSTM-Based Deep Learning Approach for Classifying Malicious Traffic at the Packet Level

Applied Sciences ◽

10.3390/app9163414 ◽

2019 ◽

Vol 9 (16) ◽

pp. 3414 ◽

Cited By ~ 8

Author(s):

Ren-Hung Hwang ◽

Min-Chun Peng ◽

Van-Linh Nguyen ◽

Yu-Lun Chang

Keyword(s):

Deep Learning ◽

Real Time ◽

Short Term Memory ◽

Learning Technologies ◽

Word Embedding ◽

Detection Systems ◽

Real Time Analysis ◽

Detection Delay ◽

Novel Approach ◽

Prior Literature

Recently, deep learning has been successfully applied to network security assessments and intrusion detection systems (IDSs) with various breakthroughs such as using Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) to classify malicious traffic. However, these state-of-the-art systems also face tremendous challenges to satisfy real-time analysis requirements due to the major delay of the flow-based data preprocessing, i.e., requiring time for accumulating the packets into particular flows and then extracting features. If detecting malicious traffic can be done at the packet level, detecting time will be significantly reduced, which makes the online real-time malicious traffic detection based on deep learning technologies become very promising. With the goal of accelerating the whole detection process by considering a packet level classification, which has not been studied in the literature, in this research, we propose a novel approach in building the malicious classification system with the primary support of word embedding and the LSTM model. Specifically, we propose a novel word embedding mechanism to extract packet semantic meanings and adopt LSTM to learn the temporal relation among fields in the packet header and for further classifying whether an incoming packet is normal or a part of malicious traffic. The evaluation results on ISCX2012, USTC-TFC2016, IoT dataset from Robert Gordon University and IoT dataset collected on our Mirai Botnet show that our approach is competitive to the prior literature which detects malicious traffic at the flow level. While the network traffic is booming year by year, our first attempt can inspire the research community to exploit the advantages of deep learning to build effective IDSs without suffering significant detection delay.

Download Full-text

Real-Time Detection of Dictionary DGA Network Traffic Using Deep Learning

SN Computer Science ◽

10.1007/s42979-021-00507-w ◽

2021 ◽

Vol 2 (2) ◽

Author(s):

Kate Highnam ◽

Domenic Puzio ◽

Song Luo ◽

Nicholas R. Jennings

Keyword(s):

Neural Network ◽

Deep Learning ◽

Real Time ◽

Network Traffic ◽

Short Term Memory ◽

Domain Names ◽

Control Networks ◽

Detection Techniques ◽

Lstm Network ◽

And Control

AbstractBotnets and malware continue to avoid detection by static rule engines when using domain generation algorithms (DGAs) for callouts to unique, dynamically generated web addresses. Common DGA detection techniques fail to reliably detect DGA variants that combine random dictionary words to create domain names that closely mirror legitimate domains. To combat this, we created a novel hybrid neural network, Bilbo the “bagging” model, that analyses domains and scores the likelihood they are generated by such algorithms and therefore are potentially malicious. Bilbo is the first parallel usage of a convolutional neural network (CNN) and a long short-term memory (LSTM) network for DGA detection. Our unique architecture is found to be the most consistent in performance in terms of AUC, $$F_1$$ F 1 score, and accuracy when generalising across different dictionary DGA classification tasks compared to current state-of-the-art deep learning architectures. We validate using reverse-engineered dictionary DGA domains and detail our real-time implementation strategy for scoring real-world network logs within a large enterprise. In 4 h of actual network traffic, the model discovered at least five potential command-and-control networks that commercial vendor tools did not flag.

Download Full-text

On the Performance of One-Stage and Two-Stage Object Detectors in Autonomous Vehicles Using Camera Data

Remote Sensing ◽

10.3390/rs13010089 ◽

2020 ◽

Vol 13 (1) ◽

pp. 89

Author(s):

Manuel Carranza-García ◽

Jesús Torres-Mateo ◽

Pedro Lara-Benítez ◽

Jorge García-Gutiérrez

Keyword(s):

Deep Learning ◽

Real Time ◽

Autonomous Vehicles ◽

Remote Sensing Data ◽

Autonomous Driving ◽

Two Stage ◽

Detection Systems ◽

One Stage ◽

Time Requirements ◽

Speed Accuracy

Object detection using remote sensing data is a key task of the perception systems of self-driving vehicles. While many generic deep learning architectures have been proposed for this problem, there is little guidance on their suitability when using them in a particular scenario such as autonomous driving. In this work, we aim to assess the performance of existing 2D detection systems on a multi-class problem (vehicles, pedestrians, and cyclists) with images obtained from the on-board camera sensors of a car. We evaluate several one-stage (RetinaNet, FCOS, and YOLOv3) and two-stage (Faster R-CNN) deep learning meta-architectures under different image resolutions and feature extractors (ResNet, ResNeXt, Res2Net, DarkNet, and MobileNet). These models are trained using transfer learning and compared in terms of both precision and efficiency, with special attention to the real-time requirements of this context. For the experimental study, we use the Waymo Open Dataset, which is the largest existing benchmark. Despite the rising popularity of one-stage detectors, our findings show that two-stage detectors still provide the most robust performance. Faster R-CNN models outperform one-stage detectors in accuracy, being also more reliable in the detection of minority classes. Faster R-CNN Res2Net-101 achieves the best speed/accuracy tradeoff but needs lower resolution images to reach real-time speed. Furthermore, the anchor-free FCOS detector is a slightly faster alternative to RetinaNet, with similar precision and lower memory usage.

Download Full-text

Deep-Framework: A Distributed, Scalable, and Edge-Oriented Framework for Real-Time Analysis of Video Streams

Sensors ◽

10.3390/s21124045 ◽

2021 ◽

Vol 21 (12) ◽

pp. 4045

Author(s):

Alessandro Sassu ◽

Jose Francisco Saenz-Cogollo ◽

Maurizio Agelli

Keyword(s):

Deep Learning ◽

Real Time ◽

Video Data ◽

Video Analytics ◽

Web Based ◽

Real Time Analysis ◽

Open Source Framework ◽

Cluster Configuration ◽

Time Requirements ◽

High Level

Edge computing is the best approach for meeting the exponential demand and the real-time requirements of many video analytics applications. Since most of the recent advances regarding the extraction of information from images and video rely on computation heavy deep learning algorithms, there is a growing need for solutions that allow the deployment and use of new models on scalable and flexible edge architectures. In this work, we present Deep-Framework, a novel open source framework for developing edge-oriented real-time video analytics applications based on deep learning. Deep-Framework has a scalable multi-stream architecture based on Docker and abstracts away from the user the complexity of cluster configuration, orchestration of services, and GPU resources allocation. It provides Python interfaces for integrating deep learning models developed with the most popular frameworks and also provides high-level APIs based on standard HTTP and WebRTC interfaces for consuming the extracted video data on clients running on browsers or any other web-based platform.

Download Full-text

Real time monitoring of water Quality using IoT and Deep learning

E3S Web of Conferences ◽

10.1051/e3sconf/202129701059 ◽

2021 ◽

Vol 297 ◽

pp. 01059

Author(s):

Saloua Senhaji ◽

Mohamed Hamlich ◽

Mohammed Ouazzani Jamil

Keyword(s):

Water Quality ◽

Deep Learning ◽

Real Time ◽

Environmental Protection Agency ◽

Short Term Memory ◽

Chemical Parameters ◽

Water Parameters ◽

Term Memory ◽

Single Board Computer ◽

Physico Chemical

Access to safe drinking water is one of the most pressing issues facing many developing countries. Water must meet Environmental Protection Agency (E.P.A.) requirements. The normal method of measuring physico-chemical parameters is to take samples manually and send them to the laboratory to check the water quality. In this paper, we proposed a new intelligent design of a real-time water quality monitoring system using Deep Learning technology. This system is composed of several sensors that allow us to measure water parameters (physico-chemical parameters), bacteriological parameters and organoleptic parameters) and to detect the presence of certain substances (undesirable substances, toxic substances) and of a single-board/mobile computer module, Internet and other accessories. Water parameters are automatically detected by the single-board computer. Raspberry Pi3 model B. The single board computer receives the data from the sensors and this data is sent to the web server using the Internet module. It is able to detect the water quality situation worldwide. The data will be analysed in real time. The application of deep learning to these areas has been an important research topic. The Long-Short Term Memory (LSTM) network has been shown to be well suited for processing and predicting large events with long intervals and delays in the time series. LSTM networks have the ability to retain long-term memory.

Download Full-text

A Hybrid Deep Learning Model for Predicting Protein Hydroxylation Sites

International Journal of Molecular Sciences ◽

10.3390/ijms19092817 ◽

2018 ◽

Vol 19 (9) ◽

pp. 2817 ◽

Cited By ~ 9

Author(s):

Haixia Long ◽

Bo Liao ◽

Xingyu Xu ◽

Jialiang Yang

Keyword(s):

Deep Learning ◽

Short Term Memory ◽

Learning Model ◽

New Drugs ◽

Post Translational Modifications ◽

Novel Approach ◽

Benchmark Datasets ◽

Memory Network ◽

Scoring Matrix ◽

Deep Learning Model

Protein hydroxylation is one type of post-translational modifications (PTMs) playing critical roles in human diseases. It is known that protein sequence contains many uncharacterized residues of proline and lysine. The question that needs to be answered is: which residue can be hydroxylated, and which one cannot. The answer will not only help understand the mechanism of hydroxylation but can also benefit the development of new drugs. In this paper, we proposed a novel approach for predicting hydroxylation using a hybrid deep learning model integrating the convolutional neural network (CNN) and long short-term memory network (LSTM). We employed a pseudo amino acid composition (PseAAC) method to construct valid benchmark datasets based on a sliding window strategy and used the position-specific scoring matrix (PSSM) to represent samples as inputs to the deep learning model. In addition, we compared our method with popular predictors including CNN, iHyd-PseAAC, and iHyd-PseCp. The results for 5-fold cross-validations all demonstrated that our method significantly outperforms the other methods in prediction accuracy.

Download Full-text

Real-Time Analysis of a Novel Approach for Localization and Tracking of a Mobile Node

2012 8th International Conference on Wireless Communications, Networking and Mobile Computing ◽

10.1109/wicom.2012.6478636 ◽

2012 ◽

Author(s):

Muhammad Haroon Siddiqui ◽

Muhammad Rehan Khalid

Keyword(s):

Real Time ◽

Mobile Node ◽

Time Analysis ◽

Real Time Analysis ◽

Novel Approach ◽

Localization And Tracking

Download Full-text

A Novel Approach to Data Driven Pandemic Recovery: The Pandemic Recovery Acceleration Model

10.1101/2020.05.17.20104695 ◽

2020 ◽

Author(s):

Jeffrey P Gold ◽

Christopher Wichman ◽

Kenneth Bayles ◽

Ali S Khan ◽

Christopher Kratochvil ◽

...

Keyword(s):

Public Health ◽

Health Care ◽

Real Time ◽

Local Community ◽

The State ◽

Data Driven ◽

Real Time Analysis ◽

Novel Approach ◽

Acceleration Model ◽

Care Systems

A data driven approach to guide the global, regional and local pandemic recovery planning is key to the safety, efficacy and sustainability of all pandemic recovery efforts. The Pandemic Recovery Acceleration Model (PRAM) analytic tool was developed and implemented state wide in Nebraska to allow health officials, public officials, industry leaders and community leaders to capture a real time snapshot of how the COVID-19 pandemic is affecting their local community, a region or the state and use this novel lens to aid in making key mitigation and recovery decisions. This is done by using six commonly available metrics that are monitored daily across the state describing the pandemic impact: number of new cases, percent positive tests, deaths, occupied hospital beds, occupied intensive care beds and utilized ventilators, all directly related to confirmed COVID-19 patients. Nebraska is separated into six Health Care Coalitions based on geography, public health and medical care systems. The PRAM aggregates the data for each of these geographic regions based on disease prevalence acceleration and health care resource utilization acceleration, producing real time analysis of the acceleration of change for each metric individually and also combined into a single weighted index, the PRAM Recovery Index. These indices are then shared daily with the state leadership, coalition leaders and public health directors and also tracked over time, aiding in real time regional and statewide decisions of resource allocation and the extent of use of comprehensive non-pharmacologic interventions.

Download Full-text

Static–dynamic features and hybrid deep learning models based spoof detection system for ASV

Complex & Intelligent Systems ◽

10.1007/s40747-021-00565-w ◽

2021 ◽

Author(s):

Aakshi Mittal ◽

Mohit Dua

Keyword(s):

Deep Learning ◽

Speech Synthesis ◽

Short Term Memory ◽

Detection System ◽

Performance Comparison ◽

Voice Conversion ◽

Learning Models ◽

Dynamic Features ◽

User Identification ◽

Detection Systems

AbstractDetection of spoof is essential for improving the performance of current scenario of Automatic Speaker Verification (ASV) systems. Empowerment to both frontend and backend parts can build the robust ASV systems. First, this paper discuses performance comparison of static and static–dynamic Constant Q Cepstral Coefficients (CQCC) frontend features by using Long Short Term Memory (LSTM) with Time Distributed Wrappers model at the backend. Second, it performs comparative analysis of ASV systems built using three deep learning models LSTM with Time Distributed Wrappers, LSTM and Convolutional Neural Network at backend and using static–dynamic CQCC features at frontend. Third, it discusses implementation of two spoof detection systems for ASV by using same static–dynamic CQCC features at frontend and different combination of deep learning models at backend. Out of these two, the first one is a voting protocol based two-level spoof detection system that uses CNN, LSTM model at first level and LSTM with Time Distributed Wrappers model at second level. The second one is a two-level spoof detection system with user identification and verification protocol, which uses LSTM model for user identification at first level and LSTM with Time Distributed Wrappers for verification at the second level. For implementing the proposed work, a variation in ASVspoof 2019 dataset has been used to introduce all types of spoofing attacks such as Speech Synthesis (SS), Voice Conversion (VC) and replay in single set of dataset. The results show that, at frontend, static–dynamic CQCC feature outperform static CQCC features and at the backend, hybrid combination of deep learning models increases accuracy of spoof detection systems.

Download Full-text

Real Time LDR Data Prediction using IoT and Deep Learning Algorithm

10.46532/978-81-950008-1-4_033 ◽

2020 ◽

pp. 158-161

Author(s):

Chandraprabha S ◽

Pradeepkumar G ◽

Dineshkumar Ponnusamy ◽

Saranya M D ◽

Satheesh Kumar S ◽

...

Keyword(s):

Deep Learning ◽

Light Intensity ◽

Real Time ◽

Short Term Memory ◽

Learning Algorithm ◽

Time Interval ◽

Deep Learning Algorithm ◽

Data Prediction ◽

Solar Plant

This paper outfits artificial intelligence based real time LDR data which is implemented in various applications like indoor lightning, and places where enormous amount of heat is produced, agriculture to increase the crop yield, Solar plant for solar irradiance Tracking. For forecasting the LDR information. The system uses a sensor that can measure the light intensity by means of LDR. The data acquired from sensors are posted in an Adafruit cloud for every two seconds time interval using Node MCU ESP8266 module. The data is also presented on adafruit dashboard for observing sensor variables. A Long short-term memory is used for setting up the deep learning. LSTM module uses the recorded historical data from adafruit cloud which is paired with Node MCU in order to obtain the real-time long-term time series sensor variables that is measured in terms of light intensity. Data is extracted from the cloud for processing the data analytics later the deep learning model is implemented in order to predict future light intensity values.

Download Full-text

Application of deep learning methods to predict ionosphere parameters in real time

E3S Web of Conferences ◽

10.1051/e3sconf/202019602007 ◽

2020 ◽

Vol 196 ◽

pp. 02007

Author(s):

Vladimir Mochalov ◽

Anastasia Mochalova

Keyword(s):

Neural Network ◽

Deep Learning ◽

Real Time ◽

Network Architecture ◽

Short Term Memory ◽

Neural Network Architecture ◽

Short Term ◽

Learning Methods ◽

Term Memory ◽

Long Short Term Memory

In this paper, the previously obtained results on recognition of ionograms using deep learning are expanded to predict the parameters of the ionosphere. After the ionospheric parameters have been identified on the ionogram using deep learning in real time, we can predict the parameters for some time ahead on the basis of the new data obtained Examples of predicting the ionosphere parameters using an artificial recurrent neural network architecture long short-term memory are given. The place of the block for predicting the parameters of the ionosphere in the system for analyzing ionospheric data using deep learning methods is shown.

Download Full-text