Anomaly Detection in Encrypted Internet Traffic Using Hybrid Deep Learning

Security and Communication Networks ◽

10.1155/2021/5363750 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Taimur Bakhshi ◽

Bogdan Ghita

Keyword(s):

Deep Learning ◽

Anomaly Detection ◽

Cyber Security ◽

Short Term Memory ◽

Hybrid Approach ◽

Training Dataset ◽

Optimal Time ◽

Traffic Classification ◽

Ongoing Research ◽

Encrypted Traffic

An increasing number of Internet application services are relying on encrypted traffic to offer adequate consumer privacy. Anomaly detection in encrypted traffic to circumvent and mitigate cyber security threats is, however, an open and ongoing research challenge due to the limitation of existing traffic classification techniques. Deep learning is emerging as a promising paradigm, allowing reduction in manual determination of feature set to increase classification accuracy. The present work develops a deep learning-based model for detection of anomalies in encrypted network traffic. Three different publicly available datasets including the NSL-KDD, UNSW-NB15, and CIC-IDS-2017 are used to comprehensively analyze encrypted attacks targeting popular protocols. Instead of relying on a single deep learning model, multiple schemes using convolutional (CNN), long short-term memory (LSTM), and recurrent neural networks (RNNs) are investigated. Our results report a hybrid combination of convolutional (CNN) and gated recurrent unit (GRU) models as outperforming others. The hybrid approach benefits from the low-latency feature derivation of the CNN, and an overall improved training dataset fitting. Additionally, the highly effective generalization offered by GRU results in optimal time-domain-related feature extraction, resulting in the CNN and GRU hybrid scheme presenting the best model.

Download Full-text

DISTILLER: Encrypted traffic classification via multimodal multitask deep learning

Journal of Network and Computer Applications ◽

10.1016/j.jnca.2021.102985 ◽

2021 ◽

pp. 102985

Author(s):

Giuseppe Aceto ◽

Domenico Ciuonzo ◽

Antonio Montieri ◽

Antonio Pescapé

Keyword(s):

Deep Learning ◽

Traffic Classification ◽

Encrypted Traffic

Download Full-text

A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites

Scientific Reports ◽

10.1038/s41598-021-91840-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Niraj Thapa ◽

Meenal Chaudhari ◽

Anthony A. Iannetta ◽

Clarence White ◽

Kaushik Roy ◽

...

Keyword(s):

Deep Learning ◽

Chlamydomonas Reinhardtii ◽

Protein Phosphorylation ◽

Short Term Memory ◽

Phosphorylation Site ◽

Specific Protein ◽

Training Dataset ◽

Phosphorylation Sites ◽

Site Prediction ◽

Model Combining

AbstractProtein phosphorylation, which is one of the most important post-translational modifications (PTMs), is involved in regulating myriad cellular processes. Herein, we present a novel deep learning based approach for organism-specific protein phosphorylation site prediction in Chlamydomonas reinhardtii, a model algal phototroph. An ensemble model combining convolutional neural networks and long short-term memory (LSTM) achieves the best performance in predicting phosphorylation sites in C. reinhardtii. Deemed Chlamy-EnPhosSite, the measured best AUC and MCC are 0.90 and 0.64 respectively for a combined dataset of serine (S) and threonine (T) in independent testing higher than those measures for other predictors. When applied to the entire C. reinhardtii proteome (totaling 1,809,304 S and T sites), Chlamy-EnPhosSite yielded 499,411 phosphorylated sites with a cut-off value of 0.5 and 237,949 phosphorylated sites with a cut-off value of 0.7. These predictions were compared to an experimental dataset of phosphosites identified by liquid chromatography-tandem mass spectrometry (LC–MS/MS) in a blinded study and approximately 89.69% of 2,663 C. reinhardtii S and T phosphorylation sites were successfully predicted by Chlamy-EnPhosSite at a probability cut-off of 0.5 and 76.83% of sites were successfully identified at a more stringent 0.7 cut-off. Interestingly, Chlamy-EnPhosSite also successfully predicted experimentally confirmed phosphorylation sites in a protein sequence (e.g., RPS6 S245) which did not appear in the training dataset, highlighting prediction accuracy and the power of leveraging predictions to identify biologically relevant PTM sites. These results demonstrate that our method represents a robust and complementary technique for high-throughput phosphorylation site prediction in C. reinhardtii. It has potential to serve as a useful tool to the community. Chlamy-EnPhosSite will contribute to the understanding of how protein phosphorylation influences various biological processes in this important model microalga.

Download Full-text

Sensory Descriptor Analysis of Whisky Lexicons through the Use of Deep Learning

Foods ◽

10.3390/foods10071633 ◽

2021 ◽

Vol 10 (7) ◽

pp. 1633

Author(s):

Chreston Miller ◽

Leah Hamilton ◽

Jacob Lahne

Keyword(s):

Deep Learning ◽

Short Term Memory ◽

Descriptive Analysis ◽

Food Products ◽

Training Dataset ◽

Test Accuracy ◽

Language Structure ◽

Deep Learning Model ◽

Descriptor Analysis ◽

Descriptive Language

This paper is concerned with extracting relevant terms from a text corpus on whisk(e)y. “Relevant” terms are usually contextually defined in their domain of use. Arguably, every domain has a specialized vocabulary used for describing things. For example, the field of Sensory Science, a sub-field of Food Science, investigates human responses to food products and differentiates “descriptive” terms for flavors from “ordinary”, non-descriptive language. Within the field, descriptors are generated through Descriptive Analysis, a method wherein a human panel of experts tastes multiple food products and defines descriptors. This process is both time-consuming and expensive. However, one could leverage existing data to identify and build a flavor language automatically. For example, there are thousands of professional and semi-professional reviews of whisk(e)y published on the internet, providing abundant descriptors interspersed with non-descriptive language. The aim, then, is to be able to automatically identify descriptive terms in unstructured reviews for later use in product flavor characterization. We created two systems to perform this task. The first is an interactive visual tool that can be used to tag examples of descriptive terms from thousands of whisky reviews. This creates a training dataset that we use to perform transfer learning using GloVe word embeddings and a Long Short-Term Memory deep learning model architecture. The result is a model that can accurately identify descriptors within a corpus of whisky review texts with a train/test accuracy of 99% and precision, recall, and F1-scores of 0.99. We tested for overfitting by comparing the training and validation loss for divergence. Our results show that the language structure for descriptive terms can be programmatically learned.

Download Full-text

Mobile Encrypted Traffic Classification Using Deep Learning

2018 Network Traffic Measurement and Analysis Conference (TMA) ◽

10.23919/tma.2018.8506558 ◽

2018 ◽

Cited By ~ 32

Author(s):

Giuseppe Aceto ◽

Domenico Ciuonzo ◽

Antonio Montieri ◽

Antonio Pescape

Keyword(s):

Deep Learning ◽

Traffic Classification ◽

Encrypted Traffic

Download Full-text

Deep-Learning-Based Approach to Anomaly Detection Techniques for Large Acoustic Data in Machine Operation

Sensors ◽

10.3390/s21165446 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5446

Author(s):

Hyojung Ahn ◽

Inchoon Yeo

Keyword(s):

Deep Learning ◽

Anomaly Detection ◽

Short Term Memory ◽

Detection System ◽

Noisy Environments ◽

Acoustic Data ◽

Detection Techniques ◽

Machine Operation ◽

Detection Technology ◽

Environment Noise

As the workforce shrinks, the demand for automatic, labor-saving, anomaly detection technology that can perform maintenance on advanced equipment such as vehicles has been increasing. In a vehicular environment, noise in the cabin, which directly affects users, is considered an important factor in lowering the emotional satisfaction of the driver and/or passengers in the vehicles. In this study, we provide an efficient method that can collect acoustic data, measured using a large number of microphones, in order to detect abnormal operations inside the machine via deep learning in a quick and highly accurate manner. Unlike most current approaches based on Long Short-Term Memory (LSTM) or autoencoders, we propose an anomaly detection (AD) algorithm that can overcome the limitations of noisy measurement and detection system anomalies via noise signals measured inside the mechanical system. These features are utilized to train a variety of anomaly detection models for demonstration in noisy environments with five different errors in machine operation, achieving an accuracy of approximately 90% or more.

Download Full-text

Deep learning-based anomaly-onset aware remaining useful life estimation of bearings

PeerJ Computer Science ◽

10.7717/peerj-cs.795 ◽

2021 ◽

Vol 7 ◽

pp. e795

Author(s):

Pooja Vinayak Kamat ◽

Rekha Sugandhi ◽

Satish Kumar

Keyword(s):

Deep Learning ◽

Anomaly Detection ◽

Short Term Memory ◽

Rotating Machinery ◽

Remaining Useful Life ◽

Operating Conditions ◽

Learning Models ◽

Vibration Data ◽

Degradation Data ◽

Useful Life

Remaining Useful Life (RUL) estimation of rotating machinery based on their degradation data is vital for machine supervisors. Deep learning models are effective and popular methods for forecasting when rotating machinery such as bearings may malfunction and ultimately break down. During healthy functioning of the machinery, however, RUL is ill-defined. To address this issue, this study recommends using anomaly monitoring during both RUL estimator training and operation. Essential time-domain data is extracted from the raw bearing vibration data, and deep learning models are used to detect the onset of the anomaly. This further acts as a trigger for data-driven RUL estimation. The study employs an unsupervised clustering approach for anomaly trend analysis and a semi-supervised method for anomaly detection and RUL estimation. The novel combined deep learning-based anomaly-onset aware RUL estimation framework showed enhanced results on the benchmarked PRONOSTIA bearings dataset under non-varying operating conditions. The framework consisting of Autoencoder and Long Short Term Memory variants achieved an accuracy of over 90% in anomaly detection and RUL prediction. In the future, the framework can be deployed under varying operational situations using the transfer learning approach.

Download Full-text

Robot Communication: Network Traffic Classification Based on Deep Neural Network

Frontiers in Neurorobotics ◽

10.3389/fnbot.2021.648374 ◽

2021 ◽

Vol 15 ◽

Author(s):

Mengmeng Ge ◽

Xiangzhan Yu ◽

Likun Liu

Keyword(s):

Neural Network ◽

Network Traffic ◽

Short Term Memory ◽

Traffic Classification ◽

Classification Framework ◽

Network Traffic Classification ◽

Scalar Input ◽

Memory Network ◽

Encrypted Traffic ◽

Robot Communication

With the rapid popularization of robots, the risks brought by robot communication have also attracted the attention of researchers. Because current traffic classification methods based on plaintext cannot classify encrypted traffic, other methods based on statistical analysis require manual extraction of features. This paper proposes (i) a traffic classification framework based on a capsule neural network. This method has a multilayer neural network that can automatically learn the characteristics of the data stream. It uses capsule vectors instead of a single scalar input to effectively classify encrypted network traffic. (ii) For different network structures, a classification network structure combining convolution neural network and long short-term memory network is proposed. This structure has the characteristics of learning network traffic time and space characteristics. Experimental results show that the network model can classify encrypted traffic and does not require manual feature extraction. And on the basis of the previous tool, the recognition accuracy rate has increased by 8%

Download Full-text

Hybrid approach with Deep Auto-Encoder and optimized LSTM based Deep Learning approach to detect anomaly in cloud logs

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201707 ◽

2021 ◽

pp. 1-15

Author(s):

Savaridassan Pankajashan ◽

G. Maragatham ◽

T. Kirthiga Devi

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Anomaly Detection ◽

Performance Metrics ◽

Hybrid Approach ◽

Machine Learning Techniques ◽

Support Vector ◽

Paper Machine ◽

Log Data ◽

Isolation Forest

Anomaly-based detection is coupled with recognizing the uncommon, to catch the unusual activity, and to find the strange action behind that activity. Anomaly-based detection has a wide scope of critical applications, from bank application security to regular sciences to medical systems to marketing apps. Anomaly-based detection adopted by various Machine Learning techniques is really a type of system that consists of artificial intelligence. With the ever-expanding volume and new sorts of information, for example, sensor information from an incontestably enormous amount of IoT devices and from network flow data from cloud computing, it is implicitly understood without surprise that there is a developing enthusiasm for having the option to deal with more conclusions automatically by means of AI and ML applications. But with respect to anomaly detection, many applications of the scheme are simply the passion for detection. In this paper, Machine Learning (ML) techniques, namely the SVM, Isolation forest classifiers experimented and with reference to Deep Learning (DL) techniques, the proposed DA-LSTM (Deep Auto-Encoder LSTM) model are adopted for preprocessing of log data and anomaly-based detection to get better performance measures of detection. An enhanced LSTM (long-short-term memory) model, optimizing for the suitable parameter using a genetic algorithm (GA), is utilized to recognize better the anomaly from the log data that is filtered, adopting a Deep Auto-Encoder (DA). The Deep Neural network models are utilized to change over unstructured log information to training ready features, which are reasonable for log classification in detecting anomalies. These models are assessed, utilizing two benchmark datasets, the Openstack logs, and CIDDS-001 intrusion detection OpenStack server dataset. The outcomes acquired show that the DA-LSTM model performs better than other notable ML techniques. We further investigated the performance metrics of the ML and DL models through the well-known indicator measurements, specifically, the F-measure, Accuracy, Recall, and Precision. The exploratory conclusion shows that the Isolation Forest, and Support vector machine classifiers perform roughly 81%and 79%accuracy with respect to the performance metrics measurement on the CIDDS-001 OpenStack server dataset while the proposed DA-LSTM classifier performs around 99.1%of improved accuracy than the familiar ML algorithms. Further, the DA-LSTM outcomes on the OpenStack log data-sets show better anomaly detection compared with other notable machine learning models.

Download Full-text

Time Series Analysis for Encrypted Traffic Classification: A Deep Learning Approach

2018 18th International Symposium on Communications and Information Technologies (ISCIT) ◽

10.1109/iscit.2018.8587975 ◽

2018 ◽

Cited By ~ 1

Author(s):

Ly Vu ◽

Hoang V. Thuy ◽

Quang Uy Nguyen ◽

Tran N. Ngoc ◽

Diep N. Nguyen ◽

...

Keyword(s):

Time Series ◽

Deep Learning ◽

Time Series Analysis ◽

Learning Approach ◽

Traffic Classification ◽

Series Analysis ◽

Encrypted Traffic

Download Full-text

A Hybrid Deep Learning Framework for Unsupervised Anomaly Detection in Multivariate Spatio-Temporal Data

Applied Sciences ◽

10.3390/app10155191 ◽

2020 ◽

Vol 10 (15) ◽

pp. 5191

Author(s):

Yıldız Karadayı ◽

Mehmet N. Aydin ◽

A. Selçuk Öğrenci

Keyword(s):

Deep Learning ◽

Anomaly Detection ◽

Hurricane Katrina ◽

Time Series Data ◽

Hybrid Approach ◽

Ground Truth ◽

Outbreak Detection ◽

Series Data ◽

Detection Techniques ◽

Spatio Temporal

Multivariate time-series data with a contextual spatial attribute have extensive use for finding anomalous patterns in a wide variety of application domains such as earth science, hurricane tracking, fraud, and disease outbreak detection. In most settings, spatial context is often expressed in terms of ZIP code or region coordinates such as latitude and longitude. However, traditional anomaly detection techniques cannot handle more than one contextual attribute in a unified way. In this paper, a new hybrid approach based on deep learning is proposed to solve the anomaly detection problem in multivariate spatio-temporal dataset. It works under the assumption that no prior knowledge about the dataset and anomalies are available. The architecture of the proposed hybrid framework is based on an autoencoder scheme, and it is more efficient in extracting features from the spatio-temporal multivariate datasets compared to the traditional spatio-temporal anomaly detection techniques. We conducted extensive experiments using buoy data of 2005 from National Data Buoy Center and Hurricane Katrina as ground truth. Experiments demonstrate that the proposed model achieves more than 10% improvement in accuracy over the methods used in the comparison where our model jointly processes the spatial and temporal dimensions of the contextual data to extract features for anomaly detection.

Download Full-text