scholarly journals Secure Cyber Defense: An Analysis of Network Intrusion-Based Dataset CCD-IDSv1 with Machine Learning and Deep Learning Models

Electronics ◽  
2021 ◽  
Vol 10 (15) ◽  
pp. 1747
Author(s):  
Niraj Thapa ◽  
Zhipeng Liu ◽  
Addison Shaver ◽  
Albert Esterline ◽  
Balakrishna Gokaraju ◽  
...  

Anomaly detection and multi-attack classification are major concerns for cyber defense. Several publicly available datasets have been used extensively for the evaluation of Intrusion Detection Systems (IDSs). However, most of the publicly available datasets may not contain attack scenarios based on evolving threats. The development of a robust network intrusion dataset is vital for network threat analysis and mitigation. Proactive IDSs are required to tackle ever-growing threats in cyberspace. Machine learning (ML) and deep learning (DL) models have been deployed recently to detect the various types of cyber-attacks. However, current IDSs struggle to attain both a high detection rate and a low false alarm rate. To address these issues, we first develop a Center for Cyber Defense (CCD)-IDSv1 labeled flow-based dataset in an OpenStack environment. Five different attacks with normal usage imitating real-life usage are implemented. The number of network features is increased to overcome the shortcomings of the previous network flow-based datasets such as CIDDS and CIC-IDS2017. Secondly, this paper presents a comparative analysis on the effectiveness of different ML and DL models on our CCD-IDSv1 dataset. In this study, we consider both cyber anomaly detection and multi-attack classification. To improve the performance, we developed two DL-based ensemble models: Ensemble-CNN-10 and Ensemble-CNN-LSTM. Ensemble-CNN-10 combines 10 CNN models developed from 10-fold cross-validation, whereas Ensemble-CNN-LSTM combines base CNN and LSTM models. This paper also presents feature importance for both anomaly detection and multi-attack classification. Overall, the proposed ensemble models performed well in both the 10-fold cross-validation and independent testing on our dataset. Together, these results suggest the robustness and effectiveness of the proposed IDSs based on ML and DL models on the CCD-IDSv1 intrusion detection dataset.

Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4319
Author(s):  
Maria-Elena Mihailescu ◽  
Darius Mihai ◽  
Mihai Carabas ◽  
Mikołaj Komisarek ◽  
Marek Pawlicki ◽  
...  

Cybersecurity is an arms race, with both the security and the adversaries attempting to outsmart one another, coming up with new attacks, new ways to defend against those attacks, and again with new ways to circumvent those defences. This situation creates a constant need for novel, realistic cybersecurity datasets. This paper introduces the effects of using machine-learning-based intrusion detection methods in network traffic coming from a real-life architecture. The main contribution of this work is a dataset coming from a real-world, academic network. Real-life traffic was collected and, after performing a series of attacks, a dataset was assembled. The dataset contains 44 network features and an unbalanced distribution of classes. In this work, the capability of the dataset for formulating machine-learning-based models was experimentally evaluated. To investigate the stability of the obtained models, cross-validation was performed, and an array of detection metrics were reported. The gathered dataset is part of an effort to bring security against novel cyberthreats and was completed in the SIMARGL project.


Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4736
Author(s):  
Sk. Tanzir Mehedi ◽  
Adnan Anwar ◽  
Ziaur Rahman ◽  
Kawsar Ahmed

The Controller Area Network (CAN) bus works as an important protocol in the real-time In-Vehicle Network (IVN) systems for its simple, suitable, and robust architecture. The risk of IVN devices has still been insecure and vulnerable due to the complex data-intensive architectures which greatly increase the accessibility to unauthorized networks and the possibility of various types of cyberattacks. Therefore, the detection of cyberattacks in IVN devices has become a growing interest. With the rapid development of IVNs and evolving threat types, the traditional machine learning-based IDS has to update to cope with the security requirements of the current environment. Nowadays, the progression of deep learning, deep transfer learning, and its impactful outcome in several areas has guided as an effective solution for network intrusion detection. This manuscript proposes a deep transfer learning-based IDS model for IVN along with improved performance in comparison to several other existing models. The unique contributions include effective attribute selection which is best suited to identify malicious CAN messages and accurately detect the normal and abnormal activities, designing a deep transfer learning-based LeNet model, and evaluating considering real-world data. To this end, an extensive experimental performance evaluation has been conducted. The architecture along with empirical analyses shows that the proposed IDS greatly improves the detection accuracy over the mainstream machine learning, deep learning, and benchmark deep transfer learning models and has demonstrated better performance for real-time IVN security.


At present situation network communication is at high risk for external and internal attacks due to large number of applications in various fields. The network traffic can be monitored to determine abnormality for software or hardware security mechanism in the network using Intrusion Detection System (IDS). As attackers always change their techniques of attack and find alternative attack methods, IDS must also evolve in response by adopting more sophisticated methods of detection .The huge growth in the data and the significant advances in computer hardware technologies resulted in the new studies existence in the deep learning field, including ID. Deep Learning (DL) is a subgroup of Machine Learning (ML) which is hinged on data description. The new model based on deep learning is presented in this research work to activate operation of IDS from modern networks. Model depicts combination of deep learning and machine learning, having capacity of wide range accurate analysis of traffic network. The new approach proposes non-symmetric deep auto encoder (NDAE) for learning the features in unsupervised manner. Furthermore, classification model is constructed using stacked NDAEs for classification. The performance is evaluated using a network intrusion detection analysis dataset, particularly the WSN Trace dataset. The contribution work is to implement advanced deep learning algorithm consists IDS use, which are efficient in taking instant measures in order to stop or minimize the malicious actions


2021 ◽  
Vol 11 (16) ◽  
pp. 7731
Author(s):  
Rao Zeng ◽  
Minghong Liao

DNA methylation is one of the most extensive epigenetic modifications. DNA N6-methyladenine (6mA) plays a key role in many biology regulation processes. An accurate and reliable genome-wide identification of 6mA sites is crucial for systematically understanding its biological functions. Some machine learning tools can identify 6mA sites, but their limited prediction accuracy and lack of robustness limit their usability in epigenetic studies, which implies the great need of developing new computational methods for this problem. In this paper, we developed a novel computational predictor, namely the 6mAPred-MSFF, which is a deep learning framework based on a multi-scale feature fusion mechanism to identify 6mA sites across different species. In the predictor, we integrate the inverted residual block and multi-scale attention mechanism to build lightweight and deep neural networks. As compared to existing predictors using traditional machine learning, our deep learning framework needs no prior knowledge of 6mA or manually crafted sequence features and sufficiently capture better characteristics of 6mA sites. By benchmarking comparison, our deep learning method outperforms the state-of-the-art methods on the 5-fold cross-validation test on the seven datasets of six species, demonstrating that the proposed 6mAPred-MSFF is more effective and generic. Specifically, our proposed 6mAPred-MSFF gives the sensitivity and specificity of the 5-fold cross-validation on the 6mA-rice-Lv dataset as 97.88% and 94.64%, respectively. Our model trained with the rice data predicts well the 6mA sites of other five species: Arabidopsis thaliana, Fragaria vesca, Rosa chinensis, Homo sapiens, and Drosophila melanogaster with a prediction accuracy 98.51%, 93.02%, and 91.53%, respectively. Moreover, via experimental comparison, we explored performance impact by training and testing our proposed model under different encoding schemes and feature descriptors.


Sensors ◽  
2020 ◽  
Vol 20 (16) ◽  
pp. 4583 ◽  
Author(s):  
Vibekananda Dutta ◽  
Michał Choraś ◽  
Marek Pawlicki ◽  
Rafał Kozik

Currently, expert systems and applied machine learning algorithms are widely used to automate network intrusion detection. In critical infrastructure applications of communication technologies, the interaction among various industrial control systems and the Internet environment intrinsic to the IoT technology makes them susceptible to cyber-attacks. Given the existence of the enormous network traffic in critical Cyber-Physical Systems (CPSs), traditional methods of machine learning implemented in network anomaly detection are inefficient. Therefore, recently developed machine learning techniques, with the emphasis on deep learning, are finding their successful implementations in the detection and classification of anomalies at both the network and host levels. This paper presents an ensemble method that leverages deep models such as the Deep Neural Network (DNN) and Long Short-Term Memory (LSTM) and a meta-classifier (i.e., logistic regression) following the principle of stacked generalization. To enhance the capabilities of the proposed approach, the method utilizes a two-step process for the apprehension of network anomalies. In the first stage, data pre-processing, a Deep Sparse AutoEncoder (DSAE) is employed for the feature engineering problem. In the second phase, a stacking ensemble learning approach is utilized for classification. The efficiency of the method disclosed in this work is tested on heterogeneous datasets, including data gathered in the IoT environment, namely IoT-23, LITNET-2020, and NetML-2020. The results of the evaluation of the proposed approach are discussed. Statistical significance is tested and compared to the state-of-the-art approaches in network anomaly detection.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Tamer N. Jarada ◽  
Jon G. Rokne ◽  
Reda Alhajj

Abstract Background Drug repositioning is an emerging approach in pharmaceutical research for identifying novel therapeutic potentials for approved drugs and discover therapies for untreated diseases. Due to its time and cost efficiency, drug repositioning plays an instrumental role in optimizing the drug development process compared to the traditional de novo drug discovery process. Advances in the genomics, together with the enormous growth of large-scale publicly available data and the availability of high-performance computing capabilities, have further motivated the development of computational drug repositioning approaches. More recently, the rise of machine learning techniques, together with the availability of powerful computers, has made the area of computational drug repositioning an area of intense activities. Results In this study, a novel framework SNF-NN based on deep learning is presented, where novel drug-disease interactions are predicted using drug-related similarity information, disease-related similarity information, and known drug-disease interactions. Heterogeneous similarity information related to drugs and disease is fed to the proposed framework in order to predict novel drug-disease interactions. SNF-NN uses similarity selection, similarity network fusion, and a highly tuned novel neural network model to predict new drug-disease interactions. The robustness of SNF-NN is evaluated by comparing its performance with nine baseline machine learning methods. The proposed framework outperforms all baseline methods ($$AUC-ROC$$ A U C - R O C = 0.867, and $$AUC-PR$$ A U C - P R =0.876) using stratified 10-fold cross-validation. To further demonstrate the reliability and robustness of SNF-NN, two datasets are used to fairly validate the proposed framework’s performance against seven recent state-of-the-art methods for drug-disease interaction prediction. SNF-NN achieves remarkable performance in stratified 10-fold cross-validation with $$AUC-ROC$$ A U C - R O C ranging from 0.879 to 0.931 and $$AUC-PR$$ A U C - P R from 0.856 to 0.903. Moreover, the efficiency of SNF-NN is verified by validating predicted unknown drug-disease interactions against clinical trials and published studies. Conclusion In conclusion, computational drug repositioning research can significantly benefit from integrating similarity measures in heterogeneous networks and deep learning models for predicting novel drug-disease interactions. The data and implementation of SNF-NN are available at http://pages.cpsc.ucalgary.ca/ tnjarada/snf-nn.php.


2020 ◽  
Vol 2020 ◽  
pp. 1-16
Author(s):  
Leila Mohammadpour ◽  
T.C. Ling ◽  
C.S. Liew ◽  
Alihossein Aryanfar

The significant development of Internet applications over the past 10 years has resulted in the rising necessity for the information network to be secured. An intrusion detection system is a fundamental network infrastructure defense that must be able to adapt to the ever-evolving threat landscape and identify new attacks that have low false alarm. Researchers have developed several supervised as well as unsupervised methods from the data mining and machine learning disciplines so that anomalies can be detected reliably. As an aspect of machine learning, deep learning uses a neuron-like structure to learn tasks. A successful deep learning technique method is convolution neural network (CNN); however, it is presently not suitable to detect anomalies. It is easier to identify expected contents within the input flow in CNNs, whereas there are minor differences in the abnormalities compared to the normal content. This suggests that a particular method is required for identifying such minor changes. It is expected that CNNs would learn the features that form the characteristic of the content of an image (flow) rather than variations that are unrelated to the content. Hence, this study recommends a new CNN architecture type known as mean convolution layer (CNN-MCL) that was developed for learning the anomalies’ content features and then identifying the particular abnormality. The recommended CNN-MCL helps in designing a strong network intrusion detection system that includes an innovative form of convolutional layer that can teach low-level abnormal characteristics. It was observed that assessing the proposed model on the CICIDS2017 dataset led to favorable results in terms of real-world application regarding detecting anomalies that are highly accurate and have low false-alarm rate as opposed to other best models.


Sign in / Sign up

Export Citation Format

Share Document