scholarly journals DDOS Botnets Attacks Detection in Anomaly Traffic: A Comparative Study.

2020 ◽  
Vol 3 (1) ◽  
pp. 64-74
Author(s):  
Ahmed A. Elsherif ◽  
Arwa A. Aldaej

One of the major challenges that faces the acceptance and growth rate of business and governmental sites is a Botnet-based DDoS attack. A flooding DDoS strikes a victim machine by means of sending a vast amount of malicious traffic, causing a significant drop in the service quality (QoS) in IoT devices. Nonetheless, it is not that easy to detect and tackle flooding DDoS attacks, owing to the significant number of attacking machines, the usage of source-address spoofing, and the common areas shared between legitimate and malicious traffic. New kinds of attacks are identified daily, and some remain undiscovered, accordingly, this paper aims to improve the traffic classification algorithm of network traffic, that hackers use to try to be ambiguous or misleading. A recorded simulated traffic was used for both samples; normal and DDoS attack traffic, approximately 104.000 cases of each, where both datasets -which were created for this study- represent the input data in order to create a classification model, to be used as a tool to mitigate the risk of being attacked. The next step is putting datasets in a format suitable for classification. This process is done through preprocessing techniques, to convert categorical data into numerical data. A classification process is applied to capture datasets, to create a classification model, by using five classification algorithms which are; Decision Tree, Support Vector Machine, Naive Bayes, K-Neighbours and Random Forest. The core code used for classification is the python code, which is controlled by a user interface. The highest prediction, precision and accuracy are obtained using the Decision Tree and Random Forest classification algorithms, which also have the lowest processing time.

2021 ◽  
pp. 71
Author(s):  
Alejandro Coca-Castro ◽  
Maycol A. Zaraza-Aguilera ◽  
Yilsey T. Benavides-Miranda ◽  
Yeimy M. Montilla-Montilla ◽  
Heidy B. Posada-Fandiño ◽  
...  

<p>Building change detection based on remote sensing imagery is a key task for land management and planning e.g., detection of illegal settlements, updating land records and disaster response. Under the post- classification comparison approach, this research aimed to evaluate the feasibility of several classification algorithms to identify and capture buildings and their change between two time steps using very-high resolution images (&lt;1 m/pixel) across rural areas and urban/rural perimeter boundaries. Through an App implemented on the Google Earth Engine (GEE) platform, we selected two study areas in Colombia with different images and input data. In total, eight traditional classification algorithms, three unsupervised (K-means, X-Means y Cascade K-Means) and five supervised (Random Forest, Support Vector Machine, Naive Bayes, GMO maximum Entropy and Minimum distance) available at GEE were trained. Additionally, a deep neural network named Feature Pyramid Networks (FPN) was added and trained using a pre-trained model, EfficientNetB3 model. Three evaluation zones per study area were proposed to quantify the performance of the algorithms through the Intersection over Union (IoU) metric. This metric, with a range between 0 and 1, represents the degree of overlapping between two regions, where the higher agreement the higher IoU values. The results indicate that the models configured with the FPN network have the best performance followed by the traditional supervised algorithms. The performance differences were specific to the study area. For the rural area, the best FPN configuration obtained an IoU averaged for both time steps of 0.4, being this four times higher than the best supervised model, Support Vector Machines using a linear kernel with an average IoU of 0.1. Regarding the setting of urban/rural perimeter boundaries, this difference was less marked, having an average IoU of 0.53 in comparison to 0.38 obtained by the best supervised classification model, in this case Random Forest. The results are relevant for institutions tracking the dynamics of building areas from cloud computing platfo future assessments of classifiers in likewise platforms in other contexts.</p>


2021 ◽  
Vol 12 (11) ◽  
pp. 1886-1891
Author(s):  
Sarthika Dutt, Et. al.

Dysgraphia is a disorder that affects writing skills. Dysgraphia Identification at an early age of a child's development is a difficult task.  It can be identified using problematic skills associated with Dysgraphia difficulty. In this study motor ability, space knowledge, copying skill, Visual Spatial Response are some of the features included for Dysgraphia identification. The features that affect Dysgraphia disability are analyzed using a feature selection technique EN (Elastic Net). The significant features are classified using machine learning techniques. The classification models compared are KNN (K-Nearest Neighbors), Naïve Bayes, Decision tree, Random Forest, SVM (Support Vector Machine) on the Dysgraphia dataset. Results indicate the highest performance of the Random forest classification model for Dysgraphia identification.


Agriculture ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 371
Author(s):  
Yu Jin ◽  
Jiawei Guo ◽  
Huichun Ye ◽  
Jinling Zhao ◽  
Wenjiang Huang ◽  
...  

The remote sensing extraction of large areas of arecanut (Areca catechu L.) planting plays an important role in investigating the distribution of arecanut planting area and the subsequent adjustment and optimization of regional planting structures. Satellite imagery has previously been used to investigate and monitor the agricultural and forestry vegetation in Hainan. However, the monitoring accuracy is affected by the cloudy and rainy climate of this region, as well as the high level of land fragmentation. In this paper, we used PlanetScope imagery at a 3 m spatial resolution over the Hainan arecanut planting area to investigate the high-precision extraction of the arecanut planting distribution based on feature space optimization. First, spectral and textural feature variables were selected to form the initial feature space, followed by the implementation of the random forest algorithm to optimize the feature space. Arecanut planting area extraction models based on the support vector machine (SVM), BP neural network (BPNN), and random forest (RF) classification algorithms were then constructed. The overall classification accuracies of the SVM, BPNN, and RF models optimized by the RF features were determined as 74.82%, 83.67%, and 88.30%, with Kappa coefficients of 0.680, 0.795, and 0.853, respectively. The RF model with optimized features exhibited the highest overall classification accuracy and kappa coefficient. The overall accuracy of the SVM, BPNN, and RF models following feature optimization was improved by 3.90%, 7.77%, and 7.45%, respectively, compared with the corresponding unoptimized classification model. The kappa coefficient also improved. The results demonstrate the ability of PlanetScope satellite imagery to extract the planting distribution of arecanut. Furthermore, the RF is proven to effectively optimize the initial feature space, composed of spectral and textural feature variables, further improving the extraction accuracy of the arecanut planting distribution. This work can act as a theoretical and technical reference for the agricultural and forestry industries.


2021 ◽  
Author(s):  
Jeremy Watts ◽  
Anahita Khojandi ◽  
Rama Vasudevan ◽  
Fatta B. Nahab ◽  
Ritesh Ramdhani

Abstract Parkinson’s disease (PD) medication treatment planning is generally based on subjective data through in-office, physicianpatient interactions. The Personal KinetiGraphTM (PKG) has shown promise in enabling objective, continuous remote health monitoring for Parkinson’s patients. In this proof-of-concept study, we propose to use objective sensor data from the PKG and apply machine learning to subtype patients based on levodopa regimens and response. We apply k-means clustering to a dataset of with-in-subject Parkinson’s medication changes—clinically assessed by the PKG and Hoehn & Yahr (H&Y) staging. A random forest classification model was then used to predict patients’ cluster allocation based on their respective PKG data and demographic information. Clinically relevant clusters were developed based on longitudinal dopaminergic regimens—partitioned by levodopa dose, administration frequency, and total levodopa equivalent daily dose—with the PKG increasing cluster granularity compared to the H&Y staging. A random forest classifier was able to accurately classify subjects of the two most demographically similar clusters with an accuracy of 87:9 ±1:3


2021 ◽  
Author(s):  
Mostafa Sa'eed Yakoot ◽  
Adel Mohamed Salem Ragab ◽  
Omar Mahmoud

Abstract Well integrity has become a crucial field with increased focus and being published intensively in industry researches. It is important to maintain the integrity of the individual well to ensure that wells operate as expected for their designated life (or higher) with all risks kept as low as reasonably practicable, or as specified. Machine learning (ML) and artificial intelligence (AI) models are used intensively in oil and gas industry nowadays. ML concept is based on powerful algorithms and robust database. Developing an efficient classification model for well integrity (WI) anomalies is now feasible because of having enormous number of well failures and well barrier integrity tests, and analyses in the database. Circa 9000 dataset points were collected from WI tests performed for 800 wells in Gulf of Suez, Egypt for almost 10 years. Moreover, those data have been quality-controlled and quality-assured by experienced engineers. The data contain different forms of WI failures. The contributing parameter set includes a total of 23 barrier elements. Data were structured and fed into 11 different ML algorithms to build an automated systematic tool for calculating imposed risk category of any well. Comparison analysis for the deployed models was performed to infer the best predictive model that can be relied on. 11 models include both supervised and ensemble learning algorithms such as random forest, support vector machine (SVM), decision tree and scalable boosting techniques. Out of 11 models, the results showed that extreme gradient boosting (XGB), categorical boosting (CatBoost), and decision tree are the most reliable algorithms. Moreover, novel evaluation metrics for confusion matrix of each model have been introduced to overcome the problem of existing metrics which don't consider domain knowledge during model evaluation. The innovated model will help to utilize company resources efficiently and dedicate personnel efforts to wells with the high-risk. As a result, progressive improvements on business, safety, environment, and performance of the business. This paper would be a milestone in the design and creation of the Well Integrity Database Management Program through the combination of integrity and ML.


Geosciences ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 265
Author(s):  
Stefan Rauter ◽  
Franz Tschuchnigg

The classification of soils into categories with a similar range of properties is a fundamental geotechnical engineering procedure. At present, this classification is based on various types of cost- and time-intensive laboratory and/or in situ tests. These soil investigations are essential for each individual construction site and have to be performed prior to the design of a project. Since Machine Learning could play a key role in reducing the costs and time needed for a suitable site investigation program, the basic ability of Machine Learning models to classify soils from Cone Penetration Tests (CPT) is evaluated. To find an appropriate classification model, 24 different Machine Learning models, based on three different algorithms, are built and trained on a dataset consisting of 1339 CPT. The applied algorithms are a Support Vector Machine, an Artificial Neural Network and a Random Forest. As input features, different combinations of direct cone penetration test data (tip resistance qc, sleeve friction fs, friction ratio Rf, depth d), combined with “defined”, thus, not directly measured data (total vertical stresses σv, effective vertical stresses σ’v and hydrostatic pore pressure u0), are used. Standard soil classes based on grain size distributions and soil classes based on soil behavior types according to Robertson are applied as targets. The different models are compared with respect to their prediction performance and the required learning time. The best results for all targets were obtained with models using a Random Forest classifier. For the soil classes based on grain size distribution, an accuracy of about 75%, and for soil classes according to Robertson, an accuracy of about 97–99%, was reached.


Chronic Kidney Disease (CKD) is a worldwide concern that influences roughly 10% of the grown-up population on the world. For most of the people the early diagnosis of CKD is often not possible. Therefore, the utilization of present-day Computer aided supported strategies is important to help the conventional CKD finding framework to be progressively effective and precise. In this project, six modern machine learning techniques namely Multilayer Perceptron Neural Network, Support Vector Machine, Naïve Bayes, K-Nearest Neighbor, Decision Tree, Logistic regression were used and then to enhance the performance of the model Ensemble Algorithms such as ADABoost, Gradient Boosting, Random Forest, Majority Voting, Bagging and Weighted Average were used on the Chronic Kidney Disease dataset from the UCI Repository. The model was tuned finely to get the best hyper parameters to train the model. The performance metrics used to evaluate the model was measured using Accuracy, Precision, Recall, F1-score, Mathew`s Correlation Coefficient and ROC-AUC curve. The experiment was first performed on the individual classifiers and then on the Ensemble classifiers. The ensemble classifier like Random Forest and ADABoost performed better with 100% Accuracy, Precision and Recall when compared to the individual classifiers with 99.16% accuracy, 98.8% Precision and 100% Recall obtained from Decision Tree Algorithm


2021 ◽  
Vol 23 (08) ◽  
pp. 532-537
Author(s):  
Cherlakola Abhinav Reddy ◽  
◽  
Sai Nitesh Gadiraju ◽  
Dr. Samala Nagaraj ◽  
◽  
...  

Online media has progressively obtained integral to the route billions of individuals experience news and occasions, frequently bypassing writers—the conventional guardians of breaking news. Occasions,in reality, make a relating spike of posts (tweets) on Twitter. This projects a great deal of significance on the validity of data found via online media stages like Twitter. We have utilized different managed learning techniques like Naïve Bayes, Decision Trees, and Support Vector Machines on the information to separate tweets among genuine and counterfeit news. For our AI models, we have utilized tweet and client highlights as our indicators. We accomplished a precision of 88% utilizing the Random Forest classifier and 88% utilizing the Decision tree. Notwithstanding, we accept that breaking down client records would build the accuracy of our models.


Sign in / Sign up

Export Citation Format

Share Document