Metamorphic malware detection using opcode frequency rate and decision tree

Mahmood Fazlali; Peyman Khodamoradi; Farhad Mardukhi; Masoud Nosrati; Mohammad Mahdi Dehshibi

doi:10.4018/ijisp.2016070105

Metamorphic malware detection using opcode frequency rate and decision tree

International Journal of Information Security and Privacy ◽

10.4018/ijisp.2016070105 ◽

2016 ◽

Vol 10 (3) ◽

pp. 67-86 ◽

Cited By ~ 3

Author(s):

Mahmood Fazlali ◽

Peyman Khodamoradi ◽

Farhad Mardukhi ◽

Masoud Nosrati ◽

Mohammad Mahdi Dehshibi

Keyword(s):

Decision Tree ◽

Time Complexity ◽

Malicious Code ◽

Empirical Validation ◽

Signature Extraction ◽

Database Updates ◽

Novel Method ◽

High Level ◽

Metamorphic Malware ◽

Enormous Number

Malware is defined as any type of malicious code that is the potent to harm a computer or a network. Modern malwares are accompanied with mutation characteristics, namely polymorphism and metamorphism. They let malwares to generate enormous number of variants. Rising number of metamorphic malwares entails hardship in analyzing them for signature extraction and database updates. In spite of the broad use of signature-based methods in the security products, they are not able detect the new unseen morphs of malware, and it is stemmed from changing the structure of malware as well as the signature in each infection. In this paper, a novel method is proposed in which the proportion of opcodes is used for detecting the new morphs. Decision trees are utilized for classification and detection of malware variants based on the rate of opcode frequencies. Three metrics for evaluating the proposed method are speed, efficiency and accuracy. It was observed in the course of experiments that speed and time complexity will not be challenging factors; because of the fast nature of extracting the frequencies of opcodes from source assembly file. Empirical validation reveals that the proposed method outperforms the entire commercial antivirus programs with a high level of efficiency and accuracy.

Download Full-text

AI-based localization and classification of skin disease with erythema

Scientific Reports ◽

10.1038/s41598-021-84593-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Ha Min Son ◽

Wooho Jeon ◽

Jinhyun Kim ◽

Chan Yeong Heo ◽

Hye Jin Yoon ◽

...

Keyword(s):

Neural Network ◽

Skin Diseases ◽

Classification Model ◽

Screening Tests ◽

Sensitivity Score ◽

Common Skin ◽

Novel Method ◽

Improved Performance ◽

High Level

AbstractAlthough computer-aided diagnosis (CAD) is used to improve the quality of diagnosis in various medical fields such as mammography and colonography, it is not used in dermatology, where noninvasive screening tests are performed only with the naked eye, and avoidable inaccuracies may exist. This study shows that CAD may also be a viable option in dermatology by presenting a novel method to sequentially combine accurate segmentation and classification models. Given an image of the skin, we decompose the image to normalize and extract high-level features. Using a neural network-based segmentation model to create a segmented map of the image, we then cluster sections of abnormal skin and pass this information to a classification model. We classify each cluster into different common skin diseases using another neural network model. Our segmentation model achieves better performance compared to previous studies, and also achieves a near-perfect sensitivity score in unfavorable conditions. Our classification model is more accurate than a baseline model trained without segmentation, while also being able to classify multiple diseases within a single image. This improved performance may be sufficient to use CAD in the field of dermatology.

Download Full-text

A Novel Method on Hydrographic Survey Technology Selection Based on the Decision Tree Supervised Learning

Applied Sciences ◽

10.3390/app11114966 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4966

Author(s):

Ivana Golub Medvešek ◽

Igor Vujović ◽

Joško Šoda ◽

Maja Krčum

Keyword(s):

Decision Tree ◽

Supervised Learning ◽

Autonomous Underwater Vehicle ◽

Specific Area ◽

Underwater Vehicle ◽

Appropriate Technology ◽

Technology Selection ◽

Hydrographic Survey ◽

Seabed Mapping ◽

Novel Method

Hydrographic survey or seabed mapping plays an important role in achieving better maritime safety, especially in coastal waters. Due to advances in survey technologies, it becomes important to choose well-suited technology for a specific area. Moreover, various technologies have various ranges of equipment and manufacturers, as well as characteristics. Therefore, in this paper, a novel method of a hydrographic survey, i.e., identifying the appropriate technology, has been developed. The method is based on a reduced elimination matrix, decision tree supervised learning, and multicriteria decision methods. The available technologies were: remotely operated underwater vehicle (ROV), unmanned aerial vehicle (UAV), light detection and ranging (LIDAR), autonomous underwater vehicle (AUV), satellite-derived bathymetry (SDB), and multibeam echosounder (MBES), and they are applied as a case study of Kaštela Bay. Results show, considering the specifics of the survey area, that UAV is the best-suited technology to be used for a hydrographic survey. However, some other technologies, such as SDB come close and can be considered an alternative for hydrographic surveys.

Download Full-text

Analysis of ancient DNA from a prehistoric Amerindian cemetery

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.1999.0368 ◽

1999 ◽

Vol 354 (1379) ◽

pp. 153-159 ◽

Cited By ~ 41

Author(s):

Anne C. Stone ◽

Mark Stoneking

Keyword(s):

Ancient Dna ◽

Dna Analysis ◽

Nuclear Dna ◽

Genetic Research ◽

Hypervariable Region ◽

Dna Preservation ◽

European Contact ◽

Novel Method ◽

High Level ◽

Central Illinois

The Norris Farms No. 36 cemetery in central Illinois has been the subject of considerable archaeological and genetic research. Both mitochondrial DNA (mtDNA) and nuclear DNA have been examined in this 700–year–old population. DNA preservation at the site was good, with about 70% of the samples producing mtDNA results and approximately 15% yielding nuclear DNA data. All four of the major Amerindian mtDNA haplogroups were found, in addition to a fifth haplogroup. Sequences of the first hypervariable region of the mtDNA control region revealed a high level of diversity in the Norris Farms population and confirmed that the fifth haplogroup associates with Mongolian sequences and hence is probably authentic. Other than a possible reduction in the number of rare mtDNA lineages in many populations, it does not appear as if European contact significantly altered patterns of Amerindian mtDNA variation, despite the large decrease in population size that occurred. For nuclear DNA analysis, a novel method for DNA–based sex identification that uses nucleotide differences between the X and Y copies of the amelogenin gene was developed and applied successfully in approximately 20 individuals. Despite the well–known problems of poor DNA preservation and the ever–present possibility of contamination with modern DNA, genetic analysis of the Norris Farms No. 36 population demonstrates that ancient DNA can be a fruitful source of new insights into prehistoric populations.

Download Full-text

High-Granular Micro-Segmentation in Campus Networks based on Downloadable Access Control Lists

PROGRAMMNAYA INGENERIA ◽

10.17587/prin.12.443-449 ◽

2021 ◽

Vol 12 (9) ◽

pp. 443-449

Author(s):

D. S. Khleborodov ◽

Keyword(s):

Access Control ◽

Malicious Code ◽

Cyber Attack ◽

Network Access ◽

Practical Application ◽

Local Networks ◽

Operational Characteristics ◽

Campus Networks ◽

High Level ◽

Access Policies

Micro-segmentation of local networks is an important element of network security. The main goal of micro-segmentation of network is to reduce a risk of compromising hosts during a cyber-attack. In micro-segmented networks, if one of the hosts has been compromised, the malicious code or attacker will be limited in the "horizontal" actions by the micro-segment to which the compromised host belongs. Existing methods of micro-segmentation of networks have operational drawbacks that impede their effective practical application. This article presents a new method of micro-segmentation of local wired and wireless networks based on downloadable and wireless access control lists, which allows to achieve a high level of granularity of network access policies by minimizing the microsegment, along with high operational characteristics.

Download Full-text

IDENTIFYING REUSABLE SOFTWARE COMPONENTS BY INDUCTION

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194091000202 ◽

1991 ◽

Vol 01 (03) ◽

pp. 271-292 ◽

Cited By ~ 7

Author(s):

JUAN CARLOS ESTEVA ◽

ROBERT G. REYNOLDS

Keyword(s):

Decision Tree ◽

Inductive Learning ◽

Test Case ◽

Structure Control ◽

Metric Properties ◽

Abstract Description ◽

Compilation Process ◽

Code Module ◽

Code Complexity ◽

High Level

The goal of the Partial Metrics Project is the automatic acquisition of planning knowledge from target code modules in a program library. In the current prototype the system is given a target code module written in Ada as input, and the result is a sequence of generalized transformations that can be used to design a class of related modules. This is accomplished by embedding techniques from Artificial Intelligence into the traditional structure of a compiler. The compiler performs compilation in reverse, starting with detailed code and producing an abstract description of it. The principal task facing the compiler is to find a decomposition of the target code into a collection of syntactic components that are nearly decomposable. Here, nearly decomposable corresponds to the need for each code segment to be nearly independent syntactically from the others. The most independent segments are then the target of the code generalization process. This process can be described as a form of chunking and is implemented here in terms of explanation-based learning. The problem of producing nearly decomposable code components becomes difficult when target code module is not well structured. The task facing users of the system is to be able to identify well-structured code modules from a library of modules that are suitable for input to the system. In this paper we describe the use of inductive learning techniques, namely variations on Quinlan's ID3 system that are capable of producing a decision tree that can be used to conceptually distinguish between well poorly structured code. In order to accomplish that task a set of high-level concepts used by software engineers to characterize structurally understandable code were identified. Next, each of these concepts was operationalized in terms of code complexity metrics that can be easily calculated during the compilation process. These metrics are related to various aspects of the program structure including its coupling, cohesion, data structure, control structure, and documentation. Each candidate module was then described in terms of a collection of such metrics. Using a training set of positive and negative examples of well-structured modules, each described in terms of the appointed metrics, a decision tree was produced that was used to recognize other well-structured modules in terms of their metric properties. This approach was applied to modules from existing software libraries in a variety of domains such as database, editor, graphic, window, data processing, FFT and computer vision software. The results achieved by the system were then benchmarked against the performance of experienced programmers in terms of recognizing well structured code. In a test case involving 120 modules, the system was able to discriminate between poor and well-structured code 99% of the time as compared to an 80% average for the 52 programmers sampled. The results suggest that such an inductive system can serve as a practical mechanism for effectively identifying reusable code modules in terms of their structural properties.

Download Full-text

Error-Correcting Graph Isomorphism Using Decision Trees

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001498000415 ◽

1998 ◽

Vol 12 (06) ◽

pp. 721-742 ◽

Cited By ~ 10

Author(s):

B. T. Messmer ◽

H. Bunke

Keyword(s):

Decision Tree ◽

Fast Algorithm ◽

Time Complexity ◽

A Priori ◽

Graph Isomorphism ◽

Subgraph Isomorphism ◽

Input Graph ◽

Require Time ◽

Original Algorithm ◽

Model Graph

In this paper we present a fast algorithm for the computation of error-correcting graph isomorphisms. The new algorithm is an extension of a method for exact subgraph isomorphism detection from an input graph to a set of a priori known model graphs, which was previously developed by the authors. Similar to the original algorithm, the new method is based on the idea of creating a decision tree from the model graphs. This decision tree is compiled off-line in a preprocessing step. At run time, it is used to find all error-correcting graph isomorphisms from an input graph to any of the model graphs up to a certain degree of distortion. The main advantage of the new algorithm is that error-correcting graph isomorphism detection is guaranteed to require time that is only polynomial in terms of the size of the input graph. Furthermore, the time complexity is completely independent of the number of model graphs and the number of edges in each model graph. However, the size of the decision tree is exponential in the size of the model graphs and the degree of error. Nevertheless, practical experiments have indicated that the method can be applied to graphs containing up to 16 vertices.

Download Full-text

Discriminative Feature Selection in Image Classification and Retrieval

Graph-Based Methods in Computer Vision ◽

10.4018/978-1-4666-1891-6.ch011 ◽

2012 ◽

pp. 216-230

Author(s):

Shang Liu ◽

Xiao Bai

Keyword(s):

Image Classification ◽

Bag Of Words ◽

Current Image ◽

Visual Words ◽

Discriminative Feature ◽

Label Information ◽

Novel Method ◽

Category Knowledge ◽

Supervised Image Classification ◽

High Level

In this chapter, the authors present a new method to improve the performance of current bag-of-words based image classification process. After feature extraction, they introduce a pairwise image matching scheme to select the discriminative features. Only the label information from the training-sets is used to update the feature weights via an iterative matching processing. The selected features correspond to the foreground content of the images, and thus highlight the high level category knowledge of images. Visual words are constructed on these selected features. This novel method could be used as a refinement step for current image classification and retrieval process. The authors prove the efficiency of their method in three tasks: supervised image classification, semi-supervised image classification, and image retrieval.

Download Full-text

Spam Mail Filtering Using Data Mining Approach

Handling Priority Inversion in Time-Constrained Distributed Databases - Advances in Data Mining and Database Management ◽

10.4018/978-1-7998-2491-6.ch015 ◽

2020 ◽

pp. 253-282 ◽

Cited By ~ 3

Author(s):

Ajay Kumar Gupta

Keyword(s):

Decision Tree ◽

Classification Accuracy ◽

Time Complexity ◽

Identification Accuracy ◽

Tree Model ◽

Swarm Optimization ◽

Spam Filter ◽

Data Mining Approach ◽

Lower Complexity ◽

Using Data

This chapter presents an overview of spam email as a serious problem in our internet world and creates a spam filter that reduces the previous weaknesses and provides better identification accuracy with less complexity. Since J48 decision tree is a widely used classification technique due to its simple structure, higher classification accuracy, and lower time complexity, it is used as a spam mail classifier here. Now, with lower complexity, it becomes difficult to get higher accuracy in the case of large number of records. In order to overcome this problem, particle swarm optimization is used here to optimize the spam base dataset, thus optimizing the decision tree model as well as reducing the time complexity. Once the records have been standardized, the decision tree is again used to check the accuracy of the classification. The chapter presents a study on various spam-related issues, various filters used, related work, and potential spam-filtering scope.

Download Full-text

Improved Privacy

Intelligent Information Technologies and Applications - Advances in Intelligent Information Technologies ◽

10.4018/978-1-59904-958-8.ch014 ◽

2011 ◽

pp. 295-316

Author(s):

K. Abumani ◽

R. Nedunchezhian

Keyword(s):

Data Mining ◽

Decision Making ◽

Empirical Analysis ◽

Time Complexity ◽

Strategic Decision ◽

Strategic Decision Making ◽

Sensitive Information ◽

Data Mining Techniques ◽

Novel Method ◽

Mining Tools

Data mining techniques have been widely used for extracting non-trivial information from massive amounts of data. They help in strategic decision-making as well as many more applications. However, data mining also has a few demerits apart from its usefulness. Sensitive information contained in the database may be brought out by the data mining tools. Different approaches are being utilized to hide the sensitive information. The proposed work in this article applies a novel method to access the generating transactions with minimum effort from the transactional database. It helps in reducing the time complexity of any hiding algorithm. The theoretical and empirical analysis of the algorithm shows that hiding of data using this proposed work performs association rule hiding quicker than other algorithms.

Download Full-text

A Method for Constructing Bijective S-Box with High Nonlinearity Based on Chaos and Optimization

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127415501278 ◽

2015 ◽

Vol 25 (10) ◽

pp. 1550127 ◽

Cited By ~ 16

Author(s):

Yong Wang ◽

Peng Lei ◽

Kwok-Wo Wong

Keyword(s):

Chaotic Maps ◽

Chaotic Map ◽

Optimization Method ◽

Second Phase ◽

Local Optima ◽

High Nonlinearity ◽

Highly Nonlinear ◽

Novel Method ◽

High Level ◽

Very High

Although chaotic maps possess useful properties, such as being highly nonlinear and pseudorandom, for designing S-box, the cryptographic performance of the chaos-based substitution box (S-box) cannot achieve a very high level, especially in nonlinearity. In this paper, two conditions of improving the nonlinearity of S-box are firstly given according to the process of calculating nonlinearity. A novel method combining chaos and optimization operations is proposed for constructing S-box with high nonlinearity. There are three phases in our method. In the first phase, the S-box is initialized by a chaotic map. Then, its nonlinearity is enhanced by an optimization method in the second phase. To avoid the result of falling into local optima, some adjustments are done in the final phase. Experimental results show that the S-boxes constructed by the proposed method have a much higher nonlinearity than those only based on chaotic maps. This justifies that our algorithm is effective in generating S-boxes with high cryptographic performance.

Download Full-text