MANNWARE: A Malware Classification Approach with a Few Samples Using a Memory Augmented Neural Network

Kien Tran; Hiroshi Sato; Masao Kubo

doi:10.3390/info11010051

MANNWARE: A Malware Classification Approach with a Few Samples Using a Memory Augmented Neural Network

Information ◽

10.3390/info11010051 ◽

2020 ◽

Vol 11 (1) ◽

pp. 51 ◽

Cited By ~ 1

Author(s):

Kien Tran ◽

Hiroshi Sato ◽

Masao Kubo

Keyword(s):

Neural Network ◽

Language Processing ◽

Fine Tuning ◽

Sources Of Information ◽

Classification Problems ◽

Malware Classification ◽

Feature Spaces ◽

N Gram ◽

The One ◽

Processing Techniques

The ability to stop malware as soon as they start spreading will always play an important role in defending computer systems. It must be a huge benefit for organizations as well as society if intelligent defense systems could themselves detect and prevent new types of malware as soon as they reveal only a tiny amount of samples. An approach introduced in this paper takes advantage of One-shot/Few-shot learning algorithms to solve the malware classification problems using a Memory Augmented Neural Network in combination with the Natural Language Processing techniques such as word2vec, n-gram. We embed the malware’s API calls, which are very valuable sources of information for identifying malware’s behaviors, in the different feature spaces, and then feed them to the one-shot/few-shot learning models. Evaluating the model on the two datasets (FFRI 2017 and APIMDS) shows that the models with different parameters could yield high accuracy on malware classification with only a few samples. For example, on the APIMDS dataset, it was able to guess 78.85% correctly after seeing only nine malware samples and 89.59% after fine-tuning with a few other samples. The results confirmed very good accuracies compared to the other traditional methods, and point to a new area of malware research.

Download Full-text

Self-normalizing learning on biomedical ontologies using a deep Siamese neural network

10.1101/2020.04.23.057117 ◽

2020 ◽

Cited By ~ 1

Author(s):

Fatima Zohra Smaili ◽

Xin Gao ◽

Robert Hoehndorf

Keyword(s):

Neural Network ◽

Natural Language ◽

Language Processing ◽

Research Group ◽

Prediction Method ◽

Entity Recognition ◽

Superior Performance ◽

Biomedical Ontologies ◽

Sources Of Information ◽

Structured Information

AbstractMotivationOntologies are widely used in biomedicine for the annotation and standardization of data. One of the main roles of ontologies is to provide structured background knowledge within a domain as well as a set of labels, synonyms, and definitions for the classes within a domain. The two types of information provided by ontologies have been extensively exploited in natural language processing and machine learning applications. However, they are commonly used separately, and thus it is unknown if joining the two sources of information can further benefit data analysis tasks.ResultsWe developed a novel method that applies named entity recognition and normalization methods on texts to connect the structured information in biomedical ontologies with the information contained in natural language. We apply this normalization both to literature and to the natural language information contained within ontologies themselves. The normalized ontologies and text are then used to generate embeddings, and relations between entities are predicted using a deep Siamese neural network model that takes these embeddings as input. We demonstrate that our novel embedding and prediction method using self-normalized biomedical ontologies significantly outperforms the state-of-the-art methods in embedding ontologies on two benchmark tasks: prediction of interactions between proteins and prediction of gene–disease associations. Our method also allows us to apply ontology-based annotations and axioms to the prediction of toxicological effects of chemicals where our method shows superior performance. Our method is generic and can be applied in scenarios where ontologies consisting of both structured information and natural language labels or synonyms are used.Availabilityhttps://github.com/bio-ontology-research-group/[email protected] and [email protected]

Download Full-text

Malware Classification Based on Shallow Neural Network

Future Internet ◽

10.3390/fi12120219 ◽

2020 ◽

Vol 12 (12) ◽

pp. 219

Author(s):

Pin Yang ◽

Huiyu Zhou ◽

Yue Zhu ◽

Liang Liu ◽

Lei Zhang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Computational Cost ◽

Malicious Code ◽

Classification Model ◽

Evolutionary Trend ◽

Training Time ◽

Binary File ◽

Malware Classification ◽

N Gram

The emergence of a large number of new malicious code poses a serious threat to network security, and most of them are derivative versions of existing malicious code. The classification of malicious code is helpful to analyze the evolutionary trend of malicious code families and trace the source of cybercrime. The existing methods of malware classification emphasize the depth of the neural network, which has the problems of a long training time and large computational cost. In this work, we propose the shallow neural network-based malware classifier (SNNMAC), a malware classification model based on shallow neural networks and static analysis. Our approach bridges the gap between precise but slow methods and fast but less precise methods in existing works. For each sample, we first generate n-grams from their opcode sequences of the binary file with a decompiler. An improved n-gram algorithm based on control transfer instructions is designed to reduce the n-gram dataset. Then, the SNNMAC exploits a shallow neural network, replacing the full connection layer and softmax with the average pooling layer and hierarchical softmax, to learn from the dataset and perform classification. We perform experiments on the Microsoft malware dataset. The evaluation result shows that the SNNMAC outperforms most of the related works with 99.21% classification precision and reduces the training time by more than half when compared with the methods using DNN (Deep Neural Networks).

Download Full-text

Sparse Non-negative Matrix Language Modeling

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00102 ◽

2016 ◽

Vol 4 ◽

pp. 329-342 ◽

Cited By ~ 1

Author(s):

Joris Pelemans ◽

Noam Shazeer ◽

Ciprian Chelba

Keyword(s):

Neural Network ◽

State Of The Art ◽

Language Modeling ◽

Language Models ◽

Close Match ◽

Modeling Techniques ◽

Network Estimation ◽

N Gram ◽

Network Language ◽

The One

We present Sparse Non-negative Matrix (SNM) estimation, a novel probability estimation technique for language modeling that can efficiently incorporate arbitrary features. We evaluate SNM language models on two corpora: the One Billion Word Benchmark and a subset of the LDC English Gigaword corpus. Results show that SNM language models trained with n-gram features are a close match for the well-established Kneser-Ney models. The addition of skip-gram features yields a model that is in the same league as the state-of-the-art recurrent neural network language models, as well as complementary: combining the two modeling techniques yields the best known result on the One Billion Word Benchmark. On the Gigaword corpus further improvements are observed using features that cross sentence boundaries. The computational advantages of SNM estimation over both maximum entropy and neural network estimation are probably its main strength, promising an approach that has large flexibility in combining arbitrary features and yet scales gracefully to large amounts of data.

Download Full-text

Rare Words: A Major Problem for Contextualized Embeddings and How to Fix it by Attentive Mimicking

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6403 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8766-8774 ◽

Cited By ~ 1

Author(s):

Timo Schick ◽

Hinrich Schütze

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Language Processing ◽

Deep Neural Network ◽

Language Model ◽

Language Modeling ◽

Fine Tuning ◽

Language Models ◽

Network Architectures ◽

Semantic Properties

Pretraining deep neural network architectures with a language modeling objective has brought large improvements for many natural language processing tasks. Exemplified by BERT, a recently proposed such architecture, we demonstrate that despite being trained on huge amounts of data, deep language models still struggle to understand rare words. To fix this problem, we adapt Attentive Mimicking, a method that was designed to explicitly learn embeddings for rare words, to deep language models. In order to make this possible, we introduce one-token approximation, a procedure that enables us to use Attentive Mimicking even when the underlying language model uses subword-based tokenization, i.e., it does not assign embeddings to all words. To evaluate our method, we create a novel dataset that tests the ability of language models to capture semantic properties of words without any task-specific fine-tuning. Using this dataset, we show that adding our adapted version of Attentive Mimicking to BERT does substantially improve its understanding of rare words.

Download Full-text

Transformable Convolutional Neural Network for Text Classification

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/625 ◽

2018 ◽

Cited By ~ 5

Author(s):

Liqiang Xiao ◽

Honglun Zhang ◽

Wenqing Chen ◽

Yongkun Wang ◽

Yaohui Jin

Keyword(s):

Neural Network ◽

Neural Networks ◽

Natural Language Processing ◽

Language Processing ◽

Text Classification ◽

State Of The Art ◽

Feature Transformation ◽

Sampling Locations ◽

N Gram ◽

Complex Features

Convolutional neural networks (CNNs) have shown their promising performance for natural language processing tasks, which extract n-grams as features to represent the input. However, n-gram based CNNs are inherently limited to fixed geometric structure and cannot proactively adapt to the transformations of features. In this paper, we propose two modules to provide CNNs with the flexibility for complex features and the adaptability for transformation, namely, transformable convolution and transformable pooling. Our method fuses dynamic and static deviations to redistribute the sampling locations, which can capture both current and global transformations. Our modules can be easily integrated by other models to generate new transformable networks. We test proposed modules on two state-of-the-art models, and the results demonstrate that our modules can effectively adapt to the feature transformation in text classification.

Download Full-text

Evaluation of Impact of Neural Networks in Text Classification

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst/21/07257 ◽

2021 ◽

Vol 23 (07) ◽

pp. 1279-1292

Author(s):

Meghana S ◽

◽

Jagadeesh Sai D ◽

Dr. Krishna Raj P. M ◽

◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Language Processing ◽

Recurrent Neural Network ◽

Short Term Memory ◽

Confusion Matrix ◽

British Broadcasting Corporation ◽

Text Data ◽

N Gram ◽

The Impact

One of the most trending and major areas of research in Natural Language Processing (NLP) is the classification of text data. This necessarily means that the category that the text belongs to is determined by the content of the text. Various algorithms such as Recurrent Neural Network along with its variation which is Long Short-Term Memory, Hierarchical Attention Networks and also Convolutional Neural Network have been used to analyse how the context of the text can be determined from the text data which in available in terms of datasets. These algorithms each have a special characteristic of their own. While Recurrent Neural Network maintains the structural sequence of the contexts, the Convolutional Neural Network manages to obtain the n-gram feature and the Hierarchical Attention Network manages the hierarchy of the documents or data. The above said algorithms have been implemented on the British Broadcasting Corporation News datasets. Various parameters such as recall, precision, accuracy etc. have been considered along with standards such as F1-score, confusion matrix etc. to deduce the impact.

Download Full-text

RIscoper: a tool for RNA–RNA interaction extraction from the literature

Bioinformatics ◽

10.1093/bioinformatics/btz044 ◽

2019 ◽

Vol 35 (17) ◽

pp. 3199-3202 ◽

Cited By ~ 11

Author(s):

Yang Zhang ◽

Tianyuan Liu ◽

Liqun Chen ◽

Jinxurong Yang ◽

Jiayi Yin ◽

...

Keyword(s):

Language Processing ◽

Biomedical Literature ◽

Supplementary Information ◽

Computational Studies ◽

Interaction Extraction ◽

Search Tool ◽

Strong Performance ◽

N Gram ◽

Rna Interaction ◽

Processing Techniques

Abstract Motivation Numerous experimental and computational studies in the biomedical literature have provided considerable amounts of data on diverse RNA–RNA interactions (RRIs). However, few text mining systems for RRIs information extraction are available. Results RNA Interactome Scoper (RIscoper) represents the first tool for full-scale RNA interactome scanning and was developed for extracting RRIs from the literature based on the N-gram model. Notably, a reliable RRI corpus was integrated in RIscoper, and more than 13 300 manually curated sentences with RRI information were recruited. RIscoper allows users to upload full texts or abstracts, and provides an online search tool that is connected with PubMed (PMID and keyword input), and these capabilities are useful for biologists. RIscoper has a strong performance (90.4% precision and 93.9% recall), integrates natural language processing techniques and has a reliable RRI corpus. Availability and implementation The standalone software and web server of RIscoper are freely available at www.rna-society.org/riscoper/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Multiclass Event Classification from Text

Scientific Programming ◽

10.1155/2021/6660651 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Daler Ali ◽

Malik Muhammad Saad Missen ◽

Mujtaba Husnain

Keyword(s):

Neural Network ◽

Social Media ◽

Deep Learning ◽

Sources Of Information ◽

Online Data ◽

Event Classification ◽

Global Issues ◽

Detection Approach ◽

The One ◽

Local Languages

Social media has become one of the most popular sources of information. People communicate with each other and share their ideas, commenting on global issues and events in a multilingual environment. While social media has been popular for several years, recently, it has given an exponential rise in online data volumes because of the increasing popularity of local languages on the web. This allows researchers of the NLP community to exploit the richness of different languages while overcoming the challenges posed by these languages. Urdu is also one of the most used local languages being used on social media. In this paper, we presented the first-ever event detection approach for Urdu language text. Multiclass event classification is performed by popular deep learning (DL) models, i.e.,Convolution Neural Network (CNN), Recurrence Neural Network (RNN), and Deep Neural Network (DNN). The one-hot-encoding, word embedding, and term-frequency inverse document frequency- (TF-IDF-) based feature vectors are used to evaluate the Deep Learning(DL) models. The dataset that is used for experimental work consists of more than 0.15 million (103965) labeled sentences. DNN classifier has achieved a promising accuracy of 84% in extracting and classifying the events in the Urdu language script.

Download Full-text

Fast Neural Network Engine for Natural Science Language Processing: A Drug-Search Case.

10.26434/chemrxiv.12800348 ◽

2020 ◽

Author(s):

Vadim V. Korolev ◽

Artem Mitrofanov ◽

Kirill Karpov ◽

Valery Tkachenko

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Natural Science ◽

Therapeutic Agent ◽

Semantic Relations ◽

Chemical Data ◽

Processing Methods ◽

Modern Natural

The main advantage of modern natural language processing methods is a possibility to turn an amorphous human-readable task into a strict mathematic form. That allows to extract chemical data and insights from articles and to find new semantic relations. We propose a universal engine for processing chemical and biological texts. We successfully tested it on various use-cases and applied to a case of searching a therapeutic agent for a COVID-19 disease by analyzing PubMed archive.

Download Full-text

Architecture Optimization Model for the Deep Neural Network For Binary Classification Problems

International Journal of Intelligent Computing and Information Sciences ◽

10.21608/ijicis.2020.18509.1008 ◽

2020 ◽

Vol 0 (0) ◽

pp. 0-0

Author(s):

Kingsley Ukaoha ◽

Efosa Igodan

Keyword(s):

Neural Network ◽

Optimization Model ◽

Deep Neural Network ◽

Binary Classification ◽

Classification Problems ◽

Architecture Optimization

Download Full-text