Anomaly Detection With Kernel Preserving Embedding

2021 ◽  
Vol 15 (5) ◽  
pp. 1-18
Author(s):  
Huawen Liu ◽  
Enhui Li ◽  
Xinwang Liu ◽  
Kaile Su ◽  
Shichao Zhang

Similarity representation plays a central role in increasingly popular anomaly detection techniques, which have been successfully applied in various realistic scenes. Until now, many low-rank representation techniques have been introduced to measure the similarity relations of data; yet, they only concern to minimize reconstruction errors, without involving the structural information of data. Besides, the traditional low-rank representation methods often take nuclear norm as their low-rank constraints, easily yielding a suboptimal solution. To address the problems above, in this article, we propose a novel anomaly detection method, which exploits kernel preserving embedding, as well as the double nuclear norm, to explore the similarity relations of data. Based on the similarity relations, a kind of probability transition matrix is derived, and a tailored random walk is further adopted to reveal anomalies. The proposed method can not only preserve the manifold structural properties of the data, but also alleviate the suboptimal problem. To validate the superiority of our method, extensive experiments with eight popular anomaly detection algorithms were conducted on 12 widely used datasets. The experimental results show that our detection method outperformed the state-of-the-art anomaly detection algorithms in most cases.

2019 ◽  
Vol 11 (2) ◽  
pp. 192 ◽  
Author(s):  
Yixin Yang ◽  
Jianqi Zhang ◽  
Shangzhen Song ◽  
Delian Liu

Anomaly detection (AD), which aims to distinguish targets with significant spectral differences from the background, has become an important topic in hyperspectral imagery (HSI) processing. In this paper, a novel anomaly detection algorithm via dictionary construction-based low-rank representation (LRR) and adaptive weighting is proposed. This algorithm has three main advantages. First, based on the consistency with AD problem, the LRR is employed to mine the lowest-rank representation of hyperspectral data by imposing a low-rank constraint on the representation coefficients. Sparse component contains most of the anomaly information and can be used for anomaly detection. Second, to better separate the sparse anomalies from the background component, a background dictionary construction strategy based on the usage frequency of the dictionary atoms for HSI reconstruction is proposed. The constructed dictionary excludes possible anomalies and contains all background categories, thus spanning a more reasonable background space. Finally, to further enhance the response difference between the background pixels and anomalies, the response output obtained by LRR is multiplied by an adaptive weighting matrix. Therefore, the anomaly pixels are more easily distinguished from the background. Experiments on synthetic and real-world hyperspectral datasets demonstrate the superiority of our proposed method over other AD detectors.


Anomaly detection has vital role in data preprocessing and also in the mining of outstanding points for marketing, network sensors, fraud detection, intrusion detection, stock market analysis. Recent studies have been found to concentrate more on outlier detection for real time datasets. Anomaly detection study is at present focuses on the expansion of innovative machine learning methods and on enhancing the computation time. Sentiment mining is the process to discover how people feel about a particular topic. Though many anomaly detection techniques have been proposed, it is also notable that the research focus lacks a comparative performance evaluation in sentiment mining datasets. In this study, three popular unsupervised anomaly detection algorithms such as density based, statistical based and cluster based anomaly detection methods are evaluated on movie review sentiment mining dataset. This paper will set a base for anomaly detection methods in sentiment mining research. The results show that density based (LOF) anomaly detection method suits best for the movie review sentiment dataset.


2020 ◽  
Author(s):  
Peter Skelsey

Information from crop disease surveillance programs and outbreak investigations provide real-world data about the drivers of epidemics. In many cases, however, only information on outbreaks is collected and data from surrounding healthy crops is omitted. Use of such data to develop models that can forecast risk/no-risk of disease is therefore problematic, as information relating to the no-risk status of healthy crops is missing. This study explored a novel application of anomaly detection techniques to derive models for forecasting risk of crop disease from data comprised of outbreaks only. This was done in two steps. In the training phase the algorithms were used to learn the envelope of weather conditions most associated with historic crop disease outbreaks. In the testing phase the algorithms were used for hindcasting of historic outbreak events. Five different anomaly-detection algorithms were compared according to their accuracy in forecasting outbreaks: robust covariance, one-class k-means, Gaussian mixture model, kernel density estimator, and one-class support vector machine. A case study of potato late blight survey data from across Great Britain was used for proof-of-concept. The results showed that Gaussian mixture model had the highest forecast accuracy at 97.0%, followed by one-class k-means at 96.9%. There was added value in combining the algorithms in an ensemble to provide a more accurate and robust forecasting tool that can be tailored to produce region-specific alerts. The techniques used here can easily be applied to outbreak data from other crop pathosystems to derive tools for agricultural decision support.


Sign in / Sign up

Export Citation Format

Share Document