scholarly journals Dynamic graph embedding for outlier detection on multiple meteorological time series

PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0247119
Author(s):  
Gen Li ◽  
Jason J. Jung

Existing dynamic graph embedding-based outlier detection methods mainly focus on the evolution of graphs and ignore the similarities among them. To overcome this limitation for the effective detection of abnormal climatic events from meteorological time series, we proposed a dynamic graph embedding model based on graph proximity, called DynGPE. Climatic events are represented as a graph where each vertex indicates meteorological data and each edge indicates a spurious relationship between two meteorological time series that are not causally related. The graph proximity is described as the distance between two graphs. DynGPE can cluster similar climatic events in the embedding space. Abnormal climatic events are distant from most of the other events and can be detected using outlier detection methods. We conducted experiments by applying three outlier detection methods (i.e., isolation forest, local outlier factor, and box plot) to real meteorological data. The results showed that DynGPE achieves better results than the baseline by 44.3% on average in terms of the F-measure. Isolation forest provides the best performance and stability. It achieved higher results than the local outlier factor and box plot methods, namely, by 15.4% and 78.9% on average, respectively.

2020 ◽  
Vol 5 (1) ◽  
pp. 1
Author(s):  
Omar Alghushairy ◽  
Raed Alsini ◽  
Terence Soule ◽  
Xiaogang Ma

Outlier detection is a statistical procedure that aims to find suspicious events or items that are different from the normal form of a dataset. It has drawn considerable interest in the field of data mining and machine learning. Outlier detection is important in many applications, including fraud detection in credit card transactions and network intrusion detection. There are two general types of outlier detection: global and local. Global outliers fall outside the normal range for an entire dataset, whereas local outliers may fall within the normal range for the entire dataset, but outside the normal range for the surrounding data points. This paper addresses local outlier detection. The best-known technique for local outlier detection is the Local Outlier Factor (LOF), a density-based technique. There are many LOF algorithms for a static data environment; however, these algorithms cannot be applied directly to data streams, which are an important type of big data. In general, local outlier detection algorithms for data streams are still deficient and better algorithms need to be developed that can effectively analyze the high velocity of data streams to detect local outliers. This paper presents a literature review of local outlier detection algorithms in static and stream environments, with an emphasis on LOF algorithms. It collects and categorizes existing local outlier detection algorithms and analyzes their characteristics. Furthermore, the paper discusses the advantages and limitations of those algorithms and proposes several promising directions for developing improved local outlier detection methods for data streams.


2021 ◽  
Vol 11 (13) ◽  
pp. 5861
Author(s):  
Gen Li ◽  
Tri-Hai Nguyen ◽  
Jason J. Jung

With a large of time series dataset from the Internet of Things in Ambient Intelligence-enabled smart environments, many supervised learning-based anomaly detection methods have been investigated but ignored the correlation among the time series. To address this issue, we present a new idea for anomaly detection based on dynamic graph embedding, in which the dynamic graph comprises the multiple time series and their correlation in each time interval. We propose an entropy for measuring a graph’s information injunction with a correlation matrix to define similarity between graphs. A dynamic graph embedding model based on the graph similarity is proposed to cluster the graphs for anomaly detection. We implement the proposed model in vehicular edge computing for traffic incident detection. The experiments are carried out using traffic data produced by the Simulation of Urban Mobility framework. The experimental findings reveal that the proposed method achieves better results than the baselines by 14.5% and 18.1% on average with respect to F1-score and accuracy, respectively.


2021 ◽  
Vol 5 (1) ◽  
pp. 56
Author(s):  
Giulia Moschini ◽  
Régis Houssou ◽  
Jérôme Bovay ◽  
Stephan Robert-Nicoud

This paper addresses the problem of the unsupervised approach of credit card fraud detection in unbalanced datasets using the ARIMA model. The ARIMA model is fitted to the regular spending behaviour of the customer and is used to detect fraud if some deviations or discrepancies appear. Our model is applied to credit card datasets and is compared to four anomaly detection approaches, namely, the K-means, box plot, local outlier factor and isolation forest approaches. The results show that the ARIMA model presents better detecting power than that of the benchmark models.


IEEE Access ◽  
2021 ◽  
Vol 9 ◽  
pp. 132980-132989
Author(s):  
Siyu Luan ◽  
Zonghua Gu ◽  
Leonid B. Freidovich ◽  
Lili Jiang ◽  
Qingling Zhao

Author(s):  
Shashank Singh and Meenu Garg

It is essential that Visa organizations can distinguish false Mastercard exchanges so clients are not charged for things that they didn't buy. Such issues can be handled with Data Science and its significance, alongside Machine Learning, couldn't be more important. This undertaking expects to outline the demonstrating of an informational collection utilizing AI with Credit Card Fraud Detection. The Credit Card Fraud Detection Problem incorporates demonstrating past Visa exchanges with the information of the ones that ended up being extortion. This model is then used to perceive if another exchange is fake. Our target here is to identify 100% of the fake exchanges while limiting the off base misrepresentation arrangements. Charge card Fraud Detection is an average example of arrangement. In this cycle, we have zeroed in on examining and pre- preparing informational indexes just as the sending of numerous irregularity discovery calculations, for example, Local Outlier Factor and Isolation Forest calculation on the PCA changed Credit Card Transaction


2021 ◽  
pp. 1-12
Author(s):  
Chunyan She ◽  
Shaohua Zeng

Outlier detection is a hot issue in data mining, which has plenty of real-world applications. LOF (Local Outlier Factor) can capture the abnormal degree of objects in the dataset with different density levels, and many extended algorithms have been proposed in recent years. However, the LOF needs to search the nearest neighborhood of each object on the whole dataset, which greatly increases the time cost. Most of these extended algorithms only consider the distance between an object and its neighborhood, but ignore the local distribution of an object within its neighborhood, resulting in a high false-positive rate. To improve the running speed, a rough clustering based on triple fusion is proposed, which divides a dataset into several subsets and outlier detection is performed only on each subset. Then, considering the local distribution of an object within its neighborhood, a new local outlier factor is constructed to estimate the abnormal degree of each object. Finally, the experimental results indicate that the proposed algorithm has better performance and lower running time than the others.


Sign in / Sign up

Export Citation Format

Share Document