scholarly journals Hydrologic Time Series Anomaly Detection Based on Flink

2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Feng Ye ◽  
Zihao Liu ◽  
Qinghua Liu ◽  
Zhijian Wang

The data mining and calculation of time series in critical application is still worth studying. Currently, in the field of hydrological time series, most of the detection of outliers focus on improving the specificity. To efficiently detect outliers in massive hydrologic sensor data, an anomaly detection method for hydrological time series based on Flink is proposed. Firstly, the sliding window and the ARIMA model are used to forecast data stream. Then, the confidence interval is calculated for the prediction result, and the results outside the interval range are judged as alternative anomaly data. Finally, based on the historical batch data, the K-Means++ algorithm is used to cluster the batch data. The state transition probability is calculated, and the anomaly data are evaluated in quality. Taking the hydrological sensor data obtained from the Chu River as experimental data, experiments on the detection time and outlier detection performance are carried out, respectively. The results show that when calculating the tens of millions of data, the time costed by two slaves is less than that by one slave, and the maximum reduction is 17.43%. The sensitivity of the evaluation is increased from 72.91% to 92.98%. In terms of delay, the average delay of different slaves is roughly the same, which is maintained within 20 ms. It shows that, under big data platform, the proposed algorithm can effectively improve the computational efficiency of hydrologic time series detection for tens of millions of data and has a significant improvement in sensitivity.

Author(s):  
Cong Gao ◽  
Ping Yang ◽  
Yanping Chen ◽  
Zhongmin Wang ◽  
Yue Wang

AbstractWith large deployment of wireless sensor networks, anomaly detection for sensor data is becoming increasingly important in various fields. As a vital data form of sensor data, time series has three main types of anomaly: point anomaly, pattern anomaly, and sequence anomaly. In production environments, the analysis of pattern anomaly is the most rewarding one. However, the traditional processing model cloud computing is crippled in front of large amount of widely distributed data. This paper presents an edge-cloud collaboration architecture for pattern anomaly detection of time series. A task migration algorithm is developed to alleviate the problem of backlogged detection tasks at edge node. Besides, the detection tasks related to long-term correlation and short-term correlation in time series are allocated to cloud and edge node, respectively. A multi-dimensional feature representation scheme is devised to conduct efficient dimension reduction. Two key components of the feature representation trend identification and feature point extraction are elaborated. Based on the result of feature representation, pattern anomaly detection is performed with an improved kernel density estimation method. Finally, extensive experiments are conducted with synthetic data sets and real-world data sets.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Xuguang Liu

Aiming at the anomaly detection problem in sensor data, traditional algorithms usually only focus on the continuity of single-source data and ignore the spatiotemporal correlation between multisource data, which reduces detection accuracy to a certain extent. Besides, due to the rapid growth of sensor data, centralized cloud computing platforms cannot meet the real-time detection needs of large-scale abnormal data. In order to solve this problem, a real-time detection method for abnormal data of IoT sensors based on edge computing is proposed. Firstly, sensor data is represented as time series; K-nearest neighbor (KNN) algorithm is further used to detect outliers and isolated groups of the data stream in time series. Secondly, an improved DBSCAN (Density Based Spatial Clustering of Applications with Noise) algorithm is proposed by considering spatiotemporal correlation between multisource data. It can be set according to sample characteristics in the window and overcomes the slow convergence problem using global parameters and large samples, then makes full use of data correlation to complete anomaly detection. Moreover, this paper proposes a distributed anomaly detection model for sensor data based on edge computing. It performs data processing on computing resources close to the data source as much as possible, which improves the overall efficiency of data processing. Finally, simulation results show that the proposed method has higher computational efficiency and detection accuracy than traditional methods and has certain feasibility.


As per statistics over 30 million in India have been diagnosed with diabetes. There is an enormous need and development to be made to recognize the possible fluctuation of blood glucose before hand with minimal errors and thereby enabling proactive decision making.. The present work details out the algorithms used for glucose prediction and makes a relative assessment of glucose prediction of Librepro Continuous Glucose Monitoring (CGM) sensor data of Type 1 Diabetes Mellitus (T1DM) subjects. For the development and evaluation of the model, 10 days observation data of 10 different subjects with T1DM recorded at every 15 minutes time interval is considered. The model's predictive performance is evaluated for one step ahead (15 minutes prediction horizon), two step ahead (30 minutes prediction horizon) and three step ahead (45 minutes prediction horizon) under univariate glucose prediction model. A novel hybrid data driven model which combines both linear regression and auto regression method is designed and developed for glucose prediction. This novel data driven model gave satisfactory performance metrics of MAPE value of 3.22 and RMSE of 7.38 mg/dl over the complex ARIMA model which requires proper selection of parameters to be chosen beforehand. In this paper an attempt has been made by the author to propose an ensemble method towards data driven model for glucose prediction under time series forecasting.


2021 ◽  
Vol 48 (4) ◽  
pp. 49-52
Author(s):  
Gastón García González ◽  
Pedro Casas ◽  
Alicia Fernández ◽  
Gabriel Gómez

Despite the many attempts and approaches for anomaly de- tection explored over the years, the automatic detection of rare events in data communication networks remains a com- plex problem. In this paper we introduce Net-GAN, a novel approach to network anomaly detection in time-series, us- ing recurrent neural networks (RNNs) and generative ad- versarial networks (GAN). Different from the state of the art, which traditionally focuses on univariate measurements, Net-GAN detects anomalies in multivariate time-series, ex- ploiting temporal dependencies through RNNs. Net-GAN discovers the underlying distribution of the baseline, multi- variate data, without making any assumptions on its nature, offering a powerful approach to detect anomalies in com- plex, difficult to model network monitoring data. We further exploit the concepts behind generative models to conceive Net-VAE, a complementary approach to Net-GAN for net- work anomaly detection, based on variational auto-encoders (VAE). We evaluate Net-GAN and Net-VAE in different monitoring scenarios, including anomaly detection in IoT sensor data, and intrusion detection in network measure- ments. Generative models represent a promising approach for network anomaly detection, especially when considering the complexity and ever-growing number of time-series to monitor in operational networks.


2018 ◽  
Vol 18 (1) ◽  
pp. 20-32 ◽  
Author(s):  
Jong-Min Kim ◽  
Jaiwook Baik

2016 ◽  
Vol 136 (3) ◽  
pp. 363-372
Author(s):  
Takaaki Nakamura ◽  
Makoto Imamura ◽  
Masashi Tatedoko ◽  
Norio Hirai

2011 ◽  
Vol 9 (3) ◽  
pp. 148-156
Author(s):  
Leonardo G. Tampelini ◽  
Clodis Boscarioli ◽  
Sarajane M. Peres ◽  
Silvio C. Sampaio

Sign in / Sign up

Export Citation Format

Share Document