scholarly journals Genetic Algorithm for the Mutual Information-Based Feature Selection in Univariate Time Series Data

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 9597-9609
Author(s):  
Umair F. Siddiqi ◽  
Sadiq M. Sait ◽  
Okyay Kaynak
Author(s):  
T. Warren Liao

In this chapter, we present genetic algorithm (GA) based methods developed for clustering univariate time series with equal or unequal length as an exploratory step of data mining. These methods basically implement the k-medoids algorithm. Each chromosome encodes in binary the data objects serving as the k-medoids. To compare their performance, both fixed-parameter and adaptive GAs were used. We first employed the synthetic control chart data set to investigate the performance of three fitness functions, two distance measures, and other GA parameters such as population size, crossover rate, and mutation rate. Two more sets of time series with or without known number of clusters were also experimented: one is the cylinder-bell-funnel data and the other is the novel battle simulation data. The clustering results are presented and discussed.


Information ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 288
Author(s):  
Kuiyong Song ◽  
Nianbin Wang ◽  
Hongbin Wang

High-dimensional time series classification is a serious problem. A similarity measure based on distance is one of the methods for time series classification. This paper proposes a metric learning-based univariate time series classification method (ML-UTSC), which uses a Mahalanobis matrix on metric learning to calculate the local distance between multivariate time series and combines Dynamic Time Warping(DTW) and the nearest neighbor classification to achieve the final classification. In this method, the features of the univariate time series are presented as multivariate time series data with a mean value, variance, and slope. Next, a three-dimensional Mahalanobis matrix is obtained based on metric learning in the data. The time series is divided into segments of equal intervals to enable the Mahalanobis matrix to more accurately describe the features of the time series data. Compared with the most effective measurement method, the related experimental results show that our proposed algorithm has a lower classification error rate in most of the test datasets.


2016 ◽  
Vol 26 (4) ◽  
pp. 043102 ◽  
Author(s):  
E. Bianco-Martinez ◽  
N. Rubido ◽  
Ch. G. Antonopoulos ◽  
M. S. Baptista

Author(s):  
Angeliki Papana

In this chapter, tools from univariate time series analysis and forecasting are presented and applied. Time series components, such as trend and seasonality are introduced and discussed, while time series methods are analyzed based on the type of the time series components. In the literature, linear methods are the most commonly used. However, real time series data often include nonlinear components, so linear time series forecasting may not be the optimal choice. Therefore, also a basic nonlinear forecasting method is presented. The necessity of these methods to logistics service providers and 3PL companies is presented by case studies that present how the operational and management costs can be cut down in order to ensure a service level. Short term forecasts are useful in all the units of activation of 3PL companies, i.e. supplies, production, distribution, storage, transportation, and service of customers.


2020 ◽  
Author(s):  
İsmail Sezen ◽  
Alper Unal ◽  
Ali Deniz

<p>Atmospheric pollution is one of the primary problems and high concentration levels are critical for human health and environment. This requires to study causes of unusual high concentration levels which do not conform to the expected behavior of the pollutant but it is not always easy to decide which levels are unusual, especially, when data is big and has complex structure. A visual inspection is subjective in most cases and a proper anomaly detection method should be used. Anomaly detection has been widely used in diverse research areas, but most of them have been developed for certain application domains. It also might not be always a good idea to identify anomalies by using data from near measurement sites because of spatio-temporal complexity of the pollutant. That’s why, it’s required to use a method which estimates anomalies from univariate time series data.</p><p>This work suggests a framework based on STL decomposition and extended isolation forest (EIF), which is a machine learning algorithm, to identify anomalies for univariate time series which has trend, multi-seasonality and seasonal variation. Main advantage of EIF method is that it defines anomalies by a score value.</p><p>In this study, a multi-seasonal STL decomposition has been applied on a univariate PM10 time series to remove trend and seasonal parts but STL is not resourceful to remove seasonal variation from the data. The remainder part still has 24 hours and yearly variation. To remove the variation, hourly and annual inter-quartile ranges (IQR) are calculated and data is standardized by dividing each value to corresponding IQR value. This process ensures removing seasonality in variation and the resulting data is processed by EIF to decide which values are anomaly by an objective criterion.</p>


Sign in / Sign up

Export Citation Format

Share Document