Statistical Comparison of Feature Sets for Time Series Classification of Dynamic System Response

Author(s):  
Amit Banerjee ◽  
Juan C. Quiroz ◽  
Issam Abu-Mahfouz

The use of classification techniques for machine health monitoring and fault diagnosis has been popular in recent years. System response in the form of time series data can be used to identify the type of defect and severity of defect. However, a central issue with time series classification is that of identifying appropriate features for classification. In this paper, we explore a new feature set based on delay differential equations (DDEs). DDEs have been used recently for extracting features for classification but have never been used to classify system responses. The Duffing oscillator, Van der Pol–Duffing (VDP-D) oscillator, Lu oscillator, and Chen oscillator are used as examples for dynamic systems, and the responses are classified into self-similar groups. Responses with the same period should belong to the same group. Misclassification rate is used as an indicator of the efficacy of the feature set. The proposed feature set is compared to a statistical feature set, a power spectral coefficient feature set, and a wavelet coefficient feature set. In the work described in this paper, a density-estimation algorithm called DBSCAN is used as the classification algorithm. The proposed DDE-based feature set is found to be significantly better than the other feature sets for classifying responses generated by the Duffing, Lu, and Chen systems. The wavelet and power spectral coefficient data sets are not found to be significantly better than the statistical feature set for these systems. None of the feature sets tested is discerning enough on the VDP-D system.

Author(s):  
Amit Banerjee ◽  
Juan C. Quiroz ◽  
Issam Abu-Mahfouz

The use of classification techniques for machine health monitoring and fault diagnosis has been popular in recent years. System response in form of time series data can be used to identify type of defect, severity of defect etc. However, a central issue with time series classification is that of identifying appropriate features for classification. In this paper, we explore a new feature set based on a delay differential equations (DDEs). DDEs have been used recently for extracting features for classification but have never been used to classify system responses. The Duffing oscillator and Van der Pol–Duffing (VDP-D) oscillator are used as dynamic systems, and the responses are classified into self-similar groups. Responses with the same period should belong to the same group. Misclassification rate is used as an indicator of the efficacy of the feature set. The proposed feature set is compared to a statistical feature set, a power spectral coefficient feature set and a wavelet coefficient feature set. In work described in this paper, a density estimation algorithm called DBSCAN is used as the classification algorithm. The proposed DDE-based feature set is found to be significantly better than the other feature sets for the classifying responses generated by the Duffing system. The wavelet and the power spectral coefficient data sets are not found to be significantly better than the statistical feature set for the Duffing system. None of the feature sets tested are discerning enough on the VDP-D system.


2021 ◽  
Vol 12 (2) ◽  
pp. 294
Author(s):  
Agus Widarjono ◽  
M. B. Hendrie Anto ◽  
Faaza Fakhrunnas

This study investigates whether Islamic rural banks perform better than conventional rural banks as their competitor in Indonesia. To measure Islamic rural banks' financial performance, we apply financial stability using Z-score and profitability using the return on assets. We use monthly time series data from January 2009 to December 2018. The dynamic regression of the Autoregressive Distributed Lag (ARDL) model is then employed. The results report that the Z-Score of Islamic rural banks is higher than the Z-Score of conventional rural banks. This finding shows that Islamic rural banks are less risky than conventional rural banks. However, the Islamic rural banks' financial stability is very vulnerable to changes in equity, output, and inflation than conventional rural banks. Although the Islamic rural banks' profit rate is lower compared to conventional rural banks, it is considered more stable. The profit of Islamic rural banks is affected by size, equity, domestic output, and inflation.


Author(s):  
Elangovan Ramanujam ◽  
S. Padmavathi

Innovations and applicability of time series data mining techniques have significantly increased the researchers' interest in the problem of time series classification. Several algorithms have been proposed for this purpose categorized under shapelet, interval, motif, and whole series-based techniques. Among this, the bag-of-words technique, an extensive application of the text mining approach, performs well due to its simplicity and effectiveness. To extend the efficiency of the bag-of-words technique, this paper proposes a discriminate supervised weighted scheme to identify the characteristic and representative pattern of a class for efficient classification. This paper uses a modified weighted matrix that discriminates the representative and non-representative pattern which enables the interpretability in classification. Experimentation has been carried out to compare the performance of the proposed technique with state-of-the-art techniques in terms of accuracy and statistical significance.


Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1908
Author(s):  
Chao Ma ◽  
Xiaochuan Shi ◽  
Wei Li ◽  
Weiping Zhu

In the past decade, time series data have been generated from various fields at a rapid speed, which offers a huge opportunity for mining valuable knowledge. As a typical task of time series mining, Time Series Classification (TSC) has attracted lots of attention from both researchers and domain experts due to its broad applications ranging from human activity recognition to smart city governance. Specifically, there is an increasing requirement for performing classification tasks on diverse types of time series data in a timely manner without costly hand-crafting feature engineering. Therefore, in this paper, we propose a framework named Edge4TSC that allows time series to be processed in the edge environment, so that the classification results can be instantly returned to the end-users. Meanwhile, to get rid of the costly hand-crafting feature engineering process, deep learning techniques are applied for automatic feature extraction, which shows competitive or even superior performance compared to state-of-the-art TSC solutions. However, because time series presents complex patterns, even deep learning models are not capable of achieving satisfactory classification accuracy, which motivated us to explore new time series representation methods to help classifiers further improve the classification accuracy. In the proposed framework Edge4TSC, by building the binary distribution tree, a new time series representation method was designed for addressing the classification accuracy concern in TSC tasks. By conducting comprehensive experiments on six challenging time series datasets in the edge environment, the potential of the proposed framework for its generalization ability and classification accuracy improvement is firmly validated with a number of helpful insights.


Information ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 288
Author(s):  
Kuiyong Song ◽  
Nianbin Wang ◽  
Hongbin Wang

High-dimensional time series classification is a serious problem. A similarity measure based on distance is one of the methods for time series classification. This paper proposes a metric learning-based univariate time series classification method (ML-UTSC), which uses a Mahalanobis matrix on metric learning to calculate the local distance between multivariate time series and combines Dynamic Time Warping(DTW) and the nearest neighbor classification to achieve the final classification. In this method, the features of the univariate time series are presented as multivariate time series data with a mean value, variance, and slope. Next, a three-dimensional Mahalanobis matrix is obtained based on metric learning in the data. The time series is divided into segments of equal intervals to enable the Mahalanobis matrix to more accurately describe the features of the time series data. Compared with the most effective measurement method, the related experimental results show that our proposed algorithm has a lower classification error rate in most of the test datasets.


2016 ◽  
Vol 26 (09n10) ◽  
pp. 1361-1377 ◽  
Author(s):  
Daoyuan Li ◽  
Tegawende F. Bissyande ◽  
Jacques Klein ◽  
Yves Le Traon

Time series mining has become essential for extracting knowledge from the abundant data that flows out from many application domains. To overcome storage and processing challenges in time series mining, compression techniques are being used. In this paper, we investigate the loss/gain of performance of time series classification approaches when fed with lossy-compressed data. This extended empirical study is essential for reassuring practitioners, but also for providing more insights on how compression techniques can even be effective in smoothing and reducing noise in time series data. From a knowledge engineering perspective, we show that time series may be compressed by 90% using discrete wavelet transforms and still achieve remarkable classification accuracy, and that residual details left by popular wavelet compression techniques can sometimes even help to achieve higher classification accuracy than the raw time series data, as they better capture essential local features.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Jitao Zhang ◽  
Weiming Shen ◽  
Liang Gao ◽  
Xinyu Li ◽  
Long Wen

Time series classification is a basic and important approach for time series data mining. Nowadays, more researchers pay attention to the shape similarity method including Shapelet-based algorithms because it can extract discriminative subsequences from time series. However, most Shapelet-based algorithms discover Shapelets by searching candidate subsequences in training datasets, which brings two drawbacks: high computational burden and poor generalization ability. To overcome these drawbacks, this paper proposes a novel algorithm named Shapelet Dictionary Learning with SVM-based Ensemble Classifier (SDL-SEC). SDL-SEC modifies the Shapelet algorithm from two aspects: Shapelet discovery method and classifier. Firstly, a Shapelet Dictionary Learning (SDL) is proposed as a novel Shapelet discovery method to generate Shapelets instead of searching them. In this way, SDL owns the advantages of lower computational cost and higher generalization ability. Then, an SVM-based Ensemble Classifier (SEC) is developed as a novel ensemble classifier and adapted to the SDL algorithm. Different from the classic SVM that needs precise parameters tuning and appropriate features selection, SEC can avoid overfitting caused by a large number of features and parameters. Compared with the baselines on 45 datasets, the proposed SDL-SEC algorithm achieves a competitive classification accuracy with lower computational cost.


In this paper, we analyze, model, predict and cluster Global Active Power, i.e., a time series data obtained at one minute intervals from electricity sensors of a household. We analyze changes in seasonality and trends to model the data. We then compare various forecasting methods such as SARIMA and LSTM to forecast sensor data for the household and combine them to achieve a hybrid model that captures nonlinear variations better than either SARIMA or LSTM used in isolation. Finally, we cluster slices of time series data effectively using a novel clustering algorithm that is a combination of density-based and centroid-based approaches, to discover relevant subtle clusters from sensor data. Our experiments have yielded meaningful insights from the data at both a micro, day-to-day granularity, as well as a macro, weekly to monthly granularity.


2019 ◽  
Author(s):  
Joseph R. Mihaljevic ◽  
Amy L. Greer ◽  
Jesse L. Brunner

AbstractMechanistic models are critical for our understanding of both within-host dynamics (i.e., pathogen population growth and immune system processes) and among-host dynamics (i.e., transmission). Rarely, however, have within-host models been synthesized with data to infer processes, validate hypotheses, or generate new theories. In this study we use mechanistic models and empirical, time-series data of viral titer to better understand the growth of ranaviruses within their amphibian hosts and the immune dynamics that limit viral replication. Specifically, we fit a suite of potential models to our data, where each model represents a hypothesis about the interactions between viral growth and immune defense. Through formal model comparison, we find a parsimonious model that captures key features of our time-series data: the viral titer rises and falls through time, likely due to an immune system response, and that the initial viral dosage affects both the peak viral titer and the timing of the peak. Importantly, our model makes several predictions, including the existence of long-term viral infections, that can be validated in future studies.


2021 ◽  
Author(s):  
David Howe

Statistical imputation is a field of study that attempts to fill missing data. It is commonly applied to population statistics whose data have no correlation with running time. For a time series, data is typically analyzed using the autocorrelation function (ACF), the Fourier transform to estimate power spectral densities (PSD), the Allan deviation (ADEV), trend extensions, and basically any analysis that depends on uniform time indexes. We explain the rationale for an imputation algorithm that fills gaps in a time series by applying a backward, inverted replica of adjacent live data. To illustrate, four intentional massive gaps that exceed 100% of the original time series are recovered. The L(f) PSD with imputation applied to the gaps is nearly indistinguishable from the original. Also, the confidence of ADEV with imputation falls within 90% of the original ADEV with mixtures of power-law noises. The algorithm in Python is included for those wishing to try it.


Sign in / Sign up

Export Citation Format

Share Document