Permutation entropy: Influence of amplitude information on time series classification performance

David Cuesta–Frau;

doi:10.3934/mbe.2019342

Slope Entropy: A New Time Series Complexity Estimator Based on Both Symbolic Patterns and Amplitude Information

Entropy ◽

10.3390/e21121167 ◽

2019 ◽

Vol 21 (12) ◽

pp. 1167 ◽

Cited By ~ 3

Author(s):

David Cuesta-Frau

Keyword(s):

Time Series ◽

Classification Performance ◽

Permutation Entropy ◽

Data Series ◽

Time Series Classification ◽

New Methods ◽

Discriminating Power ◽

Desirable Feature ◽

Encoding Method ◽

Robust To Noise

The development of new measures and algorithms to quantify the entropy or related concepts of a data series is a continuous effort that has brought many innovations in this regard in recent years. The ultimate goal is usually to find new methods with a higher discriminating power, more efficient, more robust to noise and artifacts, less dependent on parameters or configurations, or any other possibly desirable feature. Among all these methods, Permutation Entropy (PE) is a complexity estimator for a time series that stands out due to its many strengths, with very few weaknesses. One of these weaknesses is the PE’s disregarding of time series amplitude information. Some PE algorithm modifications have been proposed in order to introduce such information into the calculations. We propose in this paper a new method, Slope Entropy (SlopEn), that also addresses this flaw but in a different way, keeping the symbolic representation of subsequences using a novel encoding method based on the slope generated by two consecutive data samples. By means of a thorough and extensive set of comparative experiments with PE and Sample Entropy (SampEn), we demonstrate that SlopEn is a very promising method with clearly a better time series classification performance than those previous methods.

Download Full-text

Embedded Dimension and Time Series Length. Practical Influence on Permutation Entropy and Its Applications

Entropy ◽

10.3390/e21040385 ◽

2019 ◽

Vol 21 (4) ◽

pp. 385 ◽

Cited By ~ 14

Author(s):

David Cuesta-Frau ◽

Juan Pablo Murillo-Escobar ◽

Diana Alexandra Orrego ◽

Edilson Delgado-Trejos

Keyword(s):

Time Series ◽

Classification Performance ◽

Permutation Entropy ◽

Series Length ◽

Stability Point ◽

Long Time ◽

Model Time Series ◽

The Stability ◽

Forbidden Patterns ◽

Short Time

Permutation Entropy (PE) is a time series complexity measure commonly used in a variety of contexts, with medicine being the prime example. In its general form, it requires three input parameters for its calculation: time series length N, embedded dimension m, and embedded delay τ . Inappropriate choices of these parameters may potentially lead to incorrect interpretations. However, there are no specific guidelines for an optimal selection of N, m, or τ , only general recommendations such as N > > m ! , τ = 1 , or m = 3 , … , 7 . This paper deals specifically with the study of the practical implications of N > > m ! , since long time series are often not available, or non-stationary, and other preliminary results suggest that low N values do not necessarily invalidate PE usefulness. Our study analyses the PE variation as a function of the series length N and embedded dimension m in the context of a diverse experimental set, both synthetic (random, spikes, or logistic model time series) and real–world (climatology, seismic, financial, or biomedical time series), and the classification performance achieved with varying N and m. The results seem to indicate that shorter lengths than those suggested by N > > m ! are sufficient for a stable PE calculation, and even very short time series can be robustly classified based on PE measurements before the stability point is reached. This may be due to the fact that there are forbidden patterns in chaotic time series, not all the patterns are equally informative, and differences among classes are already apparent at very short lengths.

Download Full-text

Permutation Entropy: Enhancing Discriminating Power by Using Relative Frequencies Vector of Ordinal Patterns Instead of Their Shannon Entropy

Entropy ◽

10.3390/e21101013 ◽

2019 ◽

Vol 21 (10) ◽

pp. 1013 ◽

Cited By ~ 4

Author(s):

David Cuesta-Frau ◽

Antonio Molina-Picó ◽

Borja Vargas ◽

Paula González

Keyword(s):

Nonlinear Dynamics ◽

Time Series ◽

Classification Accuracy ◽

Markov Models ◽

Hidden Markov ◽

Permutation Entropy ◽

Time Series Classification ◽

Discriminating Power ◽

Ordinal Patterns ◽

Temperature Records

Many measures to quantify the nonlinear dynamics of a time series are based on estimating the probability of certain features from their relative frequencies. Once a normalised histogram of events is computed, a single result is usually derived. This process can be broadly viewed as a nonlinear I R n mapping into I R , where n is the number of bins in the histogram. However, this mapping might entail a loss of information that could be critical for time series classification purposes. In this respect, the present study assessed such impact using permutation entropy (PE) and a diverse set of time series. We first devised a method of generating synthetic sequences of ordinal patterns using hidden Markov models. This way, it was possible to control the histogram distribution and quantify its influence on classification results. Next, real body temperature records are also used to illustrate the same phenomenon. The experiments results confirmed the improved classification accuracy achieved using raw histogram data instead of the PE final values. Thus, this study can provide a very valuable guidance for the improvement of the discriminating capability not only of PE, but of many similar histogram-based measures.

Download Full-text

WINkNN: Windowed Intervals’ Number kNN Classifier for Efficient Time-Series Applications

Mathematics ◽

10.3390/math8030413 ◽

2020 ◽

Vol 8 (3) ◽

pp. 413 ◽

Cited By ~ 2

Author(s):

Chris Lytridis ◽

Anna Lekova ◽

Christos Bazinas ◽

Michail Manios ◽

Vassilis G. Kaburlasos

Keyword(s):

Time Series ◽

Ad Hoc ◽

Nearest Neighbor ◽

Classification Performance ◽

Human Robot Interaction ◽

Time Series Classification ◽

K Nearest Neighbor ◽

Time Dimension ◽

Knn Classifier ◽

Benchmark Datasets

Our interest is in time series classification regarding cyber–physical systems (CPSs) with emphasis in human-robot interaction. We propose an extension of the k nearest neighbor (kNN) classifier to time-series classification using intervals’ numbers (INs). More specifically, we partition a time-series into windows of equal length and from each window data we induce a distribution which is represented by an IN. This preserves the time dimension in the representation. All-order data statistics, represented by an IN, are employed implicitly as features; moreover, parametric non-linearities are introduced in order to tune the geometrical relationship (i.e., the distance) between signals and consequently tune classification performance. In conclusion, we introduce the windowed IN kNN (WINkNN) classifier whose application is demonstrated comparatively in two benchmark datasets regarding, first, electroencephalography (EEG) signals and, second, audio signals. The results by WINkNN are superior in both problems; in addition, no ad-hoc data preprocessing is required. Potential future work is discussed.

Download Full-text

Data Augmentation with Suboptimal Warping for Time-Series Classification

Sensors ◽

10.3390/s20010098 ◽

2019 ◽

Vol 20 (1) ◽

pp. 98 ◽

Cited By ~ 3

Author(s):

Krzysztof Kamycki ◽

Tomasz Kapuscinski ◽

Mariusz Oszust

Keyword(s):

Time Series ◽

Data Augmentation ◽

Nearest Neighbor ◽

Multivariate Time Series ◽

Metric Learning ◽

Classification Performance ◽

Training Dataset ◽

Time Series Classification ◽

Extensive Evaluation ◽

The Impact

In this paper, a novel data augmentation method for time-series classification is proposed. In the introduced method, a new time-series is obtained in warped space between suboptimally aligned input examples of different lengths. Specifically, the alignment is carried out constraining the warping path and reducing its flexibility. It is shown that the resultant synthetic time-series can form new class boundaries and enrich the training dataset. In this work, the comparative evaluation of the proposed augmentation method against related techniques on representative multivariate time-series datasets is presented. The performance of methods is examined using the nearest neighbor classifier with the dynamic time warping (NN-DTW), LogDet divergence-based metric learning with triplet constraints (LDMLT), and the recently introduced time-series cluster kernel (NN-TCK). The impact of the augmentation on the classification performance is investigated, taking into account entire datasets and cases with a small number of training examples. The extensive evaluation reveals that the introduced method outperforms related augmentation algorithms in terms of the obtained classification accuracy.

Download Full-text

Using the Information Provided by Forbidden Ordinal Patterns in Permutation Entropy to Reinforce Time Series Discrimination Capabilities

Entropy ◽

10.3390/e22050494 ◽

2020 ◽

Vol 22 (5) ◽

pp. 494

Author(s):

David Cuesta-Frau

Keyword(s):

Time Series ◽

Classification Accuracy ◽

Input Parameter ◽

Classification Performance ◽

Permutation Entropy ◽

Data Series ◽

Additional Information ◽

New Methods ◽

Discriminating Power ◽

Ordinal Patterns

Despite its widely tested and proven usefulness, there is still room for improvement in the basic permutation entropy (PE) algorithm, as several subsequent studies have demonstrated in recent years. Some of these new methods try to address the well-known PE weaknesses, such as its focus only on ordinal and not on amplitude information, and the possible detrimental impact of equal values found in subsequences. Other new methods address less specific weaknesses, such as the PE results’ dependence on input parameter values, a common problem found in many entropy calculation methods. The lack of discriminating power among classes in some cases is also a generic problem when entropy measures are used for data series classification. This last problem is the one specifically addressed in the present study. Toward that purpose, the classification performance of the standard PE method was first assessed by conducting several time series classification tests over a varied and diverse set of data. Then, this performance was reassessed using a new Shannon Entropy normalisation scheme proposed in this paper: divide the relative frequencies in PE by the number of different ordinal patterns actually found in the time series, instead of by the theoretically expected number. According to the classification accuracy obtained, this last approach exhibited a higher class discriminating power. It was capable of finding significant differences in six out of seven experimental datasets—whereas the standard PE method only did in four—and it also had better classification accuracy. It can be concluded that using the additional information provided by the number of forbidden/found patterns, it is possible to achieve a higher discriminating power than using the classical PE normalisation method. The resulting algorithm is also very similar to that of PE and very easy to implement.

Download Full-text

Feature extraction by grammatical evolution for one-class time series classification

Genetic Programming and Evolvable Machines ◽

10.1007/s10710-021-09403-x ◽

2021 ◽

Author(s):

Stefano Mauceri ◽

James Sweeney ◽

Miguel Nicolau ◽

James McDermott

Keyword(s):

Time Series ◽

Classification Problem ◽

Classification Performance ◽

Grammatical Evolution ◽

Data Driven ◽

Time Series Classification ◽

Class Time ◽

Feature Based ◽

Representation Of Time

AbstractWhen dealing with a new time series classification problem, modellers do not know in advance which features could enable the best classification performance. We propose an evolutionary algorithm based on grammatical evolution to attain a data-driven feature-based representation of time series with minimal human intervention. The proposed algorithm can select both the features to extract and the sub-sequences from which to extract them. These choices not only impact classification performance but also allow understanding of the problem at hand. The algorithm is tested on 30 problems outperforming several benchmarks. Finally, in a case study related to subject authentication, we show how features learned for a given subject are able to generalise to subjects unseen during the extraction phase.

Download Full-text

Multiple fault diagnosis for hydraulic systems using Nearest-centroid-with-DBA and Random-Forest-based-time-series-classification

2020 39th Chinese Control Conference (CCC) ◽

10.23919/ccc50068.2020.9189401 ◽

2020 ◽

Author(s):

Zhijie Peng ◽

Ke Zhang ◽

Yi Chai

Keyword(s):

Time Series ◽

Fault Diagnosis ◽

Random Forest ◽

Time Series Classification ◽

Hydraulic Systems ◽

Multiple Fault ◽

Multiple Fault Diagnosis

Download Full-text

A Frequent Pattern Based Time Series Classification Framework

JOURNAL OF ELECTRONICS INFORMATION TECHNOLOGY ◽

10.3724/sp.j.1146.2009.00135 ◽

2010 ◽

Vol 32 (2) ◽

pp. 261-266

Author(s):

Li Wan ◽

Jian-xin Liao ◽

Xiao-min Zhu ◽

Ping Ni

Keyword(s):

Time Series ◽

Frequent Pattern ◽

Time Series Classification ◽

Classification Framework

Download Full-text

Uncertain Time Series Classification with Shapelet Transform

2020 International Conference on Data Mining Workshops (ICDMW) ◽

10.1109/icdmw51313.2020.00044 ◽

2020 ◽

Author(s):

Michael Franklin Mbouopda ◽

Engelbert Mephu Nguifo

Keyword(s):

Time Series ◽

Time Series Classification ◽

Uncertain Time

Download Full-text