scholarly journals Using the Information Provided by Forbidden Ordinal Patterns in Permutation Entropy to Reinforce Time Series Discrimination Capabilities

Entropy ◽  
2020 ◽  
Vol 22 (5) ◽  
pp. 494
Author(s):  
David Cuesta-Frau

Despite its widely tested and proven usefulness, there is still room for improvement in the basic permutation entropy (PE) algorithm, as several subsequent studies have demonstrated in recent years. Some of these new methods try to address the well-known PE weaknesses, such as its focus only on ordinal and not on amplitude information, and the possible detrimental impact of equal values found in subsequences. Other new methods address less specific weaknesses, such as the PE results’ dependence on input parameter values, a common problem found in many entropy calculation methods. The lack of discriminating power among classes in some cases is also a generic problem when entropy measures are used for data series classification. This last problem is the one specifically addressed in the present study. Toward that purpose, the classification performance of the standard PE method was first assessed by conducting several time series classification tests over a varied and diverse set of data. Then, this performance was reassessed using a new Shannon Entropy normalisation scheme proposed in this paper: divide the relative frequencies in PE by the number of different ordinal patterns actually found in the time series, instead of by the theoretically expected number. According to the classification accuracy obtained, this last approach exhibited a higher class discriminating power. It was capable of finding significant differences in six out of seven experimental datasets—whereas the standard PE method only did in four—and it also had better classification accuracy. It can be concluded that using the additional information provided by the number of forbidden/found patterns, it is possible to achieve a higher discriminating power than using the classical PE normalisation method. The resulting algorithm is also very similar to that of PE and very easy to implement.

Entropy ◽  
2019 ◽  
Vol 21 (12) ◽  
pp. 1167 ◽  
Author(s):  
David Cuesta-Frau

The development of new measures and algorithms to quantify the entropy or related concepts of a data series is a continuous effort that has brought many innovations in this regard in recent years. The ultimate goal is usually to find new methods with a higher discriminating power, more efficient, more robust to noise and artifacts, less dependent on parameters or configurations, or any other possibly desirable feature. Among all these methods, Permutation Entropy (PE) is a complexity estimator for a time series that stands out due to its many strengths, with very few weaknesses. One of these weaknesses is the PE’s disregarding of time series amplitude information. Some PE algorithm modifications have been proposed in order to introduce such information into the calculations. We propose in this paper a new method, Slope Entropy (SlopEn), that also addresses this flaw but in a different way, keeping the symbolic representation of subsequences using a novel encoding method based on the slope generated by two consecutive data samples. By means of a thorough and extensive set of comparative experiments with PE and Sample Entropy (SampEn), we demonstrate that SlopEn is a very promising method with clearly a better time series classification performance than those previous methods.


Entropy ◽  
2019 ◽  
Vol 21 (10) ◽  
pp. 1013 ◽  
Author(s):  
David Cuesta-Frau ◽  
Antonio Molina-Picó ◽  
Borja Vargas ◽  
Paula González

Many measures to quantify the nonlinear dynamics of a time series are based on estimating the probability of certain features from their relative frequencies. Once a normalised histogram of events is computed, a single result is usually derived. This process can be broadly viewed as a nonlinear I R n mapping into I R , where n is the number of bins in the histogram. However, this mapping might entail a loss of information that could be critical for time series classification purposes. In this respect, the present study assessed such impact using permutation entropy (PE) and a diverse set of time series. We first devised a method of generating synthetic sequences of ordinal patterns using hidden Markov models. This way, it was possible to control the histogram distribution and quantify its influence on classification results. Next, real body temperature records are also used to illustrate the same phenomenon. The experiments results confirmed the improved classification accuracy achieved using raw histogram data instead of the PE final values. Thus, this study can provide a very valuable guidance for the improvement of the discriminating capability not only of PE, but of many similar histogram-based measures.


Entropy ◽  
2019 ◽  
Vol 21 (4) ◽  
pp. 385 ◽  
Author(s):  
David Cuesta-Frau ◽  
Juan Pablo Murillo-Escobar ◽  
Diana Alexandra Orrego ◽  
Edilson Delgado-Trejos

Permutation Entropy (PE) is a time series complexity measure commonly used in a variety of contexts, with medicine being the prime example. In its general form, it requires three input parameters for its calculation: time series length N, embedded dimension m, and embedded delay τ . Inappropriate choices of these parameters may potentially lead to incorrect interpretations. However, there are no specific guidelines for an optimal selection of N, m, or τ , only general recommendations such as N > > m ! , τ = 1 , or m = 3 , … , 7 . This paper deals specifically with the study of the practical implications of N > > m ! , since long time series are often not available, or non-stationary, and other preliminary results suggest that low N values do not necessarily invalidate PE usefulness. Our study analyses the PE variation as a function of the series length N and embedded dimension m in the context of a diverse experimental set, both synthetic (random, spikes, or logistic model time series) and real–world (climatology, seismic, financial, or biomedical time series), and the classification performance achieved with varying N and m. The results seem to indicate that shorter lengths than those suggested by N > > m ! are sufficient for a stable PE calculation, and even very short time series can be robustly classified based on PE measurements before the stability point is reached. This may be due to the fact that there are forbidden patterns in chaotic time series, not all the patterns are equally informative, and differences among classes are already apparent at very short lengths.


Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 1034 ◽  
Author(s):  
David Cuesta-Frau ◽  
Pradeepa H. Dakappa ◽  
Chakrapani Mahabala ◽  
Arjun R. Gupta

Fever is a readily measurable physiological response that has been used in medicine for centuries. However, the information provided has been greatly limited by a plain thresholding approach, overlooking the additional information provided by temporal variations and temperature values below such threshold that are also representative of the subject status. In this paper, we propose to utilize continuous body temperature time series of patients that developed a fever, in order to apply a method capable of diagnosing the specific underlying fever cause only by means of a pattern relative frequency analysis. This analysis was based on a recently proposed measure, Slope Entropy, applied to a variety of records coming from dengue and malaria patients, among other fever diseases. After an input parameter customization, a classification analysis of malaria and dengue records took place, quantified by the Matthews Correlation Coefficient. This classification yielded a high accuracy, with more than 90% of the records correctly labelled in some cases, demonstrating the feasibility of the approach proposed. This approach, after further studies, or combined with more measures such as Sample Entropy, is certainly very promising in becoming an early diagnosis tool based solely on body temperature temporal patterns, which is of great interest in the current Covid-19 pandemic scenario.


Entropy ◽  
2019 ◽  
Vol 21 (6) ◽  
pp. 613 ◽  
Author(s):  
Christoph Bandt

The study of order patterns of three equally-spaced values x t , x t + d , x t + 2 d in a time series is a powerful tool. The lag d is changed in a wide range so that the differences of the frequencies of order patterns become autocorrelation functions. Similar to a spectrogram in speech analysis, four ordinal autocorrelation functions are used to visualize big data series, as for instance heart and brain activity over many hours. The method applies to real data without preprocessing, and outliers and missing data do not matter. On the theoretical side, we study the properties of order correlation functions and show that the four autocorrelation functions are orthogonal in a certain sense. An analysis of variance of a modified permutation entropy can be performed with four variance components associated with the functions.


Entropy ◽  
2019 ◽  
Vol 21 (10) ◽  
pp. 1023 ◽  
Author(s):  
Sebastian Berger ◽  
Andrii Kravtsiv ◽  
Gerhard Schneider ◽  
Denis Jordan

Ordinal patterns are the common basis of various techniques used in the study of dynamical systems and nonlinear time series analysis. The present article focusses on the computational problem of turning time series into sequences of ordinal patterns. In a first step, a numerical encoding scheme for ordinal patterns is proposed. Utilising the classical Lehmer code, it enumerates ordinal patterns by consecutive non-negative integers, starting from zero. This compact representation considerably simplifies working with ordinal patterns in the digital domain. Subsequently, three algorithms for the efficient extraction of ordinal patterns from time series are discussed, including previously published approaches that can be adapted to the Lehmer code. The respective strengths and weaknesses of those algorithms are discussed, and further substantiated by benchmark results. One of the algorithms stands out in terms of scalability: its run-time increases linearly with both the pattern order and the sequence length, while its memory footprint is practically negligible. These properties enable the study of high-dimensional pattern spaces at low computational cost. In summary, the tools described herein may improve the efficiency of virtually any ordinal pattern-based analysis method, among them quantitative measures like permutation entropy and symbolic transfer entropy, but also techniques like forbidden pattern identification. Moreover, the concepts presented may allow for putting ideas into practice that up to now had been hindered by computational burden. To enable smooth evaluation, a function library written in the C programming language, as well as language bindings and native implementations for various numerical computation environments are provided in the supplements.


2017 ◽  
Vol 9 (1) ◽  
pp. 168781401668666 ◽  
Author(s):  
Yongjun Shen ◽  
Junfeng Wang ◽  
Shaopu Yang

As a dynamic detecting method for abrupt information, permutation entropy could effectively reflect the subtle change in time series data, which is also simple and can be computed conveniently. Based on the permutation entropy, some improved methods for detecting weak abrupt information hidden in time series data are presented, such as permutation entropy spectrum, second permutation entropy, and second permutation entropy spectrum. Through some simulation examples, these new methods are compared with the existing single permutation entropy and approximate entropy, and the results show that these methods can more effectively detect the much weaker abrupt information. Especially, the second permutation entropy spectrum is very robust even if the periodic abrupt information is very weak.


2018 ◽  
Vol 28 (12) ◽  
pp. 123111 ◽  
Author(s):  
J. H. Martínez ◽  
J. L. Herrera-Diestra ◽  
M. Chavez

Water ◽  
2021 ◽  
Vol 13 (13) ◽  
pp. 1723
Author(s):  
Ana Gonzalez-Nicolas ◽  
Marc Schwientek ◽  
Michael Sinsbeck ◽  
Wolfgang Nowak

Currently, the export regime of a catchment is often characterized by the relationship between compound concentration and discharge in the catchment outlet or, more specifically, by the regression slope in log-concentrations versus log-discharge plots. However, the scattered points in these plots usually do not follow a plain linear regression representation because of different processes (e.g., hysteresis effects). This work proposes a simple stochastic time-series model for simulating compound concentrations in a river based on river discharge. Our model has an explicit transition parameter that can morph the model between chemostatic behavior and chemodynamic behavior. As opposed to the typically used linear regression approach, our model has an additional parameter to account for hysteresis by including correlation over time. We demonstrate the advantages of our model using a high-frequency data series of nitrate concentrations collected with in situ analyzers in a catchment in Germany. Furthermore, we identify event-based optimal scheduling rules for sampling strategies. Overall, our results show that (i) our model is much more robust for estimating the export regime than the usually used regression approach, and (ii) sampling strategies based on extreme events (including both high and low discharge rates) are key to reducing the prediction uncertainty of the catchment behavior. Thus, the results of this study can help characterize the export regime of a catchment and manage water pollution in rivers at lower monitoring costs.


Sign in / Sign up

Export Citation Format

Share Document