The frequency of pattern occurrence in random walks

Sergi Elizalde; Megan Martinez

doi:10.46298/dmtcs.2476

The frequency of pattern occurrence in random walks

Discrete Mathematics & Theoretical Computer Science ◽

10.46298/dmtcs.2476 ◽

2015 ◽

Vol DMTCS Proceedings, 27th... (Proceedings) ◽

Author(s):

Sergi Elizalde ◽

Megan Martinez

Keyword(s):

Time Series ◽

Random Walks ◽

Permutation Patterns ◽

Equal Frequency ◽

Time Intervals ◽

The Past ◽

Data Points ◽

International Audience ◽

Ordinal Patterns ◽

Analysis Of Time Series

International audience In the past decade, the use of ordinal patterns in the analysis of time series and dynamical systems has become an important tool. Ordinal patterns (otherwise known as a permutation patterns) are found in time series by taking $n$ data points at evenly-spaced time intervals and mapping them to a length-$n$ permutation determined by relative ordering. The frequency with which certain patterns occur is a useful statistic for such series. However, the behavior of the frequency of pattern occurrence is unstudied for most models. We look at the frequency of pattern occurrence in random walks in discrete time, and we define a natural equivalence relation on permutations under which equivalent patterns appear with equal frequency, regardless of probability distribution. We characterize these equivalence classes applying combinatorial methods. Au cours de la dernière décennie, l’utilisation des motifs ordinaux dans l’analyse des séries chronologiques et systèmes dynamiques est devenu un outil important. Des motifs ordinaux (autrement appelés motifs de permutations) se trouvent dans les séries chronologiques en prenant $n$ points de données au intervalles de temps uniformément espacées et les faisant correspondre à une permutation de longueur $n$ déterminée par leur ordre relatif. La fréquence avec laquelle certains motifs apparaissent est une statistique utile pour ces séries. Toutefois, le comportement de la fréquence d’apparition de ces motifs n’a pas été étudié pour la plupart des modèles. Nous regardons la fréquence d’occurrence des motifs dans les marches aléatoires en temps discret, et nous définissons une relation d’équivalence naturelle sur des permutations dans laquelle les motifs équivalents apparaissent avec la même fréquence, quelle que soit la distribution de probabilité. Nous caractérisons ces classes d’équivalence utilisant des méthodes combinatoires

Download Full-text

A Fast Shapelet Discovery Algorithm Based on Important Data Points

International Journal of Web Services Research ◽

10.4018/ijwsr.2017040104 ◽

2017 ◽

Vol 14 (2) ◽

pp. 67-80 ◽

Cited By ~ 12

Author(s):

Cun Ji ◽

Chao Zhao ◽

Li Pan ◽

Shijun Liu ◽

Chenglei Yang ◽

...

Keyword(s):

Time Series ◽

Classification Accuracy ◽

Time Series Classification ◽

Important Data ◽

The Past ◽

Significant Interest ◽

Data Points ◽

Discovery Time ◽

Accuracy Rates

Time series classification (TSC) has attracted significant interest over the past decade. A shapelet is one fragment of a time series that can represent class characteristics of the time series. A classifier based on shapelets is interpretable, more accurate, and faster. However, the time it takes to find shapelets is enormous. This article will propose a fast shapelet (FS) discovery algorithm based on important data points (IDPs). First, the algorithm will identify IDPs. Next, the subsequence containing one or more IDPs will be selected as a candidate shapelet. Finally, the best shapelets will be selected. Results will show that the proposed algorithm reduces the shapelet discovery time by approximately 14.0% while maintaining the same level of classification accuracy rates.

Download Full-text

The Temperature of Japan Modelling Japanâ€™s Average Monthly Temperature from 1901 to 2015

Asia Proceedings of Social Sciences ◽

10.31580/apss.v2i3.437 ◽

2018 ◽

Vol 2 (3) ◽

pp. 224-228

Author(s):

Batol Shiwa Hashimi ◽

Aissa Boudjella ◽

Wagma Saboor

Keyword(s):

Time Series ◽

Polynomial Regression ◽

Polynomial Function ◽

Average Difference ◽

Time Intervals ◽

Average Temperature ◽

Monthly Average ◽

The Past ◽

Temperature Trends ◽

The Relationship

The purpose of this investigation is to examine the variation of temperature in Japan over the past 114 years. The historical dataset of the monthly average temperature from 1901 to 2015 were analyzed. The relationship between temperature and time during the four time intervals, i.e (1901 -1930), (1931-1960), (1961-1990) and (1991-2015) is described using a new analytical model based on the last â€“square method of estimation. We accurately fit a polynomial regression trend of degree 4 to the time series to describe the temperature variation. The results show the average difference of temperature between 2015 and 1901 increases about 0.97 Â°C. The average monthly difference between the maximum and minimum temperature was approximately 2.11 Â°C. This approach of modeling temperature using regression form significantly simplifies the data analysis. The information from data, namely the variation of the temperature, maybe be obtained from the extracted parameters such as slope, y-intercept, and the coefficients of polynomial function that are a function of time. More importantly, the parameters that describe the time variation temperature trends over 115 years obtained with a high R-squared do not vary significantly. This is in agreement with the Earthâ€™s average temperature that has climbed to more 1 oC.

Download Full-text

Scale-space analysis of time series in circulatory research

AJP Heart and Circulatory Physiology ◽

10.1152/ajpheart.00168.2006 ◽

2006 ◽

Vol 291 (6) ◽

pp. H3012-H3022 ◽

Cited By ~ 3

Author(s):

Kim Erlend Mortensen ◽

Fred Godtliebsen ◽

Arthur Revhaug

Keyword(s):

Time Series ◽

Statistical Analysis ◽

Real Time ◽

Repeated Measures ◽

Multiple Scales ◽

Visual Presentation ◽

Scale Space ◽

Statistical Tool ◽

Data Points ◽

Analysis Of Time Series

Statistical analysis of time series is still inadequate within circulation research. With the advent of increasing computational power and real-time recordings from hemodynamic studies, one is increasingly dealing with vast amounts of data in time series. This paper aims to illustrate how statistical analysis using the significant nonstationarities (SiNoS) method may complement traditional repeated-measures ANOVA and linear mixed models. We applied these methods on a dataset of local hepatic and systemic circulatory changes induced by aortoportal shunting and graded liver resection. We found SiNoS analysis more comprehensive when compared with traditional statistical analysis in the following four ways: 1) the method allows better signal-to-noise detection; 2) including all data points from real time recordings in a statistical analysis permits better detection of significant features in the data; 3) analysis with multiple scales of resolution facilitates a more differentiated observation of the material; and 4) the method affords excellent visual presentation by combining group differences, time trends, and multiscale statistical analysis allowing the observer to quickly view and evaluate the material. It is our opinion that SiNoS analysis of time series is a very powerful statistical tool that may be used to complement conventional statistical methods.

Download Full-text

Machine Learning-Based Algorithms to Knowledge Extraction from Time Series Data: A Review

Data ◽

10.3390/data6060055 ◽

2021 ◽

Vol 6 (6) ◽

pp. 55

Author(s):

Giuseppe Ciaburro ◽

Gino Iannace

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

Time Sequence ◽

Series Data ◽

Time Intervals ◽

The Past ◽

Future Behavior ◽

Measurable Variable ◽

The Future

To predict the future behavior of a system, we can exploit the information collected in the past, trying to identify recurring structures in what happened to predict what could happen, if the same structures repeat themselves in the future as well. A time series represents a time sequence of numerical values observed in the past at a measurable variable. The values are sampled at equidistant time intervals, according to an appropriate granular frequency, such as the day, week, or month, and measured according to physical units of measurement. In machine learning-based algorithms, the information underlying the knowledge is extracted from the data themselves, which are explored and analyzed in search of recurring patterns or to discover hidden causal associations or relationships. The prediction model extracts knowledge through an inductive process: the input is the data and, possibly, a first example of the expected output, the machine will then learn the algorithm to follow to obtain the same result. This paper reviews the most recent work that has used machine learning-based techniques to extract knowledge from time series data.

Download Full-text

Ordinal patterns in clusters of subsequent extremes of regularly varying time series

Extremes ◽

10.1007/s10687-020-00391-2 ◽

2020 ◽

Vol 23 (4) ◽

pp. 521-545

Author(s):

Marco Oesting ◽

Alexander Schnurr

Keyword(s):

Time Series ◽

Measurement Errors ◽

Real Data ◽

Stationary Time Series ◽

River Rhine ◽

Discharge Data ◽

Data Application ◽

Data Points ◽

Ordinal Patterns ◽

Regularly Varying

Abstract In this paper, we investigate temporal clusters of extremes defined as subsequent exceedances of high thresholds in a stationary time series. Two meaningful features of these clusters are the probability distribution of the cluster size and the ordinal patterns giving the relative positions of the data points within a cluster. Since these patterns take only the ordinal structure of consecutive data points into account, the method is robust under monotone transformations and measurement errors. We verify the existence of the corresponding limit distributions in the framework of regularly varying time series, develop non-parametric estimators and show their asymptotic normality under appropriate mixing conditions. The performance of the estimators is demonstrated in a simulated example and a real data application to discharge data of the river Rhine.

Download Full-text

THE ROBUST FRACTAL ANALYSIS OF TIME SERIES: CONCERNING SIGNAL CLASS AND DATA LENGTH

Fractals ◽

10.1142/s0218348x11005099 ◽

2011 ◽

Vol 19 (01) ◽

pp. 29-49 ◽

Cited By ~ 4

Author(s):

M. H. FATTAHI ◽

N. TALEBBEYDOKHTI ◽

G. R. RAKHSHANDEHROO ◽

A. SHAMSAI ◽

E. NIKOOEE

Keyword(s):

Time Series ◽

Fractal Analysis ◽

Variation Method ◽

Fluctuation Analysis ◽

Analysis Method ◽

Range Analysis ◽

Rescaled Range Analysis ◽

Data Points ◽

Data Length ◽

Analysis Of Time Series

In the present paper, the influence of the signal class (fBm/fGn) and the data length of time series on choosing the robust fractal analysis method have been studied. More than 1000 fBm/fGn generated time series in short, intermediate and long ranges have been analyzed using common fractal analysis methods. The chosen techniques were power spectral density, detrended fluctuation analysis, rescaled range analysis, box counting, average wavelet coefficients, and the variation method. Numerous graphs indicating the suitability of each method in terms of biases in calculating the fundamental fractal feature of time series, Hurst coefficient, were employed. The results strongly emphasized the crucial influence of the signal class as well as the data length when choosing the appropriate fractal analysis method. Furthermore, as a step forward, a study on the number of data points present in a classified class/length was performed. The effect of the number of data points could not be neglected either. Based on the results, a strategy flowchart for fractal analysis of time series has been proposed. Finally, as an empirical example, the monthly, weekly and daily scaled flow time series of Ghar-e-Aghaj River have been analyzed within the framework of the strategy flowchart.

Download Full-text

Introduction to Time Series

Advances in Computational Intelligence and Robotics - Pattern Recognition and Classification in Time Series Data ◽

10.4018/978-1-5225-0565-5.ch002 ◽

2017 ◽

pp. 32-52

Author(s):

Martin Žáček

Keyword(s):

Time Series ◽

Time Series Analysis ◽

Time Series Data ◽

Data File ◽

Series Data ◽

Time Intervals ◽

Series Analysis ◽

Random Samples ◽

Analysis Of Time Series

The goal of this chapter is a description of the time series. This chapter will review techniques that are useful for analyzing time series data, that is, sequences of measurements that follow non-random orders. Unlike the analyses of random samples of observations that are discussed in the context of most other statistics, the analysis of time series is based on the assumption that successive values in the data file represent consecutive measurements taken at equally spaced time intervals. There are two main goals of time series analysis: (a) identifying the nature of the phenomenon represented by the sequence of observations, and (b) forecasting (predicting future values of the time series variable). Both of these goals require that the pattern of observed time series data is identified and more or less formally described. Once the pattern is established, we can interpret and integrate it with other data.

Download Full-text

Ordinal patterns-based methodologies for distinguishing chaos from noise in discrete time series

Communications Physics ◽

10.1038/s42005-021-00696-z ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Massimiliano Zanin ◽

Felipe Olivares

Keyword(s):

Time Series ◽

Discrete Time ◽

Real World ◽

Chaotic Maps ◽

Computational Cost ◽

Permutation Patterns ◽

Discrete Time Series ◽

Ordinal Patterns ◽

Theoretical Foundations ◽

Real World Problems

AbstractOne of the most important aspects of time series is their degree of stochasticity vs. chaoticity. Since the discovery of chaotic maps, many algorithms have been proposed to discriminate between these two alternatives and assess their prevalence in real-world time series. Approaches based on the combination of “permutation patterns” with different metrics provide a more complete picture of a time series’ nature, and are especially useful to tackle pathological chaotic maps. Here, we provide a review of such approaches, their theoretical foundations, and their application to discrete time series and real-world problems. We compare their performance using a set of representative noisy chaotic maps, evaluate their applicability through their respective computational cost, and discuss their limitations.

Download Full-text

Learning and Imputation for Mass-spec Bias Reduction (LIMBR)

10.1101/301242 ◽

2018 ◽

Author(s):

Alexander M Crowell ◽

Casey S Greene ◽

Jennifer J. Loros ◽

Jay C Dunlap

Keyword(s):

Time Series ◽

Missing Data ◽

Large Scale ◽

Bias Reduction ◽

Batch Effects ◽

Genome Wide ◽

Mass Spec ◽

Data Points ◽

Block Based ◽

Analysis Of Time Series

AbstractMotivationDecreasing costs are making it feasible to perform time series proteomics and genomics experiments with more replicates and higher resolution than ever before. With more replicates and time points, proteome and genome-wide patterns of expression are more readily discernible. These larger experiments require more batches exacerbating batch effects and increasing the number of bias trends. In the case of proteomics, where methods frequently result in missing data this increasing scale is also decreasing the number of peptides observed in all samples. The sources of batch effects and missing data are incompletely understood necessitating novel techniques.ResultsHere we show that by exploiting the structure of time series experiments, it is possible to accurately and reproducibly model and remove batch effects. We implement Learning and Imputation for Mass-spec Bias Reduction (LIMBR) software, which builds on previous block based models of batch effects and includes features specific to time series and circadian studies. To aid in the analysis of time series proteomics experiments, which are often plagued with missing data points, we also integrate an imputation system. By building LIMBR for imputation and time series tailored bias modeling into one straightforward software package, we expect that the quality and ease of large-scale proteomics and genomics time series experiments will be significantly [email protected], [email protected]

Download Full-text

Correlation in the heart rate data

Nonlinear Analysis Modelling and Control ◽

10.15388/na.1998.2.0.15301 ◽

1998 ◽

Vol 2 ◽

pp. 141-148

Author(s):

J. Ulbikas ◽

A. Čenys ◽

D. Žemaitytė ◽

G. Varoneckas

Keyword(s):

Nonlinear Dynamics ◽

Experimental Data ◽

Heart Rate ◽

Time Series ◽

Cardiovascular Function ◽

Statistical Properties ◽

Rate Data ◽

Heart Rate Data ◽

Dynamical Nature ◽

Analysis Of Time Series

Variety of methods of nonlinear dynamics have been used for possibility of an analysis of time series in experimental physiology. Dynamical nature of experimental data was checked using specific methods. Statistical properties of the heart rate have been investigated. Correlation between of cardiovascular function and statistical properties of both, heart rate and stroke volume, have been analyzed. Possibility to use a data from correlations in heart rate for monitoring of cardiovascular function was discussed.

Download Full-text