Parameter-free motif discovery for time series data

Author(s):  
Pawan Nunthanid ◽  
Vit Niennattrakul ◽  
Chotirat Ann Ratanamahatana
2021 ◽  
Vol 11 (22) ◽  
pp. 10873
Author(s):  
Silvestro R. Poccia ◽  
K. Selçuk Candan ◽  
Maria Luisa Sapino

A common challenge in multimedia data understanding is the unsupervised discovery of recurring patterns, or motifs, in time series data. The discovery of motifs in uni-variate time series is a well studied problem and, while being a relatively new area of research, there are also several proposals for multi-variate motif discovery. Unfortunately, motif search among multiple variates is an expensive process, as the potential number of sub-spaces in which a pattern can occur increases exponentially with the number of variates. Consequently, many multi-variate motif search algorithms make simplifying assumptions, such as searching for motifs across all variates individually, assuming that the motifs are of the same length, or that they occur on a fixed subset of variates. In this paper, we are interested in addressing a relatively broad form of multi-variate motif detection, which seeks frequently occurring patterns (of possibly differing lengths) in sub-spaces of a multi-variate time series. In particular, we aim to leverage contextual information to help select contextually salient patterns and identify the most frequent patterns among all. Based on these goals, we first introduce the contextually salient multi-variate motif (CS-motif) discovery problem and then propose a salient multi-variate motif (SMM) algorithm that, unlike existing methods, is able to seek a broad range of patterns in multi-variate time series.


Author(s):  
Vasileios Zois ◽  
Charalampos Chelmis ◽  
Viktor K. Prasanna

Time series data emerge naturally in many fields of applied sciences and engineering including but not limited to statistics, signal processing, mathematical finance, weather and power consumption forecasting. Although time series data have been well studied in the past, they still present a challenge to the scientific community. Advanced operations such as classification, segmentation, prediction, anomaly detection and motif discovery are very useful especially for machine learning as well as other scientific fields. The advent of Big Data in almost every scientific domain motivates us to provide an in-depth study of the state of the art approaches associated with techniques for efficient querying of time series. This chapters aims at providing a comprehensive review of the existing solutions related to time series representation, processing, indexing and querying operations.


2005 ◽  
Vol 4 (2) ◽  
pp. 61-82 ◽  
Author(s):  
Jessica Lin ◽  
Eamonn Keogh ◽  
Stefano Lonardi

Data visualization techniques are very important for data analysis, since the human eye has been frequently advocated as the ultimate data-mining tool. However, there has been surprisingly little work on visualizing massive time series data sets. To this end, we developed VizTree, a time series pattern discovery and visualization system based on augmenting suffix trees. VizTree visually summarizes both the global and local structures of time series data at the same time. In addition, it provides novel interactive solutions to many pattern discovery problems, including the discovery of frequently occurring patterns (motif discovery), surprising patterns (anomaly detection), and query by content. VizTree works by transforming the time series into a symbolic representation, and encoding the data in a modified suffix tree in which the frequency and other properties of patterns are mapped onto colors and other visual properties. We demonstrate the utility of our system by comparing it with state-of-the-art batch algorithms on several real and synthetic data sets. Based on the tree structure, we further device a coefficient which measures the dissimilarity between any two time series. This coefficient is shown to be competitive with the well-known Euclidean distance.


2020 ◽  
Vol 24 (5) ◽  
pp. 1121-1140
Author(s):  
Heraldo Borges ◽  
Murillo Dutra ◽  
Amin Bazaz ◽  
Rafaelli Coutinho ◽  
Fábio Perosi ◽  
...  

Discovering motifs in time series data has been widely explored. Various techniques have been developed to tackle this problem. However, when it comes to spatial-time series, a clear gap can be observed according to the literature review. This paper tackles such a gap by presenting an approach to discover and rank motifs in spatial-time series, denominated Combined Series Approach (CSA). CSA is based on partitioning the spatial-time series into blocks. Inside each block, subsequences of spatial-time series are combined in a way that hash-based motif discovery algorithm is applied. Motifs are validated according to both temporal and spatial constraints. Later, motifs are ranked according to their entropy, the number of occurrences, and the proximity of their occurrences. The approach was evaluated using both synthetic and seismic datasets. CSA outperforms traditional methods designed only for time series. CSA was also able to prioritize motifs that were meaningful both in the context of synthetic data and also according to seismic specialists.


Author(s):  
Tianyu Li ◽  
◽  
Fang-Yan Dong ◽  
Kaoru Hirota

A distance measure is proposed for time series data mining based on symbolic aggregate approximation (SAX) with direction representation. It aims at increasing lower bound tightness to Euclidean distance and decreasing the error rate of time series data mining tasks by adding the time series subsequence direction factor to original SAX. Experiments on public University of California, Riverside (UCR) time series datasets, which contain various time series data with diverse type, length, and size, demonstrate that the tightness of the proposed distance measure increases 17.54% on average when compared with that of original SAX, and classification error rates on SAX with direction representation are reduced by 16.22% in comparison with that of results obtained by original SAX. The proposed approach lowers the classification error rate and could be applied to other time series data mining tasks, such as clustering, query by content, and motif discovery.


2013 ◽  
Author(s):  
Stephen J. Tueller ◽  
Richard A. Van Dorn ◽  
Georgiy Bobashev ◽  
Barry Eggleston

Author(s):  
Rizki Rahma Kusumadewi ◽  
Wahyu Widayat

Exchange rate is one tool to measure a country’s economic conditions. The growth of a stable currency value indicates that the country has a relatively good economic conditions or stable. This study has the purpose to analyze the factors that affect the exchange rate of the Indonesian Rupiah against the United States Dollar in the period of 2000-2013. The data used in this study is a secondary data which are time series data, made up of exports, imports, inflation, the BI rate, Gross Domestic Product (GDP), and the money supply (M1) in the quarter base, from first quarter on 2000 to fourth quarter on 2013. Regression model time series data used the ARCH-GARCH with ARCH model selection indicates that the variables that significantly influence the exchange rate are exports, inflation, the central bank rate and the money supply (M1). Whereas import and GDP did not give any influence.


2016 ◽  
Vol 136 (3) ◽  
pp. 363-372
Author(s):  
Takaaki Nakamura ◽  
Makoto Imamura ◽  
Masashi Tatedoko ◽  
Norio Hirai

Sign in / Sign up

Export Citation Format

Share Document