Parameter-free motif discovery for time series data

SMM: Leveraging Metadata for Contextually Salient Multi-Variate Motif Discovery

Applied Sciences ◽

10.3390/app112210873 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10873

Author(s):

Silvestro R. Poccia ◽

K. Selçuk Candan ◽

Maria Luisa Sapino

Keyword(s):

Time Series ◽

Motif Discovery ◽

Time Series Data ◽

Contextual Information ◽

Multimedia Data ◽

Series Data ◽

Motif Search ◽

Fixed Subset ◽

Simplifying Assumptions ◽

Expensive Process

A common challenge in multimedia data understanding is the unsupervised discovery of recurring patterns, or motifs, in time series data. The discovery of motifs in uni-variate time series is a well studied problem and, while being a relatively new area of research, there are also several proposals for multi-variate motif discovery. Unfortunately, motif search among multiple variates is an expensive process, as the potential number of sub-spaces in which a pattern can occur increases exponentially with the number of variates. Consequently, many multi-variate motif search algorithms make simplifying assumptions, such as searching for motifs across all variates individually, assuming that the motifs are of the same length, or that they occur on a fixed subset of variates. In this paper, we are interested in addressing a relatively broad form of multi-variate motif detection, which seeks frequently occurring patterns (of possibly differing lengths) in sub-spaces of a multi-variate time series. In particular, we aim to leverage contextual information to help select contextually salient patterns and identify the most frequent patterns among all. Based on these goals, we first introduce the contextually salient multi-variate motif (CS-motif) discovery problem and then propose a salient multi-variate motif (SMM) algorithm that, unlike existing methods, is able to seek a broad range of patterns in multi-variate time series.

Download Full-text

An efficient implementation of EMD algorithm for motif discovery in time series data

International Journal of Data Mining Modelling and Management ◽

10.1504/ijdmmm.2016.077159 ◽

2016 ◽

Vol 8 (2) ◽

pp. 180 ◽

Cited By ~ 1

Author(s):

Duong Tuan Anh ◽

Nguyen Van Nhat

Keyword(s):

Time Series ◽

Motif Discovery ◽

Time Series Data ◽

Efficient Implementation ◽

Series Data

Download Full-text

Querying of Time Series for Big Data Analytics

Advances in Data Mining and Database Management - Handbook of Research on Innovative Database Query Processing Techniques ◽

10.4018/978-1-4666-8767-7.ch013 ◽

2015 ◽

pp. 364-391

Author(s):

Vasileios Zois ◽

Charalampos Chelmis ◽

Viktor K. Prasanna

Keyword(s):

Applied Sciences ◽

Time Series ◽

Big Data ◽

Motif Discovery ◽

Time Series Data ◽

Series Representation ◽

Big Data Analytics ◽

Mathematical Finance ◽

Series Data ◽

Depth Study

Time series data emerge naturally in many fields of applied sciences and engineering including but not limited to statistics, signal processing, mathematical finance, weather and power consumption forecasting. Although time series data have been well studied in the past, they still present a challenge to the scientific community. Advanced operations such as classification, segmentation, prediction, anomaly detection and motif discovery are very useful especially for machine learning as well as other scientific fields. The advent of Big Data in almost every scientific domain motivates us to provide an in-depth study of the state of the art approaches associated with techniques for efficient querying of time series. This chapters aims at providing a comprehensive review of the existing solutions related to time series representation, processing, indexing and querying operations.

Download Full-text

Visualizing and Discovering Non-Trivial Patterns in Large Time Series Databases

Information Visualization ◽

10.1057/palgrave.ivs.9500089 ◽

2005 ◽

Vol 4 (2) ◽

pp. 61-82 ◽

Cited By ~ 61

Author(s):

Jessica Lin ◽

Eamonn Keogh ◽

Stefano Lonardi

Keyword(s):

Time Series ◽

Motif Discovery ◽

Time Series Data ◽

Pattern Discovery ◽

Synthetic Data ◽

Series Data ◽

Data Sets ◽

Suffix Trees ◽

Visualization System ◽

Visualization Techniques

Data visualization techniques are very important for data analysis, since the human eye has been frequently advocated as the ultimate data-mining tool. However, there has been surprisingly little work on visualizing massive time series data sets. To this end, we developed VizTree, a time series pattern discovery and visualization system based on augmenting suffix trees. VizTree visually summarizes both the global and local structures of time series data at the same time. In addition, it provides novel interactive solutions to many pattern discovery problems, including the discovery of frequently occurring patterns (motif discovery), surprising patterns (anomaly detection), and query by content. VizTree works by transforming the time series into a symbolic representation, and encoding the data in a modified suffix tree in which the frequency and other properties of patterns are mapped onto colors and other visual properties. We demonstrate the utility of our system by comparing it with state-of-the-art batch algorithms on several real and synthetic data sets. Based on the tree structure, we further device a coefficient which measures the dissimilarity between any two time series. This coefficient is shown to be competitive with the well-known Euclidean distance.

Download Full-text

Spatial-time motifs discovery

Intelligent Data Analysis ◽

10.3233/ida-194759 ◽

2020 ◽

Vol 24 (5) ◽

pp. 1121-1140

Author(s):

Heraldo Borges ◽

Murillo Dutra ◽

Amin Bazaz ◽

Rafaelli Coutinho ◽

Fábio Perosi ◽

...

Keyword(s):

Time Series ◽

Literature Review ◽

Motif Discovery ◽

Time Series Data ◽

Synthetic Data ◽

Series Data ◽

Spatial Constraints ◽

Traditional Methods ◽

Temporal And Spatial ◽

Motif Discovery Algorithm

Discovering motifs in time series data has been widely explored. Various techniques have been developed to tackle this problem. However, when it comes to spatial-time series, a clear gap can be observed according to the literature review. This paper tackles such a gap by presenting an approach to discover and rank motifs in spatial-time series, denominated Combined Series Approach (CSA). CSA is based on partitioning the spatial-time series into blocks. Inside each block, subsequences of spatial-time series are combined in a way that hash-based motif discovery algorithm is applied. Motifs are validated according to both temporal and spatial constraints. Later, motifs are ranked according to their entropy, the number of occurrences, and the proximity of their occurrences. The approach was evaluated using both synthetic and seismic datasets. CSA outperforms traditional methods designed only for time series. CSA was also able to prioritize motifs that were meaningful both in the context of synthetic data and also according to seismic specialists.

Download Full-text

Distance Measure for Symbolic Approximation Representation with Subsequence Direction for Time Series Data Mining

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2013.p0263 ◽

2013 ◽

Vol 17 (2) ◽

pp. 263-271 ◽

Cited By ~ 2

Author(s):

Tianyu Li ◽

◽

Fang-Yan Dong ◽

Kaoru Hirota

Keyword(s):

Data Mining ◽

Time Series ◽

Error Rate ◽

Motif Discovery ◽

Time Series Data ◽

Distance Measure ◽

Error Rates ◽

Series Data ◽

Classification Error ◽

Time Series Data Mining

A distance measure is proposed for time series data mining based on symbolic aggregate approximation (SAX) with direction representation. It aims at increasing lower bound tightness to Euclidean distance and decreasing the error rate of time series data mining tasks by adding the time series subsequence direction factor to original SAX. Experiments on public University of California, Riverside (UCR) time series datasets, which contain various time series data with diverse type, length, and size, demonstrate that the tightness of the proposed distance measure increases 17.54% on average when compared with that of original SAX, and classification error rates on SAX with direction representation are reduced by 16.22% in comparison with that of results obtained by original SAX. The proposed approach lowers the classification error rate and could be applied to other time series data mining tasks, such as clustering, query by content, and motif discovery.

Download Full-text

Graphical Exploratory Data Analysis for Categorical Longitudinal and Time Series Data

PsycEXTRA Dataset ◽

10.1037/e634372013-001 ◽

2013 ◽

Author(s):

Stephen J. Tueller ◽

Richard A. Van Dorn ◽

Georgiy Bobashev ◽

Barry Eggleston

Keyword(s):

Time Series ◽

Data Analysis ◽

Exploratory Data Analysis ◽

Time Series Data ◽

Series Data ◽

Exploratory Data

Download Full-text

Faktor-Faktor Yang Mempengaruhi Nilai Tukar Dollar Amerika Serikat Terhadap Rupiah Tahun 2000–2013

Jurnal Riset Manajemen Sekolah Tinggi Ilmu Ekonomi Widya Wiwaha Program Magister Manajemen ◽

10.32477/jrm.v1i2.72 ◽

2017 ◽

Vol 1 (2) ◽

pp. 177-191

Author(s):

Rizki Rahma Kusumadewi ◽

Wahyu Widayat

Keyword(s):

Time Series ◽

Exchange Rate ◽

Money Supply ◽

Time Series Data ◽

The United States ◽

Economic Conditions ◽

Series Data ◽

Arch Model ◽

United States Dollar ◽

The Exchange Rate

Exchange rate is one tool to measure a country’s economic conditions. The growth of a stable currency value indicates that the country has a relatively good economic conditions or stable. This study has the purpose to analyze the factors that affect the exchange rate of the Indonesian Rupiah against the United States Dollar in the period of 2000-2013. The data used in this study is a secondary data which are time series data, made up of exports, imports, inflation, the BI rate, Gross Domestic Product (GDP), and the money supply (M1) in the quarter base, from first quarter on 2000 to fourth quarter on 2013. Regression model time series data used the ARCH-GARCH with ARCH model selection indicates that the variables that significantly influence the exchange rate are exports, inflation, the central bank rate and the money supply (M1). Whereas import and GDP did not give any influence.

Download Full-text

An Anomaly Detection Method with Exemplar Subsequence for Time Series Data

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss.136.363 ◽

2016 ◽

Vol 136 (3) ◽

pp. 363-372

Author(s):

Takaaki Nakamura ◽

Makoto Imamura ◽

Masashi Tatedoko ◽

Norio Hirai

Keyword(s):

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Detection Method ◽

Series Data

Download Full-text

Fault Detection based on Information Extraction from Measured Time-series Data in Building Air-conditioning System

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss.135.651 ◽

2015 ◽

Vol 135 (6) ◽

pp. 651-659 ◽

Cited By ~ 2

Author(s):

Masaki Yumoto

Keyword(s):

Time Series ◽

Fault Detection ◽

Information Extraction ◽

Air Conditioning ◽

Time Series Data ◽

Series Data ◽

Air Conditioning System ◽

Measured Time

Download Full-text