Wave-SOM

Andrew Blanchard; Christopher Wolter; David S. McNabb; Eitan Gross

doi:10.4018/jkdb.2010040104

Wave-SOM

Computational Knowledge Discovery for Bioinformatics Research ◽

10.4018/978-1-4666-1785-8.ch007 ◽

2013 ◽

pp. 104-126

Author(s):

Andrew Blanchard ◽

Christopher Wolter ◽

David S. McNabb ◽

Eitan Gross

Keyword(s):

Oxidative Stress ◽

Time Series ◽

Ribosomal Proteins ◽

Clustering Algorithm ◽

Time Series Data ◽

Expression Patterns ◽

Data Sets ◽

Self Organizing Map ◽

Two Dimensional ◽

Genes Encoding

In this paper, the authors present a wavelet-based algorithm (Wave-SOM) to help visualize and cluster oscillatory time-series data in two-dimensional gene expression micro-arrays. Using various wavelet transformations, raw data are first de-noised by decomposing the time-series into low and high frequency wavelet coefficients. Following thresholding, the coefficients are fed as an input vector into a two-dimensional Self-Organizing-Map clustering algorithm. Transformed data are then clustered by minimizing the Euclidean (L2) distance between their corresponding fluctuation patterns. A multi-resolution analysis by Wave-SOM of expression data from the yeast Saccharomyces cerevisiae, exposed to oxidative stress and glucose-limited growth, identified 29 genes with correlated expression patterns that were mapped into 5 different nodes. The ordered clustering of yeast genes by Wave-SOM illustrates that the same set of genes (encoding ribosomal proteins) can be regulated by two different environmental stresses, oxidative stress and starvation. The algorithm provides heuristic information regarding the similarity of different genes. Using previously studied expression patterns of yeast cell-cycle and functional genes as test data sets, the authors’ algorithm outperformed five other competing programs.

Download Full-text

Jonckheere–Terpstra–Kendall-based non-parametric analysis of temporal differential gene expression

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab021 ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Hitoshi Iuchi ◽

Michiaki Hamada

Keyword(s):

Gene Expression ◽

Time Series ◽

Time Course ◽

Time Series Data ◽

Expression Patterns ◽

Detection Methods ◽

Series Data ◽

Expression Levels ◽

Over Time ◽

Non Parametric

Abstract Time-course experiments using parallel sequencers have the potential to uncover gradual changes in cells over time that cannot be observed in a two-point comparison. An essential step in time-series data analysis is the identification of temporal differentially expressed genes (TEGs) under two conditions (e.g. control versus case). Model-based approaches, which are typical TEG detection methods, often set one parameter (e.g. degree or degree of freedom) for one dataset. This approach risks modeling of linearly increasing genes with higher-order functions, or fitting of cyclic gene expression with linear functions, thereby leading to false positives/negatives. Here, we present a Jonckheere–Terpstra–Kendall (JTK)-based non-parametric algorithm for TEG detection. Benchmarks, using simulation data, show that the JTK-based approach outperforms existing methods, especially in long time-series experiments. Additionally, application of JTK in the analysis of time-series RNA-seq data from seven tissue types, across developmental stages in mouse and rat, suggested that the wave pattern contributes to the TEG identification of JTK, not the difference in expression levels. This result suggests that JTK is a suitable algorithm when focusing on expression patterns over time rather than expression levels, such as comparisons between different species. These results show that JTK is an excellent candidate for TEG detection.

Download Full-text

Bayesian Biclustering by dynamics: A clustering algorithm for SAGD time series data

Computers & Geosciences ◽

10.1016/j.cageo.2019.07.008 ◽

2019 ◽

Vol 133 ◽

pp. 104304 ◽

Cited By ~ 1

Author(s):

Helen Pinto ◽

Ian Gates ◽

Xin Wang

Keyword(s):

Time Series ◽

Clustering Algorithm ◽

Time Series Data ◽

Series Data

Download Full-text

A Clustering Algorithm for Time Series Data

2006 Seventh International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT'06) ◽

10.1109/pdcat.2006.1 ◽

2006 ◽

Cited By ~ 4

Author(s):

Jian Yin ◽

Duanning Zhou ◽

Qiong-qiong Xie

Keyword(s):

Time Series ◽

Clustering Algorithm ◽

Time Series Data ◽

Series Data

Download Full-text

A MPAA-Based Iterative Clustering Algorithm Augmented by Nearest Neighbors Search for Time-Series Data Streams

Advances in Knowledge Discovery and Data Mining - Lecture Notes in Computer Science ◽

10.1007/11430919_40 ◽

2005 ◽

pp. 333-342 ◽

Cited By ~ 9

Author(s):

Jessica Lin ◽

Michai Vlachos ◽

Eamonn Keogh ◽

Dimitrios Gunopulos ◽

Jianwei Liu ◽

...

Keyword(s):

Time Series ◽

Data Streams ◽

Clustering Algorithm ◽

Time Series Data ◽

Nearest Neighbors ◽

Series Data

Download Full-text

Clustering Methodology for Time Series Mining

Scientific Journal of Riga Technical University Computer Sciences ◽

10.2478/v10143-010-0011-0 ◽

2009 ◽

Vol 40 (1) ◽

pp. 81-86

Author(s):

Pēteris Grabusts ◽

Arkady Borisov

Keyword(s):

Time Series ◽

Time Series Analysis ◽

Clustering Algorithm ◽

Time Series Data ◽

Similarity Measures ◽

Longest Common Subsequence ◽

Series Data ◽

Time Series Clustering ◽

Series Analysis ◽

Time Series Mining

Clustering Methodology for Time Series MiningA time series is a sequence of real data, representing the measurements of a real variable at time intervals. Time series analysis is a sufficiently well-known task; however, in recent years research has been carried out with the purpose to try to use clustering for the intentions of time series analysis. The main motivation for representing a time series in the form of clusters is to better represent the main characteristics of the data. The central goal of the present research paper was to investigate clustering methodology for time series data mining, to explore the facilities of time series similarity measures and to use them in the analysis of time series clustering results. More complicated similarity measures include Longest Common Subsequence method (LCSS). In this paper, two tasks have been completed. The first task was to define time series similarity measures. It has been established that LCSS method gives better results in the detection of time series similarity than the Euclidean distance. The second task was to explore the facilities of the classical k-means clustering algorithm in time series clustering. As a result of the experiment a conclusion has been drawn that the results of time series clustering with the help of k-means algorithm correspond to the results obtained with LCSS method, thus the clustering results of the specific time series are adequate.

Download Full-text

Identification and visualisation of differential isoform expression in RNA-seq time series

10.1101/155135 ◽

2017 ◽

Cited By ~ 2

Author(s):

María José Nueda ◽

Jordi Martorell-Marugan ◽

Cristina Martí ◽

Sonia Tarazona ◽

Ana Conesa

Keyword(s):

Time Series ◽

Time Series Data ◽

Expression Patterns ◽

R Package ◽

Series Data ◽

Rna Seq ◽

Sequencing Technologies ◽

Isoform Expression ◽

Time Series Data Analysis ◽

Differential Isoform Expression

AbstractAs sequencing technologies improve their capacity to detect distinct transcripts of the same gene and to address complex experimental designs such as longitudinal studies, there is a need to develop statistical methods for the analysis of isoform expression changes in time series data. Iso-maSigPro is a new functionality of the R package maSigPro for transcriptomics time series data analysis. Iso-maSigPro identifies genes with a differential isoform usage across time. The package also includes new clustering and visualization functions that allow grouping of genes with similar expression patterns at the isoform level, as well as those genes with a shift in major expressed isoform. The package is freely available under the LGPL license from the Bioconductor web site (http://bioconductor.org).

Download Full-text

A new model for learning-based forecasting procedure by combining k-means clustering and time series forecasting algorithms

PeerJ Computer Science ◽

10.7717/peerj-cs.534 ◽

2021 ◽

Vol 7 ◽

pp. e534

Author(s):

Kristoko Dwi Hartomo ◽

Yessica Nataliani

Keyword(s):

Time Series ◽

Clustering Algorithm ◽

Time Series Data ◽

Mean Squared Error ◽

Time Series Forecasting ◽

Series Data ◽

Improvement Rate ◽

New Model ◽

Average Improvement ◽

Proposed Model

This paper aims to propose a new model for time series forecasting that combines forecasting with clustering algorithm. It introduces a new scheme to improve the forecasting results by grouping the time series data using k-means clustering algorithm. It utilizes the clustering result to get the forecasting data. There are usually some user-defined parameters affecting the forecasting results, therefore, a learning-based procedure is proposed to estimate the parameters that will be used for forecasting. This parameter value is computed in the algorithm simultaneously. The result of the experiment compared to other forecasting algorithms demonstrates good results for the proposed model. It has the smallest mean squared error of 13,007.91 and the average improvement rate of 19.83%.

Download Full-text

Hybrid Models for Adaptive Allocation of Electricity for Households

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b1029.1292s19 ◽

2019 ◽

Vol 9 (2S) ◽

pp. 369-376

Keyword(s):

Time Series ◽

Hybrid Model ◽

Clustering Algorithm ◽

Time Series Data ◽

Active Power ◽

Sensor Data ◽

Series Data ◽

Adaptive Allocation ◽

Forecasting Methods ◽

Better Than

In this paper, we analyze, model, predict and cluster Global Active Power, i.e., a time series data obtained at one minute intervals from electricity sensors of a household. We analyze changes in seasonality and trends to model the data. We then compare various forecasting methods such as SARIMA and LSTM to forecast sensor data for the household and combine them to achieve a hybrid model that captures nonlinear variations better than either SARIMA or LSTM used in isolation. Finally, we cluster slices of time series data effectively using a novel clustering algorithm that is a combination of density-based and centroid-based approaches, to discover relevant subtle clusters from sensor data. Our experiments have yielded meaningful insights from the data at both a micro, day-to-day granularity, as well as a macro, weekly to monthly granularity.

Download Full-text

Hybrid clustering algorithm for time series data — A literature survey

2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC) ◽

10.1109/icbdaci.2017.8070861 ◽

2017 ◽

Author(s):

T. Rajesh ◽

Y. Sravani Devi ◽

K. Venugopal Rao

Keyword(s):

Time Series ◽

Clustering Algorithm ◽

Time Series Data ◽

Literature Survey ◽

Series Data ◽

Hybrid Clustering

Download Full-text