Temporal expression divergence of network modules

Mapping Intimacies ◽

10.1101/167734 ◽

2017 ◽

Author(s):

Yongjin Park ◽

Tae-Hyuk Kang ◽

Theodore Friedmann ◽

Joel S. Bader

Keyword(s):

Time Series Data ◽

Developmental Stages ◽

Expression Profiles ◽

Stem Cell Differentiation ◽

Series Data ◽

Network Clustering ◽

Temporal Expression ◽

Discriminative Models ◽

Network Modules ◽

Discriminative Functions

AbstractHere we propose new module-based approaches to identify differentially regulated network sub-modules combining temporal trajectories of expression profiles with static network skeletons. Starting from modules identified by network clustering of static networks, our analysis refines pre-defined genesets by partitioning them into smaller homogeneous sets by non-paramettric Bayesian methods. Especially for case-control time series data we developed multi-time point discriminative models and identified each network module as a mixture or admixture of dynamic discriminative functions. Our results shows that our proposed approach outperformed existing geneset enrichment methods in simulation studies. Moreover we applied the methods to neural stem cell differentiation data, and discovered novel modules differentially perturbed in different developmental stages.

Download Full-text

Bivalent genes that undergo transcriptional switching identify networks of key regulators of embryonic stem cell differentiation

BMC Genomics ◽

10.1186/s12864-020-07009-8 ◽

2020 ◽

Vol 21 (S10) ◽

Author(s):

Ah-Jung Jeon ◽

Greg Tucker-Kellogg

Keyword(s):

Cell Differentiation ◽

Time Series Data ◽

Es Cells ◽

Expression Patterns ◽

Embryonic Stem ◽

Stem Cell Differentiation ◽

Chromatin State ◽

Series Data ◽

Embryonic Stem Cell Differentiation ◽

Gene Promoters

Abstract Background Bivalent promoters marked with both H3K27me3 and H3K4me3 histone modifications are characteristic of poised promoters in embryonic stem (ES) cells. The model of poised promoters postulates that bivalent chromatin in ES cells is resolved to monovalency upon differntiation. With the availability of single-cell RNA sequencing (scRNA-seq) data, subsequent switches in transcriptional state at bivalent promoters can be studied more closely. Results We develop an approach for capturing genes undergoing transcriptional switching by detecting ‘bimodal’ gene expression patterns from scRNA-seq data. We integrate the identification of bimodal genes in ES cell differentiation with analysis of chromatin state, and identify clear cell-state dependent patterns of bimodal, bivalent genes. We show that binarization of bimodal genes can be used to identify differentially expressed genes from fractional ON/OFF proportions. In time series data from differentiating cells, we build a pseudotime approximation and use a hidden Markov model to infer gene activity switching pseudotimes, which we use to infer a regulatory network. We identify pathways of switching during differentiation, novel details of those pathway, and transcription factor coordination with downstream targets. Conclusions Genes with expression levels too low to be informative in conventional scRNA analysis can be used to infer transcriptional switching networks that connect transcriptional activity to chromatin state. Since chromatin bivalency is a hallmark of gene promoters poised for activity, this approach provides an alternative that complements conventional scRNA-seq analysis while focusing on genes near the ON/OFF boundary of activity. This offers a novel and productive means of inferring regulatory networks from scRNA-seq data.

Download Full-text

Real age prediction from the transcriptome with RAPToR

10.1101/2021.09.07.459270 ◽

2021 ◽

Author(s):

Romain Bulteau ◽

Mirko Francesconi

Keyword(s):

Gene Expression ◽

Large Scale ◽

Time Series Data ◽

Expression Profiles ◽

Computational Method ◽

Model Systems ◽

Developmental Expression ◽

Model Organisms ◽

Series Data ◽

Developmental Variation

AbstractGenome-wide gene expression profiling is a powerful tool for exploratory analyses, providing a high dimensional picture of the state of a biological system. However, uncontrolled variation among samples can obscure and confound the effect of variables of interest. Uncontrolled developmental variation is often a major source of unknown expression variation in developmental systems. Existing methods to sort samples from transcriptomes require many samples to infer developmental trajectories and only provide a relative pseudo-time.Here we present RAPToR (Real Age Prediction from Transcriptome staging on Reference), a simple computational method to estimate the absolute developmental age of even a single sample from its gene expression with up to minutes precision. We achieve this by staging samples on high-resolution reference developmental expression profiles we build from existing time series data. We implemented RAPToR for the most common animal model systems: nematode, fruit fly, zebrafish, and mouse, and demonstrate application for non-model organisms. We show how developmental variation discovered by RAPToR can be exploited to increase power to detect differential expression and to untangle the signal of perturbations of interest even when it is completely confounded with development. We anticipate our RAPToR post-profiling staging strategy will be especially useful in large scale single organism profiling because it eliminates the need for synchronization or for a tedious and potentially difficult step of accurate staging before profiling.

Download Full-text

The seasonal development dynamics of the yak hair cycle transcriptome

10.21203/rs.2.10952/v1 ◽

2019 ◽

Author(s):

Pengjia Bao ◽

Jiayu Luo ◽

Yanbin Liu ◽

Min Chu ◽

Qingmiao Ren ◽

...

Keyword(s):

Hormonal Regulation ◽

Regulatory Networks ◽

Molecular Mechanisms ◽

Time Series Data ◽

Expression Profiles ◽

Expression Patterns ◽

Seasonal Development ◽

Series Data ◽

Hair Cycle ◽

Adaptation Mechanism

Abstract Background: Mammalian hair play an important role in mammals' ability to adapt to changing climatic environments. The seasonal circulation of yak hair helps them adapt to high altitude but the regulation mechanisms of the proliferation and differentiation of hair follicle (HF) cells during development are still unknown. Here, using time series data for whole genome expression profiles and hormone contents, we systematically analyzed the mechanism regulating the periodic expression of hair development in the yak and reviewed how different combinations of genetic pathways regulate HF development and cycling. Results: This study used high-throughput RNA sequencing to provide a detailed description of global gene expression in 15 samples from five developmental time points during the yak hair cycle. A total of 11,666 genes were found to be involved in the hair cycle. According to clustering analysis and the morphological features we observed, we found that these 15 samples could be significantly grouped into three phases, which represent different developmental periods in the hair cycle. A total of 2,316 genes were identified in these three consecutive developmental periods and their expression patterns could be divided into 9 clusters; GO annotation and KEGG pathway enrichment were performed on these differentially expressed genes (DEGs), showing that the three periods have distinctive functions in the seasonal hair cycle. The regulatory network of related signaling factors highlighted the interaction and dynamic expression of key DEGs during the seasonal hair cycle. Through co-expression analysis, we revealed a number of modular hub genes highly associated with hormones that may play unique roles in hormonal regulation of events associated with the hair cycle. Conclusions: Our results revealed the molecular mechanisms and developmental regulatory networks of the seasonal hair cycle in the yak and filled a gap in the current research field. The findings will be valuable in further understanding the alpine adaptation mechanism in the yak, which is important in order to make full use of yak hair resources and promote the economic development of pastoral plateau areas. Keywords: Hair cycle, Seasonal development, Transcriptome, Yak

Download Full-text

TimeNexus: A Novel Cytoscape App to Analyze Time-Series Data Using Temporal MultiLayer Networks (tMLNs)

10.21203/rs.3.rs-133258/v1 ◽

2020 ◽

Author(s):

Michaël Pierrelée ◽

Ana Reynders ◽

Fabrice Lopez ◽

Aziz Moqrich ◽

Laurent Tichit ◽

...

Keyword(s):

Cell Cycle ◽

Time Series ◽

Network Structure ◽

Time Series Data ◽

Series Data ◽

Omics Data ◽

Expression Data ◽

Temporal Expression ◽

Multilayer Networks ◽

Multilayer Network

Abstract Integrating -omics data with biological networks such as protein-protein interaction networks is a popular and useful approach to interpret expression changes of genes in changing conditions, and to identify relevant cellular pathways, active subnetworks or network communities. Yet, most -omics data integration tools are restricted to static networks and therefore cannot easily be used for analyzing time-series data. Determining regulations or exploring the network structure over time requires time-dependent networks which incorporate time as one component in their structure. Here, we present a method to project time-series data on sequential layers of a multilayer network, thus creating a temporal multilayer network (tMLN). We implemented this method as a Cytoscape app we named TimeNexus. TimeNexus allows to easily create, manage and visualize temporal multilayer networks starting from a combination of node and edge tables carrying the information on the temporal network structure. To allow further analysis of the tMLN, TimeNexus creates and passes on regular Cytoscape networks in form of static versions of the tMLN in three different ways: i) over the entire set of layers, ii) over two consecutive layers at a time, iii) or on one single layer at a time. We combined TimeNexus with the Cytoscape apps PathLinker and AnatApp/ANAT to extract active subnetworks from tMLNs. To test the usability of our app, we applied TimeNexus together with PathLinker or ANAT on temporal expression data of the yeast cell cycle and were able to identify active subnetworks relevant for different cell cycle phases. We furthermore used TimeNexus on our own temporal expression data from a mouse pain assay inducing hindpaw inflammation and detected active subnetworks relevant for an inflammatory response to injury, including immune response, cell stress response and regulation of apoptosis. TimeNexus is freely available from the Cytoscape app store at https://apps.cytoscape.org/apps/TimeNexus.

Download Full-text

Time-Series Growth Prediction Model Based on U-Net and Machine Learning in Arabidopsis

Frontiers in Plant Science ◽

10.3389/fpls.2021.721512 ◽

2021 ◽

Vol 12 ◽

Author(s):

Sungyul Chang ◽

Unseok Lee ◽

Min Jeong Hong ◽

Yeong Deuk Jo ◽

Jin-Baek Kim

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Analysis ◽

Time Series Data ◽

Developmental Stages ◽

Series Data ◽

Future Application ◽

Yield Prediction ◽

Flowering Stage ◽

Series Analysis

Yield prediction for crops is essential information for food security. A high-throughput phenotyping platform (HTPP) generates the data of the complete life cycle of a plant. However, the data are rarely used for yield prediction because of the lack of quality image analysis methods, yield data associated with HTPP, and the time-series analysis method for yield prediction. To overcome limitations, this study employed multiple deep learning (DL) networks to extract high-quality HTTP data, establish an association between HTTP data and the yield performance of crops, and select essential time intervals using machine learning (ML). The images of Arabidopsis were taken 12 times under environmentally controlled HTPP over 23 days after sowing (DAS). First, the features from images were extracted using DL network U-Net with SE-ResXt101 encoder and divided into early (15–21 DAS) and late (∼21–23 DAS) pre-flowering developmental stages using the physiological characteristics of the Arabidopsis plant. Second, the late pre-flowering stage at 23 DAS can be predicted using the ML algorithm XGBoost, based only on a portion of the early pre-flowering stage (17–21 DAS). This was confirmed using an additional biological experiment (P < 0.01). Finally, the projected area (PA) was estimated into fresh weight (FW), and the correlation coefficient between FW and predicted FW was calculated as 0.85. This was the first study that analyzed time-series data to predict the FW of related but different developmental stages and predict the PA. The results of this study were informative and enabled the understanding of the FW of Arabidopsis or yield of leafy plants and total biomass consumed in vertical farming. Moreover, this study highlighted the reduction of time-series data for examining interesting traits and future application of time-series analysis in various HTPPs.

Download Full-text

TWO-PASS IMPUTATION ALGORITHM FOR MISSING VALUE ESTIMATION IN GENE EXPRESSION TIME SERIES

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720007003053 ◽

2007 ◽

Vol 05 (05) ◽

pp. 1005-1022 ◽

Cited By ~ 20

Author(s):

ELENA TSIPORKOVA ◽

VESELKA BOEVA

Keyword(s):

Gene Expression ◽

Time Series ◽

Missing Values ◽

Time Series Data ◽

Expression Profiles ◽

Series Data ◽

Gene Expression Time Series ◽

Value Estimation ◽

Missing Value Estimation ◽

Expression Time

Gene expression microarray experiments frequently generate datasets with multiple values missing. However, most of the analysis, mining, and classification methods for gene expression data require a complete matrix of gene array values. Therefore, the accurate estimation of missing values in such datasets has been recognized as an important issue, and several imputation algorithms have already been proposed to the biological community. Most of these approaches, however, are not particularly suitable for time series expression profiles. In view of this, we propose a novel imputation algorithm, which is specially suited for the estimation of missing values in gene expression time series data. The algorithm utilizes Dynamic Time Warping (DTW) distance in order to measure the similarity between time expression profiles, and subsequently selects for each gene expression profile with missing values a dedicated set of candidate profiles for estimation. Three different DTW-based imputation (DTWimpute) algorithms have been considered: position-wise, neighborhood-wise, and two-pass imputation. These have initially been prototyped in Perl, and their accuracy has been evaluated on yeast expression time series data using several different parameter settings. The experiments have shown that the two-pass algorithm consistently outperforms, in particular for datasets with a higher level of missing entries, the neighborhood-wise and the position-wise algorithms. The performance of the two-pass DTWimpute algorithm has further been benchmarked against the weighted K-Nearest Neighbors algorithm, which is widely used in the biological community; the former algorithm has appeared superior to the latter one. Motivated by these findings, indicating clearly the added value of the DTW techniques for missing value estimation in time series data, we have built an optimized C++ implementation of the two-pass DTWimpute algorithm. The software also provides for a choice between three different initial rough imputation methods.

Download Full-text

Introducing the novel Cytoscape app TimeNexus to analyze time-series data using temporal MultiLayer Networks (tMLNs)

Scientific Reports ◽

10.1038/s41598-021-93128-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Michaël Pierrelée ◽

Ana Reynders ◽

Fabrice Lopez ◽

Aziz Moqrich ◽

Laurent Tichit ◽

...

Keyword(s):

Cell Cycle ◽

Time Series ◽

Network Structure ◽

Time Series Data ◽

Series Data ◽

Omics Data ◽

Expression Data ◽

Temporal Expression ◽

Multilayer Networks ◽

Multilayer Network

AbstractIntegrating -omics data with biological networks such as protein–protein interaction networks is a popular and useful approach to interpret expression changes of genes in changing conditions, and to identify relevant cellular pathways, active subnetworks or network communities. Yet, most -omics data integration tools are restricted to static networks and therefore cannot easily be used for analyzing time-series data. Determining regulations or exploring the network structure over time requires time-dependent networks which incorporate time as one component in their structure. Here, we present a method to project time-series data on sequential layers of a multilayer network, thus creating a temporal multilayer network (tMLN). We implemented this method as a Cytoscape app we named TimeNexus. TimeNexus allows to easily create, manage and visualize temporal multilayer networks starting from a combination of node and edge tables carrying the information on the temporal network structure. To allow further analysis of the tMLN, TimeNexus creates and passes on regular Cytoscape networks in form of static versions of the tMLN in three different ways: (i) over the entire set of layers, (ii) over two consecutive layers at a time, (iii) or on one single layer at a time. We combined TimeNexus with the Cytoscape apps PathLinker and AnatApp/ANAT to extract active subnetworks from tMLNs. To test the usability of our app, we applied TimeNexus together with PathLinker or ANAT on temporal expression data of the yeast cell cycle and were able to identify active subnetworks relevant for different cell cycle phases. We furthermore used TimeNexus on our own temporal expression data from a mouse pain assay inducing hindpaw inflammation and detected active subnetworks relevant for an inflammatory response to injury, including immune response, cell stress response and regulation of apoptosis. TimeNexus is freely available from the Cytoscape app store at https://apps.cytoscape.org/apps/TimeNexus.

Download Full-text

RVAgene: Generative modeling of gene expression time series data

10.1101/2020.11.10.375436 ◽

2020 ◽

Author(s):

Raktim Mitra ◽

Adam L. MacLean

Keyword(s):

Gene Expression ◽

Time Series ◽

Single Cell ◽

Time Series Data ◽

Kidney Injury ◽

Stem Cell Differentiation ◽

Series Data ◽

Latent Space ◽

Gene Expression Time Series ◽

Expression Time

AbstractMethods to model dynamic changes in gene expression at a genome-wide level are not currently sufficient for large (temporally rich or single-cell) datasets. Variational autoencoders offer means to characterize large datasets and have been used effectively to characterize features of single-cell datasets. Here we extend these methods for use with gene expression time series data. We present RVAgene: a recurrent variational autoencoder to model gene expression dynamics. RVAgene learns to accurately and efficiently reconstruct temporal gene profiles. It also learns a low dimensional representation of the data via a recurrent encoder network that can be used for biological feature discovery, and can generate new gene expression data by sampling from the latent space. We test RVAgene on simulated and real biological datasets, including embryonic stem cell differentiation and kidney injury response dynamics. In all cases, RVAgene accurately reconstructed complex gene expression temporal profiles. Via cross validation, we show that a low-error latent space representation can be learnt using only a fraction of the data. Through clustering and gene ontology term enrichment analysis on the latent space, we demonstrate the potential of RVAgene for unsupervised discovery. In particular, RVAgene identifies new programs of shared gene regulation of Lox family genes in response to kidney injury.

Download Full-text

Fast Sequential Clustering in Riemannian Manifolds for Dynamic and Time-Series-Annotated Multilayer Networks

10.36227/techrxiv.12725369.v1 ◽

2020 ◽

Author(s):

Cong Ye ◽

Konstantinos Slavakis ◽

Johan Nakuci ◽

Sarah F. Muldoon ◽

John Medaglia

Keyword(s):

Time Series ◽

Riemannian Manifolds ◽

Time Series Data ◽

Moving Average ◽

Point Clouds ◽

Brain Network ◽

Series Data ◽

Network Clustering ◽

Partial Correlations ◽

Sequential Clustering

This work exploits Riemannian manifolds to build a sequential-clustering framework able to address a wide variety of clustering tasks in dynamic multilayer (brain) networks via the information extracted from their nodal time-series. The discussion follows a bottom-up path, starting from feature extraction from time-series and reaching up to Riemannian manifolds (feature spaces) to address clustering tasks such as state clustering, community detection (a.k.a. network-topology identification), and subnetwork-sequence tracking. Kernel autoregressive-moving-average modeling and kernel (partial) correlations serve as case studies of generating features in the Riemannian manifolds of Grassmann and positive-(semi)definite matrices, respectively. Feature point-clouds form clusters which are viewed as submanifolds according to Riemannian multi-manifold modeling. A novel sequential-clustering scheme of Riemannian features is also established: feature points are first sampled in a non-random way to reveal the underlying geometric information, and, then, a fast sequential-clustering scheme is brought forth that takes advantage of Riemannian distances and the angular information on tangent spaces. By virtue of the landmark points and the sequential processing of the Riemannian features, the computational complexity of the framework is rendered free from the length of the available time-series data. The effectiveness and computational efficiency of the proposed framework is validated by extensive numerical tests against several state-of-the-art manifold-learning and brain-network-clustering schemes on synthetic as well as real functional-magnetic-resonance-imaging (fMRI) and electro-encephalogram<br> (EEG) data.

Download Full-text

Microarray Time-Series Data Clustering via Multiple Alignment of Gene Expression Profiles

Pattern Recognition in Bioinformatics - Lecture Notes in Computer Science ◽

10.1007/978-3-642-04031-3_33 ◽

2009 ◽

pp. 377-390 ◽

Cited By ~ 3

Author(s):

Numanul Subhani ◽

Alioune Ngom ◽

Luis Rueda ◽

Conrad Burden

Keyword(s):

Gene Expression ◽

Time Series ◽

Data Clustering ◽

Time Series Data ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Multiple Alignment ◽

Series Data

Download Full-text