scholarly journals Dynamic interaction network inference from longitudinal microbiome data

2018 ◽  
Author(s):  
Jose Lugo-Martinez ◽  
Daniel Ruiz-Perez ◽  
Giri Narasimhan ◽  
Ziv Bar-Joseph

AbstractBackgroundSeveral studies have focused on the microbiota living in environmental niches including human body sites. In many of these studies researchers collect longitudinal data with the goal of understanding not just the composition of the microbiome but also the interactions between the different taxa. However, analysis of such data is challenging and very few methods have been developed to reconstruct dynamic models from time series microbiome data.ResultsHere we present a computational pipeline that enables the integration of data across individuals for the reconstruction of such models. Our pipeline starts by aligning the data collected for all individuals. The aligned profiles are then used to learn a dynamic Bayesian network which represents causal relationships between taxa and clinical variables. Testing our methods on three longitudinal microbiome data sets we show that our pipeline improve upon prior methods developed for this task. We also discuss the biological insights provided by the models which include several known and novel interactions.ConclusionsWe propose a computational pipeline for analyzing longitudinal microbiome data. Our results provide evidence that microbiome alignments coupled with dynamic Bayesian networks improve predictive performance over previous methods and enhance our ability to infer biological relationships within the microbiome and between taxa and clinical factors.

2020 ◽  
Author(s):  
Venkata Suhas Maringanti ◽  
Vanni Bucci ◽  
Georg K. Gerber

AbstractThe microbiome, which is inherently dynamic, plays essential roles in human physiology and its disruption has been implicated in numerous human diseases. Linking dynamic changes in the microbiome to the status of the human host is an important problem, which is complicated by limitations and complexities of the data. Model interpretability is key in the microbiome field, as practitioners seek to derive testable biological hypotheses from data or develop diagnostic tests that can be understood by clinicians. Interpretable structure must take into account domainspecific information key to biologists and clinicians including evolutionary relationships (phylogeny) and dynamic behavior of the microbiome. A Bayesian model was previously developed in the field, which uses Markov Chain Monte Carlo inference to learn human interpretable rules for classifying the status of the human host based on microbiome time-series data, but that approach is not scalable to increasingly large microbiome datasets being produced. We present a new fully-differentiable model that also learns human-interpretable rules for the same classification task, but in an end-to-end gradient-descent based framework. We validate the performance of our model on human microbiome data sets and demonstrate our approach has similar predictive performance to the fully Bayesian method, while running orders-of-magnitude faster and moreover learning a larger set of rules, thus providing additional biological insight into the effects of diet and environment on the microbiome.


2019 ◽  
Author(s):  
Daniel Ruiz-Perez ◽  
Jose Lugo-Martinez ◽  
Natalia Bourguignon ◽  
Kalai Mathee ◽  
Betiana Lerner ◽  
...  

ABSTRACTA key challenge in the analysis of longitudinal microbiome data is the inference of temporal interactions between microbial taxa, their genes, the metabolites they consume and produce, and host genes. To address these challenges we developed a computational pipeline, PALM, that first aligns multi-omics data and then uses dynamic Bayesian networks (DBNs) to reconstruct a unified model. Our approach overcomes differences in sampling and progression rates, utilizes a biologically-inspired multi-omic framework, reduces the large number of entities and parameters in the DBNs, and validates the learned network. Applying PALM to data collected from inflammatory bowel disease patients, we show that it accurately identifies known and novel interactions. Targeted experimental validations further support a number of the predicted novel metabolite-taxa interactions.Source code and data will be freely available after publication under the MIT Open Source license agreement on our GitHub page.IMPORTANCEWhile a number of large consortia are collecting and profiling several different types of microbiome and genomic time series data, very few methods exist for joint modeling of multi-omics data sets. We developed a new computational pipeline, PALM, which uses Dynamic Bayesian Networks (DBNs) and is designed to integrate multi-omics data from longitudinal microbiome studies. When used to integrate sequence, expression, and metabolomics data from microbiome samples along with host expression data, the resulting models identify interactions between taxa, their genes and the metabolites they produce and consume, and their impact on host expression. We tested the models both by using them to predict future changes in microbiome levels, and by comparing the learned interactions to known interactions in the literature. Finally, we performed experimental validations for a few of the predicted interactions to demonstrate the ability of the method to identify novel relationships and their impact.


Microbiome ◽  
2019 ◽  
Vol 7 (1) ◽  
Author(s):  
Jose Lugo-Martinez ◽  
Daniel Ruiz-Perez ◽  
Giri Narasimhan ◽  
Ziv Bar-Joseph

mSystems ◽  
2021 ◽  
Vol 6 (2) ◽  
Author(s):  
Daniel Ruiz-Perez ◽  
Jose Lugo-Martinez ◽  
Natalia Bourguignon ◽  
Kalai Mathee ◽  
Betiana Lerner ◽  
...  

ABSTRACT A key challenge in the analysis of longitudinal microbiome data is the inference of temporal interactions between microbial taxa, their genes, the metabolites that they consume and produce, and host genes. To address these challenges, we developed a computational pipeline, a pipeline for the analysis of longitudinal multi-omics data (PALM), that first aligns multi-omics data and then uses dynamic Bayesian networks (DBNs) to reconstruct a unified model. Our approach overcomes differences in sampling and progression rates, utilizes a biologically inspired multi-omic framework, reduces the large number of entities and parameters in the DBNs, and validates the learned network. Applying PALM to data collected from inflammatory bowel disease patients, we show that it accurately identifies known and novel interactions. Targeted experimental validations further support a number of the predicted novel metabolite-taxon interactions. IMPORTANCE While a number of large consortia collect and profile several different types of microbiome and genomic time series data, very few methods exist for joint modeling of multi-omics data sets. We developed a new computational pipeline, PALM, which uses dynamic Bayesian networks (DBNs) and is designed to integrate multi-omics data from longitudinal microbiome studies. When used to integrate sequence, expression, and metabolomics data from microbiome samples along with host expression data, the resulting models identify interactions between taxa, their genes, and the metabolites that they produce and consume, as well as their impact on host expression. We tested the models both by using them to predict future changes in microbiome levels and by comparing the learned interactions to known interactions in the literature. Finally, we performed experimental validations for a few of the predicted interactions to demonstrate the ability of the method to identify novel relationships and their impact.


2020 ◽  
Vol 18 (06) ◽  
pp. 2050037
Author(s):  
Liang Chen ◽  
Shun He ◽  
Yuyao Zhai ◽  
Minghua Deng

16S rRNA gene sequencing and whole microbiome sequencing make it possible and stable to quantitatively analyze the composition of microbial communities and the relationship among microbial communities, microbes, and hosts. One essential step in the analysis of microbiome compositional data is inferring the direct interaction network among microbial species, bringing to light the potential underlying mechanism that regulates interaction in their communities. However, standard statistical analysis may obtain spurious results due to compositional nature of microbiome data; therefore, network recovery of microbial communities remains challenging. Here, we propose a novel loss function called codaloss for direct microbes interaction network estimation under the sparsity assumptions. We develop an alternating direction optimization algorithm to obtain sparse solution of codaloss as estimator. Compared to other state-of-the-art methods, our model makes less assumptions about the microbial networks. The simulation and real microbiome data results show that our method outperforms other methods in network inference. An implementation of codaloss is available from https://github.com/xuebaliang/Codaloss .


Author(s):  
Josquin Foulliaron ◽  
Laurent Bouillaut ◽  
Patrice Aknin ◽  
Anne Barros

The maintenance optimization of complex systems is a key question. One important objective is to be able to anticipate future maintenance actions required to optimize the logistic and future investments. That is why, over the past few years, the predictive maintenance approaches have been an expanding area of research. They rely on the concept of prognosis. Many papers have shown how dynamic Bayesian networks can be relevant to represent multicomponent complex systems and carry out reliability studies. The diagnosis and maintenance group from French institute of science and technology for transport, development and networks (IFSTTAR) developed a model (VirMaLab: Virtual Maintenance Laboratory) based on dynamic Bayesian networks in order to model a multicomponent system with its degradation dynamic and its diagnosis and maintenance processes. Its main purpose is to model a maintenance policy to be able to optimize the maintenance parameters due to the use of dynamic Bayesian networks. A discrete state-space system is considered, periodically observable through a diagnosis process. Such systems are common in railway or road infrastructure fields. This article presents a prognosis algorithm whose purpose is to compute the remaining useful life of the system and update this estimation each time a new diagnosis is available. Then, a representation of this algorithm is given as a dynamic Bayesian network in order to be next integrated into the Virtual Maintenance Laboratory model to include the set of predictive maintenance policies. Inference computation questions on the considered dynamic Bayesian networks will be discussed. Finally, an application on simulated data will be presented.


Author(s):  
Lei Jiang ◽  
Yiliu Liu ◽  
Xiaomin Wang ◽  
Mary Ann Lundteigen

The reliability and availability of the onboard high-speed train control system are important to guarantee operational efficiency and railway safety. Failures occurring in the onboard system may result in serious accidents. In the analysis of the effects of failure, it is significant to consider the operation of an onboard system. This article presents a systemic approach to evaluate the reliability and availability for the onboard system based on dynamic Bayesian network, with taking into account dynamic failure behaviors, imperfect coverage factors, and temporal effects in the operational phase. The case studies are presented and compared for onboard systems with different redundant strategies, that is, the triple modular redundancy, hot spare double dual, and cold spare double dual. Dynamic fault trees of the three kinds of onboard system are constructed and mapped into dynamic Bayesian networks. The forward and backward inferences are conducted not only to evaluate the reliability and availability but also to recognize the vulnerabilities of the onboard systems. A sensitivity analysis is carried out for evaluating the effects of failure rates subject to uncertainties. To improve the reliability and availability, the recovery mechanism should be paid more attention. Finally, the proposed approach is validated with the field data from one railway bureau in China and some industrial impacts are provided.


Author(s):  
Andrey Chukhray ◽  
Olena Havrylenko

The subject of research in the article is the process of intelligent computer training in engineering skills. The aim is to model the process of teaching engineering skills in intelligent computer training programs through dynamic Bayesian networks. Objectives: To propose an approach to modeling the process of teaching engineering skills. To assess the student competence level by considering the algorithms development skills in engineering tasks and the algorithms implementation ability. To create a dynamic Bayesian network structure for the learning process. To select values for conditional probability tables. To solve the problems of filtering, forecasting, and retrospective analysis. To simulate the developed dynamic Bayesian network using a special Genie 2.0-environment. The methods used are probability theory and inference methods in Bayesian networks. The following results are obtained: the development of a dynamic Bayesian network for the educational process based on the solution of engineering problems is presented. Mathematical calculations for probabilistic inference problems such as filtering, forecasting, and smoothing are considered. The solution of the filtering problem makes it possible to assess the current level of the student's competence after obtaining the latest probabilities of the development of the algorithm and its numerical calculations of the task. The probability distribution of the learning process model is predicted. The number of additional iterations required to achieve the required competence level was estimated. The retrospective analysis allows getting a smoothed assessment of the competence level, which was obtained after the task's previous instance completion and after the computation of new additional probabilities characterizing the two checkpoints implementation. The solution of the described probabilistic inference problems makes it possible to provide correct information about the learning process for intelligent computer training systems. It helps to get proper feedback and to track the student's competence level. The developed technique of the kernel of probabilistic inference can be used as the decision-making model basis for an automated training process. The scientific novelty lies in the fact that dynamic Bayesian networks are applied to a new class of problems related to the simulation of engineering skills training in the process of performing algorithmic tasks.


F1000Research ◽  
2014 ◽  
Vol 3 ◽  
pp. 146 ◽  
Author(s):  
Guanming Wu ◽  
Eric Dawson ◽  
Adrian Duong ◽  
Robin Haw ◽  
Lincoln Stein

High-throughput experiments are routinely performed in modern biological studies. However, extracting meaningful results from massive experimental data sets is a challenging task for biologists. Projecting data onto pathway and network contexts is a powerful way to unravel patterns embedded in seemingly scattered large data sets and assist knowledge discovery related to cancer and other complex diseases. We have developed a Cytoscape app called “ReactomeFIViz”, which utilizes a highly reliable gene functional interaction network and human curated pathways from Reactome and other pathway databases. This app provides a suite of features to assist biologists in performing pathway- and network-based data analysis in a biologically intuitive and user-friendly way. Biologists can use this app to uncover network and pathway patterns related to their studies, search for gene signatures from gene expression data sets, reveal pathways significantly enriched by genes in a list, and integrate multiple genomic data types into a pathway context using probabilistic graphical models. We believe our app will give researchers substantial power to analyze intrinsically noisy high-throughput experimental data to find biologically relevant information.


2021 ◽  
Author(s):  
Andrew J Kavran ◽  
Aaron Clauset

Abstract Background: Large-scale biological data sets are often contaminated by noise, which can impede accurate inferences about underlying processes. Such measurement noise can arise from endogenous biological factors like cell cycle and life history variation, and from exogenous technical factors like sample preparation and instrument variation.Results: We describe a general method for automatically reducing noise in large-scale biological data sets. This method uses an interaction network to identify groups of correlated or anti-correlated measurements that can be combined or “filtered” to better recover an underlying biological signal. Similar to the process of denoising an image, a single network filter may be applied to an entire system, or the system may be first decomposed into distinct modules and a different filter applied to each. Applied to synthetic data with known network structure and signal, network filters accurately reduce noise across a wide range of noise levels and structures. Applied to a machine learning task of predicting changes in human protein expression in healthy and cancerous tissues, network filtering prior to training increases accuracy up to 43% compared to using unfiltered data.Conclusions: Network filters are a general way to denoise biological data and can account for both correlation and anti-correlation between different measurements. Furthermore, we find that partitioning a network prior to filtering can significantly reduce errors in networks with heterogenous data and correlation patterns, and this approach outperforms existing diffusion based methods. Our results on proteomics data indicate the broad potential utility of network filters to applications in systems biology.


Sign in / Sign up

Export Citation Format

Share Document