Ensemble-based network aggregation improves the accuracy of gene network reconstruction
Reverse engineering approaches to construct context-specific gene regulatory networks (GRNs) based on genome-wide mRNA expression data have led to significant biological findings. However, the reliability and reproducibility of the reconstructed GRNs needs to be improved. Here, we propose an ensemble-based network aggregation approach to improve the accuracy of the network topology constructed from mRNA expression data. To evaluate the performance of different approaches, we created dozens of simulated networks and also tested our methods on three Escherichia coli datasets. We demonstrate three novel applications from this development. First, bootstrapping can be done on the available samples, turning any network reconstruction approach into an ensemble method. Second, this aggregation approach can be used to combine GRNs from different network inference methods, creating a novel network reconstruction approach that consistently outperforms any constituent method. Third, the approach can be used to effectively integrate GRNs constructed from different studies – producing more accurate networks. We are releasing an implementation of these techniques as an R package “ENA” which is able to run network inference in parallel across multiple servers. We made all of the code and data used in our simulations and analysis available online at https://github.com/QBRC/ENA-Research to ensure the reproducibility of our results.