HiCcompare: a method for joint normalization of Hi-C datasets and differential chromatin interaction detection
AbstractChanges in spatial chromatin interactions are now emerging as a unifying mechanism or-chestrating regulation of gene expression. Evolution of chromatin conformation capture methods into Hi-C sequencing technology now allows an insight into chromatin interactions on a genome-wide scale. However, Hi-C data contains many DNA sequence- and technology-driven biases. These biases prevent effective comparison of chromatin interactions aimed at identifying genomic regions differentially interacting between, disease-normal states or different cell types. Several methods have been developed for normalizing individual Hi-C datasets. However, they fail to account for biases between two or more Hi-C datasets, hindering comparative analysis of chromatin interactions. We developed a simple and effective method HiCcompare for the joint normalization and differential analysis of multiple Hi-C datasets. The method avoids constraining Hi-C data within a rigid statistical model, allowing a data-driven normalization of biases using locally weighted linear regression (loess). The method identifies region-specific chromatin interaction changes complementary to changes due to large-scale genomic rearrangements, such as copy number variants (CNVs). HiCcompare outperforms methods for normalizing individual Hi-C datasets in detecting a priori known chromatin interaction differences in simulated and real-life settings while detecting biologically relevant changes. HiCcompare is freely available as a Bioconductor R package https://bioconductor.org/packages/HiCcompare/.Author SummaryAdvances in chromosome conformation capture sequencing technologies (Hi-C) have sparked interest in studying the 3-dimensional (3D) chromatin interaction structure of the human genome. The 3D structure of the genome is now considered as a primary regulator of gene expression. Changes to the 3D chromatin interactions are now emerging as a hallmark of cancer and other complex diseases. With the growing availability of Hi-C data generated under different conditions (e.g. tumor-normal, cell-type-specific), methods are needed to compare them. However, biases in Hi-C data hinder their comparative analysis. To account for biases, several normalization techniques have been developed for removing biases in individual Hi-C datasets, but very few were designed to account for between-datasets biases. We developed a new method and R package HiCcompare for the joint normalization of multiple Hi-C datasets and differential chromatin interaction detection. Our results show the superiority of our joint normalization methods compared to methods for normalizing individual datasets in detecting true chromatin interaction changes. HiCcompare enables further research into discovering the dynamics of 3D genomic changes.