An Integrative Approach for Fine-Mapping Chromatin Interactions

Mapping Intimacies ◽

10.1101/605576 ◽

2019 ◽

Author(s):

Artur Jaroszewicz ◽

Jason Ernst

Keyword(s):

High Resolution ◽

Binding Sites ◽

Biological Significance ◽

Computational Method ◽

Integrative Approach ◽

Genome Architecture ◽

Open Chromatin ◽

Chromatin Interactions ◽

Genome Wide ◽

Evolutionarily Conserved

AbstractChromatin interactions play an important role in genome architecture and regulation. The Hi-C assay generates such interactions maps genome-wide, but at relatively low resolutions (e.g., 5-25kb), which is substantially larger than the resolution of transcription factor binding sites or open chromatin sites that are potential sources of such interactions. To predict the sources of Hi-C identified interactions at a high resolution (e.g., 100bp), we developed a computational method that integrates ChIP-seq data of transcription factors and histone marks and DNase-seq data. Our method,χ-SCNN, uses this data to first train a Siamese Convolutional Neural Network (SCNN) to discriminate between called Hi-C interactions and non-interactions.χ-SCNN then predicts the high-resolution source of each Hi-C interaction using a feature attribution method. We show these predictions recover original Hi-C peaks after extending them to be coarser. We also showχ-SCNN predictions enrich for evolutionarily conserved bases, eQTLs, and CTCF motifs, supporting their biological significance.χ-SCNN provides an approach for analyzing important aspects of genome architecture and regulation at a higher resolution than previously possible.χ-SCNN software is available on GitHub (https://github.com/ernstlab/X-SCNN).

Download Full-text

An integrative approach for fine-mapping chromatin interactions

Bioinformatics ◽

10.1093/bioinformatics/btz843 ◽

2019 ◽

Vol 36 (6) ◽

pp. 1704-1711

Author(s):

Artur Jaroszewicz ◽

Jason Ernst

Keyword(s):

Gene Regulation ◽

High Resolution ◽

Biological Significance ◽

Computational Method ◽

Supplementary Information ◽

Integrative Approach ◽

Genome Architecture ◽

Open Chromatin ◽

Chromatin Interactions ◽

Genome Wide

Abstract Motivation Chromatin interactions play an important role in genome architecture and gene regulation. The Hi-C assay generates such interactions maps genome-wide, but at relatively low resolutions (e.g. 5-25 kb), which is substantially coarser than the resolution of transcription factor binding sites or open chromatin sites that are potential sources of such interactions. Results To predict the sources of Hi-C-identified interactions at a high resolution (e.g. 100 bp), we developed a computational method that integrates data from DNase-seq and ChIP-seq of TFs and histone marks. Our method, χ-CNN, uses this data to first train a convolutional neural network (CNN) to discriminate between called Hi-C interactions and non-interactions. χ-CNN then predicts the high-resolution source of each Hi-C interaction using a feature attribution method. We show these predictions recover original Hi-C peaks after extending them to be coarser. We also show χ-CNN predictions enrich for evolutionarily conserved bases, eQTLs and CTCF motifs, supporting their biological significance. χ-CNN provides an approach for analyzing important aspects of genome architecture and gene regulation at a higher resolution than previously possible. Availability and implementation χ-CNN software is available on GitHub (https://github.com/ernstlab/X-CNN). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Chromatin interaction neural network (ChINN): a machine learning-based method for predicting chromatin interactions from DNA sequences

Genome Biology ◽

10.1186/s13059-021-02453-5 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Fan Cao ◽

Yu Zhang ◽

Yichao Cai ◽

Sambhavi Animesh ◽

Ying Zhang ◽

...

Keyword(s):

Neural Network ◽

Rna Polymerase Ii ◽

Dna Sequences ◽

Lymphocytic Leukemia ◽

Computational Method ◽

Chromatin Interaction ◽

Open Chromatin ◽

Interaction Prediction ◽

Chromatin Interactions ◽

Genome Wide

AbstractChromatin interactions play important roles in regulating gene expression. However, the availability of genome-wide chromatin interaction data is limited. We develop a computational method, chromatin interaction neural network (ChINN), to predict chromatin interactions between open chromatin regions using only DNA sequences. ChINN predicts CTCF- and RNA polymerase II-associated and Hi-C chromatin interactions. ChINN shows good across-sample performances and captures various sequence features for chromatin interaction prediction. We apply ChINN to 6 chronic lymphocytic leukemia (CLL) patient samples and a published cohort of 84 CLL open chromatin samples. Our results demonstrate extensive heterogeneity in chromatin interactions among CLL patient samples.

Download Full-text

Predicting chromatin interactions between open chromatin regions from DNA sequences

10.1101/720748 ◽

2019 ◽

Cited By ~ 3

Author(s):

Fan Cao ◽

Ying Zhang ◽

Yan Ping Loh ◽

Yichao Cai ◽

Melissa J. Fullwood

Keyword(s):

Rna Polymerase Ii ◽

Dna Sequences ◽

State Of The Art ◽

Lymphocytic Leukemia ◽

Computational Method ◽

Chromatin Interaction ◽

Open Chromatin ◽

Rna Seq ◽

Chromatin Interactions ◽

Genome Wide

AbstractChromatin interactions play important roles in regulating gene expression. However, the availability of genome-wide chromatin interaction data is very limited. Various computational methods have been developed to predict chromatin interactions. Most of these methods rely on large collections of ChIP-Seq/RNA-Seq/DNase-Seq datasets and predict only enhancer-promoter interactions. Some of the ‘state-of-the-art’ methods have poor experimental designs, leading to over-exaggerated performances and misleading conclusions. Here we developed a computational method, Chromatin Interaction Neural Network (CHINN), to predict chromatin interactions between open chromatin regions by using only DNA sequences of the interacting open chromatin regions. CHINN is able to predict CTCF- and RNA polymerase II-associated chromatin interactions between open chromatin regions. CHINN also shows good across-sample performances and captures various sequence features that are predictive of chromatin interactions. We applied CHINN to 84 chronic lymphocytic leukemia (CLL) samples and detected systematic differences in the chromatin interactome between IGVH-mutated and IGVH-unmutated CLL samples.

Download Full-text

Accessible Region Conformation Capture (ARC-C) gives high resolution insights into genome architecture and regulation

Genome Research ◽

10.1101/gr.275669.121 ◽

2021 ◽

pp. gr.275669.121

Author(s):

Ni Huang ◽

Wei Qiang Seow ◽

Alex Appert ◽

Yan Dong ◽

Przemyslaw Stempor ◽

...

Keyword(s):

High Resolution ◽

Regulatory Elements ◽

Genome Architecture ◽

Regulatory Interactions ◽

C Elegans ◽

Chromatin Regulators ◽

Chromatin Interactions ◽

Genome Wide ◽

Accessible Region ◽

Domain Level

Nuclear organization and chromatin interactions are important for genome function, yet determining chromatin connections at high-resolution remains a major challenge. To address this, we developed Accessible Region Conformation Capture (ARC-C), which profiles interactions between regulatory elements genome-wide without a capture step. Applied to C. elegans, we identify ~15,000 significant interactions between regulatory elements at 500bp resolution. Of 105 TFs or chromatin regulators tested, we find that the binding sites of 60 are enriched for interacting with each other, making them candidates for mediating interactions. These include cohesin and condensin II. Applying ARC-C to a mutant of transcription factor BLMP-1 detected changes in interactions between its targets. ARC-C simultaneously profiles domain level architecture, and we observe that C. elegans chromatin domains defined by either active or repressive modifications form topologically associating domains (TADs) which interact with A/B (active/inactive) compartment-like structure. Furthermore, we discovered that inactive compartment interactions are dependent on H3K9 methylation. ARC-C is a powerful new tool to interrogate genome architecture and regulatory interactions at high resolution.

Download Full-text

Chromatin Interaction Neural Network (ChINN): A machine learning-based method for predicting chromatin interactions from DNA sequences

10.1101/2020.12.30.424817 ◽

2020 ◽

Author(s):

Fan Cao ◽

Yu Zhang ◽

Yichao Cai ◽

Sambhavi Animesh ◽

Ying Zhang ◽

...

Keyword(s):

Neural Network ◽

Rna Polymerase Ii ◽

Dna Sequences ◽

Lymphocytic Leukemia ◽

Computational Method ◽

Chromatin Interaction ◽

Open Chromatin ◽

Clinical Patient ◽

Chromatin Interactions ◽

Genome Wide

AbstractChromatin interactions play important roles in regulating gene expression. However, the availability of genome-wide chromatin interaction data is limited. Various computational methods have been developed to predict chromatin interactions. Most of these methods rely on large collections of ChIP-Seq/RNA-Seq/DNase-Seq datasets and predict only enhancer-promoter interactions. Some of the ‘state-of-the-art’ methods have poor experimental designs, leading to over-exaggerated performances and misleading conclusions. Here we developed a computational method, Chromatin Interaction Neural Network (ChINN), to predict chromatin interactions between open chromatin regions by using only DNA sequences of the interacting open chromatin regions. ChINN is able to predict CTCF-, RNA polymerase II- and HiC-associated chromatin interactions between open chromatin regions. ChINN also shows good across-sample performances and captures various sequence features that are predictive of chromatin interactions. To apply our results to clinical patient data, we applied CHINN to predict chromatin interactions in 6 chronic lymphocytic leukemia (CLL) patient samples and a cohort of open chromatin data from 84 CLL samples that was previously published. Our results demonstrated extensive heterogeneity in chromatin interactions in patient samples, and one of the sources of this heterogeneity were the different subtypes of CLL.

Download Full-text

High-resolution single-cell 3D-models of chromatin ensembles during Drosophila embryogenesis

Nature Communications ◽

10.1038/s41467-020-20490-9 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Qiu Sun ◽

Alan Perez-Rathke ◽

Daniel M. Czajkowsky ◽

Zhifeng Shao ◽

Jie Liang

Keyword(s):

High Resolution ◽

Single Cell ◽

3D Models ◽

Computational Method ◽

Midblastula Transition ◽

Genome Wide ◽

Large Ensembles ◽

Chromatin Folding ◽

Function Relationship ◽

Relationship Of

AbstractSingle-cell chromatin studies provide insights into how chromatin structure relates to functions of individual cells. However, balancing high-resolution and genome wide-coverage remains challenging. We describe a computational method for the reconstruction of large 3D-ensembles of single-cell (sc) chromatin conformations from population Hi-C that we apply to study embryogenesis in Drosophila. With minimal assumptions of physical properties and without adjustable parameters, our method generates large ensembles of chromatin conformations via deep-sampling. Our method identifies specific interactions, which constitute 5–6% of Hi-C frequencies, but surprisingly are sufficient to drive chromatin folding, giving rise to the observed Hi-C patterns. Modeled sc-chromatins quantify chromatin heterogeneity, revealing significant changes during embryogenesis. Furthermore, >50% of modeled sc-chromatin maintain topologically associating domains (TADs) in early embryos, when no population TADs are perceptible. Domain boundaries become fixated during development, with strong preference at binding-sites of insulator-complexes upon the midblastula transition. Overall, high-resolution 3D-ensembles of sc-chromatin conformations enable further in-depth interpretation of population Hi-C, improving understanding of the structure-function relationship of genome organization.

Download Full-text

HiCRes: a computational method to estimate and predict the resolution of HiC libraries

10.1101/2020.09.22.307967 ◽

2020 ◽

Author(s):

Claire Marchal ◽

Nivedita Singh ◽

Ximena Corso-Díaz ◽

Anand Swaroop

Keyword(s):

Expression Patterns ◽

Three Dimensional ◽

Computational Method ◽

Mathematical Concepts ◽

Regulate Gene Expression ◽

Chromatin Interactions ◽

Genome Wide ◽

A Cell ◽

Cell Type Specific ◽

Human And Mouse

AbstractThree-dimensional (3D) conformation of the chromatin is crucial to stringently regulate gene expression patterns and DNA replication in a cell-type specific manner. HiC is a key technique for measuring 3D chromatin interactions genome wide. Estimating and predicting the resolution of a library is an essential step in any HiC experimental design. Here, we present the mathematical concepts to estimate the resolution of a library and predict whether deeper sequencing would enhance the resolution. We have developed HiCRes, a docker pipeline, by applying these concepts to human and mouse HiC libraries.

Download Full-text

methyl-ATAC-seq measures DNA methylation at accessible chromatin

10.1101/445486 ◽

2018 ◽

Cited By ~ 1

Author(s):

R Spektor ◽

ND Tippens ◽

CA Mimoso ◽

PD Soloway

Keyword(s):

Dna Methylation ◽

Binding Sites ◽

Open Chromatin ◽

Regulatory Sequences ◽

Protein Binding Sites ◽

3 Dimensional ◽

Genome Wide ◽

Gene Regulatory ◽

Nucleosome Location ◽

Accessible Chromatin

ABSTRACTChromatin features are characterized by genome-wide assays for nucleosome location, protein binding sites, 3-dimensional interactions, and modifications to histones and DNA. For example, Assay for Transposase Accessible Chromatin sequencing (ATAC-seq) identifies nucleosome-depleted (open) chromatin, which harbors potentially active gene regulatory sequences; and bisulfite sequencing (BS-seq) quantifies DNA methylation. When two distinct chromatin features like these are assayed separately in populations of cells, it is impossible to determine, with certainty, where the features are coincident in the genome by simply overlaying datasets. Here we describe methyl-ATAC-seq (mATAC-seq), which implements modifications to ATAC-seq, including subjecting the output to BS-seq. Merging these assays into a single protocol identifies the locations of open chromatin, and reveals, unambiguously, the DNA methylation state of the underlying DNA. Such combinatorial methods eliminate the need to perform assays independently and infer where features are coincident.

Download Full-text

The native cistrome and sequence motif families of the maize ear

PLoS Genetics ◽

10.1371/journal.pgen.1009689 ◽

2021 ◽

Vol 17 (8) ◽

pp. e1009689

Author(s):

Savannah D. Savadel ◽

Thomas Hartwig ◽

Zachary M. Turpin ◽

Daniel L. Vera ◽

Pei-Yau Lung ◽

...

Keyword(s):

High Resolution ◽

Binding Sites ◽

Regulatory Networks ◽

Regulatory Elements ◽

Chromatin Interaction ◽

Sequence Motif ◽

Binding Prediction ◽

Interaction Sites ◽

Genome Wide ◽

Dna Regulatory Elements

Elucidating the transcriptional regulatory networks that underlie growth and development requires robust ways to define the complete set of transcription factor (TF) binding sites. Although TF-binding sites are known to be generally located within accessible chromatin regions (ACRs), pinpointing these DNA regulatory elements globally remains challenging. Current approaches primarily identify binding sites for a single TF (e.g. ChIP-seq), or globally detect ACRs but lack the resolution to consistently define TF-binding sites (e.g. DNAse-seq, ATAC-seq). To address this challenge, we developed MNase-defined cistrome-Occupancy Analysis (MOA-seq), a high-resolution (< 30 bp), high-throughput, and genome-wide strategy to globally identify putative TF-binding sites within ACRs. We used MOA-seq on developing maize ears as a proof of concept, able to define a cistrome of 145,000 MOA footprints (MFs). While a substantial majority (76%) of the known ATAC-seq ACRs intersected with the MFs, only a minority of MFs overlapped with the ATAC peaks, indicating that the majority of MFs were novel and not detected by ATAC-seq. MFs were associated with promoters and significantly enriched for TF-binding and long-range chromatin interaction sites, including for the well-characterized FASCIATED EAR4, KNOTTED1, and TEOSINTE BRANCHED1. Importantly, the MOA-seq strategy improved the spatial resolution of TF-binding prediction and allowed us to identify 215 motif families collectively distributed over more than 100,000 non-overlapping, putatively-occupied binding sites across the genome. Our study presents a simple, efficient, and high-resolution approach to identify putative TF footprints and binding motifs genome-wide, to ultimately define a native cistrome atlas.

Download Full-text

RedChIP identifies noncoding RNAs associated with genomic sites occupied by Polycomb and CTCF proteins

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2116222119 ◽

2021 ◽

Vol 119 (1) ◽

pp. e2116222119

Author(s):

Alexey A. Gavrilov ◽

Rinat I. Sultanov ◽

Mikhail D. Magnitov ◽

Aleksandra A. Galitsyna ◽

Erdem B. Dashinimaev ◽

...

Keyword(s):

Chromatin Immunoprecipitation ◽

Binding Sites ◽

Wide Spectrum ◽

Noncoding Rnas ◽

Ctcf Binding ◽

Polycomb Repressive Complex 2 ◽

Proximity Ligation ◽

Chromatin Interactions ◽

Genome Wide ◽

Architectural Protein

Nuclear noncoding RNAs (ncRNAs) are key regulators of gene expression and chromatin organization. The progress in studying nuclear ncRNAs depends on the ability to identify the genome-wide spectrum of contacts of ncRNAs with chromatin. To address this question, a panel of RNA–DNA proximity ligation techniques has been developed. However, neither of these techniques examines proteins involved in RNA–chromatin interactions. Here, we introduce RedChIP, a technique combining RNA–DNA proximity ligation and chromatin immunoprecipitation for identifying RNA–chromatin interactions mediated by a particular protein. Using antibodies against architectural protein CTCF and the EZH2 subunit of the Polycomb repressive complex 2, we identify a spectrum of cis- and trans-acting ncRNAs enriched at Polycomb- and CTCF-binding sites in human cells, which may be involved in Polycomb-mediated gene repression and CTCF-dependent chromatin looping. By providing a protein-centric view of RNA–DNA interactions, RedChIP represents an important tool for studies of nuclear ncRNAs.

Download Full-text