chromstaR: Tracking combinatorial chromatin state dynamics in space and time

Mapping Intimacies ◽

10.1101/038612 ◽

2016 ◽

Cited By ~ 11

Author(s):

Aaron Taudt ◽

Minh Anh Nguyen ◽

Matthias Heinig ◽

Frank Johannes ◽

Maria Colomé-Tatché

Keyword(s):

Temporal Dynamics ◽

Cell Types ◽

R Package ◽

Developmental Time ◽

Genomic Region ◽

Chromatin State ◽

Post Translational Modifications ◽

Chromatin States ◽

Genome Wide ◽

Experimental Treatments

AbstractBackgroundPost-translational modifications of histone residue tails are an important component of genome regulation. It is becoming increasingly clear that the combinatorial presence and absence of various modifications define discrete chromatin states which determine the functional properties of a locus. An emerging experimental goal is to track changes in chromatin state maps across different conditions, such as experimental treatments, cell-types or developmental time points.ResultsHere we present chromstaR, an algorithm for the computational inference of combinatorial chromatin state dynamics across an arbitrary number of conditions. ChromstaR uses a multivariate Hidden Markov Model to determine the number of discrete combinatorial chromatin states using multiple ChIP-seq experiments as input and assigns every genomic region to a state based on the presence/absence of each modification in every condition. We demonstrate the advantages of chromstaR in the context of three common experimental data scenarios. First, we study how different histone modifications combine to form combinatorial chromatin states in a single tissue. Second, we infer genome-wide patterns of combinatorial state differences between two cell types or conditions. Finally, we study the dynamics of combinatorial chromatin states during tissue differentiation involving up to six differentiation points. Our findings reveal a striking sparcity in the combinatorial organization and temporal dynamics of chromatin state maps.ConclusionschromstaR is a versatile computational tool that facilitates a deeper biological understanding of chromatin organization and dynamics. The algorithm is implemented as an R-package and freely available from http://bioconductor.org/packages/chromstaR/.

Download Full-text

Annotation of Chromatin States in 66 Complete Mouse Epigenomes During Development

10.1101/2020.07.23.218552 ◽

2020 ◽

Author(s):

Arjan van der Velde ◽

Kaili Fan ◽

Junko Tsuji ◽

Jill Moore ◽

Michael Purcaro ◽

...

Keyword(s):

Time Course ◽

Target Genes ◽

Cell Types ◽

Developmental Time ◽

Developmental Trajectory ◽

Phase Iii ◽

Chromatin State ◽

Mammalian Development ◽

Chromatin States ◽

Transcription Start Sites

ABSTRACTThe morphologically and functionally distinct cell types of a multicellular organism are maintained by epigenomes and gene expression programs. Phase III of the ENCODE Project profiled 66 mouse epigenomes across twelve tissues at daily intervals from embryonic day 10.5 to birth. Applying the ChromHMM algorithm to these epigenomes, we annotated eighteen chromatin states with characteristics of promoters, enhancers, transcribed regions, repressed regions, and quiescent regions throughout the developmental time course. Our integrative analyses delineate the tissue specificity and developmental trajectory of the loci in these chromatin states. Approximately 0.3% of each epigenome is assigned to a bivalent chromatin state, which harbors both active marks and the repressive mark H3K27me3. Highly evolutionarily conserved, these loci are enriched in silencers bound by Polycomb Repressive Complex proteins and the transcription start sites of their silenced target genes. This collection of chromatin state assignments provides a useful resource for studying mammalian development.

Download Full-text

Annotation of chromatin states in 66 complete mouse epigenomes during development

Communications Biology ◽

10.1038/s42003-021-01756-4 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Arjan van der Velde ◽

Kaili Fan ◽

Junko Tsuji ◽

Jill E. Moore ◽

Michael J. Purcaro ◽

...

Keyword(s):

Target Genes ◽

Cell Types ◽

Developmental Trajectory ◽

Phase Iii ◽

Chromatin State ◽

Mammalian Development ◽

Chromatin States ◽

Transcription Start Sites ◽

Repressive Mark ◽

Distinct Cell

AbstractThe morphologically and functionally distinct cell types of a multicellular organism are maintained by their unique epigenomes and gene expression programs. Phase III of the ENCODE Project profiled 66 mouse epigenomes across twelve tissues at daily intervals from embryonic day 11.5 to birth. Applying the ChromHMM algorithm to these epigenomes, we annotated eighteen chromatin states with characteristics of promoters, enhancers, transcribed regions, repressed regions, and quiescent regions. Our integrative analyses delineate the tissue specificity and developmental trajectory of the loci in these chromatin states. Approximately 0.3% of each epigenome is assigned to a bivalent chromatin state, which harbors both active marks and the repressive mark H3K27me3. Highly evolutionarily conserved, these loci are enriched in silencers bound by polycomb repressive complex proteins, and the transcription start sites of their silenced target genes. This collection of chromatin state assignments provides a useful resource for studying mammalian development.

Download Full-text

Emergence of neuronal diversity during vertebrate brain development

10.1101/839860 ◽

2019 ◽

Author(s):

Bushra Raj ◽

Jeffrey A. Farrell ◽

Aaron McKenna ◽

Jessica L. Leslie ◽

Alexander F. Schier

Keyword(s):

Brain Development ◽

Single Cell ◽

Time Course ◽

Temporal Dynamics ◽

Cell Types ◽

Developmental Time ◽

Neural Progenitors ◽

Neuronal Diversity ◽

Gene Markers ◽

Vertebrate Brain

ABSTRACTNeurogenesis in the vertebrate brain comprises many steps ranging from the proliferation of progenitors to the differentiation and maturation of neurons. Although these processes are highly regulated, the landscape of transcriptional changes and progenitor identities underlying brain development are poorly characterized. Here, we describe the first developmental single-cell RNA-seq catalog of more than 200,000 zebrafish brain cells encompassing 12 stages from 12 hours post-fertilization to 15 days post-fertilization. We characterize known and novel gene markers for more than 800 clusters across these timepoints. Our results capture the temporal dynamics of multiple neurogenic waves from embryo to larva that expand neuronal diversity from ∼20 cell types at 12 hpf to ∼100 cell types at 15 dpf. We find that most embryonic neural progenitor states are transient and transcriptionally distinct from long-lasting neural progenitors of post-embryonic stages. Furthermore, we reconstruct cell specification trajectories for the retina and hypothalamus, and identify gene expression cascades and novel markers. Our analysis reveal that late-stage retinal neural progenitors transcriptionally overlap cell states observed in the embryo, while hypothalamic neural progenitors become progressively distinct with developmental time. These data provide the first comprehensive single-cell transcriptomic time course for vertebrate brain development and suggest distinct neurogenic regulatory paradigms between different stages and tissues.

Download Full-text

Systematic mapping of chromatin state landscapes during mouse development

10.1101/166652 ◽

2017 ◽

Cited By ~ 15

Author(s):

David U. Gorkin ◽

Iros Barozzi ◽

Yanxiao Zhang ◽

Ah Young Lee ◽

Bin Li ◽

...

Keyword(s):

Histone Modifications ◽

Target Genes ◽

Developmental Stages ◽

Developmental Time ◽

Chromatin State ◽

Mouse Development ◽

Disease Etiology ◽

Chromatin States ◽

Mouse Tissues ◽

Core Components

SUMMARYEmbryogenesis requires epigenetic information that allows each cell to respond appropriately to developmental cues. Histone modifications are core components of a cell’s epigenome, giving rise to chromatin states that modulate genome function. Here, we systematically profile histone modifications in a diverse panel of mouse tissues at 8 developmental stages from 10.5 days post conception until birth, performing a total of 1,128 ChIP-seq assays across 72 distinct tissue-stages. We combine these histone modification profiles into a unified set of chromatin state annotations, and track their activity across developmental time and space. Through integrative analysis we identify dynamic enhancers, reveal key transcriptional regulators, and characterize the role of chromatin-based repression in developmental gene regulation. We also leverage these data to link enhancers to putative target genes, revealing connections between coding and non-coding sequence variation in disease etiology. Our study provides a compendium of resources for biomedical researchers, and achieves the most comprehensive view of embryonic chromatin states to date.

Download Full-text

Integrating regulatory DNA sequence and gene expression to predict genome-wide chromatin accessibility across cellular contexts

10.1101/605717 ◽

2019 ◽

Author(s):

Surag Nair ◽

Daniel S. Kim ◽

Jacob Perricone ◽

Anshul Kundaje

Keyword(s):

Gene Expression ◽

Dna Sequence ◽

Cell Types ◽

Chromatin Accessibility ◽

Genomic Region ◽

Regulatory Sequence ◽

Specific Expression ◽

Genome Wide ◽

Context Specific ◽

Regulatory Dna

AbstractMotivationGenome-wide profiles of chromatin accessibility and gene expression in diverse cellular contexts are critical to decipher the dynamics of transcriptional regulation. Recently, convolutional neural networks (CNNs) have been used to learn predictive cis-regulatory DNA sequence models of context-specific chromatin accessibility landscapes. However, these context-specific regulatory sequence models cannot generalize predictions across cell types.ResultsWe introduce multi-modal, residual neural network architectures that integrate cis-regulatory sequence and context-specific expression of trans-regulators to predict genome-wide chromatin accessibility profiles across cellular contexts. We show that the average accessibility of a genomic region across training contexts can be a surprisingly powerful predictor. We leverage this feature and employ novel strategies for training models to enhance genome-wide prediction of shared and context-specific chromatin accessible sites across cell types. We interpret the models to reveal insights into cis and trans regulation of chromatin dynamics across 123 diverse cellular contexts.AvailabilityThe code is available athttps://github.com/kundajelab/[email protected]

Download Full-text

Self-organizing maps with variable neighborhoods facilitate learning of chromatin accessibility signal shapes associated with regulatory elements

BMC Bioinformatics ◽

10.1186/s12859-021-03976-1 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Tara Eicher ◽

Jany Chan ◽

Han Luu ◽

Raghu Machiraju ◽

Ewy A. Mathé

Keyword(s):

Cell Lines ◽

Regulatory Elements ◽

Chromatin Accessibility ◽

Chromatin State ◽

Self Organizing Map ◽

Functional Interpretation ◽

Chromatin States ◽

Genome Wide ◽

Hela Cell Lines ◽

Self Organizing

Abstract Background Assigning chromatin states genome-wide (e.g. promoters, enhancers, etc.) is commonly performed to improve functional interpretation of these states. However, computational methods to assign chromatin state suffer from the following drawbacks: they typically require data from multiple assays, which may not be practically feasible to obtain, and they depend on peak calling algorithms, which require careful parameterization and often exclude the majority of the genome. To address these drawbacks, we propose a novel learning technique built upon the Self-Organizing Map (SOM), Self-Organizing Map with Variable Neighborhoods (SOM-VN), to learn a set of representative shapes from a single, genome-wide, chromatin accessibility dataset to associate with a chromatin state assignment in which a particular RE is prevalent. These shapes can then be used to assign chromatin state using our workflow. Results We validate the performance of the SOM-VN workflow on 14 different samples of varying quality, namely one assay each of A549 and GM12878 cell lines and two each of H1 and HeLa cell lines, primary B-cells, and brain, heart, and stomach tissue. We show that SOM-VN learns shapes that are (1) non-random, (2) associated with known chromatin states, (3) generalizable across sets of chromosomes, and (4) associated with magnitude and multimodality. We compare the accuracy of SOM-VN chromatin states against the Clustering Aggregation Tool (CAGT), an unsupervised method that learns chromatin accessibility signal shapes but does not associate these shapes with REs, and we show that overall precision and recall is increased when learning shapes using SOM-VN as compared to CAGT. We further compare enhancer state assignments from SOM-VN in signals above a set threshold to enhancer state assignments from Predicting Enhancers from ATAC-seq Data (PEAS), a deep learning method that assigns enhancer chromatin states to peaks. We show that the precision-recall area under the curve for the assignment of enhancer states is comparable to PEAS. Conclusions Our work shows that the SOM-VN workflow can learn relationships between REs and chromatin accessibility signal shape, which is an important step toward the goal of assigning and comparing enhancer state across multiple experiments and phenotypic states.

Download Full-text

Multi-scale annotations of chromatin states in 127 human cell-types

10.1101/2020.12.22.424078 ◽

2020 ◽

Author(s):

Yan Kai ◽

Stephanos Tsoucas ◽

Shengbao Suo ◽

Guo-Cheng Yuan

Keyword(s):

Human Cell ◽

Biological Function ◽

Cell Types ◽

Chromatin State ◽

Cell Type ◽

Chromatin States ◽

Multi Scale ◽

Public Data ◽

Domain State ◽

Cell Type Specific

AbstractGenome-wide profiling of chromatin states has been widely used to characterize the biological function of non-coding genomic sequences in a cell-type specific manner. However, the systematic, comprehensive annotations of chromatin states from experimental data are challenging and require not just extensive biological knowledge but also sophisticated computational modeling. Previously we developed a hierarchical hidden Markov model, named diHMM, to systematically annotate chromatin states at multiple scales based on the combination of histone mark and chromatin regulator binding profiles. Here, we have improved the method by optimizing computational efficiency and using an ensemble-clustering approach to achieve a unified annotation by integrating information from cell-type-specific models. We then applied this improved method to generate a unified multi-scale chromatin state map in 127 human cell types, based on public data generated by the Epigenome Roadmap and ENCODE consortia. We found cell types with similar origin are typically associated with similar chromatin states, but cultured cell lines have distinct structures than primary cells. The contribution of enhancer elements to gene regulation is mediated by the broader context of domain-state organization. Distinct domain-state patterns are associated with various 3D chromatin structures. As such, we have demonstrated the utility of the multi-scale chromatin state map in characterizing the biological function of the human genome.

Download Full-text

ALTRE: workflow for defining ALTered Regulatory Elements using chromatin accessibility data

10.1101/080564 ◽

2016 ◽

Author(s):

Elizabeth Baskin ◽

Rick Farouni ◽

Ewy A. Mathe

Keyword(s):

Cell Types ◽

R Package ◽

Regulatory Elements ◽

Chromatin Accessibility ◽

Differential Analysis ◽

Genome Wide ◽

Wide Range ◽

R Shiny ◽

Cell Type Specific ◽

Different Cell Types

AbstractSummaryRegulatory elements regulate gene transcription, and their location and accessibility is cell-type specific, particularly for enhancers. Mapping and comparing chromatin accessibility between different cell types may identify mechanisms involved in cellular development and disease progression. To streamline and simplify differential analysis of regulatory elements genome-wide using chromatin accessibility data, such as DNase-seq, ATAC-seq, we developed ALTRE (ALTered Regulatory Elements), an R package and associated R Shiny web app. ALTRE makes such analysis accessible to a wide range of users – from novice to practiced computational biologists.Availabilityhttps://github.com/Mathelab/[email protected]

Download Full-text

Mechanisms of Binding Specificity among bHLH Transcription Factors

International Journal of Molecular Sciences ◽

10.3390/ijms22179150 ◽

2021 ◽

Vol 22 (17) ◽

pp. 9150

Author(s):

Xabier de Martin ◽

Reza Sodaei ◽

Gabriel Santpere

Keyword(s):

Transcription Factors ◽

Binding Sites ◽

Nucleosome Occupancy ◽

Genomic Region ◽

Post Translational Modifications ◽

Helix Loop Helix ◽

Dna Motif ◽

Genome Wide ◽

Bhlh Transcription Factors ◽

Specific Genomic Region

The transcriptome of every cell is orchestrated by the complex network of interaction between transcription factors (TFs) and their binding sites on DNA. Disruption of this network can result in many forms of organism malfunction but also can be the substrate of positive natural selection. However, understanding the specific determinants of each of these individual TF-DNA interactions is a challenging task as it requires integrating the multiple possible mechanisms by which a given TF ends up interacting with a specific genomic region. These mechanisms include DNA motif preferences, which can be determined by nucleotide sequence but also by DNA’s shape; post-translational modifications of the TF, such as phosphorylation; and dimerization partners and co-factors, which can mediate multiple forms of direct or indirect cooperative binding. Binding can also be affected by epigenetic modifications of putative target regions, including DNA methylation and nucleosome occupancy. In this review, we describe how all these mechanisms have a role and crosstalk in one specific family of TFs, the basic helix-loop-helix (bHLH), with a very conserved DNA binding domain and a similar DNA preferred motif, the E-box. Here, we compile and discuss a rich catalog of strategies used by bHLH to acquire TF-specific genome-wide landscapes of binding sites.

Download Full-text

FourCSeq: Analysis of 4C sequencing data

10.1101/009548 ◽

2014 ◽

Cited By ~ 4

Author(s):

Felix A. Klein ◽

Tibor Pakozdi ◽

Simon Anders ◽

Yad Ghavi-Helm ◽

Eileen E. M. Furlong ◽

...

Keyword(s):

Specific Interaction ◽

Cell Types ◽

R Package ◽

Genomic Region ◽

Specific Interactions ◽

Genomic Distance ◽

Sequencing Data ◽

Experimental Conditions ◽

Chromosome Conformation ◽

Z Scores

Abstract Motivation: Circularized Chromosome Conformation Capture (4C) is a powerful technique for studying the spatial interactions of a specific genomic region called the ?view- point? with the rest of the genome, both in a single condition or comparing different experimental conditions or cell types. Observed ligation frequencies show a strong, regular dependence on genomic distance from the viewpoint, on top of which specific interaction peaks are superimposed. Here, we address the computational task to find these specific interactions and to detect changes between interaction profiles of different conditions. Results: We model the overall trend of decreasing interaction frequency with genomic distance by fitting a smooth monotonously decreasing function to suitably trans- formed count data. Based on the fit, z-scores are calculated from the residuals, with high z scores being interpreted as peaks providing evidence for specific interactions. To compare different conditions, we normalize fragment counts between samples, and call for differential contact frequencies using the statisti- cal method DESeq2 adapted from RNA-Seq analysis. Availability and Implementation: A full end-to-end analysis pipeline is implemented in the R package FourCSeq available at www.bioconductor.org.

Download Full-text