scholarly journals chromstaR: Tracking combinatorial chromatin state dynamics in space and time

2016 ◽  
Author(s):  
Aaron Taudt ◽  
Minh Anh Nguyen ◽  
Matthias Heinig ◽  
Frank Johannes ◽  
Maria Colomé-Tatché

AbstractBackgroundPost-translational modifications of histone residue tails are an important component of genome regulation. It is becoming increasingly clear that the combinatorial presence and absence of various modifications define discrete chromatin states which determine the functional properties of a locus. An emerging experimental goal is to track changes in chromatin state maps across different conditions, such as experimental treatments, cell-types or developmental time points.ResultsHere we present chromstaR, an algorithm for the computational inference of combinatorial chromatin state dynamics across an arbitrary number of conditions. ChromstaR uses a multivariate Hidden Markov Model to determine the number of discrete combinatorial chromatin states using multiple ChIP-seq experiments as input and assigns every genomic region to a state based on the presence/absence of each modification in every condition. We demonstrate the advantages of chromstaR in the context of three common experimental data scenarios. First, we study how different histone modifications combine to form combinatorial chromatin states in a single tissue. Second, we infer genome-wide patterns of combinatorial state differences between two cell types or conditions. Finally, we study the dynamics of combinatorial chromatin states during tissue differentiation involving up to six differentiation points. Our findings reveal a striking sparcity in the combinatorial organization and temporal dynamics of chromatin state maps.ConclusionschromstaR is a versatile computational tool that facilitates a deeper biological understanding of chromatin organization and dynamics. The algorithm is implemented as an R-package and freely available from http://bioconductor.org/packages/chromstaR/.

2020 ◽  
Author(s):  
Arjan van der Velde ◽  
Kaili Fan ◽  
Junko Tsuji ◽  
Jill Moore ◽  
Michael Purcaro ◽  
...  

ABSTRACTThe morphologically and functionally distinct cell types of a multicellular organism are maintained by epigenomes and gene expression programs. Phase III of the ENCODE Project profiled 66 mouse epigenomes across twelve tissues at daily intervals from embryonic day 10.5 to birth. Applying the ChromHMM algorithm to these epigenomes, we annotated eighteen chromatin states with characteristics of promoters, enhancers, transcribed regions, repressed regions, and quiescent regions throughout the developmental time course. Our integrative analyses delineate the tissue specificity and developmental trajectory of the loci in these chromatin states. Approximately 0.3% of each epigenome is assigned to a bivalent chromatin state, which harbors both active marks and the repressive mark H3K27me3. Highly evolutionarily conserved, these loci are enriched in silencers bound by Polycomb Repressive Complex proteins and the transcription start sites of their silenced target genes. This collection of chromatin state assignments provides a useful resource for studying mammalian development.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Arjan van der Velde ◽  
Kaili Fan ◽  
Junko Tsuji ◽  
Jill E. Moore ◽  
Michael J. Purcaro ◽  
...  

AbstractThe morphologically and functionally distinct cell types of a multicellular organism are maintained by their unique epigenomes and gene expression programs. Phase III of the ENCODE Project profiled 66 mouse epigenomes across twelve tissues at daily intervals from embryonic day 11.5 to birth. Applying the ChromHMM algorithm to these epigenomes, we annotated eighteen chromatin states with characteristics of promoters, enhancers, transcribed regions, repressed regions, and quiescent regions. Our integrative analyses delineate the tissue specificity and developmental trajectory of the loci in these chromatin states. Approximately 0.3% of each epigenome is assigned to a bivalent chromatin state, which harbors both active marks and the repressive mark H3K27me3. Highly evolutionarily conserved, these loci are enriched in silencers bound by polycomb repressive complex proteins, and the transcription start sites of their silenced target genes. This collection of chromatin state assignments provides a useful resource for studying mammalian development.


2019 ◽  
Author(s):  
Bushra Raj ◽  
Jeffrey A. Farrell ◽  
Aaron McKenna ◽  
Jessica L. Leslie ◽  
Alexander F. Schier

ABSTRACTNeurogenesis in the vertebrate brain comprises many steps ranging from the proliferation of progenitors to the differentiation and maturation of neurons. Although these processes are highly regulated, the landscape of transcriptional changes and progenitor identities underlying brain development are poorly characterized. Here, we describe the first developmental single-cell RNA-seq catalog of more than 200,000 zebrafish brain cells encompassing 12 stages from 12 hours post-fertilization to 15 days post-fertilization. We characterize known and novel gene markers for more than 800 clusters across these timepoints. Our results capture the temporal dynamics of multiple neurogenic waves from embryo to larva that expand neuronal diversity from ∼20 cell types at 12 hpf to ∼100 cell types at 15 dpf. We find that most embryonic neural progenitor states are transient and transcriptionally distinct from long-lasting neural progenitors of post-embryonic stages. Furthermore, we reconstruct cell specification trajectories for the retina and hypothalamus, and identify gene expression cascades and novel markers. Our analysis reveal that late-stage retinal neural progenitors transcriptionally overlap cell states observed in the embryo, while hypothalamic neural progenitors become progressively distinct with developmental time. These data provide the first comprehensive single-cell transcriptomic time course for vertebrate brain development and suggest distinct neurogenic regulatory paradigms between different stages and tissues.


2017 ◽  
Author(s):  
David U. Gorkin ◽  
Iros Barozzi ◽  
Yanxiao Zhang ◽  
Ah Young Lee ◽  
Bin Li ◽  
...  

SUMMARYEmbryogenesis requires epigenetic information that allows each cell to respond appropriately to developmental cues. Histone modifications are core components of a cell’s epigenome, giving rise to chromatin states that modulate genome function. Here, we systematically profile histone modifications in a diverse panel of mouse tissues at 8 developmental stages from 10.5 days post conception until birth, performing a total of 1,128 ChIP-seq assays across 72 distinct tissue-stages. We combine these histone modification profiles into a unified set of chromatin state annotations, and track their activity across developmental time and space. Through integrative analysis we identify dynamic enhancers, reveal key transcriptional regulators, and characterize the role of chromatin-based repression in developmental gene regulation. We also leverage these data to link enhancers to putative target genes, revealing connections between coding and non-coding sequence variation in disease etiology. Our study provides a compendium of resources for biomedical researchers, and achieves the most comprehensive view of embryonic chromatin states to date.


2019 ◽  
Author(s):  
Surag Nair ◽  
Daniel S. Kim ◽  
Jacob Perricone ◽  
Anshul Kundaje

AbstractMotivationGenome-wide profiles of chromatin accessibility and gene expression in diverse cellular contexts are critical to decipher the dynamics of transcriptional regulation. Recently, convolutional neural networks (CNNs) have been used to learn predictive cis-regulatory DNA sequence models of context-specific chromatin accessibility landscapes. However, these context-specific regulatory sequence models cannot generalize predictions across cell types.ResultsWe introduce multi-modal, residual neural network architectures that integrate cis-regulatory sequence and context-specific expression of trans-regulators to predict genome-wide chromatin accessibility profiles across cellular contexts. We show that the average accessibility of a genomic region across training contexts can be a surprisingly powerful predictor. We leverage this feature and employ novel strategies for training models to enhance genome-wide prediction of shared and context-specific chromatin accessible sites across cell types. We interpret the models to reveal insights into cis and trans regulation of chromatin dynamics across 123 diverse cellular contexts.AvailabilityThe code is available athttps://github.com/kundajelab/[email protected]


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Tara Eicher ◽  
Jany Chan ◽  
Han Luu ◽  
Raghu Machiraju ◽  
Ewy A. Mathé

Abstract Background Assigning chromatin states genome-wide (e.g. promoters, enhancers, etc.) is commonly performed to improve functional interpretation of these states. However, computational methods to assign chromatin state suffer from the following drawbacks: they typically require data from multiple assays, which may not be practically feasible to obtain, and they depend on peak calling algorithms, which require careful parameterization and often exclude the majority of the genome. To address these drawbacks, we propose a novel learning technique built upon the Self-Organizing Map (SOM), Self-Organizing Map with Variable Neighborhoods (SOM-VN), to learn a set of representative shapes from a single, genome-wide, chromatin accessibility dataset to associate with a chromatin state assignment in which a particular RE is prevalent. These shapes can then be used to assign chromatin state using our workflow. Results We validate the performance of the SOM-VN workflow on 14 different samples of varying quality, namely one assay each of A549 and GM12878 cell lines and two each of H1 and HeLa cell lines, primary B-cells, and brain, heart, and stomach tissue. We show that SOM-VN learns shapes that are (1) non-random, (2) associated with known chromatin states, (3) generalizable across sets of chromosomes, and (4) associated with magnitude and multimodality. We compare the accuracy of SOM-VN chromatin states against the Clustering Aggregation Tool (CAGT), an unsupervised method that learns chromatin accessibility signal shapes but does not associate these shapes with REs, and we show that overall precision and recall is increased when learning shapes using SOM-VN as compared to CAGT. We further compare enhancer state assignments from SOM-VN in signals above a set threshold to enhancer state assignments from Predicting Enhancers from ATAC-seq Data (PEAS), a deep learning method that assigns enhancer chromatin states to peaks. We show that the precision-recall area under the curve for the assignment of enhancer states is comparable to PEAS. Conclusions Our work shows that the SOM-VN workflow can learn relationships between REs and chromatin accessibility signal shape, which is an important step toward the goal of assigning and comparing enhancer state across multiple experiments and phenotypic states.


2020 ◽  
Author(s):  
Yan Kai ◽  
Stephanos Tsoucas ◽  
Shengbao Suo ◽  
Guo-Cheng Yuan

AbstractGenome-wide profiling of chromatin states has been widely used to characterize the biological function of non-coding genomic sequences in a cell-type specific manner. However, the systematic, comprehensive annotations of chromatin states from experimental data are challenging and require not just extensive biological knowledge but also sophisticated computational modeling. Previously we developed a hierarchical hidden Markov model, named diHMM, to systematically annotate chromatin states at multiple scales based on the combination of histone mark and chromatin regulator binding profiles. Here, we have improved the method by optimizing computational efficiency and using an ensemble-clustering approach to achieve a unified annotation by integrating information from cell-type-specific models. We then applied this improved method to generate a unified multi-scale chromatin state map in 127 human cell types, based on public data generated by the Epigenome Roadmap and ENCODE consortia. We found cell types with similar origin are typically associated with similar chromatin states, but cultured cell lines have distinct structures than primary cells. The contribution of enhancer elements to gene regulation is mediated by the broader context of domain-state organization. Distinct domain-state patterns are associated with various 3D chromatin structures. As such, we have demonstrated the utility of the multi-scale chromatin state map in characterizing the biological function of the human genome.


2016 ◽  
Author(s):  
Elizabeth Baskin ◽  
Rick Farouni ◽  
Ewy A. Mathe

AbstractSummaryRegulatory elements regulate gene transcription, and their location and accessibility is cell-type specific, particularly for enhancers. Mapping and comparing chromatin accessibility between different cell types may identify mechanisms involved in cellular development and disease progression. To streamline and simplify differential analysis of regulatory elements genome-wide using chromatin accessibility data, such as DNase-seq, ATAC-seq, we developed ALTRE (ALTered Regulatory Elements), an R package and associated R Shiny web app. ALTRE makes such analysis accessible to a wide range of users – from novice to practiced computational biologists.Availabilityhttps://github.com/Mathelab/[email protected]


2021 ◽  
Vol 22 (17) ◽  
pp. 9150
Author(s):  
Xabier de Martin ◽  
Reza Sodaei ◽  
Gabriel Santpere

The transcriptome of every cell is orchestrated by the complex network of interaction between transcription factors (TFs) and their binding sites on DNA. Disruption of this network can result in many forms of organism malfunction but also can be the substrate of positive natural selection. However, understanding the specific determinants of each of these individual TF-DNA interactions is a challenging task as it requires integrating the multiple possible mechanisms by which a given TF ends up interacting with a specific genomic region. These mechanisms include DNA motif preferences, which can be determined by nucleotide sequence but also by DNA’s shape; post-translational modifications of the TF, such as phosphorylation; and dimerization partners and co-factors, which can mediate multiple forms of direct or indirect cooperative binding. Binding can also be affected by epigenetic modifications of putative target regions, including DNA methylation and nucleosome occupancy. In this review, we describe how all these mechanisms have a role and crosstalk in one specific family of TFs, the basic helix-loop-helix (bHLH), with a very conserved DNA binding domain and a similar DNA preferred motif, the E-box. Here, we compile and discuss a rich catalog of strategies used by bHLH to acquire TF-specific genome-wide landscapes of binding sites.


2014 ◽  
Author(s):  
Felix A. Klein ◽  
Tibor Pakozdi ◽  
Simon Anders ◽  
Yad Ghavi-Helm ◽  
Eileen E. M. Furlong ◽  
...  

Abstract Motivation: Circularized Chromosome Conformation Capture (4C) is a powerful technique for studying the spatial interactions of a specific genomic region called the ?view- point? with the rest of the genome, both in a single condition or comparing different experimental conditions or cell types. Observed ligation frequencies show a strong, regular dependence on genomic distance from the viewpoint, on top of which specific interaction peaks are superimposed. Here, we address the computational task to find these specific interactions and to detect changes between interaction profiles of different conditions. Results: We model the overall trend of decreasing interaction frequency with genomic distance by fitting a smooth monotonously decreasing function to suitably trans- formed count data. Based on the fit, z-scores are calculated from the residuals, with high z scores being interpreted as peaks providing evidence for specific interactions. To compare different conditions, we normalize fragment counts between samples, and call for differential contact frequencies using the statisti- cal method DESeq2 adapted from RNA-Seq analysis. Availability and Implementation: A full end-to-end analysis pipeline is implemented in the R package FourCSeq available at www.bioconductor.org.


Sign in / Sign up

Export Citation Format

Share Document