scholarly journals Chromatin-informed inference of transcriptional programs in gynecologic and basal breast cancers

2018 ◽  
Author(s):  
Hatice U. Osmanbeyoglu ◽  
Fumiko Shimizu ◽  
Angela Rynne-Vidal ◽  
Petar Jelinic ◽  
Samuel C. Mok ◽  
...  

ABSTRACTEpigenomic data on transcription factor occupancy and chromatin accessibility can elucidate the developmental origin of cancer cells and reveal the enhancer landscape of key oncogenic transcriptional regulators. However, in many cancers, epigenomic analyses have been limited, and computational methods to infer regulatory networks in tumors typically use expression data alone, or rely on transcription factor (TF) motifs in annotated promoter regions. Here, we develop a novel machine learning strategy called PSIONIC (patient-specific inference of networks informed by chromatin) to combine cell line chromatin accessibility data with large tumor expression data sets and model the effect of enhancers on transcriptional programs in multiple cancers. We generated a new ATAC-seq data set profiling chromatin accessibility in gynecologic and basal breast cancer cell lines and applied PSIONIC to 723 RNA-seq experiments from ovarian, uterine, and basal breast tumors as well as 96 cell line RNA-seq profiles. Our computational framework enables us to share information across tumors to learn patient-specific inferred TF activities, revealing regulatory differences between and within tumor types. Many of the identified TF regulators were significantly associated with survival outcome in basal breast, uterine serous and endometrioid carcinomas. Moreover, PSIONIC-predicted activity for MTF1 in cell line models correlated with sensitivity to MTF1 inhibition. Therefore computationally dissecting the role of TFs in gynecologic cancers may ultimately advance personalized therapy.

2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Hatice U. Osmanbeyoglu ◽  
Fumiko Shimizu ◽  
Angela Rynne-Vidal ◽  
Direna Alonso-Curbelo ◽  
Hsuan-An Chen ◽  
...  

Abstract Chromatin accessibility data can elucidate the developmental origin of cancer cells and reveal the enhancer landscape of key oncogenic transcriptional regulators. We develop a computational strategy called PSIONIC (patient-specific inference of networks informed by chromatin) to combine chromatin accessibility data with large tumor expression data and model the effect of enhancers on transcriptional programs in multiple cancers. We generate a new ATAC-seq data profiling chromatin accessibility in gynecologic and basal breast cancer cell lines and apply PSIONIC to 723 patient and 96 cell line RNA-seq profiles from ovarian, uterine, and basal breast cancers. Our computational framework enables us to share information across tumors to learn patient-specific TF activities, revealing regulatory differences between and within tumor types. PSIONIC-predicted activity for MTF1 in cell line models correlates with sensitivity to MTF1 inhibition, showing the potential of our approach for personalized therapy. Many identified TFs are significantly associated with survival outcome. To validate PSIONIC-derived prognostic TFs, we perform immunohistochemical analyses in 31 uterine serous tumors for ETV6 and 45 basal breast tumors for MITF and confirm that the corresponding protein expression patterns are also significantly associated with prognosis.


2018 ◽  
Author(s):  
Ivan Berest ◽  
Christian Arnold ◽  
Armando Reyes-Palomares ◽  
Giovanni Palla ◽  
Kasper Dindler Rasmussen ◽  
...  

Transcription factor (TF) activity is an important read-out of cellular signalling pathways and thus to assess regulatory differences across conditions. However, current technologies lack the ability to simultaneously assess activity changes for multiple TFs and in particular to determine whether a specific TF acts globally as transcriptional repressor or activator. To this end, we introduce a widely applicable genome-wide methoddiffTFto assess differential TF activity and to classify TFs as activator or repressor (available athttps://git.embl.de/grp-zaugg/diffTF). This is done by integrating any type of genome-wide chromatin accessibility data with RNA-Seq data and in-silico predicted TF binding sites. We corroborated the classification of TFs into repressors and activators by three independent analyses based on enrichments of active/repressive chromatin states, correlation of TF activity with gene expression, and activator-and repressor-specific chromatin footprints. To show the power ofdiffTF, we present two case studies: First, we applieddiffTFin to a large ATAC-Seq/RNA-Seq dataset comparing mutated and unmutated chronic lymphocytic leukemia samples, where we identified dozens of known (40%) and potentially novel (60%) TFs that are differentially active. We were also able to classify almost half of them as either repressor and activator. Second, we applieddiffTFto a small ATAC-Seq/RNA-Seq data set comparing two cell types along the hematopoietic differentiation trajectory (multipotent progenitors – MPP – versus granulocyte-macrophage progenitors – GMP). Here we identified the known drivers of differentiation and found that the majority of the differentially active TFs are transcriptional activators. Overall,diffTFwas able to recover the known TFs in both case studies, additionally identified TFs that have been less well characterized in the given condition, and provides a classification of the TFs into transcriptional activators and repressors.


2020 ◽  
Vol 4 (Supplement_1) ◽  
Author(s):  
Frederique Murielle Ruf-Zamojski ◽  
Michel A Zamojski ◽  
German Nudelman ◽  
Yongchao Ge ◽  
Natalia Mendelev ◽  
...  

Abstract The pituitary gland is a critical regulator of the neuroendocrine system. To further our understanding of the classification, cellular heterogeneity, and regulatory landscape of pituitary cell types, we performed and computationally integrated single cell (SC)/single nucleus (SN) resolution experiments capturing RNA expression, chromatin accessibility, and DNA methylation state from mouse dissociated whole pituitaries. Both SC and SN transcriptome analysis and promoter accessibility identified the five classical hormone-producing cell types (somatotropes, gonadotropes (GT), lactotropes, thyrotropes, and corticotropes). GT cells distinctively expressed transcripts for Cga, Fshb, Lhb, Nr5a1, and Gnrhr in SC RNA-seq and SN RNA-seq. This was matched in SN ATAC-seq with GTs specifically showing open chromatin at the promoter regions for the same genes. Similarly, the other classically defined anterior pituitary cells displayed transcript expression and chromatin accessibility patterns characteristic of their own cell type. This integrated analysis identified additional cell-types, such as a stem cell cluster expressing transcripts for Sox2, Sox9, Mia, and Rbpms, and a broadly accessible chromatin state. In addition, we performed bulk ATAC-seq in the LβT2b gonadotrope-like cell line. While the FSHB promoter region was closed in the cell line, we identified a region upstream of Fshb that became accessible by the synergistic actions of GnRH and activin A, and that corresponded to a conserved region identified by a polycystic ovary syndrome (PCOS) single nucleotide polymorphism (SNP). Although this locus appears closed in deep sequencing bulk ATAC-seq of dissociated mouse pituitary cells, SN ATAC-seq of the same preparation showed that this site was specifically open in mouse GT, but closed in 14 other pituitary cell type clusters. This discrepancy highlighted the detection limit of a bulk ATAC-seq experiment in a subpopulation, as GT represented ~5% of this dissociated anterior pituitary sample. These results identified this locus as a candidate for explaining the dual dependence of Fshb expression on GnRH and activin/TGFβ signaling, and potential new evidence for upstream regulation of Fshb. The pituitary epigenetic landscape provides a resource for improved cell type identification and for the investigation of the regulatory mechanisms driving cell-to-cell heterogeneity. Additional authors not listed due to abstract submission restrictions: N. Seenarine, M. Amper, N. Jain (ISMMS).


2016 ◽  
Author(s):  
David Felix Lamparter ◽  
Daniel Marbach ◽  
Rico Rueedi ◽  
Sven Bergmann ◽  
Zoltan Kutalik

To better understand genome regulation, it is important to uncover the role of transcription factors in the process of chromatin structure establishment and maintenance. Here we present a data-driven approach to systematically characterize transcription factors that are relevant for this process. Our method uses a linear mixed modeling approach to combine data sets of transcription factor binding motif enrichments in open chromatin and gene expression across the same set of cell lines. Applying this approach to the ENCODE data set we confirm already known and imply numerous novel transcription factors in playing a role in the establishment or maintenance of open chromatin.


2021 ◽  
Vol 15 (Supplement_1) ◽  
pp. S062-S062
Author(s):  
A Lewis ◽  
B Pan-Castillo ◽  
G Berti ◽  
C Felice ◽  
H Gordon ◽  
...  

Abstract Background Histone-deacetylase (HDAC) enzymes are a broad class of ubiquitously expressed enzymes that modulate histone acetylation, chromatin accessibility and gene expression. In models of Inflammatory bowel disease (IBD), HDAC inhibitors, such as Valproic acid (VPA) are proven anti-inflammatory agents and evidence suggests that they also inhibit fibrosis in non-intestinal organs. However, the role of HDAC enzymes in stricturing Crohn’s disease (CD) has not been characterised; this is key to understanding the molecular mechanism and developing novel therapies. Methods To evaluate HDAC expression in the intestine of SCD patients, we performed unbiased single-cell RNA sequencing (sc-RNA-seq) of over 10,000 cells isolated from full-thickness surgical resection specimens of non-SCD (NSCD; n=2) and SCD intestine (n=3). Approximately, 1000 fibroblasts were identified for further analysis, including a distinct cluster of myofibroblasts. Changes in gene expression were compared between myofibroblasts and other resident intestinal fibroblasts using the sc-RNA-seq analysis pipeline in Partek. Changes in HDAC expression and markers of HDAC activity (H3K27ac) were confirmed by immunohistochemistry in FFPE tissue from patient matched NSCD and SCD intestine (n=14 pairs). The function of HDACs in intestinal fibroblasts in the CCD-18co cell line and primary CD myofibroblast cultures (n=16 cultures) was assessed using VPA, a class I HDAC inhibitor. Cells were analysed using a variety of molecular techniques including ATAC-seq, gene expression arrays, qPCR, western blot and immunofluorescent protein analysis. Results Class I HDAC (HDAC1, p= 2.11E-11; HDAC2, p= 4.28E-11; HDAC3, p= 1.60E-07; and HDAC8, p= 2.67E-03) expression was increased in myofibroblasts compared to other intestinal fibroblasts subtypes. IHC also showed an increase in the percentage of stromal HDAC2 positive cells, coupled with a decrease in the percentage of H3K27ac positive cells, in the mucosa overlying SCD intestine relative to matched NSCD areas. In the CCD-18co cell line and primary myofibroblast cultures, VPA reduced chromatin accessibility at Collagen-I gene promoters and suppressed their transcription. VPA also inhibited TGFB-induced up-regulation of Collagen-I, in part by inhibiting TGFB1|1/SMAD4 signalling. TGFB1|1 was identified as a mesenchymal specific target of VPA and siRNA knockdown of TGFB1|1 was sufficient suppress TGFB-induced up-regulation of Collagen-I. Conclusion In SCD patients, class I HDAC expression is increased in myofibroblasts. Class I HDACs inhibitors impair TGFB-signalling and inhibit Collagen-I expression. Selective targeting of TGFB1|1 offers the opportunity to increase treatment specificity by selectively targeting meschenymal cells.


Blood ◽  
2016 ◽  
Vol 128 (22) ◽  
pp. 3879-3879
Author(s):  
Vivek Behera ◽  
Perry Evans ◽  
Carolyne J Face ◽  
Laavanya Sankaranarayanan ◽  
Gerd A. Blobel

Abstract Erythroid transcription factors (TFs) control gene expression programs, lineage decisions, and disease outcomes. How transcription factors contact DNA has been studied extensively in vitro, but in vivo binding characteristics are less well understood as they are influenced in a reciprocal manner by chromatin accessibility and neighboring transcription factors. Here, we present a comparative analysis approach that takes advantage of non-coding sequence variation between functionally equivalent erythroid cell lines to conduct an in-depth analysis of erythroid TF binding profiles and chromatin features. Specifically, we analyzed ChIP-seq datasets to identify millions of genetic non-coding variants between the mouse erythroleukemia cell line (MEL), a GATA1-inducible erythroid progenitor cell line (G1E-ER4), and primary murine erythroblast cells. We found that while these cell lines are highly positively correlated in chromatin features, larger differences in TF binding intensity are correlated with higher degrees of genetic variation between cell lines. We next examined discriminatory genetic variants between the cell lines that are located in ChIP-seq peaks of the erythroid transcription factor GATA1. Hundreds of such variants fall within GATA1 motifs. Differential GATA1 binding intensities associated with the variants revealed nucleotide positions that contribute most to in vivo GATA1 chromatin occupancy and identified which alternative nucleotides are most likely to disrupt binding. Notably, this additional information about GATA1's in vivo nucleotide binding preferences improved prediction of GATA1 binding sites genome-wide. We applied similar approaches to determine the bp-resolution in vivo binding preferences of TAL1/SCL and CTCF. We additionally identified thousands of discriminatory genetic variants within GATA1 sites that fall outside canonical GATA elements but within binding sites of other known TFs. Association of these variants with differential GATA1 binding intensities revealed that the hematopoietic transcription factors TAL1/SCL and KLF1 positively regulate GATA1 chromatin occupancy. Strikingly, we identified a number of motifs not previously implicated in cooperating with GATA1 that positively impact GATA1 chromatin binding. Notably, we also defined motifs associated with negative regulation of GATA1 chromatin occupancy. Applying a similar analysis to TAL1/SCL and CTCF revealed additional motifs involved in regulating the chromatin occupancy of these TFs. Finally, we associated discriminatory genetic variation between erythroid cell lines with large changes in sub-kb-scale DNase hypersensitivity. We found that single base pair substitutions within or near a number of erythroid TF motifs, including that for the RUNX family of nuclear factors, are strongly associated with changes in chromatin accessibility. Our findings use novel methods in comparative ChIP-seq and DNase-seq analysis to reveal new insights about the genetic basis for erythroid TF chromatin occupancy and chromatin accessibility. Disclosures No relevant conflicts of interest to declare.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Hamid R. Eghbalnia ◽  
William W. Wilfinger ◽  
Karol Mackey ◽  
Piotr Chomczynski

Abstract RNA-Seq expression analysis currently relies primarily upon exon expression data. The recognized role of introns during translation, and the presence of substantial RNA-Seq counts attributable to introns, provide the rationale for the simultaneous consideration of both exon and intron data. We describe here a method for the coordinated analysis of exon and intron data by investigating their relationship within individual genes and across samples, while taking into account changes in both variability and expression level. This coordinated analysis of exon and intron data offers strong evidence for significant differences that distinguish the profiles of the exon-only expression data from the combined exon and intron data. One advantage of our proposed method, called matched change characterization for exons and introns (MEI), is its straightforward applicability to existing archived data using small modifications to standard RNA-Seq pipelines. Using MEI, we demonstrate that when data are examined for changes in variability across control and case conditions, novel differential changes can be detected. Notably, when MEI criteria were employed in the analysis of an archived data set involving polyarthritic subjects, the number of differentially expressed genes was expanded by sevenfold. More importantly, the observed changes in exon and intron variability with statistically significant false discovery rates could be traced to specific immune pathway gene networks. The application of MEI analysis provides a strategy for incorporating the significance of exon and intron variability and further developing the role of using both exons and intron sequencing counts in studies of gene regulatory processes.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 2042-2042
Author(s):  
Ainhoa Hernandez Gonzalez ◽  
Anna Esteve-Codina ◽  
Cristina Carrato ◽  
Ana Munoz ◽  
Estela Pineda ◽  
...  

2042 Background: Malignant gliomas are heterogeneous diseases in genetic basis. The development of sequencing techniques, such as RNA-Sequencing, has identified many gene rearrangements encoding novel oncogenic fusions. Gene fusion discovery can potentially lead to the development of novel treatments, however studies of gene fusions in glioma remain limited. Methods: The GLIOCAT project studied 139 patient samples of newly diagnosed glioblastoma who had received the standard first-line treatment from 2004 to 2015, to identify gene fusion events from glioblastoma transcriptome data (RNA-Seq). The molecular subtype could be studied in 124 cases. RNA-Seq reads were mapped against the reference human genome with STAR-fusion version 0.7.0, specifically, with FusionInspector validate ( http://star-fusion.github.io ). Two other platforms, FusionHub ( https://fusionhub.persistent.co.in ) and Oncofuse ( www.unav.es/genetica/oncofuse.html ), were applied to eliminate false positives or previously described in healthy tissue and to predict of the oncogenic potential each fusion. Results: A total of61 patients showed 103 different fusions, a median of two fusions by sample. The majority of gene fusions were intrachromosomal and most frequently implied chromosome was 12 followed by 7. In addition, fusions were more common in patients with MGMT promoter methylation, TCGA classical subtype and 18 IGS subtype. There were no differences in age, sex, type of surgery or long survivors ( > 30 months). Ten fusions were already described in cancer, including three in gliomas (FRS2-KIF5A, EGFR-SEPT14 and FGFR3-TACC3). From the detected fusions, 22 of them included an oncogene or protooncogene. Conclusions: In our study, we report the landscape of gene fusions from a large data set of glioblastomas analyzed by RNA-seq. The majority of the fusions were private fusions. A minority of these recur in a low frequency but as many as a quarter of them included an oncogene or protooncogene. RNA-seq of GBM patient samples it is an important tool for the identification of patient-specific fusions that could drive personalized therapy. Furtherless, we will plan to validate this gene fusions.


2013 ◽  
Vol 42 (5) ◽  
pp. 2820-2832 ◽  
Author(s):  
Nicolas Philippe ◽  
Elias Bou Samra ◽  
Anthony Boureux ◽  
Alban Mancheron ◽  
Florence Rufflé ◽  
...  

Abstract Recent sequencing technologies that allow massive parallel production of short reads are the method of choice for transcriptome analysis. Particularly, digital gene expression (DGE) technologies produce a large dynamic range of expression data by generating short tag signatures for each cell transcript. These tags can be mapped back to a reference genome to identify new transcribed regions that can be further covered by RNA-sequencing (RNA-Seq) reads. Here, we applied an integrated bioinformatics approach that combines DGE tags, RNA-Seq, tiling array expression data and species-comparison to explore new transcriptional regions and their specific biological features, particularly tissue expression or conservation. We analysed tags from a large DGE data set (designated as ‘TranscriRef’). We then annotated 750 000 tags that were uniquely mapped to the human genome according to Ensembl. We retained transcripts originating from both DNA strands and categorized tags corresponding to protein-coding genes, antisense, intronic- or intergenic-transcribed regions and computed their overlap with annotated non-coding transcripts. Using this bioinformatics approach, we identified ∼34 000 novel transcribed regions located outside the boundaries of known protein-coding genes. As demonstrated using sequencing data from human pluripotent stem cells for biological validation, the method could be easily applied for the selection of tissue-specific candidate transcripts. DigitagCT is available at http://cractools.gforge.inria.fr/softwares/digitagct.


Sign in / Sign up

Export Citation Format

Share Document