Chromatin interaction analysis with updated ChIA-PET Tool (V3)

Mapping Intimacies ◽

10.1101/627257 ◽

2019 ◽

Author(s):

Guoliang Li ◽

Tongkai Sun ◽

Huidan Chang ◽

Liuyang Cai ◽

Ping Hong ◽

...

Keyword(s):

Data Analysis ◽

Target Genes ◽

Sequence Data ◽

Interaction Analysis ◽

Regulatory Elements ◽

Chromatin Interaction ◽

Data Set ◽

Chromatin Interactions ◽

A Genome ◽

Public Data

AbstractUnderstanding chromatin interactions is important since they create chromosome conformation and link the cis- and trans-regulatory elements to their target genes for transcriptional regulation. Chromatin Interaction Analysis with Paired-End Tag (ChIA-PET) sequencing is a genome-wide high-throughput technology that detects chromatin interactions associated with a specific protein of interest. Previously we developed ChIA-PET Tool in 2010 for ChIA-PET data analysis. Here we present the updated version of ChIA-PET Tool (V3), is a computational package to process the next-generation sequence data generated from ChIA-PET experiments. It processes the short-read data and long-read ChIA-PET data with multithreading and generates the statistics of results in a HTML file. In this paper, we provide a detailed demonstration of the design of ChIA-PET Tool V3 and how to install it and analyze a specific ChIA-PET data set with it. At present, other ChIA-PET data analysis tools have developed including ChiaSig, MICC, Mango and ChIA-PET2 and so on. We compared our tool with other tools using the same public data set in the same machine. Most of peaks detected by ChIA-PET Tool V3 overlap with those from other tools. There is higher enrichment for significant chromatin interactions of ChIA-PET Tool V3 in APA plot. ChIA-PET Tool V3 is open source and is available at GitHub (https://github.com/GuoliangLi-HZAU/ChIA-PET_Tool_V3/).

Download Full-text

Chromatin Interaction Analysis with Updated ChIA-PET Tool (V3)

Genes ◽

10.3390/genes10070554 ◽

2019 ◽

Vol 10 (7) ◽

pp. 554 ◽

Cited By ~ 4

Author(s):

Li ◽

Sun ◽

Chang ◽

Cai ◽

Hong ◽

...

Keyword(s):

Rna Polymerase Ii ◽

Target Genes ◽

Sequence Data ◽

Interaction Analysis ◽

Regulatory Elements ◽

Chromatin Interaction ◽

Data Set ◽

Chromatin Interactions ◽

A Genome ◽

Public Data

Understanding chromatin interactions is important because they create chromosome conformation and link the cis- and trans- regulatory elements to their target genes for transcriptional regulation. Chromatin Interaction Analysis with Paired-End Tag (ChIA-PET) sequencing is a genome-wide high-throughput technology that detects chromatin interactions associated with a specific protein of interest. We developed ChIA-PET Tool for ChIA-PET data analysis in 2010. Here, we present the updated version of ChIA-PET Tool (V3) as a computational package to process the next-generation sequence data generated from ChIA-PET experiments. It processes short-read and long-read ChIA-PET data with multithreading and generates statistics of results in an HTML file. In this paper, we provide a detailed demonstration of the design of ChIA-PET Tool V3 and how to install it and analyze RNA polymerase II (RNAPII) ChIA-PET data from human K562 cells with it. We compared our tool with existing tools, including ChiaSig, MICC, Mango and ChIA-PET2, by using the same public data set in the same computer. Most peaks detected by the ChIA-PET Tool V3 overlap with those of other tools. There is higher enrichment for significant chromatin interactions from ChIA-PET Tool V3 in aggregate peak analysis (APA) plots. The ChIA-PET Tool V3 is publicly available at GitHub.

Download Full-text

The 3D Genome Browser: a web-based browser for visualizing 3D genome organization and long-range chromatin interactions

10.1101/112268 ◽

2017 ◽

Cited By ~ 34

Author(s):

Yanli Wang ◽

Bo Zhang ◽

Lijun Zhang ◽

Lin An ◽

Jie Xu ◽

...

Keyword(s):

Genome Organization ◽

Target Genes ◽

Genome Structure ◽

Regulatory Elements ◽

Genomic Region ◽

Genome Browser ◽

Chromatin Interaction ◽

Interaction Data ◽

3D Genome ◽

Chromatin Interactions

ABSTRACTRecent advent of 3C-based technologies such as Hi-C and ChIA-PET provides us an opportunity to explore chromatin interactions and 3D genome organization in an unprecedented scale and resolution. However, it remains a challenge to visualize chromatin interaction data due to its size and complexity. Here, we introduce the 3D Genome Browser (http://3dgenome.org), which allows users to conveniently explore both publicly available and their own chromatin interaction data. Users can also seamlessly integrate other “omics” data sets, such as ChIP-Seq and RNA-Seq for the same genomic region, to gain a complete view of both regulatory landscape and 3D genome structure for any given gene. Finally, our browser provides multiple methods to link distal cis-regulatory elements with their potential target genes, including virtual 4C, ChIA-PET, Capture Hi-C and cross-cell-type correlation of proximal and distal DNA hypersensitive sites, and therefore represents a valuable resource for the study of gene regulation in mammalian genomes.

Download Full-text

SHI7 Is a Self-Learning Pipeline for Multipurpose Short-Read DNA Quality Control

mSystems ◽

10.1128/msystems.00202-17 ◽

2018 ◽

Vol 3 (3) ◽

Cited By ~ 15

Author(s):

Gabriel A. Al-Ghalith ◽

Benjamin Hillmann ◽

Kaiwei Ang ◽

Robin Shields-Cutler ◽

Dan Knights

Keyword(s):

Quality Control ◽

Dna Sequences ◽

Sequence Data ◽

Background Knowledge ◽

Sequencing Technology ◽

Data Set ◽

Short Read ◽

Dna Quality ◽

Public Data ◽

User Friendly

ABSTRACT Next-generation sequencing technology is of great importance for many biological disciplines; however, due to technical and biological limitations, the short DNA sequences produced by modern sequencers require numerous quality control (QC) measures to reduce errors, remove technical contaminants, or merge paired-end reads together into longer or higher-quality contigs. Many tools for each step exist, but choosing the appropriate methods and usage parameters can be challenging because the parameterization of each step depends on the particularities of the sequencing technology used, the type of samples being analyzed, and the stochasticity of the instrumentation and sample preparation. Furthermore, end users may not know all of the relevant information about how their data were generated, such as the expected overlap for paired-end sequences or type of adaptors used to make informed choices. This increasing complexity and nuance demand a pipeline that combines existing steps together in a user-friendly way and, when possible, learns reasonable quality parameters from the data automatically. We propose a user-friendly quality control pipeline called SHI7 (canonically pronounced “shizen”), which aims to simplify quality control of short-read data for the end user by predicting presence and/or type of common sequencing adaptors, what quality scores to trim, whether the data set is shotgun or amplicon sequencing, whether reads are paired end or single end, and whether pairs are stitchable, including the expected amount of pair overlap. We hope that SHI7 will make it easier for all researchers, expert and novice alike, to follow reasonable practices for short-read data quality control. IMPORTANCE Quality control of high-throughput DNA sequencing data is an important but sometimes laborious task requiring background knowledge of the sequencing protocol used (such as adaptor type, sequencing technology, insert size/stitchability, paired-endedness, etc.). Quality control protocols typically require applying this background knowledge to selecting and executing numerous quality control steps with the appropriate parameters, which is especially difficult when working with public data or data from collaborators who use different protocols. We have created a streamlined quality control pipeline intended to substantially simplify the process of DNA quality control from raw machine output files to actionable sequence data. In contrast to other methods, our proposed pipeline is easy to install and use and attempts to learn the necessary parameters from the data automatically with a single command.

Download Full-text

Genome-Wide, Integrative Analysis Implicates Exosome-Derived MicroRNA Dysregulation in Schizophrenia

Schizophrenia Bulletin ◽

10.1093/schbul/sby191 ◽

2019 ◽

Vol 45 (6) ◽

pp. 1257-1266 ◽

Cited By ~ 12

Author(s):

Yang Du ◽

Yun Yu ◽

Yang Hu ◽

Xiao-Wan Li ◽

Ze-Xu Wei ◽

...

Keyword(s):

Target Genes ◽

Sequence Data ◽

Protein Glycosylation ◽

Mirna Sequence ◽

First Episode ◽

Data Set ◽

Protein Levels ◽

Genome Wide ◽

Exosomal Mirnas ◽

Mirna Expression Profiling

Abstract Genetic variants conferring risk for schizophrenia (SCZ) have been extensively studied, but the role of posttranscriptional mechanisms in SCZ is not well studied. Here we performed the first genome-wide microRNA (miRNA) expression profiling in serum-derived exosome from 49 first-episode, drug-free SCZ patients and 46 controls and identified miRNAs and co-regulated modules that were perturbed in SCZ. Putative targets of these SCZ-affected miRNAs were enriched strongly for genes that have been implicated in protein glycosylation and were also related to neurotransmitter receptor and dendrite (spine) development. We validated several differentially expressed blood exosomal miRNAs in 100 SCZ patients as compared with 100 controls by quantitative reverse transcription-polymerase chain reaction. The potential regulatory relationships between several SCZ-affected miRNAs and their putative target genes were also validated. These include hsa-miR-206, which is the most upregulated miRNA in the blood exosomes of SCZ patients and that previously reported to regulate brain-derived neurotrophic factor expression, which we showed reduced mRNA and protein levels in the blood of SCZ patients. In addition, we found 11 miRNAs in blood exosomes from the miRNA sequence data that can be used to classify samples from SCZ patients and control subjects with close to 90% accuracy in the training samples, and approximately 75% accuracy in the testing samples. Our findings support a role for exosomal miRNA dysregulation in SCZ pathophysiology and provide a rich data set and framework for future analyses of miRNAs in the disease, and our data also suggest that blood exosomal miRNAs are promising biomarkers for SCZ.

Download Full-text

Noncoding de novo mutations contribute to autism spectrum disorder via chromatin interactions

10.1101/2019.12.15.877324 ◽

2019 ◽

Author(s):

Il Bin Kim ◽

Taeyeop Lee ◽

Junehawk Lee ◽

Jonghun Kim ◽

Hyunseong Lee ◽

...

Keyword(s):

Gene Expression ◽

Autism Spectrum Disorder ◽

Stem Cells ◽

Target Genes ◽

De Novo ◽

Regulatory Elements ◽

Autism Spectrum ◽

Spectrum Disorder ◽

De Novo Mutations ◽

Chromatin Interactions

Three-dimensional chromatin structures regulate gene expression across genome. The significance of de novo mutations (DNMs) affecting chromatin interactions in autism spectrum disorder (ASD) remains poorly understood. We generated 931 whole-genome sequences for Korean simplex families to detect DNMs and identified target genes dysregulated by noncoding DNMs via long-range chromatin interactions between regulatory elements. Notably, noncoding DNMs that affect chromatin interactions exhibited transcriptional dysregulation implicated in ASD risks. Correspondingly, target genes were significantly involved in histone modification, prenatal brain development, and pregnancy. Both noncoding and coding DNMs collectively contributed to low IQ in ASD. Indeed, noncoding DNMs resulted in alterations, via chromatin interactions, in target gene expression in primitive neural stem cells derived from human induced pluripotent stem cells from an ASD subject. The emerging neurodevelopmental genes, not previously implicated in ASD, include CTNNA2, GRB10, IKZF1, PDE3B, and BACE1. Our results were reproducible in 517 probands from MSSNG cohort. This work demonstrates that noncoding DNMs contribute to ASD via chromatin interactions.

Download Full-text

Candidate cancer driver mutations in superenhancers and long-range chromatin interaction networks

10.1101/236802 ◽

2017 ◽

Cited By ~ 4

Author(s):

Lina Wadi ◽

Liis Uusküla-Reimand ◽

Keren Isaev ◽

Shimin Shuai ◽

Vincent Huang ◽

...

Keyword(s):

Gene Expression ◽

Long Range ◽

Target Genes ◽

Tumor Biology ◽

Regulatory Elements ◽

Chromatin Interaction ◽

Driver Mutations ◽

Regulatory Regions ◽

Protein Coding ◽

Primary Tumors

AbstractA comprehensive catalogue of the mutations that drive tumorigenesis and progression is essential to understanding tumor biology and developing therapies. Protein-coding driver mutations have been well-characterized by large exome-sequencing studies, however many tumors have no mutations in protein-coding driver genes. Non-coding mutations are thought to explain many of these cases, however few non-coding drivers besides TERT promoter are known. To fill this gap, we analyzed 150,000 cis-regulatory regions in 1,844 whole cancer genomes from the ICGC-TCGA PCAWG project. Using our new method, ActiveDriverWGS, we found 41 frequently mutated regulatory elements (FMREs) enriched in non-coding SNVs and indels (FDR<0.05) characterized by aging-associated mutation signatures and frequent structural variants. Most FMREs are distal from genes, reported here for the first time and also recovered by additional driver discovery methods. FMREs were enriched in super-enhancers, H3K27ac enhancer marks of primary tumors and long-range chromatin interactions, suggesting that the mutations drive cancer by distally controlling gene expression through threedimensional genome organization. In support of this hypothesis, the chromatin interaction network of FMREs and target genes revealed associations of mutations and differential gene expression of known and novel cancer genes (e.g., CNNB1IP1, RCC1), activation of immune response pathways and altered enhancer marks. Thus distal genomic regions may include additional, infrequently mutated drivers that act on target genes via chromatin loops. Our study is an important step towards finding such regulatory regions and deciphering the somatic mutation landscape of the non-coding genome.

Download Full-text

Genome-Wide cis-Regulatory Element Based Discovery of Auxin-Responsive Genes in Higher Plant

Genes ◽

10.3390/genes13010024 ◽

2021 ◽

Vol 13 (1) ◽

pp. 24

Author(s):

Jianfei Wu ◽

Fan Gao ◽

Tongtong Li ◽

Haixia Guo ◽

Li Zhang ◽

...

Keyword(s):

Target Genes ◽

Regulatory Element ◽

Regulatory Elements ◽

Auxin Response ◽

Higher Plant ◽

Response Factors ◽

Auxin Response Factors ◽

Genome Wide ◽

A Genome ◽

Almost All

Auxin has a profound impact on plant physiology and participates in almost all aspects of plant development processes. Auxin exerts profound pleiotropic effects on plant growth and differentiation by regulating the auxin response genes’ expressions. The classical auxin reaction is usually mediated by auxin response factors (ARFs), which bind to the auxin response element (AuxRE) in the promoter region of the target gene. Experiments have generated only a limited number of plant genes with well-characterized functions. It is still unknown how many genes respond to exogenous auxin treatment. An economical and effective method was proposed for the genome-wide discovery of genes responsive to auxin in a model plant, Arabidopsis thaliana (A. thaliana). Our method relies on cis-regulatory-element-based targeted gene finding across different promoters in a genome. We first exploit and analyze auxin-specific cis-regulatory elements for the transcription of the target genes, and then identify putative auxin responsive genes whose promoters contain the elements in the collection of over 25,800 promoters in the A. thaliana genome. Evaluating our result by comparing with a published database and the literature, we found that this method has an accuracy rate of 65.2% (309/474) for predicting candidate genes responsive to auxin. Chromosome distribution and annotation of the putative auxin-responsive genes predicted here were also mined. The results can markedly decrease the number of identified but merely potential auxin target genes and also provide useful clues for improving the annotation of gene that lack functional information.

Download Full-text

ChIAMM: A Mixture Model for Statistical Analysis of Long-Range Chromatin Interactions From ChIA-PET Experiments

Frontiers in Genetics ◽

10.3389/fgene.2020.616160 ◽

2020 ◽

Vol 11 ◽

Author(s):

Yibeltal Arega ◽

Hao Jiang ◽

Shuangqi Wang ◽

Jingwen Zhang ◽

Xiaohui Niu ◽

...

Keyword(s):

Mixture Model ◽

Interaction Analysis ◽

Gc Content ◽

Specific Protein ◽

Chromatin Interaction ◽

New Approach ◽

Chromatin Interactions ◽

Genome Wide ◽

Systematic Biases ◽

Local Enrichment

Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) is an important experimental method for detecting specific protein-mediated chromatin loops genome-wide at high resolution. Here, we proposed a new statistical approach with a mixture model, chromatin interaction analysis using mixture model (ChIAMM), to detect significant chromatin interactions from ChIA-PET data. The statistical model is cast into a Bayesian framework to consider more systematic biases: the genomic distance, local enrichment, mappability, and GC content. Using different ChIA-PET datasets, we evaluated the performance of ChIAMM and compared it with the existing methods, including ChIA-PET Tool, ChiaSig, Mango, ChIA-PET2, and ChIAPoP. The result showed that the new approach performed better than most top existing methods in detecting significant chromatin interactions in ChIA-PET experiments.

Download Full-text

The transcriptional program controlled by the stem cell leukemia gene Scl/Tal1 during early embryonic hematopoietic development

Blood ◽

10.1182/blood-2009-01-200048 ◽

2009 ◽

Vol 113 (22) ◽

pp. 5456-5465 ◽

Cited By ~ 81

Author(s):

Nicola K. Wilson ◽

Diego Miranda-Saavedra ◽

Sarah Kinston ◽

Nicolas Bonadies ◽

Samuel D. Foster ◽

...

Keyword(s):

Stem Cell ◽

Regulatory Networks ◽

Transcriptional Control ◽

Target Genes ◽

Fetal Liver ◽

Bioinformatic Analysis ◽

Regulatory Elements ◽

Hematopoietic Stem ◽

A Genome

The basic helix-loop-helix transcription factor Scl/Tal1 controls the development and subsequent differentiation of hematopoietic stem cells (HSCs). However, because few Scl target genes have been validated to date, the underlying mechanisms have remained largely unknown. In this study, we have used ChIP-Seq technology (coupling chromatin immunoprecipitation with deep sequencing) to generate a genome-wide catalog of Scl-binding events in a stem/progenitor cell line, followed by validation using primary fetal liver cells and comprehensive transgenic mouse assays. Transgenic analysis provided in vivo validation of multiple new direct Scl target genes and allowed us to reconstruct an in vivo validated network consisting of 17 factors and their respective regulatory elements. By coupling ChIP-Seq in model cell lines with in vivo transgenic validation and sophisticated bioinformatic analysis, we have identified a widely applicable strategy for the reconstruction of stem cell regulatory networks in which biologic material is otherwise limiting. Moreover, in addition to revealing multiple previously unrecognized links to known HSC regulators, as well as novel links to genes not previously implicated in HSC function, comprehensive transgenic analysis of regulatory elements provided substantial new insights into the transcriptional control of several important hematopoietic regulators, including Cbfa2t3h/Eto2, Cebpe, Nfe2, Zfpm1/Fog1, Erg, Mafk, Gfi1b, and Myb.

Download Full-text

Discovery of directional chromatin-associated regulatory motifs affecting human gene transcription

10.1101/290825 ◽

2018 ◽

Author(s):

Naoki Osato

Keyword(s):

Transcriptional Regulation ◽

Dna Binding ◽

Target Genes ◽

Enrichment Analysis ◽

Expression Level ◽

Chromatin Interaction ◽

Binding Motifs ◽

Chromatin Interactions ◽

Dna Binding Motifs ◽

Associated Functions

AbstractBackgroundChromatin interactions are essential in enhancer-promoter interactions (EPIs) and transcriptional regulation. CTCF and cohesin proteins located at chromatin interaction anchors and other DNA-binding proteins such as YY1, ZNF143, and SMARCA4 are involved in chromatin interactions. However, there is still no good overall understanding of proteins associated with chromatin interactions and insulator functions.ResultsHere, I describe a systematic and comprehensive approach for discovering DNA-binding motifs of transcription factors (TFs) that affect EPIs and gene expression. This analysis identified 96 biased orientations [64 forward-reverse (FR) and 52 reverse-forward (RF)] of motifs that significantly affected the expression level of putative transcriptional target genes in monocytes, T cells, HMEC, and NPC and included CTCF, cohesin (RAD21 and SMC3), YY1, and ZNF143; some TFs have more than one motif in databases; thus, the total number is smaller than the sum of FRs and RFs. KLF4, ERG, RFX, RFX2, HIF1, SP1, STAT3, and AP1 were associated with chromatin interactions. Many other TFs were also known to have chromatin-associated functions. The predicted biased orientations of motifs were compared with chromatin interaction data. Correlations in expression level of nearby genes separated by the motif sites were then examined among 53 tissues.ConclusionOne hundred FR and RF orientations associated with chromatin interactions and functions were discovered. Most TFs showed weak directional biases at chromatin interaction anchors and were difficult to identify using enrichment analysis of motifs. These findings contribute to the understanding of chromatin-associated motifs involved in transcriptional regulation, chromatin interactions/regulation, and histone modifications.

Download Full-text