scholarly journals Integration of phased Hi-C and molecular phenotype data to study genetic and epigenetic effects on chromatin looping

2018 ◽  
Author(s):  
William W. Greenwald ◽  
He Li ◽  
Paola Benaglio ◽  
David Jakubosky ◽  
Hiroko Matsui ◽  
...  

SummaryWhile genetic variation at chromatin loops is relevant for human disease, the relationships between loop strength, genetics, gene expression, and epigenetics are unclear. Here, we quantitatively interrogate this relationship using Hi-C and molecular phenotype data across cell types and haplotypes. We find that chromatin loops consistently form across multiple cell types and quantitatively vary in strength, instead of exclusively forming within only one cell type. We show that large haplotype loop imbalance is primarily associated with imprinting and copy number variation, rather than genetically driven traits such as allele-specific expression. Finally, across cell types and haplotypes, we show that subtle changes in chromatin loop strength are associated with large differences in other molecular phenotypes, with a 2-fold change in looping corresponding to a 100-fold change in gene expression. Our study suggests that regulatory genetic variation could mediate its effects on gene expression through subtle modification of chromatin loop strength.

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Stéphane Deschamps ◽  
John A. Crow ◽  
Nadia Chaidir ◽  
Brooke Peterson-Burch ◽  
Sunil Kumar ◽  
...  

Abstract Background Three-dimensional chromatin loop structures connect regulatory elements to their target genes in regions known as anchors. In complex plant genomes, such as maize, it has been proposed that loops span heterochromatic regions marked by higher repeat content, but little is known on their spatial organization and genome-wide occurrence in relation to transcriptional activity. Results Here, ultra-deep Hi-C sequencing of maize B73 leaf tissue was combined with gene expression and open chromatin sequencing for chromatin loop discovery and correlation with hierarchical topologically-associating domains (TADs) and transcriptional activity. A majority of all anchors are shared between multiple loops from previous public maize high-resolution interactome datasets, suggesting a highly dynamic environment, with a conserved set of anchors involved in multiple interaction networks. Chromatin loop interiors are marked by higher repeat contents than the anchors flanking them. A small fraction of high-resolution interaction anchors, fully embedded in larger chromatin loops, co-locate with active genes and putative protein-binding sites. Combinatorial analyses indicate that all anchors studied here co-locate with at least 81.5% of expressed genes and 74% of open chromatin regions. Approximately 38% of all Hi-C chromatin loops are fully embedded within hierarchical TAD-like domains, while the remaining ones share anchors with domain boundaries or with distinct domains. Those various loop types exhibit specific patterns of overlap for open chromatin regions and expressed genes, but no apparent pattern of gene expression. In addition, up to 63% of all unique variants derived from a prior public maize eQTL dataset overlap with Hi-C loop anchors. Anchor annotation suggests that < 7% of all loops detected here are potentially devoid of any genes or regulatory elements. The overall organization of chromatin loop anchors in the maize genome suggest a loop modeling system hypothesized to resemble phase separation of repeat-rich regions. Conclusions Sets of conserved chromatin loop anchors mapping to hierarchical domains contains core structural components of the gene expression machinery in maize. The data presented here will be a useful reference to further investigate their function in regard to the formation of transcriptional complexes and the regulation of transcriptional activity in the maize genome.


2017 ◽  
Author(s):  
Hilary K. Finucane ◽  
Yakir A. Reshef ◽  
Verneri Anttila ◽  
Kamil Slowikowski ◽  
Alexander Gusev ◽  
...  

ABSTRACTGenetics can provide a systematic approach to discovering the tissues and cell types relevant for a complex disease or trait. Identifying these tissues and cell types is critical for following up on non-coding allelic function, developing ex-vivo models, and identifying therapeutic targets. Here, we analyze gene expression data from several sources, including the GTEx and PsychENCODE consortia, together with genome-wide association study (GWAS) summary statistics for 48 diseases and traits with an average sample size of 169,331, to identify disease-relevant tissues and cell types. We develop and apply an approach that uses stratified LD score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We detect tissue-specific enrichments at FDR < 5% for 34 diseases and traits across a broad range of tissues that recapitulate known biology. In our analysis of traits with observed central nervous system enrichment, we detect an enrichment of neurons over other brain cell types for several brain-related traits, enrichment of inhibitory over excitatory neurons for bipolar disorder but excitatory over inhibitory neurons for schizophrenia and body mass index, and enrichments in the cortex for schizophrenia and in the striatum for migraine. In our analysis of traits with observed immunological enrichment, we identify enrichments of T cells for asthma and eczema, B cells for primary biliary cirrhosis, and myeloid cells for Alzheimer's disease, which we validated with independent chromatin data. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signal.


2020 ◽  
Author(s):  
Devanshi Patel ◽  
Xiaoling Zhang ◽  
John J. Farrell ◽  
Jaeyoon Chung ◽  
Thor D. Stein ◽  
...  

ABSTRACTBecause regulation of gene expression is heritable and context-dependent, we investigated AD-related gene expression patterns in cell-types in blood and brain. Cis-expression quantitative trait locus (eQTL) mapping was performed genome-wide in blood from 5,257 Framingham Heart Study (FHS) participants and in brain donated by 475 Religious Orders Study/Memory & Aging Project (ROSMAP) participants. The association of gene expression with genotypes for all cis SNPs within 1Mb of genes was evaluated using linear regression models for unrelated subjects and linear mixed models for related subjects. Cell type-specific eQTL (ct-eQTL) models included an interaction term for expression of “proxy” genes that discriminate particular cell type. Ct-eQTL analysis identified 11,649 and 2,533 additional significant gene-SNP eQTL pairs in brain and blood, respectively, that were not detected in generic eQTL analysis. Of note, 386 unique target eGenes of significant eQTLs shared between blood and brain were enriched in apoptosis and Wnt signaling pathways. Five of these shared genes are established AD loci. The potential importance and relevance to AD of significant results in myeloid cell-types is supported by the observation that a large portion of GWS ct-eQTLs map within 1Mb of established AD loci and 58% (23/40) of the most significant eGenes in these eQTLs have previously been implicated in AD. This study identified cell-type specific expression patterns for established and potentially novel AD genes, found additional evidence for the role of myeloid cells in AD risk, and discovered potential novel blood and brain AD biomarkers that highlight the importance of cell-type specific analysis.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Xin Shao ◽  
Ning Lv ◽  
Jie Liao ◽  
Jinbo Long ◽  
Rui Xue ◽  
...  

Abstract Background Cancer is a heterogeneous disease with many genetic variations. Lines of evidence have shown copy number variations (CNVs) of certain genes are involved in development and progression of many cancers through the alterations of their gene expression levels on individual or several cancer types. However, it is not quite clear whether the correlation will be a general phenomenon across multiple cancer types. Methods In this study we applied a bioinformatics approach integrating CNV and differential gene expression mathematically across 1025 cell lines and 9159 patient samples to detect their potential relationship. Results Our results showed there is a close correlation between CNV and differential gene expression and the copy number displayed a positive linear influence on gene expression for the majority of genes, indicating that genetic variation generated a direct effect on gene transcriptional level. Another independent dataset is utilized to revalidate the relationship between copy number and expression level. Further analysis show genes with general positive linear influence on gene expression are clustered in certain disease-related pathways, which suggests the involvement of CNV in pathophysiology of diseases. Conclusions This study shows the close correlation between CNV and differential gene expression revealing the qualitative relationship between genetic variation and its downstream effect, especially for oncogenes and tumor suppressor genes. It is of a critical importance to elucidate the relationship between copy number variation and gene expression for prevention, diagnosis and treatment of cancer.


Genes ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 235 ◽  
Author(s):  
Hannah Swahn ◽  
Ann Harris

The cystic fibrosis transmembrane conductance regulator (CFTR) gene is an attractive target for gene editing approaches, which may yield novel therapeutic approaches for genetic diseases such as cystic fibrosis (CF). However, for gene editing to be effective, aspects of the three-dimensional (3D) structure and cis-regulatory elements governing the dynamic expression of CFTR need to be considered. In this review, we focus on the higher order chromatin organization required for normal CFTR locus function, together with the complex mechanisms controlling expression of the gene in different cell types impaired by CF pathology. Across all cells, the CFTR locus is organized into an invariant topologically associated domain (TAD) established by the architectural proteins CCCTC-binding factor (CTCF) and cohesin complex. Additional insulator elements within the TAD also recruit these factors. Although the CFTR promoter is required for basal levels of expression, cis-regulatory elements (CREs) in intergenic and intronic regions are crucial for cell-specific and temporal coordination of CFTR transcription. These CREs are recruited to the promoter through chromatin looping mechanisms and enhance cell-type-specific expression. These features of the CFTR locus should be considered when designing gene-editing approaches, since failure to recognize their importance may disrupt gene expression and reduce the efficacy of therapies.


2017 ◽  
Author(s):  
D. Leland Taylor ◽  
David A. Knowles ◽  
Laura J. Scott ◽  
Andrea H. Ramirez ◽  
Franceso Paolo Casale ◽  
...  

AbstractFrom whole organisms to individual cells, responses to environmental conditions are influenced by genetic makeup, where the effect of genetic variation on a trait depends on the environmental context. RNA-sequencing quantifies gene expression as a molecular trait, and is capable of capturing both genetic and environmental effects. In this study, we explore opportunities of using allele-specific expression (ASE) to discovercisacting genotype-environment interactions (GxE) - genetic effects on gene expression that depend on an environmental condition. Treating 17 common, clinical traits as approximations of the cellular environment of 267 skeletal muscle biopsies, we identify 10 candidate interaction quantitative trait loci (iQTLs) across 6 traits (12 unique gene-environment trait pairs; 10% FDR per trait) including sex, systolic blood pressure, and low-density lipoprotein cholesterol. Although using ASE is in principle a promising approach to detect GxE effects, replication of such signals can be challenging as validation requires harmonization of environmental traits across cohorts and a sufficient sampling of heterozygotes for a transcribed SNP. Comprehensive discovery and replication will require large human transcriptome datasets, or the integration of multiple transcribed SNPs, coupled with standardized clinical phenotyping.


2021 ◽  
Author(s):  
Jianbo Li ◽  
Ligang Wang ◽  
Dawei Yu ◽  
Junfeng Hao ◽  
Longchao Zhang ◽  
...  

Thoracolumbar vertebra (TLV) and rib primordium (RP) development is a common evolutionary feature across vertebrates although whole-organism analysis of TLV and RP gene expression dynamics has been lacking. Here we investigated the single-cell transcriptomic landscape of thoracic vertebra (TV), lumbar vertebra (LV), and RP cells from a pig embryo at 27 days post-fertilization (dpf) and identified six cell types with distinct gene-expression signatures. In-depth dissection of the gene-expression dynamics and RNA velocity revealed a coupled process of osteogenesis and angiogenesis during TLV and rib development. Further analysis of cell-type-specific and strand-specific expression uncovered the extremely high levels of HOXA10 3'-UTR sequence specific to osteoblast of LV cells, which may function as anti-HOXA10-antisense by counteracting the HOXA10-antisense effect to determine TLV transition. Thus, this work provides a valuable resource for understanding embryonic osteogenesis and angiogenesis underlying vertebrate TLV and RP development at the cell-type-specific resolution, which serves as a comprehensive view on the transcriptional profile of animal embryo development.


2018 ◽  
Author(s):  
Ken Jean-Baptiste ◽  
José L. McFaline-Figueroa ◽  
Cristina M. Alexandre ◽  
Michael W. Dorrity ◽  
Lauren Saunders ◽  
...  

ABSTRACTSingle-cell RNA-seq can yield high-resolution cell-type-specific expression signatures that reveal new cell types and the developmental trajectories of cell lineages. Here, we apply this approach toA. thalianaroot cells to capture gene expression in 3,121 root cells. We analyze these data with Monocle 3, which orders single cell transcriptomes in an unsupervised manner and uses machine learning to reconstruct single-cell developmental trajectories along pseudotime. We identify hundreds of genes with cell-type-specific expression, with pseudotime analysis of several cell lineages revealing both known and novel genes that are expressed along a developmental trajectory. We identify transcription factor motifs that are enriched in early and late cells, together with the corresponding candidate transcription factors that likely drive the observed expression patterns. We assess and interpret changes in total RNA expression along developmental trajectories and show that trajectory branch points mark developmental decisions. Finally, by applying heat stress to whole seedlings, we address the longstanding question of possible heterogeneity among cell types in the response to an abiotic stress. Although the response of canonical heat shock genes dominates expression across cell types, subtle but significant differences in other genes can be detected among cell types. Taken together, our results demonstrate that single-cell transcriptomics holds promise for studying plant development and plant physiology with unprecedented resolution.


2019 ◽  
Author(s):  
Dylan R. Farnsworth ◽  
Lauren Saunders ◽  
Adam C. Miller

ABSTRACTThe ability to define cell types and how they change during organogenesis is central to our understanding of animal development and human disease. Despite the crucial nature of this knowledge, we have yet to fully characterize all distinct cell types and the gene expression differences that generate cell types during development. To address this knowledge gap, we produced an Atlas using single-cell RNA-sequencing methods to investigate gene expression from the pharyngula to early larval stages in developing zebrafish. Our single-cell transcriptome Atlas encompasses transcriptional profiles from 44,102 cells across four days of development using duplicate experiments that confirmed high reproducibility. We annotated 220 identified clusters and highlighted several strategies for interrogating changes in gene expression associated with the development of zebrafish embryos at single-cell resolution. Furthermore, we highlight the power of this analysis to assign new cell-type or developmental stage-specific expression information to many genes, including those that are currently known only by sequence and/or that lack expression information altogether. The resulting Atlas is a resource of biologists to generate hypotheses for genetic (mutant) or functional analysis, to launch an effort to define the diversity of cell-types during zebrafish organogenesis, and to examine the transcriptional profiles that produce each cell type over developmental time.


2017 ◽  
Author(s):  
C Calabrese ◽  
K Lehmann ◽  
L Urban ◽  
F Liu ◽  
S Erkek ◽  
...  

AbstractCancer is characterised by somatic genetic variation, but the effect of the majority of non-coding somatic variants and the interface with the germline genome are still unknown. We analysed the whole genome and RNA-Seq data from 1,188 human cancer patients as provided by the Pan-cancer Analysis of Whole Genomes (PCAWG) project to map cis expression quantitative trait loci of somatic and germline variation and to uncover the causes of allele-specific expression patterns in human cancers. The availability of the first large-scale dataset with both whole genome and gene expression data enabled us to uncover the effects of the non-coding variation on cancer. In addition to confirming known regulatory effects, we identified novel associations between somatic variation and expression dysregulation, in particular in distal regulatory elements. Finally, we uncovered links between somatic mutational signatures and gene expression changes, including TERT and LMO2, and we explained the inherited risk factors in APOBEC-related mutational processes. This work represents the first large-scale assessment of the effects of both germline and somatic genetic variation on gene expression in cancer and creates a valuable resource cataloguing these effects.


Sign in / Sign up

Export Citation Format

Share Document