scholarly journals GSDB: a database of 3D chromosome and genome structures reconstructed from Hi-C data

2019 ◽  
Author(s):  
Oluwatosin Oluwadare ◽  
Max Highsmith ◽  
Jianlin Cheng

ABSTRACTAdvances in the study of chromosome conformation capture (3C) technologies, such as Hi-C technique - capable of capturing chromosomal interactions in a genome-wide scale - have led to the development of three-dimensional (3D) chromosome and genome structure reconstruction methods from Hi-C data. The 3D genome structure is important because it plays a role in a variety of important biological activities such as DNA replication, gene regulation, genome interaction, and gene expression. In recent years, numerous Hi-C datasets have been generated, and likewise, a number of genome structure construction algorithms have been developed. However, until now, there has been no freely available repository for 3D chromosome structures. In this work, we outline the construction of a novel Genome Structure Database (GSDB) to create a comprehensive repository that contains 3D structures for Hi-C datasets constructed by a variety of 3D structure reconstruction tools. GSDB contains over 50,000 structures constructed by 12 state-of-the-art chromosome and genome structure prediction methods for publicly used Hi-C datasets with varying resolution. The database is useful for the community to study the function of genome from a 3D perspective. GSDB is accessible at http://sysbio.rnet.missouri.edu/3dgenome/GSDB

2019 ◽  
Author(s):  
◽  
Oluwatosin Oluwadare

Sixteen years after the sequencing of the human genome, the Human Genome Project (HGP), and 17 years after the introduction of Chromosome Conformation Capture (3C) technologies, three-dimensional (3-D) inference and big data remains problematic in the field of genomics, and specifically, in the field of 3C data analysis. Three-dimensional inference involves the reconstruction of a genome's 3D structure or, in some cases, ensemble of structures from contact interaction frequencies extracted from a variant of the 3C technology called the Hi-C technology. Further questions remain about chromosome topology and structure; enhancer-promoter interactions; location of genes, gene clusters, and transcription factors; the relationship between gene expression and epigenetics; and chromosome visualization at a higher scale, among others. In this dissertation, four major contributions are described, first, 3DMax, a tool for chromosome and genome 3-D structure prediction from H-C data using optimization algorithm, second, GSDB, a comprehensive and common repository that contains 3D structures for Hi-C datasets from novel 3D structure reconstruction tools developed over the years, third, ClusterTAD, a method for topological associated domains (TAD) extraction from Hi-C data using unsupervised learning algorithm. Finally, we introduce a tool called, GenomeFlow, a comprehensive graphical tool to facilitate the entire process of modeling and analysis of 3D genome organization. It is worth noting that GenomeFlow and GSDB are the first of their kind in the 3D chromosome and genome research field. All the methods are available as software tools that are freely available to the scientific community.


2017 ◽  
Author(s):  
Guangxiang Zhu ◽  
Wenxuan Deng ◽  
Hailin Hu ◽  
Rui Ma ◽  
Sai Zhang ◽  
...  

AbstractDecoding the spatial organizations of chromosomes has crucial implications for studying eukaryotic gene regulation. Recently, Chromosomal conformation capture based technologies, such as Hi-C, have been widely used to uncover the interaction frequencies of genomic loci in high-throughput and genome-wide manner and provide new insights into the folding of three-dimensional (3D) genome structure. In this paper, we develop a novel manifold learning framework, called GEM (Genomic organization reconstructor based on conformational Energy and Manifold learning), to elucidate the underlying 3D spatial organizations of chromosomes from Hi-C data. Unlike previous chromatin structure reconstruction methods, which explicitly assume specific relationships between Hi-C interaction frequencies and spatial distances between distal genomic loci, GEM is able to reconstruct an ensemble of chromatin conformations by directly embedding the neigh-boring affinities from Hi-C space into 3D Euclidean space based on a manifold learning strategy that considers both the fitness of Hi-C data and the biophysical feasibility of the modeled structures, which are measured by the conformational energy derived from our current biophysical knowledge about the 3D polymer model. Extensive validation tests on both simulated interaction frequency data and experimental Hi-C data of yeast and human demonstrated that GEM not only greatly outperformed other state-of-art modeling methods but also reconstructed accurate chromatin structures that agreed well with the hold-out or independent Hi-C data and sparse geometric restraints derived from the previous fluorescence in situ hybridization (FISH) studies. In addition, as GEM can generate accurate spatial organizations of chromosomes by integrating both experimentally-derived spatial contacts and conformational energy, we for the first time extended our modeling method to recover long-range genomic interactions that are missing from the original Hi-C data. All these results indicated that GEM can provide a physically and physiologically valid 3D representations of the organizations of chromosomes and thus serve as an effective and useful genome structure reconstructor.


2021 ◽  
Author(s):  
Brandon Collins ◽  
Philip N. Brown ◽  
Oluwatosin Oluwadare

Background: With the advent of Next Generation Sequencing and the Hi-C experiment, high quality genome-wide contact data is becoming increasingly available. This data represents an empirical measure of how a genome interacts inside the nucleus. Genome conformation is of particular interest as it has been experimentally shown to be a driving force for many genomic functions from regulation to transcription. Thus, the Three-Dimensional Genome Reconstruction Problem seeks to take Hi-C data and produce the complete physical genome structure as it appears in the nucleus for genomic analysis. Results: We propose and develop a novel method to solve the Chromosome and Genome Reconstruction problem based on the Bat Algorithm which we called ChromeBat.We demonstrate on real Hi-C data that ChromeBat is capable of state of the art performance. Additionally, the domain of Genome Reconstruction has been criticized for lacking algorithmic diversity, and the bio-inspired nature of ChromeBat contributes algorithmic diversity to the problem domain. Conclusions: ChromeBat is an effective approach at solving the Genome Reconstruction Problem. The source code and usage guide can be found here: https://github.com/OluwadareLab/ChromeBat.


2022 ◽  
Vol 13 (1) ◽  
Author(s):  
Ruiting Wang ◽  
Fengling Chen ◽  
Qian Chen ◽  
Xin Wan ◽  
Minglei Shi ◽  
...  

AbstractThe genome exists as an organized, three-dimensional (3D) dynamic architecture, and each cell type has a unique 3D genome organization that determines its cell identity. An unresolved question is how cell type-specific 3D genome structures are established during development. Here, we analyzed 3D genome structures in muscle cells from mice lacking the muscle lineage transcription factor (TF), MyoD, versus wild-type mice. We show that MyoD functions as a “genome organizer” that specifies 3D genome architecture unique to muscle cell development, and that H3K27ac is insufficient for the establishment of MyoD-induced chromatin loops in muscle cells. Moreover, we present evidence that other cell lineage-specific TFs might also exert functional roles in orchestrating lineage-specific 3D genome organization during development.


2020 ◽  
Vol 48 (W1) ◽  
pp. W170-W176
Author(s):  
Michal Wlasnowolski ◽  
Michal Sadowski ◽  
Tymon Czarnota ◽  
Karolina Jodkowska ◽  
Przemyslaw Szalaj ◽  
...  

Abstract Structural variants (SVs) that alter DNA sequence emerge as a driving force involved in the reorganisation of DNA spatial folding, thus affecting gene transcription. In this work, we describe an improved version of our integrated web service for structural modeling of three-dimensional genome (3D-GNOME), which now incorporates all types of SVs to model changes to the reference 3D conformation of chromatin. In 3D-GNOME 2.0, the default reference 3D genome structure is generated using ChIA-PET data from the GM12878 cell line and SVs data are sourced from the population-scale catalogue of SVs identified by the 1000 Genomes Consortium. However, users may also submit their own structural data to set a customized reference genome structure, and/or a custom input list of SVs. 3D-GNOME 2.0 provides novel tools to inspect, visualize and compare 3D models for regions that differ in terms of their linear genomic sequence. Contact diagrams are displayed to compare the reference 3D structure with the one altered by SVs. In our opinion, 3D-GNOME 2.0 is a unique online tool for modeling and analyzing conformational changes to the human genome induced by SVs across populations. It can be freely accessed at https://3dgnome.cent.uw.edu.pl/.


2017 ◽  
Author(s):  
Haochen Li ◽  
Reza Kalhor ◽  
Bing Li ◽  
Trent Su ◽  
Arnold J. Berk ◽  
...  

AbstractViruses have evolved a variety of mechanisms to interact with host cells for their adaptive benefits, including subverting host immune responses and hijacking host DNA replication/transcription machineries [1–3]. Although interactions between viral and host proteins have been studied extensively, little is known about how the vial genome may interact with the host genome and how such interactions could affect the activities of both the virus and the host cell. Since the three-dimensional organization of a genome can have significant impact on genomic activities such as transcription and replication, we hypothesize that such structure-based regulation of genomic functions also applies to viral genomes depending on their association with host genomic regions and their spatial locations inside the nucleus. Here, we used Tethered Chromosome Conformation Capture (TCC) to investigate viral-host genome interactions between the adenovirus and human lung fibroblast cells. We found viral-host genome interactions were enriched in certain active chromatin regions and chromatin domains marked by H3K27me3. The contacts by viral DNA seems to impact the structure and function of the host genome, leading to remodeling of the fibroblast epigenome. Our study represents the first comprehensive analysis of viral-host interactions at the genome structure level, revealing unexpectedly specific virus-host genome interactions. The non-random nature of such interactions indicates a deliberate but poorly understood mechanism for targeting of host DNA by foreign genomes.


2021 ◽  
Author(s):  
Van Hovenga ◽  
Oluwatosin Oluwadare ◽  
Jugal Kalita

Chromosome conformation capture (3C) is a method of measuring chromosome topology in terms of loci interaction. The Hi-C method is a derivative of 3C that allows for genome wide quantification of chromosome interaction. From such interaction data, it is possible to infer the three-dimensional (3D) structure of the underlying chromosome. In this paper, we use a node embedding algorithm and a graph neural network to predict the 3D coordinates of each genomic loci from the corresponding Hi-C contact data. Unlike other chromosome structure prediction methods, our method can generalize a single model across Hi-C resolutions, multiple restriction enzymes, and multiple cell populations while maintaining reconstruction accuracy. We derive these results using three separate Hi-C data sets from the GM12878, GM06990, and K562 cell lines. We also compare the reconstruction accuracy of our method to four other existing methods and show that our method yields superior performance. Our algorithm outperforms the state-of-the-art methods in the accuracy of prediction and introduces a novel method for 3D structure prediction from Hi-C data.


Genes ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 1757
Author(s):  
Brandon Collins ◽  
Oluwatosin Oluwadare ◽  
Philip Brown

With the advent of Next Generation Sequencing and the Hi-C experiment, high quality genome-wide contact data are becoming increasingly available. These data represents an empirical measure of how a genome interacts inside the nucleus. Genome conformation is of particular interest as it has been experimentally shown to be a driving force for many genomic functions from regulation to transcription. Thus, the Three Dimensional-Genome Reconstruction Problem (3D-GRP) seeks to take Hi-C data and produces a complete physical genome structure as it appears in the nucleus for genomic analysis. We propose and develop a novel method to solve the Chromosome and Genome Reconstruction problem based on the Bat Algorithm (BA) which we called ChromeBat. We demonstrate on real Hi-C data that ChromeBat is capable of state-of-the-art performance. Additionally, the domain of Genome Reconstruction has been criticized for lacking algorithmic diversity, and the bio-inspired nature of ChromeBat contributes algorithmic diversity to the problem domain. ChromeBat is an effective approach for solving the Genome Reconstruction Problem.


2017 ◽  
Author(s):  
◽  
Tuan Anh Trieu

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] Different cell types of an organism have the same DNA sequence, but they can function differently because their difference in 3D organization allows them to express different genes and has different cellular functions. Understanding the 3D organization of the genome is the key to understand functions of the cell. Chromosome conformation capture techniques like Hi-C and TCC that can capture interactions between proximal chromosome fragments have allowed the study of 3D genome organization in high resolution and high through-put. My work focuses on developing computational methods to reconstruct 3D genome structures from Hi-C data. I presented three methods to reconstruct 3D genome and chromosome structures. The first method can build 3D genome models from soft constraints of contacts and non-contacts. This method utilizes the concept of contact and non-contact to reconstruct 3D models without translating interaction frequencies into physical distances. The translation is commonly used by other methods even though it makes a strong assumption about the relationship between interaction frequencies and physical distances. In synthetic dataset, when the relationship was known, my method performed comparably with other methods assuming the relationship. This shows the potential of my method for real Hi-C datasets where the relationship is unknown. The limitation of the method is that it has parameters requiring manual adjustment. I developed the second method to reconstruct 3D genome models. This method utilizes a commonly used function to translate interaction frequencies to physical distances to build 3D models. I proposed a novel way to derive soft constraints to handle inconsistency in the data and to make the method robust. Building 3D models at high resolution is a more challenging problem as the number of constraints is small and the feasible space is larger. I introduced a third method to build 3D chromosome models at high resolution. The method reconstructs models at low resolution and then uses them to guide the reconstruction of models at high resolution. The last part of my work is the development of a comprehensive tool with intuitive graphic user interface to analyze Hi-C data, reconstruct and analyze 3D models.


2018 ◽  
Author(s):  
David J Winter ◽  
Austen RD Ganley ◽  
Carolyn A Young ◽  
Ivan Liachko ◽  
Christopher L Schardl ◽  
...  

AbstractStructural features of genomes, including the three-dimensional arrangement of DNA in the nucleus, are increasingly seen as key contributors to the regulation of gene expression. However, studies on how genome structure and nuclear organization influence transcription have so far been limited to a handful of model species. This narrow focus limits our ability to draw general conclusions about the ways in which three-dimensional structures are encoded, and to integrate information from three-dimensional data to address a broader gamut of biological questions. Here, we generate a complete and gapless genome sequence for the filamentous fungus,Epichloë festucae. Coupling it with RNAseq and HiC data, we investigate how the structure of the genome contributes to the suite of transcriptional changes that anEpichloëspecies needs to maintain symbiotic relationships with its grass host. Our results reveal a unique “patchwork” genome, in which repeat-rich blocks of DNA with discrete boundaries are interspersed by gene-rich sequences. In contrast to other species, the three-dimensional structure of the genome is anchored by these repeat blocks, which act to isolate transcription in neighbouring gene-rich regions. Genes that are differentially expressed in planta are enriched near the boundaries of these repeat-rich blocks, suggesting that their three-dimensional orientation partly encodes and regulates the symbiotic relationship formed by this organism.


Sign in / Sign up

Export Citation Format

Share Document