scholarly journals Evaluation and comparison of methods for recapitulation of 3D spatial chromatin structures

2017 ◽  
Vol 20 (4) ◽  
pp. 1205-1214
Author(s):  
Jincheol Park ◽  
Shili Lin

Abstract How chromosomes fold and how distal genomic elements interact with one another at a genomic scale have been actively pursued in the past decade following the seminal work describing the Chromosome Conformation Capture (3C) assay. Essentially, 3C-based technologies produce two-dimensional (2D) contact maps that capture interactions between genomic fragments. Accordingly, a plethora of analytical methods have been proposed to take a 2D contact map as input to recapitulate the underlying whole genome three-dimensional (3D) structure of the chromatin. However, their performance in terms of several factors, including data resolution and ability to handle contact map features, have not been sufficiently evaluated. This task is taken up in this article, in which we consider several recent and/or well-regarded methods, both optimization-based and model-based, for their aptness of producing 3D structures using contact maps generated based on a population of cells. These methods are evaluated and compared using both simulated and real data. Several criteria have been used. For simulated data sets, the focus is on accurate recapitulation of the entire structure given the existence of the gold standard. For real data sets, comparison with distances measured by Florescence in situ Hybridization and consistency with several genomic features of known biological functions are examined.

2017 ◽  
Author(s):  
Oana Ursu ◽  
Nathan Boley ◽  
Maryna Taranova ◽  
Y.X. Rachel Wang ◽  
Galip Gurkan Yardimci ◽  
...  

AbstractMotivationThe three-dimensional organization of chromatin plays a critical role in gene regulation and disease. High-throughput chromosome conformation capture experiments such as Hi-C are used to obtain genome-wide maps of 3D chromatin contacts. However, robust estimation of data quality and systematic comparison of these contact maps is challenging due to the multi-scale, hierarchical structure of chromatin contacts and the resulting properties of experimental noise in the data. Measuring concordance of contact maps is important for assessing reproducibility of replicate experiments and for modeling variation between different cellular contexts.ResultsWe introduce a concordance measure called GenomeDISCO (DIfferences between Smoothed COntact maps) for assessing the similarity of a pair of contact maps obtained from chromosome conformation capture experiments. The key idea is to smooth contact maps using random walks on the contact map graph, before estimating concordance. We use simulated datasets to benchmark GenomeDISCO’s sensitivity to different types of noise that affect chromatin contact maps. When applied to a large collection of Hi-C datasets, GenomeDISCO accurately distinguishes biological replicates from samples obtained from different cell types. GenomeDISCO also generalizes to other chromosome conformation capture assays, such as HiChIP.AvailabilitySoftware implementing GenomeDISCO is available at https://github.com/kundajelab/[email protected] informationSupplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Van Hovenga ◽  
Oluwatosin Oluwadare ◽  
Jugal Kalita

Chromosome conformation capture (3C) is a method of measuring chromosome topology in terms of loci interaction. The Hi-C method is a derivative of 3C that allows for genome wide quantification of chromosome interaction. From such interaction data, it is possible to infer the three-dimensional (3D) structure of the underlying chromosome. In this paper, we use a node embedding algorithm and a graph neural network to predict the 3D coordinates of each genomic loci from the corresponding Hi-C contact data. Unlike other chromosome structure prediction methods, our method can generalize a single model across Hi-C resolutions, multiple restriction enzymes, and multiple cell populations while maintaining reconstruction accuracy. We derive these results using three separate Hi-C data sets from the GM12878, GM06990, and K562 cell lines. We also compare the reconstruction accuracy of our method to four other existing methods and show that our method yields superior performance. Our algorithm outperforms the state-of-the-art methods in the accuracy of prediction and introduces a novel method for 3D structure prediction from Hi-C data.


Sensors ◽  
2018 ◽  
Vol 18 (11) ◽  
pp. 3949 ◽  
Author(s):  
Wei Li ◽  
Mingli Dong ◽  
Naiguang Lu ◽  
Xiaoping Lou ◽  
Peng Sun

An extended robot–world and hand–eye calibration method is proposed in this paper to evaluate the transformation relationship between the camera and robot device. This approach could be performed for mobile or medical robotics applications, where precise, expensive, or unsterile calibration objects, or enough movement space, cannot be made available at the work site. Firstly, a mathematical model is established to formulate the robot-gripper-to-camera rigid transformation and robot-base-to-world rigid transformation using the Kronecker product. Subsequently, a sparse bundle adjustment is introduced for the optimization of robot–world and hand–eye calibration, as well as reconstruction results. Finally, a validation experiment including two kinds of real data sets is designed to demonstrate the effectiveness and accuracy of the proposed approach. The translation relative error of rigid transformation is less than 8/10,000 by a Denso robot in a movement range of 1.3 m × 1.3 m × 1.2 m. The distance measurement mean error after three-dimensional reconstruction is 0.13 mm.


2009 ◽  
Vol 2009 ◽  
pp. 1-11 ◽  
Author(s):  
Mahendran Shitan ◽  
Shelton Peiris

Spatial modelling has its applications in many fields like geology, agriculture, meteorology, geography, and so forth. In time series a class of models known as Generalised Autoregressive (GAR) has been introduced by Peiris (2003) that includes an index parameterδ. It has been shown that the inclusion of this additional parameter aids in modelling and forecasting many real data sets. This paper studies the properties of a new class of spatial autoregressive process of order 1 with an index. We will call this aGeneralised Separable Spatial Autoregressive(GENSSAR) Model. The spectral density function (SDF), the autocovariance function (ACVF), and the autocorrelation function (ACF) are derived. The theoretical ACF and SDF plots are presented as three-dimensional figures.


2005 ◽  
Vol 30 (4) ◽  
pp. 369-396 ◽  
Author(s):  
Eisuke Segawa

Multi-indicator growth models were formulated as special three-level hierarchical generalized linear models to analyze growth of a trait latent variable measured by ordinal items. Items are nested within a time-point, and time-points are nested within subject. These models are special because they include factor analytic structure. This model can analyze not only data with item- and time-level missing observations, but also data with time points freely specified over subjects. Furthermore, features useful for longitudinal analyses, “autoregressive error degree one” structure for the trait residuals and estimated time-scores, were included. The approach is Bayesian with Markov Chain and Monte Carlo, and the model is implemented in WinBUGS. They are illustrated with two simulated data sets and one real data set with planned missing items within a scale.


2019 ◽  
Author(s):  
◽  
Oluwatosin Oluwadare

Sixteen years after the sequencing of the human genome, the Human Genome Project (HGP), and 17 years after the introduction of Chromosome Conformation Capture (3C) technologies, three-dimensional (3-D) inference and big data remains problematic in the field of genomics, and specifically, in the field of 3C data analysis. Three-dimensional inference involves the reconstruction of a genome's 3D structure or, in some cases, ensemble of structures from contact interaction frequencies extracted from a variant of the 3C technology called the Hi-C technology. Further questions remain about chromosome topology and structure; enhancer-promoter interactions; location of genes, gene clusters, and transcription factors; the relationship between gene expression and epigenetics; and chromosome visualization at a higher scale, among others. In this dissertation, four major contributions are described, first, 3DMax, a tool for chromosome and genome 3-D structure prediction from H-C data using optimization algorithm, second, GSDB, a comprehensive and common repository that contains 3D structures for Hi-C datasets from novel 3D structure reconstruction tools developed over the years, third, ClusterTAD, a method for topological associated domains (TAD) extraction from Hi-C data using unsupervised learning algorithm. Finally, we introduce a tool called, GenomeFlow, a comprehensive graphical tool to facilitate the entire process of modeling and analysis of 3D genome organization. It is worth noting that GenomeFlow and GSDB are the first of their kind in the 3D chromosome and genome research field. All the methods are available as software tools that are freely available to the scientific community.


2021 ◽  
Vol 17 (3) ◽  
pp. e1008865
Author(s):  
Yang Li ◽  
Chengxin Zhang ◽  
Eric W. Bell ◽  
Wei Zheng ◽  
Xiaogen Zhou ◽  
...  

The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks. Compared to previous approaches, the major advantage of TripletRes is in its ability to learn and directly fuse a triplet of coevolutionary matrices extracted from the whole-genome and metagenome databases and therefore minimize the information loss during the course of contact model training. TripletRes was tested on a large set of 245 non-homologous proteins from CASP 11&12 and CAMEO experiments and outperformed other top methods from CASP12 by at least 58.4% for the CASP 11&12 targets and 44.4% for the CAMEO targets in the top-L long-range contact precision. On the 31 FM targets from the latest CASP13 challenge, TripletRes achieved the highest precision (71.6%) for the top-L/5 long-range contact predictions. It was also shown that a simple re-training of the TripletRes model with more proteins can lead to further improvement with precisions comparable to state-of-the-art methods developed after CASP13. These results demonstrate a novel efficient approach to extend the power of deep convolutional networks for high-accuracy medium- and long-range protein contact-map predictions starting from primary sequences, which are critical for constructing 3D structure of proteins that lack homologous templates in the PDB library.


2020 ◽  
Vol 12 (12) ◽  
pp. 2016 ◽  
Author(s):  
Tao Zhang ◽  
Puzhao Zhang ◽  
Weilin Zhong ◽  
Zhen Yang ◽  
Fan Yang

The traditional local binary pattern (LBP, hereinafter we also call it a two-dimensional local binary pattern 2D-LBP) is unable to depict the spectral characteristics of a hyperspectral image (HSI). To cure this deficiency, this paper develops a joint spectral-spatial 2D-LBP feature (J2D-LBP) by averaging three different 2D-LBP features in a three-dimensional hyperspectral data cube. Subsequently, J2D-LBP is added into the Gabor filter-based deep network (GFDN), and then a novel classification method JL-GFDN is proposed. Different from the original GFDN framework, JL-GFDN further fuses the spectral and spatial features together for HSI classification. Three real data sets are adopted to evaluate the effectiveness of JL-GFDN, and the experimental results verify that (i) JL-GFDN has a better classification accuracy than the original GFDN; (ii) J2D-LBP is more effective in HSI classification in comparison with the traditional 2D-LBP.


2019 ◽  
Vol 35 (14) ◽  
pp. i145-i153 ◽  
Author(s):  
Abbas Roayaei Ardakany ◽  
Ferhat Ay ◽  
Stefano Lonardi

AbstractMotivationHigh-throughput conformation capture experiments, such as Hi-C provide genome-wide maps of chromatin interactions, enabling life scientists to investigate the role of the three-dimensional structure of genomes in gene regulation and other essential cellular functions. A fundamental problem in the analysis of Hi-C data is how to compare two contact maps derived from Hi-C experiments. Detecting similarities and differences between contact maps are critical in evaluating the reproducibility of replicate experiments and for identifying differential genomic regions with biological significance. Due to the complexity of chromatin conformations and the presence of technology-driven and sequence-specific biases, the comparative analysis of Hi-C data is analytically and computationally challenging.ResultsWe present a novel method called Selfish for the comparative analysis of Hi-C data that takes advantage of the structural self-similarity in contact maps. We define a novel self-similarity measure to design algorithms for (i) measuring reproducibility for Hi-C replicate experiments and (ii) finding differential chromatin interactions between two contact maps. Extensive experimental results on simulated and real data show that Selfish is more accurate and robust than state-of-the-art methods.Availability and implementationhttps://github.com/ucrbioinfo/Selfish


Author(s):  
M Perzyk ◽  
R Biernacki ◽  
J Kozlowski

Determination of the most significant manufacturing process parameters using collected past data can be very helpful in solving important industrial problems, such as the detection of root causes of deteriorating product quality, the selection of the most efficient parameters to control the process, and the prediction of breakdowns of machines, equipment, etc. A methodology of determination of relative significances of process variables and possible interactions between them, based on interrogations of generalized regression models, is proposed and tested. The performance of several types of data mining tool, such as artificial neural networks, support vector machines, regression trees, classification trees, and a naïve Bayesian classifier, is compared. Also, some simple non-parametric statistical methods, based on an analysis of variance (ANOVA) and contingency tables, are evaluated for comparison purposes. The tests were performed using simulated data sets, with assumed hidden relationships, as well as on real data collected in the foundry industry. It was found that the performance of significance and interaction factors obtained from regression models, and, in particular, neural networks, is satisfactory, while the other methods appeared to be less accurate and/or less reliable.


Sign in / Sign up

Export Citation Format

Share Document