scholarly journals Evaluating molecular modeling tools for thermal stability using an independently generated dataset

2019 ◽  
Author(s):  
Peishan Huang ◽  
Simon K. S. Chu ◽  
Henrique N. Frizzo ◽  
Morgan P. Connolly ◽  
Ryan W. Caster ◽  
...  

ABSTRACTEngineering proteins to enhance thermal stability is a widely utilized approach for creating industrially relevant biocatalysts. Computational tools that guide these engineering efforts remain an active area of research with new data sets and develop algorithms. To aid in these efforts, we are reporting an expansion of our previously published data set of mutants for a β-glucosidase to include both measures of TM and ΔΔG, to complement the previously reported measures of T50 and kinetic constants (kcat and KM). For a set of 51 mutants, we found that T50 and TM are moderately correlated with a Pearson correlation coefficient (PCC) of 0.58, indicated the two methods capture different physical features. The performance of predicted stability using five computational tools are also evaluated on the 51 mutants dataset, none of which are found to be strong predictors of the observed changes in T50, TM, or ΔΔG. Furthermore, the ability of the five algorithms to predict the production of isolatable soluble protein is examined, which revealed that Rosetta ΔΔG, ELASPIC, and DeepDDG are capable of predicting if a mutant could be produced and isolated as a soluble protein. These results further highlight the need for new algorithms for predicting modest, yet important, changes in thermal stability as well as a new utility for current algorithms for prescreening designs for the production of soluble mutants.

2011 ◽  
Vol 61 (2) ◽  
pp. 225-238 ◽  
Author(s):  
Wen Bo Liao ◽  
Zhi Ping Mi ◽  
Cai Quan Zhou ◽  
Ling Jin ◽  
Xian Han ◽  
...  

AbstractComparative studies of the relative testes size in animals show that promiscuous species have relatively larger testes than monogamous species. Sperm competition favours the evolution of larger ejaculates in many animals – they give bigger testes. In the view, we presented data on relative testis mass for 17 Chinese species including 3 polyandrous species. We analyzed relative testis mass within the Chinese data set and combining those data with published data sets on Japanese and African frogs. We found that polyandrous foam nesting species have relatively large testes, suggesting that sperm competition was an important factor affecting the evolution of relative testes size. For 4 polyandrous species testes mass is positively correlated with intensity (males/mating) but not with risk (frequency of polyandrous matings) of sperm competition.


2017 ◽  
Vol 3 (5) ◽  
pp. e192 ◽  
Author(s):  
Corina Anastasaki ◽  
Stephanie M. Morris ◽  
Feng Gao ◽  
David H. Gutmann

Objective:To ascertain the relationship between the germline NF1 gene mutation and glioma development in patients with neurofibromatosis type 1 (NF1).Methods:The relationship between the type and location of the germline NF1 mutation and the presence of a glioma was analyzed in 37 participants with NF1 from one institution (Washington University School of Medicine [WUSM]) with a clinical diagnosis of NF1. Odds ratios (ORs) were calculated using both unadjusted and weighted analyses of this data set in combination with 4 previously published data sets.Results:While no statistical significance was observed between the location and type of the NF1 mutation and glioma in the WUSM cohort, power calculations revealed that a sample size of 307 participants would be required to determine the predictive value of the position or type of the NF1 gene mutation. Combining our data set with 4 previously published data sets (n = 310), children with glioma were found to be more likely to harbor 5′-end gene mutations (OR = 2; p = 0.006). Moreover, while not clinically predictive due to insufficient sensitivity and specificity, this association with glioma was stronger for participants with 5′-end truncating (OR = 2.32; p = 0.005) or 5′-end nonsense (OR = 3.93; p = 0.005) mutations relative to those without glioma.Conclusions:Individuals with NF1 and glioma are more likely to harbor nonsense mutations in the 5′ end of the NF1 gene, suggesting that the NF1 mutation may be one predictive factor for glioma in this at-risk population.


2020 ◽  
Vol 295 (27) ◽  
pp. 8999-9011 ◽  
Author(s):  
Alina Glaub ◽  
Christopher Huptas ◽  
Klaus Neuhaus ◽  
Zachary Ardern

Ribosome profiling (RIBO-Seq) has improved our understanding of bacterial translation, including finding many unannotated genes. However, protocols for RIBO-Seq and corresponding data analysis are not yet standardized. Here, we analyzed 48 RIBO-Seq samples from nine studies of Escherichia coli K12 grown in lysogeny broth medium and particularly focused on the size-selection step. We show that for conventional expression analysis, a size range between 22 and 30 nucleotides is sufficient to obtain protein-coding fragments, which has the advantage of removing many unwanted rRNA and tRNA reads. More specific analyses may require longer reads and a corresponding improvement in rRNA/tRNA depletion. There is no consensus about the appropriate sequencing depth for RIBO-Seq experiments in prokaryotes, and studies vary significantly in total read number. Our analysis suggests that 20 million reads that are not mapping to rRNA/tRNA are required for global detection of translated annotated genes. We also highlight the influence of drug-induced ribosome stalling, which causes bias at translation start sites. The resulting accumulation of reads at the start site may be especially useful for detecting weakly expressed genes. As different methods suit different questions, it may not be possible to produce a “one-size-fits-all” ribosome profiling data set. Therefore, experiments should be carefully designed in light of the scientific questions of interest. We propose some basic characteristics that should be reported with any new RIBO-Seq data sets. Careful attention to the factors discussed should improve prokaryotic gene detection and the comparability of ribosome profiling data sets.


Research on online interactions during a learning situation to better understand users' practices and to provide them with quality-oriented features, resources and services is attracting a large community. As a result, the interest for sharing educational data sets that translate the interactions of users with e-learning systems has become a hot topic today. However, the current systems aggregating social and usage data about their users suffer from a series of weaknesses. In particular, they lack a common information model that would allow for exchanges of interaction data at a large scale. To tackle this issue, we propose in this paper a generic model able to federate heterogeneous context metadata and to facilitate their share and reuse. This framework has been successfully applied to several data sets provided by the research community, and thus gives access to a big data set that could help researchers to increase efficiency of existing learning analytics technics, and promote research and development of new algorithms and services on top of these data.


2016 ◽  
Vol 34 (2_suppl) ◽  
pp. 152-152
Author(s):  
Karthikeyan Perumal ◽  
Mahadev Potharaju

152 Background: To characterize the intra-fraction and inter-fraction prostate motion as tracked by the X-ray images of the implanted gold fiducials during stereotactic radiotherapy with CyberKnife. The published data have analysed the linear and angular prostate motion intrafraction and interfraction prostate motion among patients. We sought to quantify the same within each patient. Methods: Twenty Five patients with localized prostate cancer treated with CyberKnife radiosurgery between January 2013 and August 2015 were studied retrospectively. A data set constitutes the deviations derived from X-ray images obtained between two consecutive couch motions. Results: Included in the analysis were 3926 data sets. A total of 210 non-coplanar fields were used per fraction. The mean total treatment time for all fields per fraction was 36.13 minutes. The detected and corrected movements over all were in a range of ± 10.1 mm in linear direction (Right: mean 1.1±0.4 mm; Left: mean 1.0±0.6 mm; Superior: mean 0.7±0.3 mm; Inferior: mean 1.6±0.6 mm; Anterior: mean 1.6±0.7 mm; Posterior: mean 0.5±0.3 mm with maximum (max) movement range of Right max 9.9±6.4 mm, Left max 7.1±3.4 mm, Superior max 8.6±5.4 mm, Inferior max 10.1±8.5 mm, Anterior max 9.2±6.5 mm, Posterior max 8.4±2.9 mm) and angular movements were in a range of ± 6.7 deg in all directions (Right Angle: mean 0.6±0.3 deg; Left Angle: mean 0.6±0.3 deg; Head Up(H-U): mean 1.3±0.6 deg; Head Down(H-D): mean 1.4±0.6 deg; Counter-Clockwise movement (CCW): mean 0.7±0.3 deg; Clockwise movement (CW): mean 0.5±0.3 deg with max rotation range of Right angle max 2.4±2 deg, Left angle max 2.7±2 deg, H-U max 10.2±3.5 deg, H-D max 6.7±4.8 deg, CCW 4±2.9 deg, CW max 2.8±2.4 deg). There was an unpredictable change in prostate motion inter-fraction in each patient. But, a unique observation is that a predictable pattern exists for prostate motion intra-fraction within a patient. Change in the linear or angular prostate motion intra-fraction in any direction is not erratic. Conclusions: The linear and rotational prostate motion intra-fraction in any direction has a predictable pattern and any change is gradual and not erratic. The motion shows secular trend during the course of treatment.


2018 ◽  
Vol 18 (2) ◽  
pp. 599-611 ◽  
Author(s):  
Marinella Passarella ◽  
Evan B. Goldstein ◽  
Sandro De Muro ◽  
Giovanni Coco

Abstract. We use genetic programming (GP), a type of machine learning (ML) approach, to predict the total and infragravity swash excursion using previously published data sets that have been used extensively in swash prediction studies. Three previously published works with a range of new conditions are added to this data set to extend the range of measured swash conditions. Using this newly compiled data set we demonstrate that a ML approach can reduce the prediction errors compared to well-established parameterizations and therefore it may improve coastal hazards assessment (e.g. coastal inundation). Predictors obtained using GP can also be physically sound and replicate the functionality and dependencies of previous published formulas. Overall, we show that ML techniques are capable of both improving predictability (compared to classical regression approaches) and providing physical insight into coastal processes.


2017 ◽  
Vol 3 (1) ◽  
Author(s):  
Dora Matzke ◽  
Alexander Ly ◽  
Ravi Selker ◽  
Wouter D. Weeda ◽  
Benjamin Scheibehenne ◽  
...  

Whenever parameter estimates are uncertain or observations are contaminated by measurement error, the Pearson correlation coefficient can severely underestimate the true strength of an association. Various approaches exist for inferring the correlation in the presence of estimation uncertainty and measurement error, but none are routinely applied in psychological research. Here we focus on a Bayesian hierarchical model proposed by Behseta, Berdyyeva, Olson, and Kass (2009) that allows researchers to infer the underlying correlation between error-contaminated observations. We show that this approach may be also applied to obtain the underlying correlation between uncertain parameter estimates as well as the correlation between uncertain parameter estimates and noisy observations. We illustrate the Bayesian modeling of correlations with two empirical data sets; in each data set, we first infer the posterior distribution of the underlying correlation and then compute Bayes factors to quantify the evidence that the data provide for the presence of an association.


2014 ◽  
Vol 204 (2) ◽  
pp. 108-114 ◽  
Author(s):  
Elliott Rees ◽  
James T. R. Walters ◽  
Lyudmila Georgieva ◽  
Anthony R. Isles ◽  
Kimberly D. Chambert ◽  
...  

BackgroundA number of copy number variants (CNVs) have been suggested as susceptibility factors for schizophrenia. For some of these the data remain equivocal, and the frequency in individuals with schizophrenia is uncertain.AimsTo determine the contribution of CNVs at 15 schizophrenia-associated loci (a) using a large new data-set of patients with schizophrenia (n= 6882) and controls (n= 6316), and (b) combining our results with those from previous studies.MethodWe used Illumina microarrays to analyse our data. Analyses were restricted to 520 766 probes common to all arrays used in the different data-sets.ResultsWe found higher rates in participants with schizophrenia than in controls for 13 of the 15 previously implicated CNVs. Six were nominally significantly associated (P<0.05) in this new data-set: deletions at 1q21.1,NRXN1, 15q11.2 and 22q11.2 and duplications at 16p11.2 and the Angelman/Prader–Willi Syndrome (AS/PWS) region. All eight AS/PWS duplications in patients were of maternal origin. When combined with published data, 11 of the 15 loci showed highly significant evidence for association with schizophrenia (P<4.1×10−4).ConclusionsWe strengthen the support for the majority of the previously implicated CNVs in schizophrenia. About 2.5% of patients with schizophrenia and 0.9% of controls carry a large, detectable CNV at one of these loci. Routine CNV screening may be clinically appropriate given the high rate of known deleterious mutations in the disorder and the comorbidity associated with these heritable mutations.


2020 ◽  
Vol 76 (11) ◽  
pp. 1134-1144 ◽  
Author(s):  
Helen M. Ginn

Drug and fragment screening at X-ray crystallography beamlines has been a huge success. However, it is inevitable that more high-profile biological drug targets will be identified for which high-quality, highly homogenous crystal systems cannot be found. With increasing heterogeneity in crystal systems, the application of current multi-data-set methods becomes ever less sensitive to bound ligands. In order to ease the bottleneck of finding a well behaved crystal system, pre-clustering of data sets can be carried out using cluster4x after data collection to separate data sets into smaller partitions in order to restore the sensitivity of multi-data-set methods. Here, the software cluster4x is introduced for this purpose and validated against published data sets using PanDDA, showing an improved total signal from existing ligands and identifying new hits in both highly heterogenous and less heterogenous multi-data sets. cluster4x provides the researcher with an interactive graphical user interface with which to explore multi-data set experiments.


2012 ◽  
Vol 33 (1) ◽  
pp. 150-154 ◽  
Author(s):  
Uwe Fritz ◽  
Mario Vargas-Ramírez ◽  
Pavel Široký

We re-examine the phylogenetic position of Pelusios williamsi by merging new sequences with an earlier published data set of all Pelusios species, except the possibly extinct P. seychellensis, and the nine previously identified lineages of the closely allied genus Pelomedusa (2054 bp mtDNA, 2025 bp nDNA). Furthermore, we include new sequences of Pelusios broadleyi, P. castanoides, P. gabonensis and P. marani. Individual and combined analyses of the mitochondrial and nuclear data sets indicate that P. williamsi is sister to P. castanoides, as predicted by morphology. This provides evidence for the misidentification of GenBank sequences allegedly representing P. williamsi. Such mislabelled GenBank sequences contribute to continued confusion, because only the original submitter can revise their identification; an impractical procedure impeding the rectification of obvious mistakes. We recommend implementing another option for revising taxonomic identifications, paralleling the century-old best practice of natural history museums for new determinations of specimens. Within P. broadleyi, P. gabonensis and P. marani, there is only shallow genetic divergence, while some phylogeographic structuring is present in the wide-ranging species P. castaneus and P. castanoides.


Sign in / Sign up

Export Citation Format

Share Document