Optimization of in silico tools for predicting genetic variants: individualizing for genes with molecular sub-regional stratification

2019 ◽  
Vol 21 (5) ◽  
pp. 1776-1786 ◽  
Author(s):  
Bin Tang ◽  
Bin Li ◽  
Liang-Di Gao ◽  
Na He ◽  
Xiao-Rong Liu ◽  
...  

Abstract Genes are unique in functional role and differ in their sensitivities to genetic defects, but with difficulties in pathogenicity prediction. This study attempted to improve the performance of existing in silico algorithms and find a common solution based on individualization strategy. We initiated the individualization with the epilepsy-related SCN1A variants by sub-regional stratification. SCN1A missense variants related to epilepsy were retrieved from mutation databases, and benign missense variants were collected from ExAC database. Predictions were performed by using 10 traditional tools with stepwise optimizations. Model predictive ability was evaluated using the five-fold cross-validations on variants of SCN1A, SCN2A, and KCNQ2. Additional validation was performed in SCN1A variants of damage-confirmed/familial epilepsy. The performance of commonly used predictors was less satisfactory for SCN1A with accuracy less than 80% and varied dramatically by functional domains of Nav1.1. Multistep individualized optimizations, including cutoff resetting, domain-based stratification, and combination of predicting algorithms, significantly increased predictive performance. Similar improvements were obtained for variants in SCN2A and KCNQ2. The predictive performance of the recently developed ensemble tools, such as Mendelian clinically applicable pathogenicity, combined annotation-dependent depletion and Eigen, was also improved dramatically by application of the strategy with molecular sub-regional stratification. The prediction scores of SCN1A variants showed linear correlations with the degree of functional defects and the severity of clinical phenotypes. This study highlights the need of individualized optimization with molecular sub-regional stratification for each gene in practice.

2018 ◽  
Author(s):  
Karthik A. Jagadeesh ◽  
Joseph M. Paggi ◽  
James S. Ye ◽  
Peter D. Stenson ◽  
David N. Cooper ◽  
...  

AbstractThere are over 15,000 known variants that cause human inherited disease by disrupting RNA splicing. While several in silico methods such as CADD, EIGEN and LINSIGHT are commonly used to predict the pathogenicity of noncoding variants, we introduce S-CAP, a tool developed specially for splicing which is better able to effectively distinguish pathogenic splicing-relevant variants from benign variants. S-CAP is a novel splicing pathogenicity predictor that reduces the number of splicing-relevant variants of uncertain significance in patient exomes by 41%, a nearly 3-fold improvement over existing noncoding pathogenicity measures while correctly classifying known pathogenic splicing-relevant variants with a clinical-grade 95% sensitivity.


2011 ◽  
Vol 6 (2) ◽  
pp. 185-198
Author(s):  
Alejandro j. Brea-Fernandez ◽  
Marta Ferro ◽  
Ceres Fernandez-Rozadilla ◽  
Ana Blanco ◽  
Laura Fachal ◽  
...  

Andrology ◽  
2021 ◽  
Author(s):  
Miriam Cerván‐Martín ◽  
Lara Bossini‐Castillo ◽  
Rocío Rivera‐Egea ◽  
Nicolás Garrido ◽  
Saturnino Luján ◽  
...  

2021 ◽  
Vol 31 (2) ◽  
pp. 148-158
Author(s):  
A. Yu. Voronkova ◽  
Yu. L. Melyanovskaya ◽  
N. V. Petrova ◽  
T. A. Adyan ◽  
E. K. Zhekaite ◽  
...  

The variety of clinical manifestations of cystic fibrosis is driven by the diversity of the CFTR gene nucleotide sequence. Descriptions of the clinical manifestations in patients with the newly identified genetic variants are of particular interest.The aim of this study was to describe clinical manifestations of the disease with the newly identified genetic variants.Methods. Data from Registry of patients with cystic fibrosis in the Russian Federation (2018) were used. The data review included three steps — the search for frequent mutations, Sanger sequencing, and the search for extensive rearrangements by MLPA. 38 pathogenic variants were identified that were not previously described in the international CFTR2 database. We selected and analyzed full case histories of 15 patients with 10 of those 38 pathogenic variants: p.Tyr84*, G1047S, 3321delG, c.583delC, CFTRdele13,14del18, CFTRdele19-22, c.2619+1G>A, c.743+2T>A, p.Glu1433Gly, and CFTRdel4-8del10-11.Results. A nonsense variant p.Tyr84* was found in 5 patients (0.08 %). Two missense variants c.3139G>A were found in 2 siblings (0.03 %). The c.4298A>G was found in 1 patient. Other variants were detected in a single patient (0.02 %) each. They included two variants of a deletion with a shift of the reading frame 3321delG and c.583delC, two splicing disorders c.2619+1G>A and c.743+2T>A, three extended rearrangements CFTRdele19-22, CFTRdele13,14del18, and CFTRdel4-8del10-11. The last two variants include 2 rearrangements on one allele, which cause the severe course in two young children. 8 of the 10 variants are accompanied by pancreatic insufficiency (PI). Among patients with p.Tyr84*, one had ABPA, one had liver transplantation, and all had Pseudomonas aeruginosa infection. Nasal polyps were diagnosed in 2 patients with p.Tyr84*, 1 with G1047S, 1 with CFTRdel4-8del10-11, and 1 patient with 3321delG, who also had osteoporosis and cystic fibrosis-related diabetes (CFRD). 2 patients with PI with 3321delG and CFTRdel4-8del10-11 genetic variants, and 1 with PI with p.Glu1433Gly genetic variant had severe protein-energy malnutrition (PEM).Conclusion. Clinical manifestations of previously undescribed CFTR genetic variants were described. 5/10 genetic variants should be attributed to class I, 3/10 – to class 7 of the function classification of pathogenic CFTR gene variants associated with transcription and translation disruptions. Class of the identified missense variants c.3139G>A and c.4298A>G has not been established and requires further functional, cultural, and molecular genetic studies.


Author(s):  
Eva–Maria Walz ◽  
Marlon Maranan ◽  
Roderick van der Linden ◽  
Andreas H. Fink ◽  
Peter Knippertz

AbstractCurrent numerical weather prediction models show limited skill in predicting low-latitude precipitation. To aid future improvements, be it with better dynamical or statistical models, we propose a well-defined benchmark forecast. We use the arguably best currently high-resolution, gauge-calibrated, gridded precipitation product, the Integrated Multi-Satellite Retrievals for GPM (Global Precipitation Measurement) (IMERG) “final run” in a ± 15-day window around the date of interest to build an empirical climatological ensemble forecast. This window size is an optimal compromise between statistical robustness and flexibility to represent seasonal changes. We refer to this benchmark as Extended Probabilistic Climatology (EPC) and compute it on a 0.1°×0.1° grid for 40°S–40°N and the period 2001–2019. In order to reduce and standardize information, a mixed Bernoulli-Gamma distribution is fitted to the empirical EPC, which hardly affects predictive performance. The EPC is then compared to 1-day ensemble predictions from the European Centre for Medium-Range Weather Forecasts (ECMWF) using standard verification scores. With respect to rainfall amount, ECMWF performs only slightly better than EPS over most of the low latitudes and worse over high-mountain and dry oceanic areas as well as over tropical Africa, where the lack of skill is also evident in independent station data. For rainfall occurrence, EPC is superior over most oceanic, coastal, and mountain regions, although the better potential predictive ability of ECMWF indicates that this is mostly due to calibration problems. To encourage the use of the new benchmark, we provide the data, scripts, and an interactive webtool to the scientific community.


2018 ◽  
Author(s):  
Gabrielle Wheway ◽  
Liliya Nazlamova ◽  
Nervine Meshad ◽  
Samantha Hunt ◽  
Nicola Jackson ◽  
...  

AbstractAt least six different proteins of the spliceosome, including PRPF3, PRPF4, PRPF6, PRPF8, PRPF31 and SNRNP200, are mutated in autosomal dominant retinitis pigmentosa (adRP). These proteins have recently been shown to localise to the base of the connecting cilium of the retinal photoreceptor cells, elucidating this form of RP as a retinal ciliopathy. In the case of loss-of-function variants in these genes, pathogenicity can easily be ascribed. In the case of missense variants, this is more challenging. Furthermore, the exact molecular mechanism of disease in this form of RP remains poorly understood.In this paper we take advantage of the recently published cryo EM-resolved structure of the entire human spliceosome, to predict the effect of a novel missense variant in one component of the spliceosome; PRPF31, found in a patient attending the genetics eye clinic at Bristol Eye Hospital. Monoallelic variants in PRPF31 are a common cause of autosomal dominant retinitis pigmentosa (adRP) with incomplete penetrance. We use in vitro studies to confirm pathogenicity of this novel variant PRPF31 c.341T>A, p.Ile114Asn.This work demonstrates how in silico modelling of structural effects of missense variants on cryo-EM resolved protein complexes can contribute to predicting pathogenicity of novel variants, in combination with in vitro and clinical studies. It is currently a considerable challenge to assign pathogenic status to missense variants in these proteins.


Circulation ◽  
2015 ◽  
Vol 131 (suppl_1) ◽  
Author(s):  
Camille Lassale ◽  
Yvonne Van der Schouw ◽  
Joline Beulens ◽  
Guy Fagherazzi ◽  
Nina Roswall ◽  
...  

Introduction: Diet quality indexes and lifestyle indexes (which also include other lifestyle characteristics such as smoking and obesity) have recently received increased attention in disease prevention. Hypothesis: We aimed to investigate the comparative predictive performance of a comprehensive list of dietary and lifestyle indexes in relation to cardiovascular (CVD) mortality. Methods: We applied these indexes to men and women from 10 European countries participating in the European Prospective Investigation into Cancer and Nutrition (EPIC) study and examined their association with 10-year CVD mortality risk. We computed 10 dietary indexes and 2 diet and lifestyle indexes and calculated quartiles. Cox proportional hazard models stratified by age and study centre, adjusted for age, BMI, energy intake, smoking status, physical activity and educational level were fit to estimate Hazard Ratios (HR) and 95% CI. Harrell’s C-statistic, a discrimination measure of predictive performance, was calculated for each model. Results: After 10 years of follow up, 3761 CVD deaths were observed among 451 256 participants. All dietary indexes, except one, were significantly associated with CVD mortality with HR ranging from 0.75 to 0.84 for the fully adjusted model when comparing top vs bottom quartile (Table 1). Stronger effect size was observed for the diet and lifestyle indexes, particularly the Healthy Lifestyle Index (HLI). Discrimination of the full models was high and did not vary between scores. We found no heterogeneity in HRs across countries for most scores, except a modest heterogeneity for Mediterranean diet scores (I 2 =48%) and HLI (75%); however, heterogeneity across countries of the C-statistics was high for all scores (I 2 = 87%). Conclusion: Our results show that diet quality as a whole, or a cluster of lifestyle behaviours including diet, are consistently associated with a reduction of 10-year CVD mortality risk and that models comprising only age, sex and lifestyle risk factors could serve as predictors of CVD mortality.


2019 ◽  
Vol 2019 ◽  
pp. 1-9 ◽  
Author(s):  
Vera G. Pshennikova ◽  
Nikolay A. Barashkov ◽  
Georgii P. Romanov ◽  
Fedor M. Teryutin ◽  
Aisen V. Solov’ev ◽  
...  

In silico predictive software allows assessing the effect of amino acid substitutions on the structure or function of a protein without conducting functional studies. The accuracy of in silico pathogenicity prediction tools has not been previously assessed for variants associated with autosomal recessive deafness 1A (DFNB1A). Here, we identify in silico tools with the most accurate clinical significance predictions for missense variants of the GJB2 (Cx26), GJB6 (Cx30), and GJB3 (Cx31) connexin genes associated with DFNB1A. To evaluate accuracy of selected in silico tools (SIFT, FATHMM, MutationAssessor, PolyPhen-2, CONDEL, MutationTaster, MutPred, Align GVGD, and PROVEAN), we tested nine missense variants with previously confirmed clinical significance in a large cohort of deaf patients and control groups from the Sakha Republic (Eastern Siberia, Russia): Сх26: p.Val27Ile, p.Met34Thr, p.Val37Ile, p.Leu90Pro, p.Glu114Gly, p.Thr123Asn, and p.Val153Ile; Cx30: p.Glu101Lys; Cx31: p.Ala194Thr. We compared the performance of the in silico tools (accuracy, sensitivity, and specificity) by using the missense variants in GJB2 (Cx26), GJB6 (Cx30), and GJB3 (Cx31) genes associated with DFNB1A. The correlation coefficient (r) and coefficient of the area under the Receiver Operating Characteristic (ROC) curve as alternative quality indicators of the tested programs were used. The resulting ROC curves demonstrated that the largest coefficient of the area under the curve was provided by three programs: SIFT (AUC = 0.833, p = 0.046), PROVEAN (AUC = 0.833, p = 0.046), and MutationAssessor (AUC = 0.833, p = 0.002). The most accurate predictions were given by two tested programs: SIFT and PROVEAN (Ac = 89%, Se = 67%, Sp = 100%, r = 0.75, AUC = 0.833). The results of this study may be applicable for analysis of novel missense variants of the GJB2 (Cx26), GJB6 (Cx30), and GJB3 (Cx31) connexin genes.


2019 ◽  
Vol 104 (6) ◽  
pp. e64.2-e64
Author(s):  
H-Y Shi ◽  
X Huang ◽  
Q Li ◽  
Wu Y-E ◽  
MW Khan ◽  
...  

BackgroundTo evaluate the predictive ability of the existing formula to measure free ceftriaxone levels in children, and optimize the formula by adding disease and maturation factors.MethodsFifty children receiving ceftriaxone were evaluated, and the predictive performance of the different equations were assessed by mean absolute error (MAE), mean prediction error (MPE) and linear regression of predicted vs. actual free levels.ResultsThe average free ceftriaxone concentration was 2.11 ± 9.51µg/ml. The predicted free concentration was 1.15 ± 4.39µg/ml with the in vivo binding equation, which increased to 1.58 ± 7.73µg/ml and 2.01 ± 9.53µg/ml when adjusted for age (disease adapted equation), and age and albumin (disease-maturation equation) respectively. The average MAE values were 0.48 (in vivo banding equation), 0.34 (disease adapted equation) and 0.41 (disease maturation equation). The average MPE values were -0.41 (in vivo binding equation), 0.14 (disease adapted equation) and 0.09 (disease maturation equation). The respective linear regression equations and coefficients were y=1.8647x+1.0731(R2=0.7398), y=1.1455x+0.8414(R2=0.8674), and y=0.9664x(R2=0.8641) for the in vivo binding, disease adapted and disease maturation equations respectively.ConclusionCompared to the in vivo binding equation, the disease adapted and disease maturation equations showed lower MAE and MPE values, and the latter showed the lowest MPE value. In addition, the slope of the disease maturation equation was closer to 1 compared to the other two. Therefore, the optimized disease maturation equation should be used to measure free ceftriaxone levels in children.Disclosure(s)Nothing to disclose.


Sign in / Sign up

Export Citation Format

Share Document