mUSP: a high-accuracy map of the in situ crosstalk of ubiquitylation and SUMOylation proteome predicted via the feature enhancement approach

Author(s):  
Hao-Dong Xu ◽  
Ru-Ping Liang ◽  
You-Gan Wang ◽  
Jian-Ding Qiu

Abstract Reversible post-translational modification (PTM) orchestrates various biological processes by changing the properties of proteins. Since many proteins are multiply modified by PTMs, identification of PTM crosstalk site has emerged to be an intriguing topic and attracted much attention. In this study, we systematically deciphered the in situ crosstalk of ubiquitylation and SUMOylation that co-occurs on the same lysine residue. We first collected 3363 ubiquitylation-SUMOylation (UBS) crosstalk site on 1302 proteins and then investigated the prime sequence motifs, the local evolutionary degree and the distribution of structural annotations at the residue and sequence levels between the UBS crosstalk and the single modification sites. Given the properties of UBS crosstalk sites, we thus developed the mUSP classifier to predict UBS crosstalk site by integrating different types of features with two-step feature optimization by recursive feature elimination approach. By using various cross-validations, the mUSP model achieved an average area under the curve (AUC) value of 0.8416, indicating its promising accuracy and robustness. By comparison, the mUSP has significantly better performance with the improvement of 38.41 and 51.48% AUC values compared to the cross-results by the previous single predictor. The mUSP was implemented as a web server available at http://bioinfo.ncu.edu.cn/mUSP/index.html to facilitate the query of our high-accuracy UBS crosstalk results for experimental design and validation.

2021 ◽  
Vol 22 (5) ◽  
pp. 2704
Author(s):  
Andi Nur Nilamyani ◽  
Firda Nurul Auliah ◽  
Mohammad Ali Moni ◽  
Watshara Shoombuatong ◽  
Md Mehedi Hasan ◽  
...  

Nitrotyrosine, which is generated by numerous reactive nitrogen species, is a type of protein post-translational modification. Identification of site-specific nitration modification on tyrosine is a prerequisite to understanding the molecular function of nitrated proteins. Thanks to the progress of machine learning, computational prediction can play a vital role before the biological experimentation. Herein, we developed a computational predictor PredNTS by integrating multiple sequence features including K-mer, composition of k-spaced amino acid pairs (CKSAAP), AAindex, and binary encoding schemes. The important features were selected by the recursive feature elimination approach using a random forest classifier. Finally, we linearly combined the successive random forest (RF) probability scores generated by the different, single encoding-employing RF models. The resultant PredNTS predictor achieved an area under a curve (AUC) of 0.910 using five-fold cross validation. It outperformed the existing predictors on a comprehensive and independent dataset. Furthermore, we investigated several machine learning algorithms to demonstrate the superiority of the employed RF algorithm. The PredNTS is a useful computational resource for the prediction of nitrotyrosine sites. The web-application with the curated datasets of the PredNTS is publicly available.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Nicholas Nuechterlein ◽  
Beibin Li ◽  
Abdullah Feroze ◽  
Eric C Holland ◽  
Linda Shapiro ◽  
...  

Abstract Background Combined whole-exome sequencing (WES) and somatic copy number alteration (SCNA) information can separate isocitrate dehydrogenase (IDH)1/2-wildtype glioblastoma into two prognostic molecular subtypes, which cannot be distinguished by epigenetic or clinical features. The potential for radiographic features to discriminate between these molecular subtypes has yet to be established. Methods Radiologic features (n = 35 340) were extracted from 46 multisequence, pre-operative magnetic resonance imaging (MRI) scans of IDH1/2-wildtype glioblastoma patients from The Cancer Imaging Archive (TCIA), all of whom have corresponding WES/SCNA data. We developed a novel feature selection method that leverages the structure of extracted MRI features to mitigate the dimensionality challenge posed by the disparity between a large number of features and the limited patients in our cohort. Six traditional machine learning classifiers were trained to distinguish molecular subtypes using our feature selection method, which was compared to least absolute shrinkage and selection operator (LASSO) feature selection, recursive feature elimination, and variance thresholding. Results We were able to classify glioblastomas into two prognostic subgroups with a cross-validated area under the curve score of 0.80 (±0.03) using ridge logistic regression on the 15-dimensional principle component analysis (PCA) embedding of the features selected by our novel feature selection method. An interrogation of the selected features suggested that features describing contours in the T2 signal abnormality region on the T2-weighted fluid-attenuated inversion recovery (FLAIR) MRI sequence may best distinguish these two groups from one another. Conclusions We successfully trained a machine learning model that allows for relevant targeted feature extraction from standard MRI to accurately predict molecularly-defined risk-stratifying IDH1/2-wildtype glioblastoma patient groups.


2009 ◽  
Vol 390 (2) ◽  
pp. 137-144 ◽  
Author(s):  
Yingmiao Liu ◽  
Chien-Tsun Kuan ◽  
Jing Mi ◽  
Xiuwu Zhang ◽  
Bryan M. Clary ◽  
...  

Abstract Epidermal growth factor receptor variant III (EGFRvIII) is a glycoprotein uniquely expressed in glioblastoma, but not in normal brain tissues. To develop targeted therapies for brain tumors, we selected RNA aptamers against the histidine-tagged EGFRvIII ectodomain, using an Escherichia coli system for protein expression and purification. Representative aptamer E21 has a dissociation constant (Kd) of 33×10-9 m, and exhibits high affinity and specificity for EGFRvIII in ELISA and surface plasmon resonance assays. However, selected aptamers cannot bind the same protein expressed from eukaryotic cells because glycosylation, a post-translational modification present only in eukaryotic systems, significantly alters the structure of the target protein. By transfecting EGFRvIII aptamers into cells, we find that membrane-bound, glycosylated EGFRvIII is reduced and the percentage of cells undergoing apoptosis is increased. We postulate that transfected aptamers can interact with newly synthesized EGFRvIII, disrupt proper glycosylation, and reduce the amount of mature EGFRvIII reaching the cell surface. Our work establishes the feasibility of disrupting protein post-translational modifications in situ with aptamers. This finding is useful for elucidating the function of proteins of interest with various modifications, as well as dissecting signal transduction pathways.


2021 ◽  
Vol 24 ◽  
Author(s):  
Anna Torres-Giménez ◽  
Alba Roca-Lecumberri ◽  
Bàrbara Sureda ◽  
Susana Andrés-Perpiña ◽  
Bruma Palacios-Hernández ◽  
...  

Abstract The aim of the present study was to validate the Spanish Postpartum Bonding Questionnaire (PBQ) against external criteria of bonding disorder, as well as to establish its test-retest reliability. One hundred fifty-six postpartum women consecutively recruited from a perinatal mental health outpatient unit completed the PBQ at 4–6 weeks postpartum. Four weeks later, all mothers completed again the PBQ and were interviewed using the Birmingham Interview for Maternal Mental Health to establish the presence of a bonding disorder. Receiver operating characteristic curve analysis revealed an area under the curve (AUC) value for the PBQ total score of 0.93, 95% CI [0.88, 0.98], with the optimal cut-off of 13 for detecting bonding disorders (sensitivity: 92%, specificity: 87%). Optimal cut-off scores for each scale were also obtained. The test-retest reliability coefficients were moderate to good. Our data confirm the validity of PBQ for detecting bonding disorders in Spanish population.


1999 ◽  
Vol 112 (12) ◽  
pp. 2019-2032 ◽  
Author(s):  
A.D. Minet ◽  
B.P. Rubin ◽  
R.P. Tucker ◽  
S. Baumgartner ◽  
R. Chiquet-Ehrismann

The Drosophila gene ten-m is the first pair-rule gene not encoding a transcription factor, but an extracellular protein. We have characterized a highly conserved chicken homologue that we call teneurin-1. The C-terminal part harbors 26 repetitive sequence motifs termed YD-repeats. The YD-repeats are most similar to the core of the rhs elements of Escherichia coli. Related repeats in toxin A of Clostridium difficile are known to bind specific carbohydrates. We show that recombinantly expressed proteins containing the YD-repeats of teneurin-1 bind to heparin. Furthermore, heparin lyase treatment of extracts of cells expressing recombinant YD-repeat protein releases this protein from high molecular mass aggregates. In situ hybridization and immunostaining reveals teneurin-1 expression in neurons of the developing visual system of chicken and Drosophila. This phylogenetic conservation of neuronal expression from flies to birds implies fundamental roles for teneurin-1 in neurogenesis. This is supported by the neurite outgrowth occurring on substrates made of recombinant YD-repeat proteins, which can be inhibited by heparin. Database searches resulted in the identification of ESTs encoding at least three further members of the teneurin family of proteins. Furthermore, the human teneurin-1 gene could be identified on chromosome Xq24/25, a region implied in an X-linked mental retardation syndrome.


Sensors ◽  
2018 ◽  
Vol 18 (12) ◽  
pp. 4406 ◽  
Author(s):  
Rafael Sola-Guirado ◽  
Sergio Bayano-Tejero ◽  
Antonio Rodríguez-Lizana ◽  
Jesús Gil-Ribes ◽  
Antonio Miranda-Fuentes

Canopy characterization has become important when trying to optimize any kind of agricultural operation in high-growing crops, such as olive. Many sensors and techniques have reported satisfactory results in these approaches and in this work a 2D laser scanner was explored for measuring canopy trees in real-time conditions. The sensor was tested in both laboratory and field conditions to check its accuracy, its cone width, and its ability to characterize olive canopies in situ. The sensor was mounted on a mast and tested in laboratory conditions to check: (i) its accuracy at different measurement distances; (ii) its measurement cone width with different reflectivity targets; and (iii) the influence of the target’s density on its accuracy. The field tests involved both isolated and hedgerow orchards, in which the measurements were taken manually and with the sensor. The canopy volume was estimated with a methodology consisting of revolving or extruding the canopy contour. The sensor showed high accuracy in the laboratory test, except for the measurements performed at 1.0 m distance, with 60 mm error (6%). Otherwise, error remained below 20 mm (1% relative error). The cone width depended on the target reflectivity. The accuracy decreased with the target density.


Author(s):  
Dongying Ma ◽  
Ivo M. B. Francischetti ◽  
Jose M. C. Ribeiro ◽  
John F. Andersen

Secreted protein components of hookworm species include a number of representatives of the cysteine-rich/antigen 5/pathogenesis-related 1 (CAP) protein family known asAncylostoma-secreted proteins (ASPs). Some of these have been considered as candidate antigens for the development of vaccines against hookworms. The functions of most CAP superfamily members are poorly understood, but one form, the hookworm platelet inhibitor (HPI), has been isolated as a putative antagonist of the platelet integrins αIIbβ3and α2β1. Here, the crystal structure of HPI is described and its structural features are examined in relation to its possible function. The HPI structure is similar to those of other ASPs and shows incomplete conservation of the sequence motifs CAP1 and CAP2 that are considered to be diagnostic of CAP superfamily members. The asymmetric unit of the HPI crystal contains a dimer with an extensive interaction interface, but chromatographic measurements indicate that it is primarily monomeric in solution. In the dimeric structure, the putative active-site cleft areas from both monomers are united into a single negatively charged depression. A potential Lys-Gly-Asp disintegrin-like motif was identified in the sequence of HPI, but is not positioned at the apex of a tight turn, making it unlikely that it interacts with the integrin. Recombinant HPI produced inEscherichia coliwas found not to inhibit the adhesion of human platelets to collagen or fibrinogen, despite having a native structure as shown by X-ray diffraction. This result corroborates previous analyses of recombinant HPI and suggests that it might require post-translational modification or have a different biological function.


Information ◽  
2022 ◽  
Vol 13 (1) ◽  
pp. 35
Author(s):  
Jibouni Ayoub ◽  
Dounia Lotfi ◽  
Ahmed Hammouch

The analysis of social networks has attracted a lot of attention during the last two decades. These networks are dynamic: new links appear and disappear. Link prediction is the problem of inferring links that will appear in the future from the actual state of the network. We use information from nodes and edges and calculate the similarity between users. The more users are similar, the higher the probability of their connection in the future will be. The similarity metrics play an important role in the link prediction field. Due to their simplicity and flexibility, many authors have proposed several metrics such as Jaccard, AA, and Katz and evaluated them using the area under the curve (AUC). In this paper, we propose a new parameterized method to enhance the AUC value of the link prediction metrics by combining them with the mean received resources (MRRs). Experiments show that the proposed method improves the performance of the state-of-the-art metrics. Moreover, we used machine learning algorithms to classify links and confirm the efficiency of the proposed combination.


2021 ◽  
Author(s):  
Baixing Chen ◽  
Shaoshuo Li ◽  
Zhaoqi Lu ◽  
Mingling Huang ◽  
Shi Lin ◽  
...  

Abstract Background: Staphylococcus aureus (S. aureus) is the most common pathogen that causes osteomyelitis (OM). However, OM's pathogenesis, which is not clear, involves many factors such as environment, genetics and immunity dysregulation. This study aims to explore the key genes involved in the pathogenesis and development of OM following S. aureus infection. Methods: After obtaining the datasets of GSE6269 and GSE16129, we performed weighted gene co-expression network analysis (WGCNA) to find clusters modules of highly correlated genes and recursive feature elimination (RFE) method to narrow the range of feature genes. For determining the effect of feature genes, we constructed a random forest (RF) model with feature genes and validated the predictive validity of the RF model using independent data from GSE11908. The protein-protein interaction (PPI) network identifies essential proteins that contributed to OM development. Results: There were 12,401 genes from 77 samples that 48 S. aureus patients developed to OM and 29 of those without OM. We divided 31 significant gene modules into different modules, and the brown module significantly related to OM. Biological Functions of the brown module mainly enriched in the inflammatory response, metabolic, cancer, viral pathways, protein binding and RNA binding. After screening, 19 genes, including CYP2E1, BBS10, ARPC5L, GAPVD1, PURA, RBMS1, BTN2A2, EXOSC8, METTL8, FYCO1, KHK, PRPF38B, CD72, C2CD5, ABHD6, CD200, FAM53C, HCP5 and ELP1, were defined as feature genes for constructing RF model. After validating the external data, the average area under the curve was 85%, and the accuracy of the RF model was 85.7%. The protein function of modules enriched in the RNA exosome complex's catalytic component and regulation of actin polymerization. Conclusions: This study aimed to identify related genes involved in the occurrence and development of OM. We constructed the RF model with 19 genes, which effectively classify the patients with OM or non-OM. Despite its limitations, the study certainly adds to our understanding of OM's pathogenesis, and therefore, has significant implications for potential therapeutic targets and the predicted value of OM.


2021 ◽  
Author(s):  
Kun-Lin Wu ◽  
Che-Yi Chou ◽  
An-Lun Li ◽  
Chien-Lung Chen ◽  
Jen-chieh Tsai ◽  
...  

Abstract Encapsulating peritoneal sclerosis (EPS) is a catastrophic complication of chronic peritoneal dialysis (PD). Late diagnosis is associated with high mortality. With the advancement of new diagnostic technologies, such as microRNA (miRNA), we attempted to develop a noninvasive test to assist in the diagnosis of EPS. The eight-hour PD effluents were collected from 71 non-EPS and 56 EPS patients. The screening set included 28 samples (20 of non-EPS vs. 8 of EPS). After analyzing the ratio values of two miRNA expression levels from the high-throughput real-time PCR-array of 377 miRNAs, eight candidate miRNAs were selected. The prediction model was conducted using 127 samples (71 of non-EPS vs 56 of EPS) to produce an area under the curve (AUC) value of the miRNA classifier. Candidate miRNAs were also verified by single real-time PCR. The ratios of the five miRNAs with the top five ROC values were selected to calculate the combined AUC by multiple logistic regression. The AUC value to detect EPS with the five miRNA ratios was 0.8929 with an accuracy of 78.7%. The accuracy of the EPS diagnosis was further optimized to 94.1% after considering clinical characteristics (AUC value 0.9931). A signature-based model of clinical characteristics and miRNA expression in PD effluents can efficiently assist in the diagnosis of EPS, thus preventing the catastrophic prognosis.


Sign in / Sign up

Export Citation Format

Share Document