scholarly journals GP4: an integrated Gram-Positive Protein Prediction Pipeline for subcellular localization mimicking bacterial sorting

Author(s):  
Stefano Grasso ◽  
Tjeerd van Rij ◽  
Jan Maarten van Dijl

Abstract Subcellular localization is a critical aspect of protein function and the potential application of proteins either as drugs or drug targets, or in industrial and domestic applications. However, the experimental determination of protein localization is time consuming and expensive. Therefore, various localization predictors have been developed for particular groups of species. Intriguingly, despite their major representation amongst biotechnological cell factories and pathogens, a meta-predictor based on sorting signals and specific for Gram-positive bacteria was still lacking. Here we present GP4, a protein subcellular localization meta-predictor mainly for Firmicutes, but also Actinobacteria, based on the combination of multiple tools, each specific for different sorting signals and compartments. Novelty elements include improved cell-wall protein prediction, including differentiation of the type of interaction, prediction of non-canonical secretion pathway target proteins, separate prediction of lipoproteins and better user experience in terms of parsability and interpretability of the results. GP4 aims at mimicking protein sorting as it would happen in a bacterial cell. As GP4 is not homology based, it has a broad applicability and does not depend on annotated databases with homologous proteins. Non-canonical usage may include little studied or novel species, synthetic and engineered organisms, and even re-use of the prediction data to develop custom prediction algorithms. Our benchmark analysis highlights the improved performance of GP4 compared to other widely used subcellular protein localization predictors. A webserver running GP4 is available at http://gp4.hpc.rug.nl/

2020 ◽  
Vol 19 (7) ◽  
pp. 1076-1087 ◽  
Author(s):  
Georg H. H. Borner

Protein subcellular localization is an essential and highly regulated determinant of protein function. Major advances in mass spectrometry and imaging have allowed the development of powerful spatial proteomics approaches for determining protein localization at the whole cell scale. Here, a brief overview of current methods is presented, followed by a detailed discussion of organellar mapping through proteomic profiling. This relatively simple yet flexible approach is rapidly gaining popularity, because of its ability to capture the localizations of thousands of proteins in a single experiment. It can be used to generate high-resolution cell maps, and as a tool for monitoring protein localization dynamics. This review highlights the strengths and limitations of the approach and provides guidance to designing and interpreting profiling experiments.


eLife ◽  
2016 ◽  
Vol 5 ◽  
Author(s):  
Daniel N Itzhak ◽  
Stefka Tyanova ◽  
Jürgen Cox ◽  
Georg HH Borner

Subcellular localization critically influences protein function, and cells control protein localization to regulate biological processes. We have developed and applied Dynamic Organellar Maps, a proteomic method that allows global mapping of protein translocation events. We initially used maps statically to generate a database with localization and absolute copy number information for over 8700 proteins from HeLa cells, approaching comprehensive coverage. All major organelles were resolved, with exceptional prediction accuracy (estimated at >92%). Combining spatial and abundance information yielded an unprecedented quantitative view of HeLa cell anatomy and organellar composition, at the protein level. We subsequently demonstrated the dynamic capabilities of the approach by capturing translocation events following EGF stimulation, which we integrated into a quantitative model. Dynamic Organellar Maps enable the proteome-wide analysis of physiological protein movements, without requiring any reagents specific to the investigated process, and will thus be widely applicable in cell biology.


2021 ◽  
Author(s):  
Brianna Jayanthi ◽  
Bhagyashree Bachhav ◽  
Zengyi Wan ◽  
Santiago Martinez Legaspi ◽  
Laura Segatori

Abstract Mammalian cells process information through coordinated spatiotemporal regulation of proteins. Engineering cellular networks thus relies on efficient tools for regulating protein levels in specific subcellular compartments. To address the need to manipulate the extent and dynamics of protein localization, we developed a platform technology for target-specific control of protein destination. This platform is based on bifunctional molecules comprising a target-specific nanobody and universal sequences determining target subcellular localization or degradation rate. We demonstrate that nanobody-mediated localization depends on the expression level of the target and the nanobody, and the extent of target subcellular localization can be regulated by combining multiple target-specific nanobodies with distinct localization or degradation sequences. We also show that this platform for nanobody-mediated target localization and degradation can be regulated transcriptionally and integrated within orthogonal genetic circuits to achieve the desired temporal control over spatial regulation of target proteins. The platform reported in this study provides an innovative tool to control protein subcellular localization which will be useful to investigate protein function and regulate large synthetic gene circuits.


2020 ◽  
Vol 0 (0) ◽  
Author(s):  
Hafida Bouziane ◽  
Abdallah Chouarfia

AbstractTo date, many proteins generated by large-scale genome sequencing projects are still uncharacterized and subject to intensive investigations by both experimental and computational means. Knowledge of protein subcellular localization (SCL) is of key importance for protein function elucidation. However, it remains a challenging task, especially for multiple sites proteins known to shuttle between cell compartments to perform their proper biological functions and proteins which do not have significant homology to proteins of known subcellular locations. Due to their low-cost and reasonable accuracy, machine learning-based methods have gained much attention in this context with the availability of a plethora of biological databases and annotated proteins for analysis and benchmarking. Various predictive models have been proposed to tackle the SCL problem, using different protein sequence features pertaining to the subcellular localization, however, the overwhelming majority of them focuses on single localization and cover very limited cellular locations. The prediction was basically established on sorting signals, amino acids compositions, and homology. To improve the prediction quality, focus is actually on knowledge information extracted from annotation databases, such as protein–protein interactions and Gene Ontology (GO) functional domains annotation which has been recently a widely adopted and essential information for learning systems. To deal with such problem, in the present study, we considered SCL prediction task as a multi-label learning problem and tried to label both single site and multiple sites unannotated bacterial protein sequences by mining proteins homology relationships using both GO terms of protein homologs and PSI-BLAST profiles. The experiments using 5-fold cross-validation tests on the benchmark datasets showed a significant improvement on the results obtained by the proposed consensus multi-label prediction model which discriminates six compartments for Gram-negative and five compartments for Gram-positive bacterial proteins.


2020 ◽  
Vol 49 (D1) ◽  
pp. D803-D808
Author(s):  
Wing Yin Venus Lau ◽  
Gemma R Hoad ◽  
Vivian Jin ◽  
Geoffrey L Winsor ◽  
Ashmeet Madyan ◽  
...  

Abstract Protein subcellular localization (SCL) is important for understanding protein function, genome annotation, and aids identification of potential cell surface diagnostic markers, drug targets, or vaccine components. PSORTdb comprises ePSORTdb, a manually curated database of experimentally verified protein SCLs, and cPSORTdb, a pre-computed database of PSORTb-predicted SCLs for NCBI’s RefSeq deduced bacterial and archaeal proteomes. We now report PSORTdb 4.0 (http://db.psort.org/). It features a website refresh, in particular a more user-friendly database search. It also addresses the need to uniquely identify proteins from NCBI genomes now that GI numbers have been retired. It further expands both ePSORTdb and cPSORTdb, including additional data about novel secondary localizations, such as proteins found in bacterial outer membrane vesicles. Protein predictions in cPSORTdb have increased along with the number of available microbial genomes, from approximately 13 million when PSORTdb 3.0 was released, to over 66 million currently. Now, analyses of both complete and draft genomes are included. This expanded database will be of wide use to researchers developing SCL predictors or studying diverse microbes, including medically, agriculturally and industrially important species that have both classic or atypical cell envelope structures or vesicles.


Biomolecules ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 264
Author(s):  
Kaisa Liimatainen ◽  
Riku Huttunen ◽  
Leena Latonen ◽  
Pekka Ruusuvuori

Identifying localization of proteins and their specific subpopulations associated with certain cellular compartments is crucial for understanding protein function and interactions with other macromolecules. Fluorescence microscopy is a powerful method to assess protein localizations, with increasing demand of automated high throughput analysis methods to supplement the technical advancements in high throughput imaging. Here, we study the applicability of deep neural network-based artificial intelligence in classification of protein localization in 13 cellular subcompartments. We use deep learning-based on convolutional neural network and fully convolutional network with similar architectures for the classification task, aiming at achieving accurate classification, but importantly, also comparison of the networks. Our results show that both types of convolutional neural networks perform well in protein localization classification tasks for major cellular organelles. Yet, in this study, the fully convolutional network outperforms the convolutional neural network in classification of images with multiple simultaneous protein localizations. We find that the fully convolutional network, using output visualizing the identified localizations, is a very useful tool for systematic protein localization assessment.


Cells ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 1747
Author(s):  
Roya Yousefi ◽  
Kristina Jevdokimenko ◽  
Verena Kluever ◽  
David Pacheu-Grau ◽  
Eugenio F. Fornasiero

Protein homeostasis is an equilibrium of paramount importance that maintains cellular performance by preserving an efficient proteome. This equilibrium avoids the accumulation of potentially toxic proteins, which could lead to cellular stress and death. While the regulators of proteostasis are the machineries controlling protein production, folding and degradation, several other factors can influence this process. Here, we have considered two factors influencing protein turnover: the subcellular localization of a protein and its functional state. For this purpose, we used an imaging approach based on the pulse-labeling of 17 representative SNAP-tag constructs for measuring protein lifetimes. With this approach, we obtained precise measurements of protein turnover rates in several subcellular compartments. We also tested a selection of mutants modulating the function of three extensively studied proteins, the Ca2+ sensor calmodulin, the small GTPase Rab5a and the brain creatine kinase (CKB). Finally, we followed up on the increased lifetime observed for the constitutively active Rab5a (Q79L), and we found that its stabilization correlates with enlarged endosomes and increased interaction with membranes. Overall, our data reveal that both changes in protein localization and functional state are key modulators of protein turnover, and protein lifetime fluctuations can be considered to infer changes in cellular behavior.


2020 ◽  
Vol 11 ◽  
Author(s):  
Kenichiro Imai ◽  
Kenta Nakai

At the time of translation, nascent proteins are thought to be sorted into their final subcellular localization sites, based on the part of their amino acid sequences (i.e., sorting or targeting signals). Thus, it is interesting to computationally recognize these signals from the amino acid sequences of any given proteins and to predict their final subcellular localization with such information, supplemented with additional information (e.g., k-mer frequency). This field has a long history and many prediction tools have been released. Even in this era of proteomic atlas at the single-cell level, researchers continue to develop new algorithms, aiming at accessing the impact of disease-causing mutations/cell type-specific alternative splicing, for example. In this article, we overview the entire field and discuss its future direction.


2014 ◽  
Vol 70 (a1) ◽  
pp. C432-C432
Author(s):  
George Minasov ◽  
Salvatore Nocadello ◽  
Ekaterina Filippova ◽  
Andrei Halavaty ◽  
Wayne Anderson

The Center for Structural Genomics for Infectious Diseases (CSGID) applies structural genomics approaches to biomedically important proteins from human pathogens. It also provides the infectious disease community with a high throughput pipeline for structure determination that carries out all steps of the process, from target selection through structure deposition. Target proteins include drug targets, essential enzymes, virulence factors and vaccine candidates. The CSGID has deposited over 680 structures in the Protein Data Bank. The proteins that are exposed on the surface of Gram positive bacterial pathogens (including Staphylococcus aureus, Bacillus anthracis, Listeria monocytogenes, Streptococcus species and Clostridium species) have been one focus area for the CSGID. So far, the structures of more than 55 of these proteins have been determined. The surface proteins are important in the interactions between the pathogen and its host, but many of them are as yet functionally uncharacterized. Among the examples that will be presented is the Bacillus anthracis SpoIID protein. SpoIID is part of a coordinated cell wall degradation machine that is essential for sporulation and the morphological changes involved. It represents a new family of lytic transglycosylases that degrade the glycan strands of the peptidoglycan cell wall. The two active site clefts in the dimeric enzyme include residues from both subunits, suggesting that the dimer is required for activity. This project has been funded in whole or in part with Federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Contracts No. HHSN272200700058C and HHSN272201200026C.


2020 ◽  
Vol 48 (W1) ◽  
pp. W72-W76 ◽  
Author(s):  
Vadim M Gumerov ◽  
Igor B Zhulin

Abstract Key steps in a computational study of protein function involve analysis of (i) relationships between homologous proteins, (ii) protein domain architecture and (iii) gene neighborhoods the corresponding proteins are encoded in. Each of these steps requires a separate computational task and sets of tools. Currently in order to relate protein features and gene neighborhoods information to phylogeny, researchers need to prepare all the necessary data and combine them by hand, which is time-consuming and error-prone. Here, we present a new platform, TREND (tree-based exploration of neighborhoods and domains), which can perform all the necessary steps in automated fashion and put the derived information into phylogenomic context, thus making evolutionary based protein function analysis more efficient. A rich set of adjustable components allows a user to run the computational steps specific to his task. TREND is freely available at http://trend.zhulinlab.org.


Sign in / Sign up

Export Citation Format

Share Document