GP4: an integrated Gram-Positive Protein Prediction Pipeline for subcellular localization mimicking bacterial sorting

Briefings in Bioinformatics ◽

10.1093/bib/bbaa302 ◽

2020 ◽

Author(s):

Stefano Grasso ◽

Tjeerd van Rij ◽

Jan Maarten van Dijl

Keyword(s):

Subcellular Localization ◽

Protein Function ◽

Drug Targets ◽

Protein Localization ◽

Homologous Proteins ◽

Gram Positive ◽

Cell Factories ◽

Sorting Signals ◽

Protein Prediction ◽

Improved Performance

Abstract Subcellular localization is a critical aspect of protein function and the potential application of proteins either as drugs or drug targets, or in industrial and domestic applications. However, the experimental determination of protein localization is time consuming and expensive. Therefore, various localization predictors have been developed for particular groups of species. Intriguingly, despite their major representation amongst biotechnological cell factories and pathogens, a meta-predictor based on sorting signals and specific for Gram-positive bacteria was still lacking. Here we present GP4, a protein subcellular localization meta-predictor mainly for Firmicutes, but also Actinobacteria, based on the combination of multiple tools, each specific for different sorting signals and compartments. Novelty elements include improved cell-wall protein prediction, including differentiation of the type of interaction, prediction of non-canonical secretion pathway target proteins, separate prediction of lipoproteins and better user experience in terms of parsability and interpretability of the results. GP4 aims at mimicking protein sorting as it would happen in a bacterial cell. As GP4 is not homology based, it has a broad applicability and does not depend on annotated databases with homologous proteins. Non-canonical usage may include little studied or novel species, synthetic and engineered organisms, and even re-use of the prediction data to develop custom prediction algorithms. Our benchmark analysis highlights the improved performance of GP4 compared to other widely used subcellular protein localization predictors. A webserver running GP4 is available at http://gp4.hpc.rug.nl/

Download Full-text

Organellar Maps Through Proteomic Profiling – A Conceptual Guide

Molecular & Cellular Proteomics ◽

10.1074/mcp.r120.001971 ◽

2020 ◽

Vol 19 (7) ◽

pp. 1076-1087 ◽

Cited By ~ 4

Author(s):

Georg H. H. Borner

Keyword(s):

Mass Spectrometry ◽

High Resolution ◽

Subcellular Localization ◽

Protein Function ◽

Protein Localization ◽

Proteomic Profiling ◽

Protein Subcellular Localization ◽

Single Experiment ◽

Flexible Approach ◽

Resolution Cell

Protein subcellular localization is an essential and highly regulated determinant of protein function. Major advances in mass spectrometry and imaging have allowed the development of powerful spatial proteomics approaches for determining protein localization at the whole cell scale. Here, a brief overview of current methods is presented, followed by a detailed discussion of organellar mapping through proteomic profiling. This relatively simple yet flexible approach is rapidly gaining popularity, because of its ability to capture the localizations of thousands of proteins in a single experiment. It can be used to generate high-resolution cell maps, and as a tool for monitoring protein localization dynamics. This review highlights the strengths and limitations of the approach and provides guidance to designing and interpreting profiling experiments.

Download Full-text

Global, quantitative and dynamic mapping of protein subcellular localization

eLife ◽

10.7554/elife.16950 ◽

2016 ◽

Vol 5 ◽

Cited By ~ 204

Author(s):

Daniel N Itzhak ◽

Stefka Tyanova ◽

Jürgen Cox ◽

Georg HH Borner

Keyword(s):

Subcellular Localization ◽

Protein Function ◽

Cell Biology ◽

Dynamic Capabilities ◽

Protein Translocation ◽

Protein Localization ◽

Quantitative Model ◽

Dynamic Mapping ◽

Global Mapping ◽

Control Protein

Subcellular localization critically influences protein function, and cells control protein localization to regulate biological processes. We have developed and applied Dynamic Organellar Maps, a proteomic method that allows global mapping of protein translocation events. We initially used maps statically to generate a database with localization and absolute copy number information for over 8700 proteins from HeLa cells, approaching comprehensive coverage. All major organelles were resolved, with exceptional prediction accuracy (estimated at >92%). Combining spatial and abundance information yielded an unprecedented quantitative view of HeLa cell anatomy and organellar composition, at the protein level. We subsequently demonstrated the dynamic capabilities of the approach by capturing translocation events following EGF stimulation, which we integrated into a quantitative model. Dynamic Organellar Maps enable the proteome-wide analysis of physiological protein movements, without requiring any reagents specific to the investigated process, and will thus be widely applicable in cell biology.

Download Full-text

A platform for post-translational spatiotemporal control of cellular proteins

Synthetic Biology ◽

10.1093/synbio/ysab002 ◽

2021 ◽

Author(s):

Brianna Jayanthi ◽

Bhagyashree Bachhav ◽

Zengyi Wan ◽

Santiago Martinez Legaspi ◽

Laura Segatori

Keyword(s):

Subcellular Localization ◽

Protein Function ◽

Mammalian Cells ◽

Protein Localization ◽

Genetic Circuits ◽

Bifunctional Molecules ◽

Protein Levels ◽

Synthetic Gene Circuits ◽

Spatiotemporal Regulation ◽

Innovative Tool

Abstract Mammalian cells process information through coordinated spatiotemporal regulation of proteins. Engineering cellular networks thus relies on efficient tools for regulating protein levels in specific subcellular compartments. To address the need to manipulate the extent and dynamics of protein localization, we developed a platform technology for target-specific control of protein destination. This platform is based on bifunctional molecules comprising a target-specific nanobody and universal sequences determining target subcellular localization or degradation rate. We demonstrate that nanobody-mediated localization depends on the expression level of the target and the nanobody, and the extent of target subcellular localization can be regulated by combining multiple target-specific nanobodies with distinct localization or degradation sequences. We also show that this platform for nanobody-mediated target localization and degradation can be regulated transcriptionally and integrated within orthogonal genetic circuits to achieve the desired temporal control over spatial regulation of target proteins. The platform reported in this study provides an innovative tool to control protein subcellular localization which will be useful to investigate protein function and regulate large synthetic gene circuits.

Download Full-text

Use of Chou’s 5-steps rule to predict the subcellular localization of gram-negative and gram-positive bacterial proteins by multi-label learning based on gene ontology annotation and profile alignment

Journal of Integrative Bioinformatics ◽

10.1515/jib-2019-0091 ◽

2020 ◽

Vol 0 (0) ◽

Author(s):

Hafida Bouziane ◽

Abdallah Chouarfia

Keyword(s):

Gene Ontology ◽

Subcellular Localization ◽

Protein Interactions ◽

Protein Function ◽

Large Scale ◽

Biological Databases ◽

Gram Positive ◽

Essential Information ◽

Gram Negative ◽

Bacterial Proteins

AbstractTo date, many proteins generated by large-scale genome sequencing projects are still uncharacterized and subject to intensive investigations by both experimental and computational means. Knowledge of protein subcellular localization (SCL) is of key importance for protein function elucidation. However, it remains a challenging task, especially for multiple sites proteins known to shuttle between cell compartments to perform their proper biological functions and proteins which do not have significant homology to proteins of known subcellular locations. Due to their low-cost and reasonable accuracy, machine learning-based methods have gained much attention in this context with the availability of a plethora of biological databases and annotated proteins for analysis and benchmarking. Various predictive models have been proposed to tackle the SCL problem, using different protein sequence features pertaining to the subcellular localization, however, the overwhelming majority of them focuses on single localization and cover very limited cellular locations. The prediction was basically established on sorting signals, amino acids compositions, and homology. To improve the prediction quality, focus is actually on knowledge information extracted from annotation databases, such as protein–protein interactions and Gene Ontology (GO) functional domains annotation which has been recently a widely adopted and essential information for learning systems. To deal with such problem, in the present study, we considered SCL prediction task as a multi-label learning problem and tried to label both single site and multiple sites unannotated bacterial protein sequences by mining proteins homology relationships using both GO terms of protein homologs and PSI-BLAST profiles. The experiments using 5-fold cross-validation tests on the benchmark datasets showed a significant improvement on the results obtained by the proposed consensus multi-label prediction model which discriminates six compartments for Gram-negative and five compartments for Gram-positive bacterial proteins.

Download Full-text

PSORTdb 4.0: expanded and redesigned bacterial and archaeal protein subcellular localization database incorporating new secondary localizations

Nucleic Acids Research ◽

10.1093/nar/gkaa1095 ◽

2020 ◽

Vol 49 (D1) ◽

pp. D803-D808

Author(s):

Wing Yin Venus Lau ◽

Gemma R Hoad ◽

Vivian Jin ◽

Geoffrey L Winsor ◽

Ashmeet Madyan ◽

...

Keyword(s):

Subcellular Localization ◽

Protein Function ◽

Drug Targets ◽

Cell Envelope ◽

Membrane Vesicles ◽

Outer Membrane Vesicles ◽

Protein Subcellular Localization ◽

Atypical Cell ◽

Important Species ◽

Archaeal Protein

Abstract Protein subcellular localization (SCL) is important for understanding protein function, genome annotation, and aids identification of potential cell surface diagnostic markers, drug targets, or vaccine components. PSORTdb comprises ePSORTdb, a manually curated database of experimentally verified protein SCLs, and cPSORTdb, a pre-computed database of PSORTb-predicted SCLs for NCBI’s RefSeq deduced bacterial and archaeal proteomes. We now report PSORTdb 4.0 (http://db.psort.org/). It features a website refresh, in particular a more user-friendly database search. It also addresses the need to uniquely identify proteins from NCBI genomes now that GI numbers have been retired. It further expands both ePSORTdb and cPSORTdb, including additional data about novel secondary localizations, such as proteins found in bacterial outer membrane vesicles. Protein predictions in cPSORTdb have increased along with the number of available microbial genomes, from approximately 13 million when PSORTdb 3.0 was released, to over 66 million currently. Now, analyses of both complete and draft genomes are included. This expanded database will be of wide use to researchers developing SCL predictors or studying diverse microbes, including medically, agriculturally and industrially important species that have both classic or atypical cell envelope structures or vesicles.

Download Full-text

Convolutional Neural Network-Based Artificial Intelligence for Classification of Protein Localization Patterns

Biomolecules ◽

10.3390/biom11020264 ◽

2021 ◽

Vol 11 (2) ◽

pp. 264

Author(s):

Kaisa Liimatainen ◽

Riku Huttunen ◽

Leena Latonen ◽

Pekka Ruusuvuori

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Convolutional Neural Network ◽

High Throughput ◽

Protein Function ◽

Protein Localization ◽

Convolutional Network ◽

High Throughput Analysis ◽

Fully Convolutional Network

Identifying localization of proteins and their specific subpopulations associated with certain cellular compartments is crucial for understanding protein function and interactions with other macromolecules. Fluorescence microscopy is a powerful method to assess protein localizations, with increasing demand of automated high throughput analysis methods to supplement the technical advancements in high throughput imaging. Here, we study the applicability of deep neural network-based artificial intelligence in classification of protein localization in 13 cellular subcompartments. We use deep learning-based on convolutional neural network and fully convolutional network with similar architectures for the classification task, aiming at achieving accurate classification, but importantly, also comparison of the networks. Our results show that both types of convolutional neural networks perform well in protein localization classification tasks for major cellular organelles. Yet, in this study, the fully convolutional network outperforms the convolutional neural network in classification of images with multiple simultaneous protein localizations. We find that the fully convolutional network, using output visualizing the identified localizations, is a very useful tool for systematic protein localization assessment.

Download Full-text

Influence of Subcellular Localization and Functional State on Protein Turnover

Cells ◽

10.3390/cells10071747 ◽

2021 ◽

Vol 10 (7) ◽

pp. 1747

Author(s):

Roya Yousefi ◽

Kristina Jevdokimenko ◽

Verena Kluever ◽

David Pacheu-Grau ◽

Eugenio F. Fornasiero

Keyword(s):

Subcellular Localization ◽

Functional State ◽

Protein Turnover ◽

Protein Localization ◽

Small Gtpase ◽

Turnover Rates ◽

Cellular Behavior ◽

Two Factors ◽

Brain Creatine Kinase ◽

Selection Of

Protein homeostasis is an equilibrium of paramount importance that maintains cellular performance by preserving an efficient proteome. This equilibrium avoids the accumulation of potentially toxic proteins, which could lead to cellular stress and death. While the regulators of proteostasis are the machineries controlling protein production, folding and degradation, several other factors can influence this process. Here, we have considered two factors influencing protein turnover: the subcellular localization of a protein and its functional state. For this purpose, we used an imaging approach based on the pulse-labeling of 17 representative SNAP-tag constructs for measuring protein lifetimes. With this approach, we obtained precise measurements of protein turnover rates in several subcellular compartments. We also tested a selection of mutants modulating the function of three extensively studied proteins, the Ca2+ sensor calmodulin, the small GTPase Rab5a and the brain creatine kinase (CKB). Finally, we followed up on the increased lifetime observed for the constitutively active Rab5a (Q79L), and we found that its stabilization correlates with enlarged endosomes and increased interaction with membranes. Overall, our data reveal that both changes in protein localization and functional state are key modulators of protein turnover, and protein lifetime fluctuations can be considered to infer changes in cellular behavior.

Download Full-text

Tools for the Recognition of Sorting Signals and the Prediction of Subcellular Localization of Proteins From Their Amino Acid Sequences

Frontiers in Genetics ◽

10.3389/fgene.2020.607812 ◽

2020 ◽

Vol 11 ◽

Author(s):

Kenichiro Imai ◽

Kenta Nakai

Keyword(s):

Amino Acid ◽

Subcellular Localization ◽

Amino Acid Sequences ◽

Additional Information ◽

Sorting Signals ◽

Specific Alternative ◽

Cell Type Specific ◽

Future Direction ◽

The Impact ◽

New Algorithms

At the time of translation, nascent proteins are thought to be sorted into their final subcellular localization sites, based on the part of their amino acid sequences (i.e., sorting or targeting signals). Thus, it is interesting to computationally recognize these signals from the amino acid sequences of any given proteins and to predict their final subcellular localization with such information, supplemented with additional information (e.g., k-mer frequency). This field has a long history and many prediction tools have been released. Even in this era of proteomic atlas at the single-cell level, researchers continue to develop new algorithms, aiming at accessing the impact of disease-causing mutations/cell type-specific alternative splicing, for example. In this article, we overview the entire field and discuss its future direction.

Download Full-text

Structures of the surface exposed proteins of Gram positive bacteria

Acta Crystallographica Section A Foundations and Advances ◽

10.1107/s2053273314095679 ◽

2014 ◽

Vol 70 (a1) ◽

pp. C432-C432

Author(s):

George Minasov ◽

Salvatore Nocadello ◽

Ekaterina Filippova ◽

Andrei Halavaty ◽

Wayne Anderson

Keyword(s):

Cell Wall ◽

Infectious Diseases ◽

Bacillus Anthracis ◽

Structural Genomics ◽

Drug Targets ◽

Morphological Changes ◽

Target Selection ◽

Surface Proteins ◽

Human Pathogens ◽

Gram Positive

The Center for Structural Genomics for Infectious Diseases (CSGID) applies structural genomics approaches to biomedically important proteins from human pathogens. It also provides the infectious disease community with a high throughput pipeline for structure determination that carries out all steps of the process, from target selection through structure deposition. Target proteins include drug targets, essential enzymes, virulence factors and vaccine candidates. The CSGID has deposited over 680 structures in the Protein Data Bank. The proteins that are exposed on the surface of Gram positive bacterial pathogens (including Staphylococcus aureus, Bacillus anthracis, Listeria monocytogenes, Streptococcus species and Clostridium species) have been one focus area for the CSGID. So far, the structures of more than 55 of these proteins have been determined. The surface proteins are important in the interactions between the pathogen and its host, but many of them are as yet functionally uncharacterized. Among the examples that will be presented is the Bacillus anthracis SpoIID protein. SpoIID is part of a coordinated cell wall degradation machine that is essential for sporulation and the morphological changes involved. It represents a new family of lytic transglycosylases that degrade the glycan strands of the peptidoglycan cell wall. The two active site clefts in the dimeric enzyme include residues from both subunits, suggesting that the dimer is required for activity. This project has been funded in whole or in part with Federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Contracts No. HHSN272200700058C and HHSN272201200026C.

Download Full-text

TREND: a platform for exploring protein function in prokaryotes based on phylogenetic, domain architecture and gene neighborhood analyses

Nucleic Acids Research ◽

10.1093/nar/gkaa243 ◽

2020 ◽

Vol 48 (W1) ◽

pp. W72-W76 ◽

Cited By ~ 3

Author(s):

Vadim M Gumerov ◽

Igor B Zhulin

Keyword(s):

Protein Function ◽

Computational Study ◽

Function Analysis ◽

Domain Architecture ◽

Protein Domain ◽

Homologous Proteins ◽

Gene Neighborhood ◽

Protein Domain Architecture ◽

Key Steps

Abstract Key steps in a computational study of protein function involve analysis of (i) relationships between homologous proteins, (ii) protein domain architecture and (iii) gene neighborhoods the corresponding proteins are encoded in. Each of these steps requires a separate computational task and sets of tools. Currently in order to relate protein features and gene neighborhoods information to phylogeny, researchers need to prepare all the necessary data and combine them by hand, which is time-consuming and error-prone. Here, we present a new platform, TREND (tree-based exploration of neighborhoods and domains), which can perform all the necessary steps in automated fashion and put the derived information into phylogenomic context, thus making evolutionary based protein function analysis more efficient. A rich set of adjustable components allows a user to run the computational steps specific to his task. TREND is freely available at http://trend.zhulinlab.org.

Download Full-text