Can bioinformatics help in the identification of moonlighting proteins?

Sergio Hernández; Alejandra Calvo; Gabriela Ferragut; Luís Franco; Antoni Hermoso; Isaac Amela; Antonio Gómez; Enrique Querol; Juan Cedano

doi:10.1042/bst20140241

Can bioinformatics help in the identification of moonlighting proteins?

Biochemical Society Transactions ◽

10.1042/bst20140241 ◽

2014 ◽

Vol 42 (6) ◽

pp. 1692-1697 ◽

Cited By ~ 2

Author(s):

Sergio Hernández ◽

Alejandra Calvo ◽

Gabriela Ferragut ◽

Luís Franco ◽

Antoni Hermoso ◽

...

Keyword(s):

Correlation Analysis ◽

Structural Information ◽

Evolutionary Relationship ◽

Evolutionary Process ◽

Main Function ◽

Remote Homology ◽

Functional Sites ◽

Protein Protein Interaction ◽

Moonlighting Proteins ◽

Moonlighting Functions

Protein multitasking or moonlighting is the capability of certain proteins to execute two or more unique biological functions. This ability to perform moonlighting functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Usually, moonlighting proteins are revealed experimentally by serendipity, and the proteins described probably represent just the tip of the iceberg. It would be helpful if bioinformatics could predict protein multifunctionality, especially because of the large amounts of sequences coming from genome projects. In the present article, we describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. The sequence analysis has been performed: (i) by remote homology searches using PSI-BLAST, (ii) by the detection of functional motifs, and (iii) by the co-evolutionary relationship between amino acids. Programs designed to identify functional motifs/domains are basically oriented to detect the main function, but usually fail in the detection of secondary ones. Remote homology searches such as PSI-BLAST seem to be more versatile in this task, and it is a good complement for the information obtained from protein–protein interaction (PPI) databases. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can be used only in very restricted situations, but can suggest how the evolutionary process of the acquisition of the second function took place.

Download Full-text

Moonlighting in Mitosis: Analysis of the Mitotic Functions of Transcription and Splicing Factors

Cells ◽

10.3390/cells9061554 ◽

2020 ◽

Vol 9 (6) ◽

pp. 1554 ◽

Cited By ~ 2

Author(s):

Maria Patrizia Somma ◽

Evgeniya N. Andreyeva ◽

Gera A. Pavlova ◽

Claudia Pellacani ◽

Elisabetta Bucciarelli ◽

...

Keyword(s):

Evolutionary Process ◽

Splicing Factors ◽

Insufficient Data ◽

Primary Role ◽

Interphase Nuclei ◽

Protein Databases ◽

Moonlighting Proteins ◽

Bona Fide ◽

Cellular Compartments ◽

Moonlighting Functions

Moonlighting proteins can perform one or more additional functions besides their primary role. It has been posited that a protein can acquire a moonlighting function through a gradual evolutionary process, which is favored when the primary and secondary functions are exerted in different cellular compartments. Transcription factors (TFs) and splicing factors (SFs) control processes that occur in interphase nuclei and are strongly reduced during cell division, and are therefore in a favorable situation to evolve moonlighting mitotic functions. However, recently published moonlighting protein databases, which comprise almost 400 proteins, do not include TFs and SFs with secondary mitotic functions. We searched the literature and found several TFs and SFs with bona fide moonlighting mitotic functions, namely they localize to specific mitotic structure(s), interact with proteins enriched in the same structure(s), and are required for proper morphology and functioning of the structure(s). In addition, we describe TFs and SFs that localize to mitotic structures but cannot be classified as moonlighting proteins due to insufficient data on their biochemical interactions and mitotic roles. Nevertheless, we hypothesize that most TFs and SFs with specific mitotic localizations have either minor or redundant moonlighting functions, or are evolving towards the acquisition of these functions.

Download Full-text

A Comparative Genomic and Phylogenetic Analysis of the Origin and Evolution of the CCN Gene Family

BioMed Research International ◽

10.1155/2019/8620878 ◽

2019 ◽

Vol 2019 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Kuan Hu ◽

Yiming Tao ◽

Juanni Li ◽

Zhuang Liu ◽

Xinyan Zhu ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Gene Family ◽

Evolutionary Relationship ◽

Evolutionary Process ◽

Structural Features ◽

Comparative Genomic ◽

Sequence Alignments ◽

Multiple Sequence ◽

Origin And Evolution ◽

Functional Sites

CCN gene family members have recently been identified as multifunctional regulators involved in diverse biological functions, especially in vascular and skeletal development. In the present study, a comparative genomic and phylogenetic analysis was performed to show the similarities and differences in structure and function of CCNs from different organisms and to reveal their potential evolutionary relationship. First, CCN homologs of metazoans from different species were identified. Then we made multiple sequence alignments, MEME analysis, and functional sites prediction, which show the highly conserved structural features among CCN metazoans. The phylogenetic tree was further established, and thus CCNs were found undergoing extensive lineage-specific duplication events and lineage-specific expansion during the evolutionary process. Besides, comparative analysis about the genomic organization and chromosomal CCN gene surrounding indicated a clear orthologous relationship among these species counterparts. At last, based on these research results above, a potential evolutionary scenario was generated to overview the origin and evolution of the CCN gene family.

Download Full-text

CATH functional families predict functional sites in proteins

Bioinformatics ◽

10.1093/bioinformatics/btaa937 ◽

2020 ◽

Author(s):

Sayoni Das ◽

Harry M Scholes ◽

Neeladri Sen ◽

Christine Orengo

Keyword(s):

Functional Characterization ◽

Functional Site ◽

Training Data ◽

Supplementary Information ◽

Conserved Residues ◽

Functional Sites ◽

Protein Protein Interaction ◽

Evolutionary Features ◽

Functional Families

Abstract Motivation Identification of functional sites in proteins is essential for functional characterization, variant interpretation and drug design. Several methods are available for predicting either a generic functional site, or specific types of functional site. Here, we present FunSite, a machine learning predictor that identifies catalytic, ligand-binding and protein–protein interaction functional sites using features derived from protein sequence and structure, and evolutionary data from CATH functional families (FunFams). Results FunSite’s prediction performance was rigorously benchmarked using cross-validation and a holdout dataset. FunSite outperformed other publicly available functional site prediction methods. We show that conserved residues in FunFams are enriched in functional sites. We found FunSite’s performance depends greatly on the quality of functional site annotations and the information content of FunFams in the training data. Finally, we analyze which structural and evolutionary features are most predictive for functional sites. Availabilityand implementation https://github.com/UCL/cath-funsite-predictor. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Trifunctional cross-linker for mapping protein-protein interaction networks and comparing protein conformational states

eLife ◽

10.7554/elife.12509 ◽

2016 ◽

Vol 5 ◽

Cited By ~ 56

Author(s):

Dan Tan ◽

Qiang Li ◽

Mei-Jun Zhang ◽

Chao Liu ◽

Chengying Ma ◽

...

Keyword(s):

Protein Interactions ◽

Structural Information ◽

Affinity Purification ◽

Protein Protein Interactions ◽

C Elegans ◽

Protein Protein Interaction ◽

Lysine Residues ◽

Spacer Arm ◽

Cross Linker ◽

Protein Nucleic Acid

To improve chemical cross-linking of proteins coupled with mass spectrometry (CXMS), we developed a lysine-targeted enrichable cross-linker containing a biotin tag for affinity purification, a chemical cleavage site to separate cross-linked peptides away from biotin after enrichment, and a spacer arm that can be labeled with stable isotopes for quantitation. By locating the flexible proteins on the surface of 70S ribosome, we show that this trifunctional cross-linker is effective at attaining structural information not easily attainable by crystallography and electron microscopy. From a crude Rrp46 immunoprecipitate, it helped identify two direct binding partners of Rrp46 and 15 protein-protein interactions (PPIs) among the co-immunoprecipitated exosome subunits. Applying it to E. coli and C. elegans lysates, we identified 3130 and 893 inter-linked lysine pairs, representing 677 and 121 PPIs. Using a quantitative CXMS workflow we demonstrate that it can reveal changes in the reactivity of lysine residues due to protein-nucleic acid interaction.

Download Full-text

Developing a machine learning model to identify protein–protein interaction hotspots to facilitate drug discovery

PeerJ ◽

10.7717/peerj.10381 ◽

2020 ◽

Vol 8 ◽

pp. e10381

Author(s):

Rohit Nandakumar ◽

Valentin Dinu

Keyword(s):

Machine Learning ◽

Amino Acid ◽

Drug Discovery ◽

Structural Information ◽

Learning Model ◽

Protein Protein Interaction ◽

Drug Molecules ◽

Machine Learning Model ◽

Disease Associations ◽

History Of

Throughout the history of drug discovery, an enzymatic-based approach for identifying new drug molecules has been primarily utilized. Recently, protein–protein interfaces that can be disrupted to identify small molecules that could be viable targets for certain diseases, such as cancer and the human immunodeficiency virus, have been identified. Existing studies computationally identify hotspots on these interfaces, with most models attaining accuracies of ~70%. Many studies do not effectively integrate information relating to amino acid chains and other structural information relating to the complex. Herein, (1) a machine learning model has been created and (2) its ability to integrate multiple features, such as those associated with amino-acid chains, has been evaluated to enhance the ability to predict protein–protein interface hotspots. Virtual drug screening analysis of a set of hotspots determined on the EphB2-ephrinB2 complex has also been performed. The predictive capabilities of this model offer an AUROC of 0.842, sensitivity/recall of 0.833, and specificity of 0.850. Virtual screening of a set of hotspots identified by the machine learning model developed in this study has identified potential medications to treat diseases caused by the overexpression of the EphB2-ephrinB2 complex, including prostate, gastric, colorectal and melanoma cancers which are linked to EphB2 mutations. The efficacy of this model has been demonstrated through its successful ability to predict drug-disease associations previously identified in literature, including cimetidine, idarubicin, pralatrexate for these conditions. In addition, nadolol, a beta blocker, has also been identified in this study to bind to the EphB2-ephrinB2 complex, and the possibility of this drug treating multiple cancers is still relatively unexplored.

Download Full-text

Functional Importance of Hydrophobic Patches on the Ebola Virus VP35 IFN-Inhibitory Domain

Viruses ◽

10.3390/v13112316 ◽

2021 ◽

Vol 13 (11) ◽

pp. 2316

Author(s):

Nodoka Kasajima ◽

Keita Matsuno ◽

Hiroko Miyamoto ◽

Masahiro Kajihara ◽

Manabu Igarashi ◽

...

Keyword(s):

Amino Acid ◽

Ebola Virus ◽

Structural Information ◽

Site Directed Mutagenesis ◽

Genome Replication ◽

Type I ◽

Multifunctional Protein ◽

Functional Sites ◽

Inhibitory Function ◽

Inhibitory Domain

Viral protein 35 (VP35) of Ebola virus (EBOV) is a multifunctional protein that mainly acts as a viral polymerase cofactor and an interferon antagonist. VP35 interacts with the viral nucleoprotein (NP) and double-stranded RNA for viral RNA transcription/replication and inhibition of type I interferon (IFN) production, respectively. The C-terminal portion of VP35, which is termed the IFN-inhibitory domain (IID), is important for both functions. To further identify critical regions in this domain, we analyzed the physical properties of the surface of VP35 IID, focusing on hydrophobic patches, which are expected to be functional sites that are involved in interactions with other molecules. Based on the known structural information of VP35 IID, three hydrophobic patches were identified on its surface and their biological importance was investigated using minigenome and IFN-β promoter-reporter assays. Site-directed mutagenesis revealed that some of the amino acid substitutions that were predicted to disrupt the hydrophobicity of the patches significantly decreased the efficiency of viral genome replication/transcription due to reduced interaction with NP, suggesting that the hydrophobic patches might be critical for the formation of a replication complex through the interaction with NP. It was also found that the hydrophobic patches were involved in the IFN-inhibitory function of VP35. These results highlight the importance of hydrophobic patches on the surface of EBOV VP35 IID and also indicate that patch analysis is useful for the identification of amino acid residues that directly contribute to protein functions.

Download Full-text

Structure of the stress-related LHCSR1 complex determined by an integrated computational strategy

10.1101/2021.10.06.463383 ◽

2021 ◽

Author(s):

Ingrid Guarnetti Prandi ◽

Vladislav Sláma ◽

Cristina Pecorilla ◽

Lorenzo Cupellini ◽

Benedetta Mennucci

Keyword(s):

Structural Model ◽

Light Harvesting ◽

Structural Information ◽

Protein Complexes ◽

Complex Stress ◽

Main Function ◽

Light Harvesting Complexes ◽

Light Harvesting Complex ◽

Pigment Protein Complexes ◽

Computational Strategy

Light-harvesting complexes (LHCs) are pigment-protein complexes whose main function is to capture sunlight and transfer the energy to reaction centers of photosystems. In response to varying light conditions, LH complexes also play photoregulation and photoprotection roles. In algae and mosses, a sub-family of LHCs, Light-Harvesting complex stress related (LHCSR), is responsible for photoprotective quenching. Despite their functional and evolutionary importance, no direct structural information on LHCSRs is available that can explain their unique properties. In this work we propose a structural model of LHCSR1 from the moss P. Patens, obtained through an integrated computational strategy that combines homology modeling, molecular dynamics, and multiscale quantum chemical calculations. The model is validated by reproducing the spectral properties of LHCSR1. Our model reveals the structural specificity of LHCSR1, as compared with the CP29 LH complex, and poses the basis for understanding photoprotective quenching in mosses.

Download Full-text

Roles of the GA-mediated SPL Gene Family and miR156 in the Floral Development of Chinese Chestnut (Castanea mollissima)

International Journal of Molecular Sciences ◽

10.3390/ijms20071577 ◽

2019 ◽

Vol 20 (7) ◽

pp. 1577 ◽

Cited By ~ 4

Author(s):

Guosong Chen ◽

Jingtong Li ◽

Yang Liu ◽

Qing Zhang ◽

Yuerong Gao ◽

...

Keyword(s):

Target Genes ◽

Floral Development ◽

Binding Protein ◽

Expression Patterns ◽

Evolutionary Relationship ◽

Deciduous Tree ◽

Main Function ◽

Squamosa Promoter Binding Protein ◽

Spl Genes ◽

Castanea Mollissima

Chestnut (Castanea mollissima) is a deciduous tree species with major economic and ecological value that is widely used in the study of floral development in woody plants due its monoecious and out-of-proportion characteristics. Squamosa promoter-binding protein-like (SPL) is a plant-specific transcription factor that plays an important role in floral development. In this study, a total of 18 SPL genes were identified in the chestnut genome, of which 10 SPL genes have complementary regions of CmmiR156. An analysis of the phylogenetic tree of the squamosa promoter-binding protein (SBP) domains of the SPL genes of Arabidopsis thaliana, Populus trichocarpa, and C. mollissima divided these SPL genes into eight groups. The evolutionary relationship between poplar and chestnut in the same group was similar. A structural analysis of the protein-coding regions (CDSs) showed that the domains have the main function of SBP domains and that other domains also play an important role in determining gene function. The expression patterns of CmmiR156 and CmSPLs in different floral organs of chestnut were analyzed by real-time quantitative PCR. Some CmSPLs with similar structural patterns showed similar expression patterns, indicating that the gene structures determine the synergy of the gene functions. The application of gibberellin (GA) and its inhibitor (Paclobutrazol, PP333) to chestnut trees revealed that these exert a significant effect on the number and length of the male and female chestnut flowers. GA treatment significantly increased CmmiR156 expression and thus significantly decreased the expression of its target gene, CmSPL6/CmSPL9/CmSPL16, during floral bud development. This finding indicates that GA might indirectly affect the expression of some of the SPL target genes through miR156. In addition, RNA ligase-mediated rapid amplification of the 5′ cDNA ends (RLM-RACE) experiments revealed that CmmiR156 cleaves CmSPL9 and CmSPL16 at the 10th and 12th bases of the complementary region. These results laid an important foundation for further study of the biological function of CmSPLs in the floral development of C. mollissima.

Download Full-text

Molecular Cloning and Sequence Analysis of the Gene Encoding Interferon Alpha of the Giant Panda (Ailuropoda melanoleuca)

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.195-196.370 ◽

2012 ◽

Vol 195-196 ◽

pp. 370-379

Author(s):

Yue Yi ◽

Cheng Dong Wang ◽

Zhi Wen Xu ◽

De Sheng Li ◽

Ling Zhu ◽

...

Keyword(s):

Interferon Alpha ◽

Giant Panda ◽

Mononuclear Cells ◽

Evolutionary Relationship ◽

Functional Study ◽

Antigenic Determinants ◽

Ailuropoda Melanoleuca ◽

Peripheral Blood Mononuclear ◽

Functional Sites ◽

Cell Functions

nterferon-alpha (IFN-a) is a kind of cytokines that share antiviral, as well as immunomodulatory, and antiproliferative effects on cell functions. In this report, the cDNA for Ailuropoda melanoleucas interferon alpha was cloned from the ConA-stimulated giant panda peripheral blood mononuclear cells (PBMCs) by RT-PCR. Bioinformatics analysis was performed to predict the characteristics of this gene. Sequencing revealed that the fragment was composed of 495 nucleotides, intronless, encoding a mature polypeptide with 164 amino acids and exhibiting a molecular mass of 18.15 kDa. The analysis of the functional sites and antigenic determinants demonstrated that this protein has 27 functional sites and 9 antigenic determinants, And possesses typical characteristics of interferon alpha, beta and delta family. Compared with 10 corresponding IFN-α sequences. It revealed that the GpIFN-a gene had a close evolutionary relationship with mammalians IFN-a. Phylogenetic tree based on nucleotide sequences showed that giant panda, ferret, dog and cat clustered together and evolved into a distinct phylogenetic lineage. In conclusion, all the data and consequences will provide a basis for further functional study of the Ailuropoda melanoleuca IFN-a.

Download Full-text

X-Ray Cross-Correlation Analysis of Disordered Ensembles of Particles: Potentials and Limitations

Advances in Condensed Matter Physics ◽

10.1155/2013/959835 ◽

2013 ◽

Vol 2013 ◽

pp. 1-15 ◽

Cited By ~ 15

Author(s):

R. P. Kurta ◽

M. Altarelli ◽

I. A. Vartanyants

Keyword(s):

Correlation Analysis ◽

Disordered Systems ◽

Cross Correlation ◽

Structural Information ◽

Three Dimensional ◽

X Ray ◽

X Ray Scattering ◽

Cross Correlation Analysis ◽

2D And 3D ◽

Ray Scattering

Angular X-ray cross-correlation analysis (XCCA) is an approach to study the structure of disordered systems using the results of X-ray scattering experiments. In this paper we summarize recent theoretical developments related to the Fourier analysis of the cross-correlation functions. Results of our simulations demonstrate the application of XCCA to two- and three-dimensional (2D and 3D) disordered ensembles of particles. We show that the structure of a single particle can be recovered using X-ray data collected from a 2D disordered system of identical particles. We also demonstrate that valuable structural information about the local structure of 3D systems, inaccessible from a standard small-angle X-ray scattering experiment, can be resolved using XCCA.

Download Full-text