scholarly journals Partitioning a PPI Network into Overlapping Modules Constrained by High-Density and Periphery Tracking

2012 ◽  
Vol 2012 ◽  
pp. 1-11 ◽  
Author(s):  
Md. Altaf-Ul-Amin ◽  
Masayoshi Wada ◽  
Shigehiko Kanaya

This paper presents an algorithm called DPClusO for partitioning simple graphs into overlapping modules, that is, clusters constrained by density and periphery tracking. The major advantages of DPClusO over the related and previously published algorithm DPClus are shorter running time and ensuring coverage, that is, each node goes to at least one module. DPClusO is a general-purpose clustering algorithm and useful for finding overlapping cohesive groups in a simple graph for any type of application. This work shows that the modules generated by DPClusO from several PPI networks of yeast with high-density constraint match with more known complexes compared to some other recently published complex generating algorithms. Furthermore, the biological significance of the high density modules has been demonstrated by comparing their P values in the context of Gene Ontology (GO) terms with those of the randomly generated modules having the same size, distribution, and zero density. As a consequence, it was also learnt that a PPI network is a combination of mainly high-density and star-like modules.

2020 ◽  
Vol 25 (1) ◽  
Author(s):  
Xue Jiang ◽  
Zhijie Xu ◽  
Yuanyuan Du ◽  
Hongyu Chen

Abstract Background Immunoglobulin A nephropathy (IgAN) is the most common primary glomerulopathy worldwide. However, the molecular events underlying IgAN remain to be fully elucidated. This study aimed to identify novel biomarkers of IgAN through bioinformatics analysis and elucidate the possible molecular mechanism. Methods Based on the microarray datasets GSE93798 and GSE37460 downloaded from the Gene Expression Omnibus database, the differentially expressed genes (DEGs) between IgAN samples and normal controls were identified. Using the DEGs, we further performed a series of functional enrichment analyses. Protein–protein interaction (PPI) networks of the DEGs were constructed using the STRING online search tool and were visualized using Cytoscape. Next, hub genes were identified and the most important module among the DEGs, Biological Networks Gene Ontology tool (BiNGO), was used to elucidate the molecular mechanism of IgAN. Results In total, 148 DEGs were identified, comprising 53 upregulated genes and 95 downregulated genes. Gene Ontology (GO) analysis indicated that the DEGs for IgAN were mainly enriched in extracellular exosome, region and space, fibroblast growth factor stimulus, inflammatory response, and innate immunity. Module analysis showed that genes in the top 1 significant module of the PPI network were mainly associated with innate immune response, integrin-mediated signaling pathway and inflammatory response. The top 10 hub genes were constructed in the PPI network, which could well distinguish the IgAN and control group in monocyte and tissue samples. We finally identified the integrin subunit beta 2 (ITGB2) and Fc fragment of IgE receptor Ig (FCER1G) genes that may play important roles in the development of IgAN. Conclusions We identified key genes along with the pathways that were most closely related to IgAN initiation and progression. Our results provide a more detailed molecular mechanism for the development of IgAN and novel candidate gene targets of IgAN.


2019 ◽  
Vol 20 (S25) ◽  
Author(s):  
Jie Zhao ◽  
Xiujuan Lei

Abstract Background Protein complexes are the cornerstones of many biological processes and gather them to form various types of molecular machinery that perform a vast array of biological functions. In fact, a protein may belong to multiple protein complexes. Most existing protein complex detection algorithms cannot reflect overlapping protein complexes. To solve this problem, a novel overlapping protein complexes identification algorithm is proposed. Results In this paper, a new clustering algorithm based on overlay network chain in quotient space, marked as ONCQS, was proposed to detect overlapping protein complexes in weighted PPI networks. In the quotient space, a multilevel overlay network is constructed by using the maximal complete subgraph to mine overlapping protein complexes. The GO annotation data is used to weight the PPI network. According to the compatibility relation, the overlay network chain in quotient space was calculated. The protein complexes are contained in the last level of the overlay network. The experiments were carried out on four PPI databases, and compared ONCQS with five other state-of-the-art methods in the identification of protein complexes. Conclusions We have applied ONCQS to four PPI databases DIP, Gavin, Krogan and MIPS, the results show that it is superior to other five existing algorithms MCODE, MCL, CORE, ClusterONE and COACH in detecting overlapping protein complexes.


2014 ◽  
Vol 22 (03) ◽  
pp. 339-351 ◽  
Author(s):  
JIAWEI LUO ◽  
NAN ZHANG

Essential proteins are important for the survival and development of organisms. Lots of centrality algorithms based on network topology have been proposed to detect essential proteins and achieve good results. However, most of them only focus on the network topology, but ignore the false positive (FP) interactions in protein–protein interaction (PPI) network. In this paper, gene ontology (GO) information is proposed to measure the reliability of the edges in PPI network and we propose a novel algorithm for identifying essential proteins, named EGC algorithm. EGC algorithm integrates topology character of PPI network and GO information. To validate the performance of EGC algorithm, we use EGC and other nine methods (DC, BC, CC, SC, EC, LAC, NC, PEC and CoEWC) to identify the essential proteins in the two different yeast PPI networks: DIP and MIPS. The results show that EGC is better than the other nine methods, which means adding GO information can help in predicting essential proteins.


2017 ◽  
Author(s):  
Danila Vella ◽  
Simone Marini ◽  
Francesca Vitali ◽  
Riccardo Bellazzi

The increasing amount of -omics data leads to development of models to interpret and analyse them. A common approach consists in representing data as PPI Networks. These models can be very complex and informatics tools are needed to analyse them. In this abstract, we present MTopGO, an algorithm of module detection specific for PPI Network, exploiting both the network topological information and the Gene Ontology (GO) knowledge about network proteins. MTopGO output consists in a network partition, where each obtained cluster is labelled with a specific GO term describing its biological nature. In a single step, MTopGO performs a double PPI network analysis; from a topological perspective, through the individuation of a meaningful network partition and, from a biological perspective, through the selection of significant GO terms describing the biological role of network proteins.


2017 ◽  
Author(s):  
Danila Vella ◽  
Simone Marini ◽  
Francesca Vitali ◽  
Riccardo Bellazzi

The increasing amount of -omics data leads to development of models to interpret and analyse them. A common approach consists in representing data as PPI Networks. These models can be very complex and informatics tools are needed to analyse them. In this abstract, we present MTopGO, an algorithm of module detection specific for PPI Network, exploiting both the network topological information and the Gene Ontology (GO) knowledge about network proteins. MTopGO output consists in a network partition, where each obtained cluster is labelled with a specific GO term describing its biological nature. In a single step, MTopGO performs a double PPI network analysis; from a topological perspective, through the individuation of a meaningful network partition and, from a biological perspective, through the selection of significant GO terms describing the biological role of network proteins.


2020 ◽  
Author(s):  
Ping Kong ◽  
Wei Liu

Abstract Background: Escherichia coli has been at the center of microbial research for decades, making it a standard microorganism for studying molecular mechanism. Molecular complexes, operons and functional modules are important molecular functional domains of Escherichia coli. Most previous studies focused on the detection of E. coli protein complexes based on the experimental methods. While the research of prediction of protein complexes in E. coli based on large-scale proteomic data, especially the functional modules of E. coli are relatively few. Identifying protein complexes and functional modules of E. coli is crucial to reveal principles of cellular organizations, processes and functions. Results: In this study, the protein complexes and functional modules of two high-quality binary interaction datasets of E. coli are predicted by an efficient edge clustering algorithm (ELPA) for complex biological network, respectively. According to the gold standard protein complexes and function annotations provided by EcoCyc dataset, the experimental results show that most topological modules predicted in the two datasets match very well with the real protein complexes, cellular processes and biological functions. By analyzing the corresponding complexes and functional modules shows that all predicted protein complexes are fully covered by one or more functional modules. Furthermore, we compared the results of ELPA with a famous node clustering algorithm (MCL) on the same PPI network of E. coli , and found that ELPA outperforms MCL in terms of matching with gold standard complexes. Conclusions: As a consequence, we surmise that topological modules of PPI network detected by ELPA fits well with real protein complexes and functional units. In most predicted topological modules, the protein complexes and corresponding functional modules are highly overlapping. ELPA is an effective tool to predict protein complexes and functional modules in PPI networks of E. coli.


2020 ◽  
Vol 21 (S16) ◽  
Author(s):  
Xiaoshi Zhong ◽  
Jagath C. Rajapakse

Abstract Background Protein–protein interaction (PPI) prediction is an important task towards the understanding of many bioinformatics functions and applications, such as predicting protein functions, gene-disease associations and disease-drug associations. However, many previous PPI prediction researches do not consider missing and spurious interactions inherent in PPI networks. To address these two issues, we define two corresponding tasks, namely missing PPI prediction and spurious PPI prediction, and propose a method that employs graph embeddings that learn vector representations from constructed Gene Ontology Annotation (GOA) graphs and then use embedded vectors to achieve the two tasks. Our method leverages on information from both term–term relations among GO terms and term-protein annotations between GO terms and proteins, and preserves properties of both local and global structural information of the GO annotation graph. Results We compare our method with those methods that are based on information content (IC) and one method that is based on word embeddings, with experiments on three PPI datasets from STRING database. Experimental results demonstrate that our method is more effective than those compared methods. Conclusion Our experimental results demonstrate the effectiveness of using graph embeddings to learn vector representations from undirected GOA graphs for our defined missing and spurious PPI tasks.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Kalifa Manjang ◽  
Shailesh Tripathi ◽  
Olli Yli-Harja ◽  
Matthias Dehmer ◽  
Frank Emmert-Streib

Abstract Gene ontology (GO) is an eminent knowledge base frequently used for providing biological interpretations for the analysis of genes or gene sets from biological, medical and clinical problems. Unfortunately, the interpretation of such results is challenging due to the large number of GO terms, their hierarchical and connected organization as directed acyclic graphs (DAGs) and the lack of tools allowing to exploit this structural information explicitly. For this reason, we developed the package . The main features of are (I) easy and direct access to structural features of GO, (II) structure-based ranking of GO-terms, (III) mapping to reduced GO-DAGs including visualization capabilities and (IV) prioritizing of GO-terms. The underlying idea of is to exploit a graph-theoretical perspective of GO as manifested by its DAG-structure and the containing hierarchy levels for cumulating semantic information. That means all these features enhance the utilization of structural information of GO and complement existing analysis tools. Overall, provides exploratory as well as confirmatory tools for complementing any kind of analysis resulting in a list of GO-terms, e.g., from differentially expressed genes or gene sets, GWAS or biomarkers. Our package is freely available from CRAN.


2018 ◽  
Vol 14 (1) ◽  
pp. 4-10
Author(s):  
Fang Jing ◽  
Shao-Wu Zhang ◽  
Shihua Zhang

Background:Biological network alignment has been widely studied in the context of protein-protein interaction (PPI) networks, metabolic networks and others in bioinformatics. The topological structure of networks and genomic sequence are generally used by existing methods for achieving this task.Objective and Method:Here we briefly survey the methods generally used for this task and introduce a variant with incorporation of functional annotations based on similarity in Gene Ontology (GO). Making full use of GO information is beneficial to provide insights into precise biological network alignment.Results and Conclusion:We analyze the effect of incorporation of GO information to network alignment. Finally, we make a brief summary and discuss future directions about this topic.


2021 ◽  
Vol 104 (3) ◽  
pp. 003685042110180
Author(s):  
Xiao Lin ◽  
Meng Zhou ◽  
Zehong Xu ◽  
Yusheng Chen ◽  
Fan Lin

In this study, we aimed to screen out genes associated with a high risk of postoperative recurrence of lung adenocarcinoma and investigate the possible mechanisms of the involvement of these genes in the recurrence of lung adenocarcinoma. We identify Hub genes and verify the expression levels and prognostic roles of these genes. Datasets of GSE40791, GSE31210, and GSE30219 were obtained from the Gene Expression Omnibus database. Enrichment analysis of gene ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were performed for the screened candidate genes using the DAVID database. Then, we performed protein–protein interaction (PPI) network analysis through the database STRING. Hub genes were screened out using Cytoscape software, and their expression levels were determined by the GEPIA database. Finally, we assessed the relationships of Hub genes expression levels and the time of survival. Forty-five candidate genes related to a high-risk of lung adenocarcinoma recurrence were screened out. Gene ontology analysis showed that these genes were enriched in the mitotic spindle assembly checkpoint, mitotic sister chromosome segregation, G2/M-phase transition of the mitotic cell cycle, and ATP binding, etc. KEGG analysis showed that these genes were involved predominantly in the cell cycle, p53 signaling pathway, and oocyte meiosis. We screened out the top ten Hub genes related to high expression of lung adenocarcinoma from the PPI network. The high expression levels of eight genes (TOP2A, HMMR, MELK, MAD2L1, BUB1B, BUB1, RRM2, and CCNA2) were related to short recurrence-free survival and they can be used as biomarkers for high risk of lung adenocarcinoma recurrence. This study screened out eight genes associated with a high risk of lung adenocarcinoma recurrence, which might provide novel insights into researching the recurrence mechanisms of lung adenocarcinoma as well as into the selection of targets in the treatment of the disease.


Sign in / Sign up

Export Citation Format

Share Document