scholarly journals Identification of potential biomarkers with colorectal cancer based on bioinformatics analysis and machine learning

2021 ◽  
Vol 18 (6) ◽  
pp. 8997-9015
Author(s):  
Ahmed Hammad ◽  
◽  
Mohamed Elshaer ◽  
Xiuwen Tang ◽  
◽  
...  

<abstract> <p>Colorectal cancer (CRC) is one of the most common malignancies worldwide. Biomarker discovery is critical to improve CRC diagnosis, however, machine learning offers a new platform to study the etiology of CRC for this purpose. Therefore, the current study aimed to perform an integrated bioinformatics and machine learning analyses to explore novel biomarkers for CRC prognosis. In this study, we acquired gene expression microarray data from Gene Expression Omnibus (GEO) database. The microarray expressions GSE103512 dataset was downloaded and integrated. Subsequently, differentially expressed genes (DEGs) were identified and functionally analyzed via Gene Ontology (GO) and Kyoto Enrichment of Genes and Genomes (KEGG). Furthermore, protein protein interaction (PPI) network analysis was conducted using the STRING database and Cytoscape software to identify hub genes; however, the hub genes were subjected to Support Vector Machine (SVM), Receiver operating characteristic curve (ROC) and survival analyses to explore their diagnostic values. Meanwhile, TCGA transcriptomics data in Gene Expression Profiling Interactive Analysis (GEPIA) database and the pathology data presented by in the human protein atlas (HPA) database were used to verify our transcriptomic analyses. A total of 105 DEGs were identified in this study. Functional enrichment analysis showed that these genes were significantly enriched in biological processes related to cancer progression. Thereafter, PPI network explored a total of 10 significant hub genes. The ROC curve was used to predict the potential application of biomarkers in CRC diagnosis, with an area under ROC curve (AUC) of these genes exceeding 0.92 suggesting that this risk classifier can discriminate between CRC patients and normal controls. Moreover, the prognostic values of these hub genes were confirmed by survival analyses using different CRC patient cohorts. Our results demonstrated that these 10 differentially expressed hub genes could be used as potential biomarkers for CRC diagnosis.</p> </abstract>

2021 ◽  
Author(s):  
Pejman Morovat ◽  
Saman Morovat ◽  
Arash M. Ashrafi ◽  
Shahram Teimourian

Abstract Hepatocellular carcinoma (HCC) is one of the most prevalent cancers worldwide, which has a high mortality rate and poor treatment outcomes with yet unknown molecular basis. It seems that gene expression plays a pivotal role in the pathogenesis of the disease. Circular RNAs (circRNAs) can interact with microRNAs (miRNAs) to regulate gene expression in various malignancies by acting as competitive endogenous RNAs (ceRNAs). However, the potential pathogenesis roles of the ceRNA network among circRNA/miRNA/mRNA in HCC are unclear. In this study, first, the HCC circRNA expression data were obtained from three Gene Expression Omnibus microarray datasets (GSE164803, GSE94508, GSE97332), and the differentially expressed circRNAs (DECs) were identified using R limma package. Also, the liver hepatocellular carcinoma (LIHC) miRNA and mRNA sequence data were retrieved from TCGA, and differentially expressed miRNAs (DEMIs) and mRNAs (DEGs) were determined using the R DESeq2 package. Second, CSCD website was used to uncover the binding sites of miRNAs on DECs. The DECs' potential target miRNAs were revealed by conducting an intersection between predicted miRNAs from CSCD and downregulated DEMIs. Third, some related genes were uncovered by intersecting targeted genes predicted by miRWalk and targetscan online tools with upregulated DEGs. The ceRNA network was then built using the Cytoscape software. The functional enrichment and the overall survival time of these potential targeted genes were analyzed, and a PPI network was constructed in the STRING database. Network visualization was performed by Cytoscape, and ten hub genes were detected using the CytoHubba plugin tool. Four DECs (hsa_circ_0000520, hsa_circ_0008616, hsa_circ_0070934, hsa_circ_0004315) were obtained and six miRNAs (hsa-miR-542-5p, hsa-miR-326, hsa-miR-511-5p, hsa-miR-195-5p, hsa-miR-214-3p, and hsa-miR-424-5p) which are regulated by the above DECs were identified. Then 543 overlapped genes regulated by six miRNAs mentioned above were predicted. Functional enrichment analysis showed that these genes are mostly associated with cancer regulation functions. Ten hub genes (TTK،AURKB, KIF20A، KIF23، CEP55، CDC6، DTL، NCAPG، CENPF، PLK4) have been screened from the PPI network of the 204 survival-related genes. KIF20A, NCAPG, TTK, PLK4, and CDC6 were selected for the highest significant p-values. In the end, a circRNA-miRNA-mRNA regulatory axis was established for five final selected hub genes. This study implies the potential pathogenesis of the obtained network and proposes that the two DECs (has_circ_0070934 and has_circ_0004315) may be important prognostic factor for HCC.


2020 ◽  
Vol 40 (9) ◽  
Author(s):  
Lin Liao ◽  
Pinhu Liao

Abstract Background: Acute respiratory distress syndrome (ARDS) is caused by uncontrolled inflammation, and the activation of alveolar macrophages (AM) is involved in pathophysiologic procedures. The present study aimed to identify key AM genes and pathways and try to provide potential targets for prognosis and early intervention in ARDS. Methods: The mRNA expression profile of GSE89953 was obtained from the Gene Expression Omnibus database. The LIMMA package in R software was used to identify differentially expressed genes (DEGs), and the clusterProfiler package was used for functional enrichment and pathway analyses. A protein–protein interaction network of DEGs was constructed to identify hub genes via the STRING database and Cytoscape software. Hub gene expression was validated using differentially expressed proteins (DEPs) obtained from the ProteomeXchange datasets to screen potential biomarkers. Results: A total of 166 DEGs (101 up-regulated and 65 down-regulated) were identified. The up-regulated DEGs were mainly enriched in regulation of the ERK1 and ERK2 cascade, response to interferon-gamma, cell chemotaxis, and migration in biological processes. In the KEGG pathway analysis, up-regulated DEGs were mainly involved in rheumatoid arthritis, cytokine–cytokine receptor interactions, phagosome, and the chemokine signaling pathway. The 12 hub genes identified included GZMA, MPO, PRF1, CXCL8, ELANE, GZMB, SELL, APOE, SPP1, JUN, CD247, and CCL2. Conclusion: SPP1 was consistently differentially expressed in both DEGs and DEPs. SPP1 could be a potential biomarker for ARDS.


2020 ◽  
Author(s):  
Rongrong Xiao ◽  
Ping Wang ◽  
Tian Xia ◽  
Chun-Yi Li ◽  
Ting Jiang ◽  
...  

Abstract Background Tumor microenvironment plays important roles in the development of cancer. The aim of our study was to examine the expression of genes in colorectal cancer and also to evaluate the association value between expression level of these genes and clinical features. Methods We combined The Cancer Genome Atlas (TCGA) datasets to identify differentially expressed genes in colon cancer. Using these differentially expressed genes, we constructed protein-protein interaction network and conducted functional enrichment analysis. Genes with degree beyond 10 in the PPI network were regarded as hub genes. Then, we verified of the expression of molecules in Oncomine datasets and conducted Kaplan-Meier curve and log-rank test and functional enrichment analysis on these hub genes. Finally, we analyzed the relationship clinicopathological features analysis with the key gene. Results There were 719 differentially expressed genes identified to be associated with colon cancer microenvironment. We screened out 10 hub genes by construction of PPI network. The functions of these hub genes were enriched in cytokine-cytokine receptor interaction, alcoholism and systemic lupus erythematosus, which provided further insight into the roles of these genes in the tumor microenvironment. GNG4, with the highest degrees in the PPI network, were highly exprepressed in metastasis(P = 9.5-05) ,N1(P = 0.0025) and N2(,0.037).It was a relationship with stage. It was significantly different between with stage I and IV, II and III, II and IV,III and IV (P = 0.0015,0.029,3.9-05,0.00074,0.01,respectively) Conclusions We identified GNG4 can be regarded as a prognostic biomarker in colon cancer.


2021 ◽  
Author(s):  
Feifei Liu ◽  
Yu Wang ◽  
Wenxue Li ◽  
Diancheng Li ◽  
Yuwei Xin ◽  
...  

Abstract Background: Colorectal cancer (CRC) is one of the most common malignancies of the digestive system; the progression and prognosis of which are affected by a complicated network of genes and pathways. The aim of this study was to identify potential hub genes associated with the progression and prognosis of colorectal cancer (CRC).Methods: We obtained gene expression profiles from GEO database to search differentially expressed genes (DEGs) between CRC tissues and normal tissue. Subsequently, we conducted a functional enrichment analysis, generated a protein–protein interaction (PPI) network to identify the hub genes, and analyzed the expression validation of the hub genes. Kaplan–Meier plotter survival analysis tool was performed to evaluate the prognostic value of hub genes expression in CRC patients.Results: A total of 370 samples, involving CRC and normal tissues were enrolled in this article. 283 differentially expressed genes (DEGs), including 62 upregulated genes and 221 downregulated genes between CRC and normal tissues were selected. We finally filtered out 6 hub genes, including INSL5, MTIM, GCG, SPP1, HSD11B2, and MAOB. In the database of TCGA-COAD, the mRNA expression of INSL5, MT1M, HSD11B2, MAOB in tumor is lower than that in normal; the mRNA expression of SPP1 in tumor is higher than that in normal. In the HPA database, the expression of INSL5, GCG, HSD11B2, MAOB in tumor is lower than that in normal tissues; the expression of SPP1 in the tumor is higher than that in normal tissues. Survival analysis revealed that INSL5, GCG, SPP1 and MT1M may serve as prognostic biomarkers in CRC. Conclusions: We screened out six hub genes to predict the occurrence and prognosis of patients with CRC using bioinformatics methods, which may provide new targets and ideas for diagnosis, prognosis and individualized treatment for CRC.


2019 ◽  
Vol 48 (5) ◽  
pp. 030006051988726
Author(s):  
Yuting Zhang ◽  
Bo Shen ◽  
Liya Zhuge ◽  
Yong Xie

Objective We aimed to identify differentially expressed genes (DEG) in patients with inflammatory bowel disease (IBD). Methods RNA-seq data were obtained from the Array Express database. DEG were identified using the edgeR package. A co-expression network was constructed and key modules with the highest correlation with IBD inflammatory sites were identified for analysis. The Cytoscape MCODE plugin was used to identify key sub-modules of the protein–protein interaction (PPI) network. The genes in the sub-modules were considered hub genes, and functional enrichment analysis was performed. Furthermore, we constructed a drug–gene interaction network. Finally, we visualized the hub gene expression pattern between the colon and ileum of IBD using the ggpubr package and analyzed it using the Wilcoxon test. Results DEG were identified between the colon and ileum of IBD patients. Based on the co-expression network, the green module had the highest correlation with IBD inflammatory sites. In total, 379 DEG in the green module were identified for the PPI network. Nineteen hub genes were differentially expressed between the colon and ileum. The drug–gene network identified these hub genes as potential drug targets. Conclusion Nineteen DEG were identified between the colon and ileum of IBD patients.


Author(s):  
Chengzhang Li ◽  
Jiucheng Xu

Background: Hepatocellular carcinoma (HCC) is a major threat to public health. However, few effective therapeutic strategies exist. We aimed to identify potentially therapeutic target genes of HCC by analyzing three gene expression profiles. Methods: The gene expression profiles were analyzed with GEO2R, an interactive web tool for gene differential expression analysis, to identify common differentially expressed genes (DEGs). Functional enrichment analyses were then conducted followed by a protein-protein interaction (PPI) network construction with the common DEGs. The PPI network was employed to identify hub genes, and the expression level of the hub genes was validated via data mining the Oncomine database. Survival analysis was carried out to assess the prognosis of hub genes in HCC patients. Results: A total of 51 common up-regulated DEGs and 201 down-regulated DEGs were obtained after gene differential expression analysis of the profiles. Functional enrichment analyses indicated that these common DEGs are linked to a series of cancer events. We finally identified 10 hub genes, six of which (OIP5, ASPM, NUSAP1, UBE2C, CCNA2, and KIF20A) are reported as novel HCC hub genes. Data mining the Oncomine database validated that the hub genes have a significant high level of expression in HCC samples compared normal samples (t-test, p < 0.05). Survival analysis indicated that overexpression of the hub genes is associated with a significant reduction (p < 0.05) in survival time in HCC patients. Conclusions: We identified six novel HCC hub genes that might be therapeutic targets for the development of drugs for some HCC patients.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257343
Author(s):  
Shaoshuo Li ◽  
Baixing Chen ◽  
Hao Chen ◽  
Zhen Hua ◽  
Yang Shao ◽  
...  

Objectives Smoking is a significant independent risk factor for postmenopausal osteoporosis, leading to genome variations in postmenopausal smokers. This study investigates potential biomarkers and molecular mechanisms of smoking-related postmenopausal osteoporosis (SRPO). Materials and methods The GSE13850 microarray dataset was downloaded from Gene Expression Omnibus (GEO). Gene modules associated with SRPO were identified using weighted gene co-expression network analysis (WGCNA), protein-protein interaction (PPI) analysis, and pathway and functional enrichment analyses. Feature genes were selected using two machine learning methods: support vector machine-recursive feature elimination (SVM-RFE) and random forest (RF). The diagnostic efficiency of the selected genes was assessed by gene expression analysis and receiver operating characteristic curve. Results Eight highly conserved modules were detected in the WGCNA network, and the genes in the module that was strongly correlated with SRPO were used for constructing the PPI network. A total of 113 hub genes were identified in the core network using topological network analysis. Enrichment analysis results showed that hub genes were closely associated with the regulation of RNA transcription and translation, ATPase activity, and immune-related signaling. Six genes (HNRNPC, PFDN2, PSMC5, RPS16, TCEB2, and UBE2V2) were selected as genetic biomarkers for SRPO by integrating the feature selection of SVM-RFE and RF. Conclusion The present study identified potential genetic biomarkers and provided a novel insight into the underlying molecular mechanism of SRPO.


2021 ◽  
Vol 11 ◽  
Author(s):  
Jiahuan Luo ◽  
Li Zhu ◽  
Ning Zhou ◽  
Yuanyuan Zhang ◽  
Lirong Zhang ◽  
...  

Background: Many studies on circular RNAs (circRNAs) have recently been published. However, the function of circRNAs in recurrent implantation failure (RIF) is unknown and remains to be explored. This study aims to determine the regulatory mechanisms of circRNAs in RIF.Methods: Microarray data of RIF circRNA (GSE147442), microRNA (miRNA; GSE71332), and messenger RNA (mRNA; GSE103465) were downloaded from the Gene Expression Omnibus (GEO) database to identify differentially expressed circRNA, miRNA, and mRNA. The circRNA–miRNA–mRNA network was constructed by Cytoscape 3.8.0 software, then the protein–protein interaction (PPI) network was constructed by STRING database, and the hub genes were identified by cytoHubba plug-in. The circRNA–miRNA–hub gene regulatory subnetwork was formed to understand the regulatory axis of hub genes in RIF. Finally, the Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of the hub genes were performed by clusterProfiler package of Rstudio software, and Reactome Functional Interaction (FI) plug-in was used for reactome analysis to comprehensively analyze the mechanism of hub genes in RIF.Results: A total of eight upregulated differentially expressed circRNAs (DECs), five downregulated DECs, 56 downregulated differentially expressed miRNAs (DEmiRs), 104 upregulated DEmiRs, 429 upregulated differentially expressed genes (DEGs), and 1,067 downregulated DEGs were identified regarding RIF. The miRNA response elements of 13 DECs were then predicted. Seven overlapping miRNAs were obtained by intersecting the predicted miRNA and DEmiRs. Then, 56 overlapping mRNAs were obtained by intersecting the predicted target mRNAs of seven miRNAs with 1,496 DEGs. The circRNA–miRNA–mRNA network and PPI network were constructed through six circRNAs, seven miRNAs, and 56 mRNAs; and four hub genes (YWHAZ, JAK2, MYH9, and RAP2C) were identified. The circRNA–miRNA–hub gene regulatory subnetwork with nine regulatory axes was formed in RIF. Functional enrichment analysis and reactome analysis showed that these four hub genes were closely related to the biological functions and pathways of RIF.Conclusion: The results of this study provide further understanding of the potential pathogenesis from the perspective of circRNA-related competitive endogenous RNA network in RIF.


2021 ◽  
Author(s):  
Churen Zhang ◽  
Ruoran Sun

Abstract Background. Among the diseases of oral mucosa, oral lichen planus (OLP) is characterized by chronic autoimmune/autoinflammation. However, the etiology and pathogenesis of OLP were still limited. The aim of this research was to identify the differentially expressed genes and their potentially interacted miRNAs in OLP to provide a possible explanation of the pathogenesis of OLP and therapeutic biomarkers.Methods. The OLP microarray dataset (GSE52130) was download from the Gene Expression Omnibus (GEO) database. R software was used to identify differentially expressed genes between the OLP samples and normal oral mucosa. Functional enrichment analysis of DEGs, including Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, were conducted. Protein–protein interaction (PPI) network analysis was performed in the STRING database. CytoHubba in the Cytoscape software was applied to determining the top 10 hub genes, whose relative miRNA was identified through RNA Interactome Database.Results. Overall, 627 DEGs was identified in OLP samples, including 351 highly expressed genes and 276 lowly expressed genes. GO analysis indicated that the epidermal differentiation was mostly enriched. For the KEGG pathway, the DEGs in OLP samples were mostly involved in Staphylococcus aureus infection, Estrogen signaling pathway, Serotonergic synapse and Histidine metabolism. Top 10 hub genes including LOR, LCE3D, LCE3E, LCE1B, LCE2B, SPRR2E, SPRR2G, LCE2A, RPTN and CDSN were identified from the PPI network. The miRNA (hsa-miR-98-5p) was regarded as the mostly possible miRNA involved in OLP.


2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Cheng Zhang ◽  
Bingye Zhang ◽  
Di Meng ◽  
Chunlin Ge

Abstract Background The incidence of cholangiocarcinoma (CCA) has risen in recent years, and it has become a significant health burden worldwide. However, the mechanisms underlying tumorigenesis and progression of this disease remain largely unknown. An increasing number of studies have demonstrated crucial biological functions of epigenetic modifications, especially DNA methylation, in CCA. The present study aimed to identify and analyze methylation-regulated differentially expressed genes (MeDEGs) involved in CCA tumorigenesis and progression by bioinformatics analysis. Methods The gene expression profiling dataset (GSE119336) and gene methylation profiling dataset (GSE38860) were obtained from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) and differentially methylated genes (DMGs) were identified using the limma packages of R and GEO2R, respectively. The MeDEGs were obtained by overlapping the DEGs and DMGs. Functional enrichment analyses of these genes were then carried out. Protein–protein interaction (PPI) networks were constructed using STRING and visualized in Cytoscape to determine hub genes. Finally, the results were verified based on The Cancer Genome Atlas (TCGA) database. Results We identified 98 hypermethylated, downregulated genes and 93 hypomethylated, upregulated genes after overlapping the DEGs and DMGs. These genes were mainly enriched in the biological processes of the cell cycle, nuclear division, xenobiotic metabolism, drug catabolism, and negative regulation of proteolysis. The top nine hub genes of the PPI network were F2, AHSG, RRM2, AURKB, CCNA2, TOP2A, BIRC5, PLK1, and ASPM. Moreover, the expression and methylation status of the hub genes were significantly altered in TCGA. Conclusions Our study identified novel methylation-regulated differentially expressed genes (MeDEGs) and explored their related pathways and functions in CCA, which may provide novel insights into a further understanding of methylation-mediated regulatory mechanisms in CCA.


Sign in / Sign up

Export Citation Format

Share Document