KEGG-PATH: Kyoto encyclopedia of genes and genomes-based pathway analysis using a path analysis model

Junli Du; Zhifa Yuan; Ziwei Ma; Jiuzhou Song; Xiaoli Xie; Yulin Chen

doi:10.1039/c4mb00287c

78 Effects of invitro production on the epigenome and transcriptome of bovine embryos determined through a multi-omics data integration approach

Reproduction Fertility and Development ◽

10.1071/rdv33n2ab78 ◽

2021 ◽

Vol 33 (2) ◽

pp. 147

Author(s):

M. Rabaglino ◽

J. B.-M. Secher ◽

P. Hyttel ◽

H. Kadarmideen

Keyword(s):

Survival Rates ◽

Enrichment Analysis ◽

Functional Enrichment Analysis ◽

Gene Expression Omnibus ◽

Functional Enrichment ◽

Receptor Interaction ◽

P Value ◽

Kegg Pathways ◽

The Past ◽

Before And After

In cattle, ovarian superovulation followed by invivo embryo collection and transfer (MOET), and the invitro production (IVP) of embryos are used all over the world to improve animal genetics. Application of MOET has resulted in the production of billions of healthy animals during the past 40 years, and IVP has evolved and given rise to significant numbers of calves during the past 10 years. Nevertheless, the use of MOET and IVP can affect the embryo epigenome, and therefore its transcriptome, before and after elongation, as shown by different studies. The integration of publicly available epigenome-transcriptome datasets generated by these studies could lead to a robust characterisation of the impacts of the application of MOET and IVP. The goal of this study was to integrate all publicly available data about MOET and IVP embryos to determine temporally differentially methylated regions (DMRs) and differentially expressed genes (DEGs) from blastocyst to elongation between IVP and MOET embryos. Datasets were downloaded from the Gene Expression Omnibus (GEO) database. Accession numbers were (1) for epigenomics: GSE69173, GSE97517, and GSE101895, plus one provided dataset from O’Doherty et al. (2018 BMC Genomics, 19, 438; https://doi.org/10.1186/s12864-018-4818-3), all hybridized to the EDMA platform GPL18384; (2) for transcriptomics: GSE12327, GSE21030, GSE24596, GSE24936, GSE27817, and GSE40101, all hybridized to the Affymetrix platform GPL2112. Both types of data were analysed with the limma package for R software, and functional enrichment analysis was done with the DAVID database. For DMRs, comparisons between IVP and MOET were made from spherical blastocysts (n=16 per group) on Day 7, to embryos on Day 15, specifically in the trophectoderm (TE) or embryonic disc (ED) regions (n=4 per region and per group). For DEGs, comparisons between IVP and MOET were made from spherical blastocysts (n=9 per group) to elongated blastocysts on Day 13 and embryos undergoing gastrulation on Day 16 (n=6 per group). Considering a P-value <0.05 and fold-change >2, there were 16 672 (TE) and 26 264 (ED) DMRs and 2236 DEGs that temporally differed between IVP and MOET. Most of the identified DMRs were found in intronic regions (around 36%) rather than exonic regions (8%). However, DMRs that were more methylated at IVP compared with MOET contained exons encoding for genes that enriched the Wnt signalling Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway in the ED, and focal adhesion and ECM-receptor interaction KEGG pathways (P<0.05) in the TE. Accordingly, DEGs with lower expression in elongated embryos (Day 13 and Day 16) at IVP as opposed to MOET were mainly associated with these three pathways. In conclusion, this multi-omics analysis demonstrates that even when embryos are produced under different conditions and experiments, the main changes imposed by IVP affected genes involved in embryonic development and adhesion to the endometrium, which could explain the lower survival rates at IVP compared with MOET.

Download Full-text

VirusCircBase: a database of virus circular RNAs

Briefings in Bioinformatics ◽

10.1093/bib/bbaa052 ◽

2020 ◽

Cited By ~ 4

Author(s):

Zena Cai ◽

Yunshi Fan ◽

Zheng Zhang ◽

Congyu Lu ◽

Zhaozhong Zhu ◽

...

Keyword(s):

Influenza A ◽

Enrichment Analysis ◽

Computational Prediction ◽

Functional Enrichment Analysis ◽

Functional Enrichment ◽

Circular Rnas ◽

Specific Cell ◽

Sequencing Data ◽

Kegg Pathways ◽

Dsdna Viruses

Abstract Circular RNAs (circRNAs) are covalently closed long noncoding RNAs critical in diverse cellular activities and multiple human diseases. Several cancer-related viral circRNAs have been identified in double-stranded DNA viruses (dsDNA), yet no systematic study about the viral circRNAs has been reported. Herein, we have performed a systematic survey of 11 924 circRNAs from 23 viral species by computational prediction of viral circRNAs from viral-infection-related RNA sequencing data. Besides the dsDNA viruses, our study has also revealed lots of circRNAs in single-stranded RNA viruses and retro-transcribing viruses, such as the Zika virus, the Influenza A virus, the Zaire ebolavirus, and the Human immunodeficiency virus 1. Most viral circRNAs had reverse complementary sequences or repeated sequences at the flanking sequences of the back-splice sites. Most viral circRNAs only expressed in a specific cell line or tissue in a specific species. Functional enrichment analysis indicated that the viral circRNAs from dsDNA viruses were involved in KEGG pathways associated with cancer. All viral circRNAs presented in the current study were stored and organized in VirusCircBase, which is freely available at http://www.computationalbiology.cn/ViruscircBase/home.html and is the first virus circRNA database. VirusCircBase forms the fundamental atlas for the further exploration and investigation of viral circRNAs in the context of public health.

Download Full-text

Genome-wide ChIPseq analysis of AhR, COUP-TF, and HNF4 enrichment in TCDD-treated mouse liver

10.1101/2021.06.18.448955 ◽

2021 ◽

Author(s):

Giovan N. Cholico ◽

Rance Nault ◽

Timothy R. Zacharewski

Keyword(s):

Gene Expression ◽

Transcription Factor ◽

Aryl Hydrocarbon Receptor ◽

Time Course ◽

Enrichment Analysis ◽

Functional Enrichment Analysis ◽

Functional Enrichment ◽

Hepatocyte Differentiation ◽

Aryl Hydrocarbon ◽

Tcdd Treatment

The aryl hydrocarbon receptor (AhR) is a ligand-activated transcription factor known for mediating the toxicity of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) and related compounds. Although the canonical mechanism of AhR activation involves heterodimerization with the aryl hydrocarbon receptor nuclear translocator, other transcriptional regulators that interact with AhR have been identified. Enrichment analysis of motifs in AhR-bound genomic regions implicated co-operation with COUP transcription factor (COUP-TF) and hepatocyte nuclear factor 4 (HNF4). The present study investigated AhR, HNF4α and COUP-TFII genomic binding and effects on gene expression associated with liver-specific function and cell differentiation in response to TCDD. Hepatic ChIPseq data from male C57BL/6 mice at 2 hrs after oral gavage with 30 μg/kg TCDD were integrated with bulk RNA-sequencing (RNAseq) time-course (2 - 72 hrs) and dose-response (0.01 - 30 μg/kg) datasets to assess putative AhR, HNF4α and COUP-TFII interactions associated with differential gene expression. TCDD treatment resulted in the genomic enrichment of 23,701, 11,688, and 9,547 binding regions for AhR, COUP-TFII and HNF4α, respectively, throughout the genome. Functional enrichment analysis of differentially expressed genes (DEGs) identified differential binding enrichment for AhR, COUP-TFII, and HNF4a to regions within liver-specific genes suggesting intersections associated with the loss of liver-specific functions and hepatocyte differentiation. Analysis found that the repression of liver-specific, HNF4α target and hepatocyte differentiation genes, involved increased AhR and HNF4α binding with decreased COUP-TFII binding. Collectively, these results suggested TCDD-elicited loss of liver-specific functions and markers of hepatocyte differentiation involved interactions between AhR, COUP-TFII and HNF4α.

Download Full-text

Prognostic Value of a Novel Signature With Nine Hepatitis C Virus-Induced Genes in Hepatic Cancer by Mining GEO and TCGA Databases

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.648279 ◽

2021 ◽

Vol 9 ◽

Author(s):

Jianming Wei ◽

Bo Wang ◽

Xibo Gao ◽

Daqing Sun

Keyword(s):

Hepatitis C ◽

High Risk ◽

Risk Group ◽

Enrichment Analysis ◽

Functional Enrichment Analysis ◽

Functional Enrichment ◽

Hepatic Cancer ◽

Prognostic Signature ◽

Kegg Pathways ◽

Induced Genes

BackgroundHepatitis C virus-induced genes (HCVIGs) play a critical role in regulating tumor development in hepatic cancer. The role of HCVIGs in hepatic cancer remains unknown. This study aimed to construct a prognostic signature and assess the value of the risk model for predicting the prognosis of hepatic cancer.MethodsDifferentially expressed HCVIGs were identified in hepatic cancer data from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases using the library (“limma”) package of R software. The protein–protein interaction (PPI) network was constructed using the Cytoscape software. Functional enrichment analysis was performed using the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Univariate and multivariate Cox proportional hazard regression analyses were applied to screen for prognostic HCVIGs. The signature of HCVIGs was constructed. Gene Set Enrichment Analysis (GSEA) compared the low-risk and high-risk groups. Finally, the International Cancer Genome Consortium (ICGC) database was used to validate this prognostic signature. Polymerase chain reaction (PCR) was performed to validate the expression of nine HCVIGs in the hepatic cancer cell lines.ResultsA total of 143 differentially expressed HCVIGs were identified in TCGA hepatic cancer dataset. Functional enrichment analysis showed that DNA replication was associated with the development of hepatic cancer. The risk score signature was constructed based on the expression of ZIC2, SLC7A11, PSRC1, TMEM106C, TRAIP, DTYMK, FAM72D, TRIP13, and CENPM. In this study, the risk score was an independent prognostic factor in the multivariate Cox regression analysis [hazard ratio (HR) = 1.433, 95% CI = 1.280–1.605, P < 0.001]. The overall survival curve revealed that the high-risk group had a poor prognosis. The Kaplan–Meier Plotter online database showed that the survival time of hepatic cancer patients with overexpression of HCVIGs in this signature was significantly shorter. The prognostic signature-associated GO and KEGG pathways were significantly enriched in the risk group. This prognostic signature was validated using external data from the ICGC databases. The expression of nine prognostic genes was validated in HepG2 and LO-2.ConclusionThis study evaluates a potential prognostic signature and provides a way to explore the mechanism of HCVIGs in hepatic cancer.

Download Full-text

Enrichment analysis on regulatory subspaces: a novel direction for the superior description of cellular responses to SARS-CoV-2

10.1101/2021.12.15.472466 ◽

2021 ◽

Author(s):

Pedro P. Rodrigues ◽

Rafael S. Costa ◽

Rui Henriques

Keyword(s):

Machine Learning ◽

Cell Lines ◽

Enrichment Analysis ◽

Functional Enrichment Analysis ◽

Functional Enrichment ◽

Learning Approaches ◽

Transcriptional Responses ◽

Kegg Pathways ◽

Regulatory Processes

Statement: The enrichment analysis of discriminative cell transcriptional responses to SARS-CoV-2 infection using biclustering produces a broader set of superiorly enriched GO terms and KEGG pathways against alternative state-of-the-art machine learning approaches, unraveling novel knowledge. Motivation and methods: The comprehensive understanding of the impacts of the SARS-CoV-2 virus on infected cells is still incomplete. This work identifies and analyses the main cell regulatory processes affected and induced by SARS-CoV-2, using transcriptomic data from several infectable cell lines available in public databases and in vivo samples. We propose a new class of statistical models to handle three major challenges, namely the scarcity of observations, the high dimensionality of the data, and the complexity of the interactions between genes. Additionally, we analyse the function of these genes and their interactions within cells to compare them to ones affected by IAV (H1N1), RSV and HPIV3 in the target cell lines. Results: Gathered results show that, although clustering and predictive algorithms aid classic functional enrichment analysis, recent pattern-based biclustering algorithms significantly improve the number and quality of the detected biological processes. Additionally, a comparative analysis of these processes is performed to identify potential pathophysiological characteristics of COVID-19. These are further compared to those identified by other authors for the same virus as well as related ones such as SARS-CoV-1. This approach is particularly relevant due to a lack of other works utilizing more complex machine learning tools within this context.

Download Full-text

Multiple-to-multiple path analysis model

PLoS ONE ◽

10.1371/journal.pone.0247722 ◽

2021 ◽

Vol 16 (3) ◽

pp. e0247722

Author(s):

Yujie Du ◽

Junli Du ◽

Xi Liu ◽

Zhifa Yuan

Keyword(s):

Path Analysis ◽

Regulation Mechanism ◽

Analysis Model ◽

Multiple Path ◽

Independent Variables ◽

Determination Coefficient ◽

Dependent Variables ◽

Regulation Mechanisms ◽

Complex Regulation ◽

Path Analysis Model

One-to-multiple path analysis model describes the regulation mechanism of multiple independent variables to one dependent variable by dividing the correlation coefficient and the determination coefficient. How to analyse more complex regulation mechanisms of multiple independent variables to multiple dependent variables? Similarly, according to multiple-to-multiple linear regression analysis, multiple-to-multiple path analysis model was proposed in this paper and it demonstrated more complex regulation mechanisms among multiple independent variables and multiple dependent variables by dividing the generalized determination coefficient. Differently, three other types of paths were generated in multiple-to-multiple path analysis model in that the correlation among multiple dependent variables was considered. Then, the decision coefficient of each independent variable was constructed for dependent variables system, and its hypothesis testing statistics were given. Finally, the research example of the wheat breeding rules in arid area demonstrated that the multiple-to-multiple path analysis considering more correlation information can get better results.

Download Full-text

Systems Biology Approaches Reveal a Multi-stress Responsive WRKY Transcription Factor and Stress Associated Gene Co-expression Networks in Chickpea

Current Bioinformatics ◽

10.2174/1574893614666190204152500 ◽

2019 ◽

Vol 14 (7) ◽

pp. 591-601 ◽

Cited By ~ 1

Author(s):

Aravind K. Konda ◽

Parasappa R. Sabale ◽

Khela R. Soren ◽

Shanmugavadivel P. Subramaniam ◽

Pallavi Singh ◽

...

Keyword(s):

Expression Pattern ◽

Gene Networks ◽

Molecular Mechanisms ◽

Enrichment Analysis ◽

Functional Enrichment Analysis ◽

Wrky Transcription Factor ◽

Functional Enrichment ◽

Differentially Expressed ◽

Callose Deposition ◽

Significant Probability

Background: Chickpea is a nutritional rich premier pulse crop but its production encounters setbacks due to various stresses and understanding of molecular mechanisms can be ascribed foremost importance. Objective: The investigation was carried out to identify the differentially expressed WRKY TFs in chickpea in response to herbicide stress and decipher their interacting partners. Methods: For this purpose, transcriptome wide identification of WRKY TFs in chickpea was done. Behavior of the differentially expressed TFs was compared between other stress conditions. Orthology based cofunctional gene networks were derived from Arabidopsis. Gene ontology and functional enrichment analysis was performed using Blast2GO and STRING software. Gene Coexpression Network (GCN) was constructed in chickpea using publicly available transcriptome data. Expression pattern of the identified gene network was studied in chickpea-Fusarium interactions. Results: A unique WRKY TF (Ca_08086) was found to be significantly (q value = 0.02) upregulated not only under herbicide stress but also in other stresses. Co-functional network of 14 genes, namely Ca_08086, Ca_19657, Ca_01317, Ca_20172, Ca_12226, Ca_15326, Ca_04218, Ca_07256, Ca_14620, Ca_12474, Ca_11595, Ca_15291, Ca_11762 and Ca_03543 were identified. GCN revealed 95 hub genes based on the significant probability scores. Functional annotation indicated role in callose deposition and response to chitin. Interestingly, contrasting expression pattern of the 14 network genes was observed in wilt resistant and susceptible chickpea genotypes, infected with Fusarium. Conclusion: This is the first report of identification of a multi-stress responsive WRKY TF and its associated GCN in chickpea.

Download Full-text

Structural variations in papaya genomes

BMC Genomics ◽

10.1186/s12864-021-07665-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Zhenyang Liao ◽

Xunxiao Zhang ◽

Shengcheng Zhang ◽

Zhicong Lin ◽

Xingtan Zhang ◽

...

Keyword(s):

Agronomic Traits ◽

Enrichment Analysis ◽

Copy Number Variant ◽

Functional Enrichment Analysis ◽

Functional Enrichment ◽

Structural Variations ◽

Reproduction Traits ◽

Depth Study ◽

Environmental Adaptability ◽

The Impact

Abstract Background Structural variations (SVs) are a type of mutations that have not been widely detected in plant genomes and studies in animals have shown their role in the process of domestication. An in-depth study of SVs will help us to further understand the impact of SVs on the phenotype and environmental adaptability during papaya domestication and provide genomic resources for the development of molecular markers. Results We detected a total of 8083 SVs, including 5260 deletions, 552 tandem duplications and 2271 insertions with deletion being the predominant, indicating the universality of deletion in the evolution of papaya genome. The distribution of these SVs is non-random in each chromosome. A total of 1794 genes overlaps with SV, of which 1350 genes are expressed in at least one tissue. The weighted correlation network analysis (WGCNA) of these expressed genes reveals co-expression relationship between SVs-genes and different tissues, and functional enrichment analysis shows their role in biological growth and environmental responses. We also identified some domesticated SVs genes related to environmental adaptability, sexual reproduction, and important agronomic traits during the domestication of papaya. Analysis of artificially selected copy number variant genes (CNV-genes) also revealed genes associated with plant growth and environmental stress. Conclusions SVs played an indispensable role in the process of papaya domestication, especially in the reproduction traits of hermaphrodite plants. The detection of genome-wide SVs and CNV-genes between cultivated gynodioecious populations and wild dioecious populations provides a reference for further understanding of the evolution process from male to hermaphrodite in papaya.

Download Full-text

CCTs as new biomarkers for the prognosis of head and neck squamous cancer

Open Medicine ◽

10.1515/med-2020-0114 ◽

2020 ◽

Vol 15 (1) ◽

pp. 672-688

Author(s):

Yanbo Dong ◽

Siyu Lu ◽

Zhenxiao Wang ◽

Liangfa Liu

Keyword(s):

Head And Neck ◽

Survival Data ◽

Enrichment Analysis ◽

Functional Enrichment Analysis ◽

Functional Enrichment ◽

Advanced Tumor Stage ◽

Expression Levels ◽

T Complex ◽

Advanced Tumor ◽

Squamous Cancer

AbstractThe chaperonin-containing T-complex protein 1 (CCT) subunits participate in diverse diseases. However, little is known about their expression and prognostic values in human head and neck squamous cancer (HNSC). This article aims to evaluate the effects of CCT subunits regarding their prognostic values for HNSC. We mined the transcriptional and survival data of CCTs in HNSC patients from online databases. A protein–protein interaction network was constructed and a functional enrichment analysis of target genes was performed. We observed that the mRNA expression levels of CCT1/2/3/4/5/6/7/8 were higher in HNSC tissues than in normal tissues. Survival analysis revealed that the high mRNA transcriptional levels of CCT3/4/5/6/7/8 were associated with a low overall survival. The expression levels of CCT4/7 were correlated with advanced tumor stage. And the overexpression of CCT4 was associated with higher N stage of patients. Validation of CCTs’ differential expression and prognostic values was achieved by the Human Protein Atlas and GEO datasets. Mechanistic exploration of CCT subunits by the functional enrichment analysis suggests that these genes may influence the HNSC prognosis by regulating PI3K-Akt and other pathways. This study implies that CCT3/4/6/7/8 are promising biomarkers for the prognosis of HNSC.

Download Full-text

Investigation and Functional Enrichment Analysis of the Human Host Interaction Network with Common Gram-Negative Respiratory Pathogens Predicts Possible Association with Lung Adenocarcinoma

Pathophysiology ◽

10.3390/pathophysiology28010003 ◽

2021 ◽

Vol 28 (1) ◽

pp. 20-33

Author(s):

Lydia-Eirini Giannakou ◽

Athanasios-Stefanos Giannopoulos ◽

Chrissi Hatzoglou ◽

Konstantinos I. Gourgoulianis ◽

Erasmia Rouka ◽

...

Keyword(s):

Lung Adenocarcinoma ◽

Interaction Network ◽

Enrichment Analysis ◽

Functional Enrichment Analysis ◽

Functional Enrichment ◽

Cell Junctions ◽

Respiratory Pathogens ◽

Gram Negative ◽

Human Proteins ◽

Apoptotic Pathways

Haemophilus influenzae (Hi), Moraxella catarrhalis (MorCa) and Pseudomonas aeruginosa (Psa) are three of the most common gram-negative bacteria responsible for human respiratory diseases. In this study, we aimed to identify, using the functional enrichment analysis (FEA), the human gene interaction network with the aforementioned bacteria in order to elucidate the full spectrum of induced pathogenicity. The Human Pathogen Interaction Database (HPIDB 3.0) was used to identify the human proteins that interact with the three pathogens. FEA was performed via the ToppFun tool of the ToppGene Suite and the GeneCodis database so as to identify enriched gene ontologies (GO) of biological processes (BP), cellular components (CC) and diseases. In total, 11 human proteins were found to interact with the bacterial pathogens. FEA of BP GOs revealed associations with mitochondrial membrane permeability relative to apoptotic pathways. FEA of CC GOs revealed associations with focal adhesion, cell junctions and exosomes. The most significantly enriched annotations in diseases and pathways were lung adenocarcinoma and cell cycle, respectively. Our results suggest that the Hi, MorCa and Psa pathogens could be related to the pathogenesis and/or progression of lung adenocarcinoma via the targeting of the epithelial cellular junctions and the subsequent deregulation of the cell adhesion and apoptotic pathways. These hypotheses should be experimentally validated.

Download Full-text