Fast gene set enrichment analysis

Revealing Biological Pathways Implicated in Lung Cancer from TCGA Gene Expression Data Using Gene Set Enrichment Analysis

Cancer Informatics ◽

10.4137/cin.s13882 ◽

2014 ◽

Vol 13s1 ◽

pp. CIN.S13882 ◽

Cited By ~ 4

Author(s):

Binghuang Cai ◽

Xia Jiang

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Gene Expression Data ◽

Lung Squamous Cell Carcinoma ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Expression Data ◽

Gene Set Enrichment ◽

Gene Set ◽

Pathway Gene

Analyzing biological system abnormalities in cancer patients based on measures of biological entities, such as gene expression levels, is an important and challenging problem. This paper applies existing methods, Gene Set Enrichment Analysis and Signaling Pathway Impact Analysis, to pathway abnormality analysis in lung cancer using microarray gene expression data. Gene expression data from studies of Lung Squamous Cell Carcinoma (LUSC) in The Cancer Genome Atlas project, and pathway gene set data from the Kyoto Encyclopedia of Genes and Genomes were used to analyze the relationship between pathways and phenotypes. Results, in the form of pathway rankings, indicate that some pathways may behave abnormally in LUSC. For example, both the cell cycle and viral carcinogenesis pathways ranked very high in LUSC. Furthermore, some pathways that are known to be associated with cancer, such as the p53 and the PI3K-Akt signal transduction pathways, were found to rank high in LUSC. Other pathways, such as bladder cancer and thyroid cancer pathways, were also ranked high in LUSC.

Download Full-text

Application of bi-clustering of gene expression data and gene set enrichment analysis methods to identify potentially disease causing nanomaterials

Data in Brief ◽

10.1016/j.dib.2017.10.060 ◽

2017 ◽

Vol 15 ◽

pp. 933-940 ◽

Cited By ~ 2

Author(s):

Andrew Williams ◽

Sabina Halappanavar

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Expression Data ◽

Gene Set Enrichment ◽

Gene Set ◽

Analysis Methods

Download Full-text

Newborn infant skin gene expression: Remarkable differences versus adults

PLoS ONE ◽

10.1371/journal.pone.0258554 ◽

2021 ◽

Vol 16 (10) ◽

pp. e0258554

Author(s):

Marty O. Visscher ◽

Ping Hu ◽

Andrew N. Carr ◽

Charles C. Bascom ◽

Robert J. Isfort ◽

...

Keyword(s):

Ultraviolet Radiation ◽

Immune Function ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Biological Processes ◽

Hair Cycle ◽

Gene Set Enrichment ◽

Gene Set ◽

Adult Skin ◽

Skin Samples

At birth, human infants are poised to survive in harsh, hostile conditions. An understanding of the state of newborn skin development and maturation is key to the maintenance of health, optimum response to injury, healing and disease. The observational study collected full-thickness newborn skin samples from 27 infants at surgery and compared them to skin samples from 43 adult sites protected from ultraviolet radiation exposure, as the standard for stable, mature skin. Transcriptomics profiling and gene set enrichment analysis were performed. Statistical analysis established over 25,000 differentially regulated probe sets, representing 10,647 distinct genes, in infant skin compared to adult skin. Gene set enrichment analysis showed a significant increase in 143 biological processes (adjusted p < 0.01) in infant skin, versus adult skin samples, including extracellular matrix (ECM) organization, cell adhesion, collagen fibril organization and fatty acid metabolic process. ECM organization and ECM structure organization were the biological processes in infant skin with the lowest adjusted P-value. Genes involving epidermal development, immune function, cell differentiation, and hair cycle were overexpressed in adults, representing 101 significantly enriched biological processes (adjusted p < 0.01). The processes with the highest significant difference were skin and epidermal development, e.g., keratinocyte differentiation, keratinization and cornification intermediate filament cytoskeleton organization and hair cycle. Enriched Gene Ontology (GO) biological processes also involved immune function, including antigen processing and presentation. When compared to ultraviolet radiation-protected adult skin, our results provide essential insight into infant skin and its ability to support the newborn’s preparedness to survive and flourish, despite the infant’s new environment laden with microbes, high oxygen tension and potential irritants. This fundamental knowledge is expected to guide strategies to protect and preserve the features of unperturbed, young skin.

Download Full-text

Differential gene expression and gene-set enrichment analysis in Caco-2 monolayers during a 30-day timeline with Dexamethasone exposure

Tissue Barriers ◽

10.1080/21688370.2019.1651597 ◽

2019 ◽

Vol 7 (3) ◽

pp. e1651597 ◽

Cited By ~ 3

Author(s):

J.M. Robinson ◽

S. Turkington ◽

S.A. Abey ◽

N. Kenea ◽

W.A. Henderson

Keyword(s):

Gene Expression ◽

Differential Gene Expression ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Gene Set Enrichment ◽

Gene Set ◽

Differential Gene

Download Full-text

Gene Expression Signature in Human Neuroblastoma with TERT Overexpression Can Be Identified by Gene Set Enrichment Analysis and Epigenetically Targeted in an Orthotopic Mouse Xenograft Model

Journal of the American College of Surgeons ◽

10.1016/j.jamcollsurg.2020.07.565 ◽

2020 ◽

Vol 231 (4) ◽

pp. S199

Author(s):

Min Huang ◽

Lauren Wood ◽

Jasmine C. Zeki ◽

Modupeola Diyaolu ◽

Miao Gong ◽

...

Keyword(s):

Gene Expression ◽

Xenograft Model ◽

Gene Expression Signature ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Gene Set Enrichment ◽

Human Neuroblastoma ◽

Mouse Xenograft ◽

Gene Set ◽

Mouse Xenograft Model

Download Full-text

Tracking Difference in Gene Expression in a Time-Course Experiment Using Gene Set Enrichment Analysis

PLoS ONE ◽

10.1371/journal.pone.0107629 ◽

2014 ◽

Vol 9 (9) ◽

pp. e107629 ◽

Cited By ~ 2

Author(s):

Pui Shan Wong ◽

Michihiro Tanaka ◽

Yoshihiko Sunaga ◽

Masayoshi Tanaka ◽

Takeaki Taniguchi ◽

...

Keyword(s):

Gene Expression ◽

Time Course ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Gene Set Enrichment ◽

Gene Set ◽

Time Course Experiment

Download Full-text

Towards a gold standard for benchmarking gene set enrichment analysis

10.1101/674267 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ludwig Geistlinger ◽

Gergely Csaba ◽

Mara Santarelli ◽

Marcel Ramos ◽

Lucas Schiffer ◽

...

Keyword(s):

Ad Hoc ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Data Sets ◽

Expression Data ◽

Rna Seq ◽

Gene Set Enrichment ◽

Gene Set ◽

Gene Sets ◽

Enrichment Methods

AbstractBackgroundAlthough gene set enrichment analysis has become an integral part of high-throughput gene expression data analysis, the assessment of enrichment methods remains rudimentary and ad hoc. In the absence of suitable gold standards, evaluations are commonly restricted to selected data sets and biological reasoning on the relevance of resulting enriched gene sets. However, this is typically incomplete and biased towards the goals of individual investigations.ResultsWe present a general framework for standardized and structured benchmarking of enrichment methods based on defined criteria for applicability, gene set prioritization, and detection of relevant processes. This framework incorporates a curated compendium of 75 expression data sets investigating 42 different human diseases. The compendium features microarray and RNA-seq measurements, and each dataset is associated with a precompiled GO/KEGG relevance ranking for the corresponding disease under investigation. We perform a comprehensive assessment of 10 major enrichment methods on the benchmark compendium, identifying significant differences in (i) runtime and applicability to RNA-seq data, (ii) fraction of enriched gene sets depending on the type of null hypothesis tested, and (iii) recovery of the a priori defined relevance rankings. Based on these findings, we make practical recommendations on (i) how methods originally developed for microarray data can efficiently be applied to RNA-seq data, (ii) how to interpret results depending on the type of gene set test conducted, and (iii) which methods are best suited to effectively prioritize gene sets with high relevance for the phenotype investigated.ConclusionWe carried out a systematic assessment of existing enrichment methods, and identified best performing methods, but also general shortcomings in how gene set analysis is currently conducted. We provide a directly executable benchmark system for straightforward assessment of additional enrichment methods.Availabilityhttp://bioconductor.org/packages/GSEABenchmarkeR

Download Full-text

Construction and Validation of a Reliable Six-Gene Prognostic Signature Based on the TP53 Alteration for Hepatocellular Carcinoma

Frontiers in Oncology ◽

10.3389/fonc.2021.618976 ◽

2021 ◽

Vol 11 ◽

Author(s):

Junyu Huo ◽

Liqun Wu ◽

Yunjin Zang

Keyword(s):

Gene Expression ◽

Hepatocellular Carcinoma ◽

Regression Analysis ◽

Cox Regression ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Prognostic Signature ◽

Gene Set Enrichment ◽

Cox Regression Analysis ◽

Gene Set

BackgroundThe high mutation rate of TP53 in hepatocellular carcinoma (HCC) makes it an attractive potential therapeutic target. However, the mechanism by which TP53 mutation affects the prognosis of HCC is not fully understood.Material and ApproachThis study downloaded a gene expression profile and clinical-related information from The Cancer Genome Atlas (TCGA) database and the international genome consortium (ICGC) database. We used Gene Set Enrichment Analysis (GSEA) to determine the difference in gene expression patterns between HCC samples with wild-type TP53 (n=258) and mutant TP53 (n=116) in the TCGA cohort. We screened prognosis-related genes by univariate Cox regression analysis and Kaplan–Meier (KM) survival analysis. We constructed a six-gene prognostic signature in the TCGA training group (n=184) by Lasso and multivariate Cox regression analysis. To assess the predictive capability and applicability of the signature in HCC, we conducted internal validation, external validation, integrated analysis and subgroup analysis.ResultsA prognostic signature consisting of six genes (EIF2S1, SEC61A1, CDC42EP2, SRM, GRM8, and TBCD) showed good performance in predicting the prognosis of HCC. The area under the curve (AUC) values of the ROC curve of 1-, 2-, and 3-year survival of the model were all greater than 0.7 in each independent cohort (internal testing cohort, n = 181; TCGA cohort, n = 365; ICGC cohort, n = 229; whole cohort, n = 594; subgroup, n = 9). Importantly, by gene set variation analysis (GSVA) and the single sample gene set enrichment analysis (ssGSEA) method, we found three possible causes that may lead to poor prognosis of HCC: high proliferative activity, low metabolic activity and immunosuppression.ConclusionOur study provides a reliable method for the prognostic risk assessment of HCC and has great potential for clinical transformation.

Download Full-text

NGSEA: network-based gene set enrichment analysis for interpreting gene expression phenotypes with functional gene sets

10.1101/636498 ◽

2019 ◽

Cited By ~ 1

Author(s):

Heonjong Han ◽

Sangyoung Lee ◽

Insuk Lee

Keyword(s):

Gene Expression ◽

Enrichment Analysis ◽

Functional Gene ◽

Gene Set Enrichment Analysis ◽

Clinical Samples ◽

Functional Genes ◽

Biological Processes ◽

Gene Set Enrichment ◽

Gene Sets ◽

Anti Cancer

ABSTRACTGene set enrichment analysis (GSEA) is a popular tool to identify underlying biological processes in clinical samples using their gene expression phenotypes. GSEA measures the enrichment of annotated gene sets that represent biological processes for differentially expressed genes (DEGs) in clinical samples. GSEA may be suboptimal for functional gene sets, however, because DEGs from the expression dataset may not be functional genes per se but dysregulated genes perturbed by bona fide functional genes. To overcome this shortcoming, we developed network-based GSEA (NGSEA), which measures the enrichment score of functional gene sets using the expression difference of not only individual genes but also their neighbors in the functional network. We found that NGSEA outperformed GSEA in identifying pathway gene sets for matched gene expression phenotypes. We also observed that NGSEA substantially improved the ability to retrieve known anti-cancer drugs from patient-derived gene expression data using drug-target gene sets compared with another method, Connectivity Map. We also repurposed FDA-approved drugs using NGSEA and experimentally validated budesonide as a chemical with anti-cancer effects for colorectal cancer. We, therefore, expect that NGSEA will facilitate both pathway interpretation of gene expression phenotypes and anti-cancer drug repositioning. NGSEA is freely available at www.inetbio.org/ngsea.

Download Full-text

XGSEA: CROSS-species Gene Set Enrichment Analysis via domain adaptation

10.1101/2020.07.21.213645 ◽

2020 ◽

Author(s):

Menglan Cai ◽

Canh Hao Nguyen ◽

Hiroshi Mamitsuka ◽

Limin Li

Keyword(s):

Gene Expression ◽

Domain Adaptation ◽

Gene Knockout ◽

Enrichment Analysis ◽

Real Data ◽

Gene Set Enrichment Analysis ◽

Data Sets ◽

Gene Set Enrichment ◽

Gene Set ◽

Gene Sets

AbstractGene set enrichment analysis (GSEA) has been widely used to identify gene sets with statistically significant difference between cases and controls against a large gene set. GSEA needs both phenotype labels and expression of genes. However, gene expression are assessed more often for model organisms than minor species. More importantly, gene expression could not be measured under specific conditions for human, due to high healthy risk of direct experiments, such as non-approved treatment or gene knockout, and then often substituted by mouse. Thus predicting enrichment significance (on a phenotype) of a given gene set of a species (target, say human), by using gene expression measured under the same phenotype of the other species (source, say mouse) is a vital and challenging problem, which we call CROSS-species Gene Set Enrichment Problem (XGSEP). For XGSEP, we propose XGSEA (Cross-species Gene Set Enrichment Analysis), with three steps of: 1) running GSEA for a source species to obtain enrichment scores and p-values of source gene sets; 2) representing the relation between source and target gene sets by domain adaptation; and 3) using regression to predict p-values of target gene sets, based on the representation in 2). We extensively validated XGSEA by using four real data sets under various settings, proving that XGSEA significantly outperformed three baseline methods. A case study of identifying important human pathways for T cell dysfunction and reprogramming from mouse ATAC-Seq data further confirmed the reliability of XGSEA. Source code is available through https://github.com/LiminLi-xjtu/XGSEAAuthor summaryGene set enrichment analysis (GSEA) is a powerful tool in the gene sets differential analysis given a ranked gene list. GSEA requires complete data, gene expression with phenotype labels. However, gene expression could not be measured under specific conditions for human, due to high risk of direct experiments, such as non-approved treatment or gene knockout, and then often substituted by mouse. Thus no availability of gene expression leads to more challenging problem, CROSS-species Gene Set Enrichment Problem (XGSEP), in which enrichment significance (on a phenotype) of a given gene set of a species (target, say human) is predicted by using gene expression measured under the same phenotype of the other species (source, say mouse). In this work, we propose XGSEA (Cross-species Gene Set Enrichment Analysis) for XGSEP, with three steps of: 1) GSEA; 2) domain adaptation; and 3) regression. The results of four real data sets and a case study indicate that XGSEA significantly outperformed three baseline methods and confirmed the reliability of XGSEA.

Download Full-text