Lung Cancer Explorer (LCE): an open web portal to explore gene expression and clinical associations in lung cancer

Mapping Intimacies ◽

10.1101/271056 ◽

2018 ◽

Author(s):

Ling Cai ◽

ShinYi Lin ◽

Yunyun Zhou ◽

Lin Yang ◽

Bo Ci ◽

...

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Clinical Data ◽

Whole Genome ◽

Expression Data ◽

Web Resource ◽

Differential Gene ◽

Clinical Associations ◽

Meta Analyses ◽

Individual Dataset

AbstractWe constructed a lung cancer-specific database housing expression data and clinical data from over 6,700 patients in 56 studies. Expression data from 23 “whole-genome” based platforms were carefully processed and quality controlled, whereas clinical data were standardized and rigorously curated. Empowered by this lung cancer database, we created an open access web resource – the Lung Cancer Explorer (LCE), which enables researchers and clinicians to explore these data and perform analyses. Users can perform meta-analyses on LCE to gain a quick overview of the results on tumor vs normal differential gene expression and expression-survival association. Individual dataset-based survival analysis, comparative analysis, and correlation analysis are also provided with flexible options to allow for customized analyses from the user.

Download Full-text

Imaging Biomarkers and Gene Expression Data Correlation Framework for Lung Cancer Radiogenomics Analysis Based on Deep Learning

IEEE Access ◽

10.1109/access.2021.3071466 ◽

2021 ◽

pp. 1-1

Author(s):

Dong Sui ◽

Maozu Guo ◽

Xiaoxuan Ma ◽

Julian Baptiste ◽

Lei Zhang

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Deep Learning ◽

Gene Expression Data ◽

Imaging Biomarkers ◽

Expression Data ◽

Data Correlation

Download Full-text

A lepidic gene signature predicts patient prognosis and sensitivity to immunotherapy in lung adenocarcinoma

Genome Medicine ◽

10.1186/s13073-021-01010-w ◽

2022 ◽

Vol 14 (1) ◽

Author(s):

Thinh T. Nguyen ◽

Hyun-Sung Lee ◽

Bryan M. Burt ◽

Jia Wu ◽

Jianjun Zhang ◽

...

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Lung Adenocarcinoma ◽

Gene Expression Analysis ◽

Immune Cell ◽

Expression Patterns ◽

Immune Cell Infiltration ◽

Histological Subtypes ◽

Genomic Features ◽

Differential Gene

Abstract Background Lung adenocarcinoma, the most common type of lung cancer, has a high level of morphologic heterogeneity and is composed of tumor cells of multiple histological subtypes. It has been reported that immune cell infiltration significantly impacts clinical outcomes of patients with lung adenocarcinoma. However, it is unclear whether histologic subtyping can reflect the tumor immune microenvironment, and whether histologic subtyping can be applied for therapeutic stratification of the current standard of care. Methods We inferred immune cell infiltration levels using a histological subtype-specific gene expression dataset. From differential gene expression analysis between different histological subtypes, we developed two gene signatures to computationally determine the relative abundance of lepidic and solid components (denoted as the L-score and S-score, respectively) in lung adenocarcinoma samples. These signatures enabled us to investigate the relationship between histological composition and clinical outcomes in lung adenocarcinoma using previously published datasets. Results We found dramatic immunological differences among histological subtypes. Differential gene expression analysis showed that the lepidic and solid subtypes could be differentiated based on their gene expression patterns while the other subtypes shared similar gene expression patterns. Our results indicated that higher L-scores were associated with prolonged survival, and higher S-scores were associated with shortened survival. L-scores and S-scores were also correlated with global genomic features such as tumor mutation burdens and driver genomic events. Interestingly, we observed significantly decreased L-scores and increased S-scores in lung adenocarcinoma samples with EGFR gene amplification but not in samples with EGFR gene mutations. In lung cancer cell lines, we observed significant correlations between L-scores and cell sensitivity to a number of targeted drugs including EGFR inhibitors. Moreover, lung cancer patients with higher L-scores were more likely to benefit from immune checkpoint blockade therapy. Conclusions Our findings provided further insights into evaluating histology composition in lung adenocarcinoma. The established signatures reflected that lepidic and solid subtypes in lung adenocarcinoma would be associated with prognosis, genomic features, and responses to targeted therapy and immunotherapy. The signatures therefore suggested potential clinical translation in predicting patient survival and treatment responses. In addition, our framework can be applied to other types of cancer with heterogeneous histological subtypes.

Download Full-text

An Integrative Systems Biology and Experimental Approach Identifies Convergence of Epithelial Plasticity, Metabolism, and Autophagy to Promote Chemoresistance

Journal of Clinical Medicine ◽

10.3390/jcm8020205 ◽

2019 ◽

Vol 8 (2) ◽

pp. 205 ◽

Cited By ~ 9

Author(s):

Shengnan Xu ◽

Kathryn Ware ◽

Yuantong Ding ◽

So Kim ◽

Maya Sheth ◽

...

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Systems Biology ◽

Gene Expression Data ◽

Transforming Growth Factor ◽

Epithelial Mesenchymal Transition ◽

Phylogenetic Analyses ◽

Host Immune System ◽

Expression Data ◽

Mesenchymal Transition

The evolution of therapeutic resistance is a major cause of death for cancer patients. The development of therapy resistance is shaped by the ecological dynamics within the tumor microenvironment and the selective pressure of the host immune system. These selective forces often lead to evolutionary convergence on pathways or hallmarks that drive progression. Thus, a deeper understanding of the evolutionary convergences that occur could reveal vulnerabilities to treat therapy-resistant cancer. To this end, we combined phylogenetic clustering, systems biology analyses, and molecular experimentation to identify convergences in gene expression data onto common signaling pathways. We applied these methods to derive new insights about the networks at play during transforming growth factor-β (TGF-β)-mediated epithelial–mesenchymal transition in lung cancer. Phylogenetic analyses of gene expression data from TGF-β-treated cells revealed convergence of cells toward amine metabolic pathways and autophagy during TGF-β treatment. Knockdown of the autophagy regulatory, ATG16L1, re-sensitized lung cancer cells to cancer therapies following TGF-β-induced resistance, implicating autophagy as a TGF-β-mediated chemoresistance mechanism. In addition, high ATG16L expression was found to be a poor prognostic marker in multiple cancer types. These analyses reveal the usefulness of combining evolutionary and systems biology methods with experimental validation to illuminate new therapeutic vulnerabilities for cancer.

Download Full-text

Abstract 3946: Androgen receptor drives differential gene expression in KRAS-mediated non-small cell lung cancer

10.1158/1538-7445.am2018-3946 ◽

2018 ◽

Author(s):

Albert Roy Wang ◽

Hope Beyer ◽

Sean Brennan ◽

Shannon Stiles ◽

Dylan Wiese ◽

...

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Androgen Receptor ◽

Small Cell Lung Cancer ◽

Differential Gene Expression ◽

Cell Lung Cancer ◽

Small Cell ◽

Small Cell Lung ◽

Differential Gene

Download Full-text

Revealing Biological Pathways Implicated in Lung Cancer from TCGA Gene Expression Data Using Gene Set Enrichment Analysis

Cancer Informatics ◽

10.4137/cin.s13882 ◽

2014 ◽

Vol 13s1 ◽

pp. CIN.S13882 ◽

Cited By ~ 4

Author(s):

Binghuang Cai ◽

Xia Jiang

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Gene Expression Data ◽

Lung Squamous Cell Carcinoma ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Expression Data ◽

Gene Set Enrichment ◽

Gene Set ◽

Pathway Gene

Analyzing biological system abnormalities in cancer patients based on measures of biological entities, such as gene expression levels, is an important and challenging problem. This paper applies existing methods, Gene Set Enrichment Analysis and Signaling Pathway Impact Analysis, to pathway abnormality analysis in lung cancer using microarray gene expression data. Gene expression data from studies of Lung Squamous Cell Carcinoma (LUSC) in The Cancer Genome Atlas project, and pathway gene set data from the Kyoto Encyclopedia of Genes and Genomes were used to analyze the relationship between pathways and phenotypes. Results, in the form of pathway rankings, indicate that some pathways may behave abnormally in LUSC. For example, both the cell cycle and viral carcinogenesis pathways ranked very high in LUSC. Furthermore, some pathways that are known to be associated with cancer, such as the p53 and the PI3K-Akt signal transduction pathways, were found to rank high in LUSC. Other pathways, such as bladder cancer and thyroid cancer pathways, were also ranked high in LUSC.

Download Full-text

Analysis on Differential Gene Expression Data for Prediction of New Biological Features in Permanent Atrial Fibrillation

PLoS ONE ◽

10.1371/journal.pone.0076166 ◽

2013 ◽

Vol 8 (10) ◽

pp. e76166 ◽

Cited By ~ 7

Author(s):

Feng Ou ◽

Nini Rao ◽

Xudong Jiang ◽

Mengyao Qian ◽

Wei Feng ◽

...

Keyword(s):

Gene Expression ◽

Atrial Fibrillation ◽

Differential Gene Expression ◽

Gene Expression Data ◽

Expression Data ◽

Biological Features ◽

Permanent Atrial Fibrillation ◽

Differential Gene

Download Full-text

SFARI Genes and where to find them; classification modelling to identify genes associated with Autism Spectrum Disorder from RNA-seq data

10.1101/2021.01.29.428754 ◽

2021 ◽

Author(s):

Magdalena Navarro ◽

T Ian Simpson

Keyword(s):

Gene Expression ◽

Autism Spectrum Disorder ◽

Differential Gene Expression ◽

Gene Expression Data ◽

Gene List ◽

Autism Spectrum ◽

Spectrum Disorder ◽

Expression Data ◽

Link Type ◽

Differential Gene

AbstractMotivationAutism spectrum disorder (ASD) has a strong, yet heterogeneous, genetic component. Among the various methods that are being developed to help reveal the underlying molecular aetiology of the disease, one that is gaining popularity is the combination of gene expression and clinical genetic data. For ASD, the SFARI-gene database comprises lists of curated genes in which presumed causative mutations have been identified in patients. In order to predict novel candidate SFARI-genes we built classification models combining differential gene expression data for ASD patients and unaffected individuals with a gene’s status in the SFARI-gene list.ResultsSFARI-genes were not found to be significantly associated with differential gene expression patterns, nor were they enriched in gene co-expression network modules that had a strong correlation with ASD diagnosis. However, network analysis and machine learning models that incorporate information from the whole gene co-expression network were able to predict novel candidate genes that share features of existing SFARI genes and have support for roles in ASD in the literature. We found a statistically significant bias related to the absolute level of gene expression for existing SFARI genes and their scores. It is essential that this bias be taken into account when studies interpret ASD gene expression data at gene, module and whole-network levels.AvailabilitySource code is available from GitHub (https://doi.org/10.5281/zenodo.4463693) and the accompanying data from The University of Edinburgh DataStore (https://doi.org/10.7488/ds/2980)[email protected]

Download Full-text

A Collection of Multimodal Melanoma Data

International Journal of Research and Review ◽

10.52403/ijrr.20211034 ◽

2021 ◽

Vol 8 (10) ◽

pp. 257-262

Author(s):

Aigli Korfiati ◽

Giorgos Livanos ◽

Christos Konstandinou

Keyword(s):

Gene Expression ◽

Clinical Data ◽

Data Availability ◽

Expression Data ◽

Rna Seq ◽

Crucial Importance ◽

Computer Aided ◽

Keywords Melanoma ◽

Computational Resources ◽

Aided Diagnosis

Computer-aided diagnosis, prognosis and therapy systems have been of great interest for a number of years. The availability of big volumes of data and of powerful computational resources have allowed artificial intelligence approaches to emerge in melanoma related studies. However, for such approaches to have good predictive performances data availability is of crucial importance. Melanoma related imaging, biological and clinical data can be found partially and scattered in various repositories. Thus, in this work, we assemble in a web accessible database, named ebioMelDB, the widest collection of clinical and dermoscopy images accompanied with patient clinical data and the widest collection of RNA-Seq gene expression data accompanied with patient clinical data. The database organization allows users to select the data that are appropriate for their application of interest (diagnosis, prognosis and therapy). Keywords: melanoma database, integrated data, dermoscopy, imaging, RNA-Seq, clinical data.

Download Full-text

Chip-seq and gene expression data for the identification of functional sub-pathways: a proof of concept in lung cancer

10.1101/2020.06.15.151712 ◽

2020 ◽

Author(s):

Xanthoula Atsalaki ◽

Lefteris Koumakis ◽

George Potamias ◽

Manolis Tsiknakis

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Systems Biology ◽

Gene Expression Data ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Molecular Mechanisms ◽

Integrative Approach ◽

Expression Data ◽

Gene Regulatory

AbstractHigh-throughput technologies, such as chromatin immunoprecipitation (ChIP) with massively parallel sequencing (ChIP-seq) have enabled cost and time efficient generation of immense amount of genome data. The advent of advanced sequencing techniques allowed biologists and bioinformaticians to investigate biological aspects of cell function and understand or reveal unexplored disease etiologies. Systems biology attempts to formulate the molecular mechanisms in mathematical models and one of the most important areas is the gene regulatory networks (GRNs), a collection of DNA segments that somehow interact with each other. GRNs incorporate valuable information about molecular targets that can be corellated to specific phenotype.In our study we highlight the need to develop new explorative tools and approaches for the integration of different types of -omics data such as ChIP-seq and GRNs using pathway analysis methodologies. We present an integrative approach for ChIP-seq and gene expression data on GRNs. Using public microarray expression samples for lung cancer and healthy subjects along with the KEGG human gene regulatory networks, we identified ways to disrupt functional sub-pathways on lung cancer with the aid of CTCF ChIP-seq data, as a proof of concept.We expect that such a systems biology pipeline could assist researchers to identify corellations and causality of transcription factors over functional or disrupted biological sub-pathways.

Download Full-text

Maximizing the Reusability of Public Gene Expression Data by Predicting Missing Metadata

10.1101/792382 ◽

2019 ◽

Author(s):

Pei-Yau Lung ◽

Xiaodong Pang ◽

Yan Li ◽

Jinfeng Zhang

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Gene Expression Data ◽

Missing Values ◽

Expression Data ◽

New Approach ◽

Machine Learning Methods ◽

Differential Gene ◽

Missing Variables ◽

Better Than

AbstractReusability is part of the FAIR data principle, which aims to make data Findable, Accessible, Interoperable, and Reusable. One of the current efforts to increase the reusability of public genomics data has been to focus on the inclusion of quality metadata associated with the data. When necessary metadata are missing, most researchers will consider the data useless. In this study, we develop a framework to predict the missing metadata of gene expression datasets to maximize their reusability. We propose a new metric called Proportion of Cases Accurately Predicted (PCAP), which is optimized in our specifically-designed machine learning pipeline. The new approach performed better than pipelines using commonly used metrics such as F1-score in terms of maximizing the reusability of data with missing values. We also found that different variables might need to be predicted using different machine learning methods and/or different data processing protocols. Using differential gene expression analysis as an example, we show that when missing variables are accurately predicted, the corresponding gene expression data can be reliably used in downstream analyses.

Download Full-text