scholarly journals Lung Cancer Explorer (LCE): an open web portal to explore gene expression and clinical associations in lung cancer

2018 ◽  
Author(s):  
Ling Cai ◽  
ShinYi Lin ◽  
Yunyun Zhou ◽  
Lin Yang ◽  
Bo Ci ◽  
...  

AbstractWe constructed a lung cancer-specific database housing expression data and clinical data from over 6,700 patients in 56 studies. Expression data from 23 “whole-genome” based platforms were carefully processed and quality controlled, whereas clinical data were standardized and rigorously curated. Empowered by this lung cancer database, we created an open access web resource – the Lung Cancer Explorer (LCE), which enables researchers and clinicians to explore these data and perform analyses. Users can perform meta-analyses on LCE to gain a quick overview of the results on tumor vs normal differential gene expression and expression-survival association. Individual dataset-based survival analysis, comparative analysis, and correlation analysis are also provided with flexible options to allow for customized analyses from the user.

2022 ◽  
Vol 14 (1) ◽  
Author(s):  
Thinh T. Nguyen ◽  
Hyun-Sung Lee ◽  
Bryan M. Burt ◽  
Jia Wu ◽  
Jianjun Zhang ◽  
...  

Abstract Background Lung adenocarcinoma, the most common type of lung cancer, has a high level of morphologic heterogeneity and is composed of tumor cells of multiple histological subtypes. It has been reported that immune cell infiltration significantly impacts clinical outcomes of patients with lung adenocarcinoma. However, it is unclear whether histologic subtyping can reflect the tumor immune microenvironment, and whether histologic subtyping can be applied for therapeutic stratification of the current standard of care. Methods We inferred immune cell infiltration levels using a histological subtype-specific gene expression dataset. From differential gene expression analysis between different histological subtypes, we developed two gene signatures to computationally determine the relative abundance of lepidic and solid components (denoted as the L-score and S-score, respectively) in lung adenocarcinoma samples. These signatures enabled us to investigate the relationship between histological composition and clinical outcomes in lung adenocarcinoma using previously published datasets. Results We found dramatic immunological differences among histological subtypes. Differential gene expression analysis showed that the lepidic and solid subtypes could be differentiated based on their gene expression patterns while the other subtypes shared similar gene expression patterns. Our results indicated that higher L-scores were associated with prolonged survival, and higher S-scores were associated with shortened survival. L-scores and S-scores were also correlated with global genomic features such as tumor mutation burdens and driver genomic events. Interestingly, we observed significantly decreased L-scores and increased S-scores in lung adenocarcinoma samples with EGFR gene amplification but not in samples with EGFR gene mutations. In lung cancer cell lines, we observed significant correlations between L-scores and cell sensitivity to a number of targeted drugs including EGFR inhibitors. Moreover, lung cancer patients with higher L-scores were more likely to benefit from immune checkpoint blockade therapy. Conclusions Our findings provided further insights into evaluating histology composition in lung adenocarcinoma. The established signatures reflected that lepidic and solid subtypes in lung adenocarcinoma would be associated with prognosis, genomic features, and responses to targeted therapy and immunotherapy. The signatures therefore suggested potential clinical translation in predicting patient survival and treatment responses. In addition, our framework can be applied to other types of cancer with heterogeneous histological subtypes.


2019 ◽  
Vol 8 (2) ◽  
pp. 205 ◽  
Author(s):  
Shengnan Xu ◽  
Kathryn Ware ◽  
Yuantong Ding ◽  
So Kim ◽  
Maya Sheth ◽  
...  

The evolution of therapeutic resistance is a major cause of death for cancer patients. The development of therapy resistance is shaped by the ecological dynamics within the tumor microenvironment and the selective pressure of the host immune system. These selective forces often lead to evolutionary convergence on pathways or hallmarks that drive progression. Thus, a deeper understanding of the evolutionary convergences that occur could reveal vulnerabilities to treat therapy-resistant cancer. To this end, we combined phylogenetic clustering, systems biology analyses, and molecular experimentation to identify convergences in gene expression data onto common signaling pathways. We applied these methods to derive new insights about the networks at play during transforming growth factor-β (TGF-β)-mediated epithelial–mesenchymal transition in lung cancer. Phylogenetic analyses of gene expression data from TGF-β-treated cells revealed convergence of cells toward amine metabolic pathways and autophagy during TGF-β treatment. Knockdown of the autophagy regulatory, ATG16L1, re-sensitized lung cancer cells to cancer therapies following TGF-β-induced resistance, implicating autophagy as a TGF-β-mediated chemoresistance mechanism. In addition, high ATG16L expression was found to be a poor prognostic marker in multiple cancer types. These analyses reveal the usefulness of combining evolutionary and systems biology methods with experimental validation to illuminate new therapeutic vulnerabilities for cancer.


2014 ◽  
Vol 13s1 ◽  
pp. CIN.S13882 ◽  
Author(s):  
Binghuang Cai ◽  
Xia Jiang

Analyzing biological system abnormalities in cancer patients based on measures of biological entities, such as gene expression levels, is an important and challenging problem. This paper applies existing methods, Gene Set Enrichment Analysis and Signaling Pathway Impact Analysis, to pathway abnormality analysis in lung cancer using microarray gene expression data. Gene expression data from studies of Lung Squamous Cell Carcinoma (LUSC) in The Cancer Genome Atlas project, and pathway gene set data from the Kyoto Encyclopedia of Genes and Genomes were used to analyze the relationship between pathways and phenotypes. Results, in the form of pathway rankings, indicate that some pathways may behave abnormally in LUSC. For example, both the cell cycle and viral carcinogenesis pathways ranked very high in LUSC. Furthermore, some pathways that are known to be associated with cancer, such as the p53 and the PI3K-Akt signal transduction pathways, were found to rank high in LUSC. Other pathways, such as bladder cancer and thyroid cancer pathways, were also ranked high in LUSC.


2021 ◽  
Author(s):  
Magdalena Navarro ◽  
T Ian Simpson

AbstractMotivationAutism spectrum disorder (ASD) has a strong, yet heterogeneous, genetic component. Among the various methods that are being developed to help reveal the underlying molecular aetiology of the disease, one that is gaining popularity is the combination of gene expression and clinical genetic data. For ASD, the SFARI-gene database comprises lists of curated genes in which presumed causative mutations have been identified in patients. In order to predict novel candidate SFARI-genes we built classification models combining differential gene expression data for ASD patients and unaffected individuals with a gene’s status in the SFARI-gene list.ResultsSFARI-genes were not found to be significantly associated with differential gene expression patterns, nor were they enriched in gene co-expression network modules that had a strong correlation with ASD diagnosis. However, network analysis and machine learning models that incorporate information from the whole gene co-expression network were able to predict novel candidate genes that share features of existing SFARI genes and have support for roles in ASD in the literature. We found a statistically significant bias related to the absolute level of gene expression for existing SFARI genes and their scores. It is essential that this bias be taken into account when studies interpret ASD gene expression data at gene, module and whole-network levels.AvailabilitySource code is available from GitHub (https://doi.org/10.5281/zenodo.4463693) and the accompanying data from The University of Edinburgh DataStore (https://doi.org/10.7488/ds/2980)[email protected]


2021 ◽  
Vol 8 (10) ◽  
pp. 257-262
Author(s):  
Aigli Korfiati ◽  
Giorgos Livanos ◽  
Christos Konstandinou

Computer-aided diagnosis, prognosis and therapy systems have been of great interest for a number of years. The availability of big volumes of data and of powerful computational resources have allowed artificial intelligence approaches to emerge in melanoma related studies. However, for such approaches to have good predictive performances data availability is of crucial importance. Melanoma related imaging, biological and clinical data can be found partially and scattered in various repositories. Thus, in this work, we assemble in a web accessible database, named ebioMelDB, the widest collection of clinical and dermoscopy images accompanied with patient clinical data and the widest collection of RNA-Seq gene expression data accompanied with patient clinical data. The database organization allows users to select the data that are appropriate for their application of interest (diagnosis, prognosis and therapy). Keywords: melanoma database, integrated data, dermoscopy, imaging, RNA-Seq, clinical data.


2020 ◽  
Author(s):  
Xanthoula Atsalaki ◽  
Lefteris Koumakis ◽  
George Potamias ◽  
Manolis Tsiknakis

AbstractHigh-throughput technologies, such as chromatin immunoprecipitation (ChIP) with massively parallel sequencing (ChIP-seq) have enabled cost and time efficient generation of immense amount of genome data. The advent of advanced sequencing techniques allowed biologists and bioinformaticians to investigate biological aspects of cell function and understand or reveal unexplored disease etiologies. Systems biology attempts to formulate the molecular mechanisms in mathematical models and one of the most important areas is the gene regulatory networks (GRNs), a collection of DNA segments that somehow interact with each other. GRNs incorporate valuable information about molecular targets that can be corellated to specific phenotype.In our study we highlight the need to develop new explorative tools and approaches for the integration of different types of -omics data such as ChIP-seq and GRNs using pathway analysis methodologies. We present an integrative approach for ChIP-seq and gene expression data on GRNs. Using public microarray expression samples for lung cancer and healthy subjects along with the KEGG human gene regulatory networks, we identified ways to disrupt functional sub-pathways on lung cancer with the aid of CTCF ChIP-seq data, as a proof of concept.We expect that such a systems biology pipeline could assist researchers to identify corellations and causality of transcription factors over functional or disrupted biological sub-pathways.


2019 ◽  
Author(s):  
Pei-Yau Lung ◽  
Xiaodong Pang ◽  
Yan Li ◽  
Jinfeng Zhang

AbstractReusability is part of the FAIR data principle, which aims to make data Findable, Accessible, Interoperable, and Reusable. One of the current efforts to increase the reusability of public genomics data has been to focus on the inclusion of quality metadata associated with the data. When necessary metadata are missing, most researchers will consider the data useless. In this study, we develop a framework to predict the missing metadata of gene expression datasets to maximize their reusability. We propose a new metric called Proportion of Cases Accurately Predicted (PCAP), which is optimized in our specifically-designed machine learning pipeline. The new approach performed better than pipelines using commonly used metrics such as F1-score in terms of maximizing the reusability of data with missing values. We also found that different variables might need to be predicted using different machine learning methods and/or different data processing protocols. Using differential gene expression analysis as an example, we show that when missing variables are accurately predicted, the corresponding gene expression data can be reliably used in downstream analyses.


Sign in / Sign up

Export Citation Format

Share Document