SEGreg: a database for human specifically expressed genes and their regulations in cancer and normal tissue

2018 ◽  
Vol 20 (4) ◽  
pp. 1322-1328 ◽  
Author(s):  
Qin Tang ◽  
Qiong Zhang ◽  
Yao Lv ◽  
Ya-Ru Miao ◽  
An-Yuan Guo

AbstractHuman specifically expressed genes (SEGs) usually serve as potential biomarkers for disease diagnosis and treatment. However, the regulation underlying their specific expression remains to be revealed. In this study, we constructed SEG regulation database (SEGreg; available at http://bioinfo.life.hust.edu.cn/SEGreg) for showing SEGs and their transcription factors (TFs) and microRNA (miRNA) regulations under different physiological conditions, which include normal tissue, cancer tissue and cell line. In total, SEGreg collected 6387, 1451, 4506 and 5320 SEGs from expression profiles of 34 cancer types and 55 tissues of The Cancer Genome Atlas, Cancer Cell Line Encyclopedia, Human Body Map and Genotype-Tissue Expression databases/projects, respectively. The cancer or tissue corresponding expressed miRNAs and TFs were identified from miRNA and gene expression profiles, and their targets were collected from several public resources. Then the regulatory networks of all SEGs were constructed and integrated into SEGreg. Through a user-friendly interface, users can browse and search SEGreg by gene name, data source, tissue, cancer type and regulators. In summary, SEGreg is a specialized resource to explore SEGs and their regulations, which provides clues to reveal the mechanisms of carcinogenesis and biological processes.

2019 ◽  
Author(s):  
Haiwei Wang ◽  
Xinrui Wang ◽  
Liangpu Xu ◽  
Ji Zhang ◽  
Hua Cao

Abstract Background: For a specific cancer type, the transcriptional profile is determined by the combination of innate transcriptional features of the original normal tissue and the acquired transcriptional characteristics mediated by genomic and epigenetic aberrations in the tumor development. However, the classification of innate normal tissue specific genes and acquired tumor specific genes is not studied in a pan-cancer manner. Methods: The innate and acquired gene expression profiles in each tumor type were studied using The Cancer Genome Atlas (TCGA) RNA-seq dataset. The prognostic effects of the tumor acquired genes were determined by “survival” package in R software. The methylation of the tumor acquired genes was delineated using TCGA HumanMethylation450 microarray data. Results: 90% liver hepatocellular carcinoma (LIHC) specific genes are derived from innate normal liver specific genes. On the contrary, 90.3% kidney clear cell carcinoma (KIRC) specific genes and 90.9 % lung squamous cell carcinoma (LUSC) specific genes are acquired in the tumor developmental progress. The innate normal tissue specific genes are down regulated in tumor tissues, while, the tumor acquired specific genes are up regulated in the tumor tissues. The innate normal tissue specific genes and the tumors acquired specific genes are both associated with the tumor overall survival in some tumor types. The hyper-DNA methylation of normal tissue specific genes is contributing to the inhibition of normal tissue specific genes expression in cancer cells. And the tumor acquired specific genes are activated by hypo-DNA methylation and genomic aberrations. Conclusions: Our results provide descriptions of the specific transcriptional features across cancer types and suggest that the tumor acquired specific genes are potential targets for anti-cancer therapy.


2020 ◽  
Author(s):  
Wei Ma ◽  
Dandan Li ◽  
Changjian Zhang ◽  
Ming Xiong ◽  
Yuanyuan Qiao

Abstract Purpose: We tried to explore new gene signature via the combination of tumor-derived expression profile and the adjacent normal-derived expression profile to find more robust cancer biomarker. Methods: Log2 transformed ratio of tumor tissue and the adjacent normal tissue (Log2TN) expression, tumor-derived expression, and normal-derived expression were used to do univariate Cox regression in The Cancer Genome Atlas (TCGA) lung squamous cell carcinoma (LUSC) respectively. Then, we used factor analysis and least absolute shrinkage and selection operator Cox (LASSO-Cox) to select gene signature in TCGA LUSC for Log2TN, tumor, and adjacent normal respectively.Results: By comparing Log2TN with tumor and adjacent normal in LUSC, we found that genes derived from Log2TN show more robust (p = 0.006 and p = 0.001) and have lower p-values (p < 0.001). Gene signature selected from Log2TN shows the best generalization in the three GEO datasets even though only tumor-derived expression profiles were available in the three datasets. Enrichment analysis showed that the tumor cells mainly focus on proliferation with losing functional of metabolism.Conclusions: These results indicate that (1) Log2TN could get more robust genes and gene signature than tumor-derived expression profiles used traditionally; (2) the adjacent-normal tissue may also play an important role in the progress and outcome of the tumor.Implications for Cancer Survivors: By combined of tumor-derived expression profile and the adjacent normal-derived expression profile, we could find more robust gene signature than traditionally method. Using these robust gene signatures, robust cancer biomarkers could be constructed and will do great help to improve cancer prognosis.


2021 ◽  
Author(s):  
H. Robert Frost

AbstractThe genetic alterations that underlie cancer development are highly tissue-specific with the majority of driving alterations occurring in only a few cancer types and with alterations common to multiple cancer types often showing a tissue-specific functional impact. This tissue-specificity means that the biology of normal tissues carries important information regarding the pathophysiology of the associated cancers, information that can be leveraged to improve the power and accuracy of cancer genomic analyses. Research exploring the use of normal tissue data for the analysis of cancer genomics has primarily focused on the paired analysis of tumor and adjacent normal samples. Efforts to leverage the general characteristics of normal tissue for cancer analysis has received less attention with most investigations focusing on understanding the tissue-specific factors that lead to individual genomic alterations or dysregulated pathways within a single cancer type. To address this gap and support scenarios where adjacent normal tissue samples are not available, we explored the genome-wide association between the transcriptomes of 21 solid human cancers and their associated normal tissues as profiled in healthy individuals. While the average gene expression profiles of normal and cancerous tissue may appear distinct, with normal tissues more similar to other normal tissues than to the associated cancer types, when transformed into relative expression values, i.e., the ratio of expression in one tissue or cancer relative to the mean in other tissues or cancers, the close association between gene activity in normal tissues and related cancers is revealed. As we demonstrate through an analysis of tumor data from The Cancer Genome Atlas and normal tissue data from the Human Protein Atlas, this association between tissue-specific and cancer-specific expression values can be leveraged to improve the prognostic modeling of cancer, the comparative analysis of different cancer types, and the analysis of cancer and normal tissue pairs.


PLoS ONE ◽  
2021 ◽  
Vol 16 (4) ◽  
pp. e0249424
Author(s):  
Stepan Nersisyan ◽  
Alexei Galatenko ◽  
Vladimir Galatenko ◽  
Maxim Shkurnikov ◽  
Alexander Tonevitsky

Analysis of regulatory networks is a powerful framework for identification and quantification of intracellular interactions. We introduce miRGTF-net, a novel tool for construction of miRNA-gene-TF networks. We consider multiple transcriptional and post-transcriptional interaction types, including regulation of gene and miRNA expression by transcription factors, gene silencing by miRNAs, and co-expression of host genes with their intronic miRNAs. The underlying algorithm uses information on experimentally validated interactions as well as integrative miRNA/mRNA expression profiles in a given set of samples. The latter ensures simultaneous tissue-specificity and biological validity of interactions. We applied miRGTF-net to paired miRNA/mRNA-sequencing data of breast cancer samples from The Cancer Genome Atlas (TCGA). Together with topological analysis of the constructed network we showed that considered players can form reliable prognostic gene signatures for ER-positive breast cancer. A number of signatures demonstrated remarkably high accuracy on transcriptomic data obtained by both microarrays and RNA sequencing from several independent patient cohorts. Furthermore, an essential part of prognostic genes were identified as direct targets of transcription factor E2F1. The putative interplay between estrogen receptor alpha and E2F1 was suggested as a potential recurrence factor in patients treated with tamoxifen. Source codes of miRGTF-net are available at GitHub (https://github.com/s-a-nersisyan/miRGTF-net).


2021 ◽  
Vol 11 ◽  
Author(s):  
Chen Xue ◽  
Yalei Zhao ◽  
Ganglei Li ◽  
Lanjuan Li

The ALYREF protein acts as a crucial epigenetic regulator in several cancers. However, the specific expression levels and functional roles of ALYREF in cancers are largely unknown, including for hepatocellular carcinoma (HCC). In a pan-cancer tissue analysis that included HCC, we assessed the expression of ALYREF compared to normal tissues using The Cancer Genome Atlas database. Associations between ALYREF gene expression and the clinical characteristics of HCC patient samples were assessed using the UALCAN database. Kaplan-Meier plots were performed to assess HCC patient prognosis, and the TIMER database was used to explore associations between ALYREF expression and immune-cell infiltrations. The same methods were used to assess eIF4A3 expression in HCC patient samples. In addition, ALYREF- and elF4A3-related differentially expressed genes (DEGs) were determined using LinkedOmics, associated protein functionalities were predicted for positively associated DEGs, and both the TargetScan and miRDB databases were used to predict potential upstream miRNAs for control of ALYREF and eIF4A3 expression. We found that ALYREF gene expression was dysregulated in several cancers and was significantly elevated in HCC patient tissue samples and HCC cell lines. The overexpression of ALYREF was significantly related to both advanced tumor-node-metastasis stages and poor HCC prognosis. Furthermore, we found that eIF4A3 expression was significantly correlated with ALYREF expression, and that upregulated eIF4A3 was significantly associated with poor HCC patient outcomes. In the protein-protein interaction network, we identified eight hub genes based on the positively associated DEGs in common between ALYREF and eIF4A3, and the high expression levels of these hub genes were positively associated with patient clinical outcomes. In addition, we identified miR-4666a-5p and miR-6124 as potential regulators of ALYREF and eIF4A3 expression. These findings suggest that increased ALYREF expression may function as a novel biomarker for both HCC diagnosis and prognosis predictions.


2016 ◽  
Author(s):  
Roni Rasnic ◽  
Nathan Linial ◽  
Michal Linial

ABSTRACTThe primary function of microRNAs (miRNAs) is to maintain cell homeostasis. In cancerous tissues miRNAs’ expression undergo drastic alterations. In this study, we used miRNA expression profiles from The Cancer Genome Atlas (TCGA) of 24 cancer types and 3 healthy tissues, collected from >8500 samples. We seek to classify the cancer’s origin and tissue identification using the expression from 1046 reported miRNAs. Despite an apparent uniform appearance of miRNAs among cancerous samples, we recover indispensable information from lowly expressed miRNAs regarding the cancer/tissue types. Multiclass support vector machine classification yields an average recall of 58% in identifying the correct tissue and tumor types. Data discretization has led to substantial improvement reaching an average recall of 91% (95% median). We propose a straightforward protocol as a crucial step in classifying tumors of unknown primary origin. Our counter-intuitive conclusion is that in almost all cancer types, highly expressing miRNAs mask the significant signal that lower expressed miRNAs provide.


2019 ◽  
Author(s):  
Xiao Ma ◽  
Shuangshuang Cen ◽  
Luming Wang ◽  
Chao Zhang ◽  
Limin Wu ◽  
...  

Abstract Abstract Background: Gonad is the major factor affecting the animal reproduction. The regulation mechanism of protein coding genes expression involved reproduction is still remains to be elucidated. Increasing evidence has shown that ncRNAs play key regulatory roles in gene expression in many life processes. The roles of microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) in reproduction had been investigated in some species. However, the regulation patterns of miRNA and lncRNA in sex biased expression of protein coding genes remains to be elucidated. In this study, we performed an integrated analysis of miRNA, messenger RNA (mRNA), and lncRNA expression profiles to explore their regulatory patterns in the female ovary and male testis of the soft-shelled turtle, Pelodiscus sinensis. Results: We identified 10 796 mature miRNAs, 44 678 mRNAs, and 58 923 lncRNAs in the testis and ovary. A total of 16 817 target genes were identified for miRNAs. Of these, 11 319 mRNAs, 10 495 lncRNAs, and 633 miRNAs were expressed differently. The predicted target genes of these differential expression (DE) miRNAs and lncRNAs included genes related to reproduction regulation. Furthermore, we found that 5 408 DElncRNAs and 186 DE miRNAs showed sex-specific expression. Of these, 3 miRNAs and 917 lncRNAs were testis specific and 186 DEmiRNAs and 4 491 DElncRNAs were ovary specific. We constructed compete endogenous lncRNA-miRNA-mRNA networks using bioinformatics, including 273 DEmRNAs, 5 730 DEmiRNAs, and 2 945 DElncRNAs. The target genes for the different expressed of miRNAs and lncRNAs included Wt1, Creb3l2, Gata4, Wnt2, Nr5a1, Hsd17, Igf2r, H2afz, Lin52, Trim71, Zar1, and Jazf1, etc. Conclusions: In animals, miRNA and lncRNA regulate the reproduction process, including the regulation of oocyte maturation and spermatogenesis. Considering their importance, the identified miRNAs, lncRNAs, and their targets in P. sinensis might be useful for genome editing to produce higher quality aquaculture animals. A thorough understanding of ncRNA-based cellular regulatory networks will aid in the improvement of P. sinensis reproduction traits for aquaculture.


2017 ◽  
Vol 35 (15_suppl) ◽  
pp. e23162-e23162
Author(s):  
Konstantin Volyanskyy ◽  
Minghao Zhong ◽  
Payal Keswarpu ◽  
John T Fallon ◽  
Michael Paul Fanucchi ◽  
...  

e23162 Background: Cancer is characterized by a variety of heterogeneous genomic and transcriptomic patterns involving highly complex signaling biological pathways. The problem of identification of the factors driving tumor progression becomes even more challenging due to intricate interaction mechanisms between these pathways. Using novel approaches in machine learning, we demonstrate the ability to quantitatively describe characteristic signaling patterns in cancer based on transcriptomic data Methods: We used RNASeq data from 20531 genes in 174 samples of GBM from The Cancer Genome Atlas including 5 major histological subtypes – Classical, G-CIMP, Mesenchymal, Neural, and Proneural, anddeveloped predictive computational framework for molecular subtype differentiation from normal tissue relying on variance based gene selection and random forest algorithm. Results: We obtained a few key findings – (1) genes from cell signaling pathways alone differentiate each subtype from normal tissue with 100% accuracy; (2) predictive genes are specific to each subtype; (3) inferred pathway interactions are also specific to each subtype; (4) typically most of the predictive genes involved in signaling are down-regulated in tumor compared to normal tissue (MAPT, PRKCG, PDE2A, RYR2, ATP1B1, GRN1, GNAO1), however, in each subtype we observed a smaller subset of predictive genes which are highly up-regulated in tumor (ID3, FN1, JAG1, F2R, COL4A1, EDAR, CDK2, CDK4, MFNG, BIRC5, CCNB2). We detected and quantitatively evaluated characteristic signaling pathway involvement across the GBM subtypes for MAPK, RAP1, RAS, Notch, PI3K-Akt, mTOR, FoxO, Jak-STAT, Wnt, cAMP, and Calcium Signaling, providing a unique approximation for each subtype signaling profile. Conclusions: In this study, we identified gene expression profiles and associated signaling pathways for distinguishing GBM Multiforme subtypes from normal tissue. We observed and described a dense complex picture of interacting signaling pathways. The detected interactions may provide clinical insights and could be used to identify potential therapeutic targets, however, more research is needed to confirm this.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Milad Mostavi ◽  
Yu-Chiao Chiu ◽  
Yidong Chen ◽  
Yufei Huang

Abstract Background The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider how to utilize the existing training samples to predict cancer types unseen during the training. We hypothesize the existence of a set of type-agnostic expression representations that define the similarity/dissimilarity between samples of the same/different types and propose a novel one-shot learning model called CancerSiamese to learn this common representation. CancerSiamese accepts a pair of query and support samples (gene expression profiles) and learns the representation of similar or dissimilar cancer types through two parallel convolutional neural networks joined by a similarity function. Results We trained CancerSiamese for cancer type prediction for primary and metastatic tumors using samples from the Cancer Genome Atlas (TCGA) and MET500. Network transfer learning was utilized to facilitate the training of the CancerSiamese models. CancerSiamese was tested for different N-way predictions and yielded an average accuracy improvement of 8% and 4% over the benchmark 1-Nearest Neighbor (1-NN) classifier for primary and metastatic tumors, respectively. Moreover, we applied the guided gradient saliency map and feature selection to CancerSiamese to examine 100 and 200 top marker-gene candidates for the prediction of primary and metastatic cancers, respectively. Functional analysis of these marker genes revealed several cancer related functions between primary and metastatic tumors. Conclusion This work demonstrated, for the first time, the feasibility of predicting unseen cancer types whose samples are limited. Thus, it could inspire new and ingenious applications of one-shot and few-shot learning solutions for improving cancer diagnosis, prognostic, and our understanding of cancer.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Jiani Wu ◽  
Dongqiang Zeng ◽  
Shimeng Zhi ◽  
Zilan Ye ◽  
Wenjun Qiu ◽  
...  

Abstract Background Tumor-derived exosomes (TEXs) are involved in tumor progression and the immune modulation process and mediate intercellular communication in the tumor microenvironment. Although exosomes are considered promising liquid biomarkers for disease diagnosis, it is difficult to discriminate TEXs and to develop TEX-based predictive biomarkers. Methods In this study, the gene expression profiles and clinical information were collected from The Cancer Genome Atlas (TCGA) database, IMvigor210 cohorts, and six independent Gene Expression Omnibus datasets. A TEXs-associated signature named TEXscore was established to predict overall survival in multiple cancer types and in patients undergoing immune checkpoint blockade therapies. Results Based on exosome-associated genes, we first constructed a tumor-derived exosome signature named TEXscore using a principal component analysis algorithm. In single-cell RNA-sequencing data analysis, ascending TEXscore was associated with disease progression and poor clinical outcomes. In the TCGA Pan-Cancer cohort, TEXscore was elevated in tumor samples rather than in normal tissues, thereby serving as a reliable biomarker to distinguish cancer from non-cancer sources. Moreover, high TEXscore was associated with shorter overall survival across 12 cancer types. TEXscore showed great potential in predicting immunotherapy response in melanoma, urothelial cancer, and renal cancer. The immunosuppressive microenvironment characterized by macrophages, cancer-associated fibroblasts, and myeloid-derived suppressor cells was associated with high TEXscore in the TCGA and immunotherapy cohorts. Besides, TEXscore-associated miRNAs and gene mutations were also identified. Further experimental research will facilitate the extending of TEXscore in tumor-associated exosomes. Conclusions TEXscore capturing tumor-derived exosome features might be a robust biomarker for prognosis and treatment responses in independent cohorts.


Sign in / Sign up

Export Citation Format

Share Document