scholarly journals Cancer as a tissue anomaly: classifying tumor transcriptomes based only on healthy data

2018 ◽  
Author(s):  
Thomas P. Quinn ◽  
Thin Nguyen ◽  
Samuel C. Lee ◽  
Svetha Venkatesh

AbstractSince the turn of the century, researchers have sought to diagnose cancer based on gene expression signatures measured from the blood or biopsy as biomarkers. This task, known as classification, is typically solved using a suite of algorithms that learn a mathematical rule capable of discriminating one group (e.g., cases) from another (e.g., controls). However, discriminatory methods can only identify cancerous samples that resemble those that the algorithm already saw during training. As such, we argue that discriminatory methods are fundamentally ill-suited for the classification of cancer: because the possibility space of cancer is definitively large, the existence of a one-of-a-kind gene expression signature becomes very likely. Instead, we propose using an established surveillance method that detects anomalous samples based on their deviation from a learned normal steady-state structure. By transferring this method to transcriptomic data, we can create an anomaly detector for tissue transcriptomes, a “tissue detector”, that is capable of identifying cancer without ever seeing a single cancer example. Using models trained on normal GTEx samples, we show that our “tissue detector” can accurately classify TCGA samples as normal or cancerous and that its performance is further improved by including more normal samples in the training set. We conclude this report by emphasizing the conceptual advantages of anomaly detection and by highlighting future directions for this field of study.

2018 ◽  
Vol 36 (15_suppl) ◽  
pp. 10553-10553
Author(s):  
Qinghua Xu ◽  
Qifeng WANG ◽  
Jinying Chen ◽  
Chengshu Chen ◽  
Yifeng Sun ◽  
...  

2008 ◽  
Vol 25 (1) ◽  
pp. 1-16 ◽  
Author(s):  
Orsolya Galamb ◽  
Balázs Györffy ◽  
Ferenc Sipos ◽  
Sándor Spisák ◽  
Anna Mária Németh ◽  
...  

Gene expression analysis of colon biopsies using high-density oligonucleotide microarrays can contribute to the understanding of local pathophysiological alterations and to functional classification of adenoma (15 samples), colorectal carcinomas (CRC) (15) and inflammatory bowel diseases (IBD) (14). Total RNA was extracted, amplified and biotinylated from frozen colonic biopsies. Genome-wide gene expression profile was evaluated by HGU133plus2 microarrays and verified by RT-PCR. We applied two independent methods for data normalization and used PAM for feature selection. Leave one-out stepwise discriminant analysis was performed. Top validated genes included collagenIVα1, lipocalin-2, calumenin, aquaporin-8 genes in CRC; CD44, met proto-oncogene, chemokine ligand-12, ADAM-like decysin-1 and ATP-binding casette-A8 genes in adenoma; and lipocalin-2, ubiquitin D and IFITM2 genes in IBD. Best differentiating markers between Ulcerative colitis and Crohn's disease were cyclin-G2; tripartite motif-containing-31; TNFR shedding aminopeptidase regulator-1 and AMICA. The discriminant analysis was able to classify the samples in overall 96.2% using 7 discriminatory genes (indoleamine-pyrrole-2,3-dioxygenase, ectodermal-neural cortex, TIMP3, fucosyltransferase-8, collectin sub-family member 12, carboxypeptidase D, and transglutaminase-2). Using routine biopsy samples we successfully performed whole genomic microarray analysis to identify discriminative signatures. Our results provide further insight into the pathophysiological background of colonic diseases. The results set up data warehouse which can be mined further.


Sign in / Sign up

Export Citation Format

Share Document