Multiplatform Biomarker Identification using a Data-driven Approach Enables Single-sample Classification

Mapping Intimacies ◽

10.1101/581686 ◽

2019 ◽

Author(s):

Ling Zhang ◽

Ishwor Thapa ◽

Christian Haas ◽

Dhundy Bastola

Keyword(s):

Gene Expression ◽

Expression Profiles ◽

Blood Platelets ◽

Gene Expression Profiles ◽

Housekeeping Genes ◽

Single Sample ◽

Data Driven ◽

Superior Performance ◽

Sample Classification

AbstractHigh-throughput gene expression profiles have allowed discovery of potential biomarkers enabling early diagnosis, prognosis and developing individualized treatment. However, it remains a challenge to identify a set of reliable and reproducible biomarkers across various gene expression platforms and laboratories for single sample diagnosis and prognosis. We address this need with our Data-Driven Reference (DDR) approach, which employs stably expressed housekeeping genes as references to eliminate platform-specific biases and non-biological variabilities. Our method identifies biomarkers with “built-in” features, and these features can be interpreted consistently regardless of profiling technology, which enable classification of single-sample independent of platforms. Validation with RNA-seq data of blood platelets shows that DDR achieves the superior performance in classification of six different tumor types as well as molecular target statuses (such asMETorHER2-positive, and mutantKRAS, EGFRorPIK3CA) with smaller sets of biomarkers. We demonstrate on the three microarray datasets that our method is capable of identifying robust biomarkers for subgrouping medulloblastoma samples with data perturbation due to different microarray platforms. In addition to identifying the majority of subgroup-specific biomarkers in Code-Set of nanoString, some potential new biomarkers for subgrouping medulloblastoma were detected by our method. Our results show that the DDR method contributes significantly to single-sample classification of disease and shed light on personalized medicine.

Download Full-text

Multiplatform biomarker identification using a data-driven approach enables single-sample classification

BMC Bioinformatics ◽

10.1186/s12859-019-3140-7 ◽

2019 ◽

Vol 20 (1) ◽

Author(s):

Ling Zhang ◽

Ishwor Thapa ◽

Christian Haas ◽

Dhundy Bastola

Keyword(s):

Gene Expression ◽

Expression Profiles ◽

Blood Platelets ◽

Gene Expression Profiles ◽

Housekeeping Genes ◽

Disease Classification ◽

Single Sample ◽

Data Driven ◽

Superior Performance

Abstract Background High-throughput gene expression profiles have allowed discovery of potential biomarkers enabling early diagnosis, prognosis and developing individualized treatment. However, it remains a challenge to identify a set of reliable and reproducible biomarkers across various gene expression platforms and laboratories for single sample diagnosis and prognosis. We address this need with our Data-Driven Reference (DDR) approach, which employs stably expressed housekeeping genes as references to eliminate platform-specific biases and non-biological variabilities. Results Our method identifies biomarkers with “built-in” features, and these features can be interpreted consistently regardless of profiling technology, which enable classification of single-sample independent of platforms. Validation with RNA-seq data of blood platelets shows that DDR achieves the superior performance in classification of six different tumor types as well as molecular target statuses (such as MET or HER2-positive, and mutant KRAS, EGFR or PIK3CA) with smaller sets of biomarkers. We demonstrate on the three microarray datasets that our method is capable of identifying robust biomarkers for subgrouping medulloblastoma samples with data perturbation due to different microarray platforms. In addition to identifying the majority of subgroup-specific biomarkers in CodeSet of nanoString, some potential new biomarkers for subgrouping medulloblastoma were detected by our method. Conclusions In this study, we present a simple, yet powerful data-driven method which contributes significantly to identification of robust cross-platform gene signature for disease classification of single-patient to facilitate precision medicine. In addition, our method provides a new strategy for transcriptome analysis.

Download Full-text

Molecular classification of selective oestrogen receptor modulators on the basis of gene expression profiles of breast cancer cells expressing oestrogen receptor α

British Journal of Cancer ◽

10.1038/sj.bjc.6600477 ◽

2002 ◽

Vol 87 (4) ◽

pp. 449-456 ◽

Cited By ~ 16

Author(s):

A S Levenson ◽

I L Kliakhandler ◽

K M Svoboda ◽

K M Pease ◽

S A Kaiser ◽

...

Keyword(s):

Breast Cancer ◽

Gene Expression ◽

Oestrogen Receptor ◽

Breast Cancer Cells ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Molecular Classification ◽

Selective Oestrogen Receptor Modulators ◽

Receptor Modulators

Download Full-text

Assessing the validity of blood-based gene expression profiles for the classification of schizophrenia and bipolar disorder: A preliminary report

American Journal of Medical Genetics Part B Neuropsychiatric Genetics ◽

10.1002/ajmg.b.30161 ◽

2005 ◽

Vol 133B (1) ◽

pp. 1-5 ◽

Cited By ~ 156

Author(s):

Ming T. Tsuang ◽

Nadine Nossova ◽

Tom Yager ◽

Min-Min Tsuang ◽

Shi-Chin Guo ◽

...

Keyword(s):

Gene Expression ◽

Bipolar Disorder ◽

Preliminary Report ◽

Expression Profiles ◽

Gene Expression Profiles

Download Full-text

Identification and Validation of Novel Reference Genes in Acute Lymphoblastic Leukemia for Droplet Digital PCR

Genes ◽

10.3390/genes10050376 ◽

2019 ◽

Vol 10 (5) ◽

pp. 376 ◽

Cited By ~ 3

Author(s):

Vanessa Villegas-Ruíz ◽

Karina Olmos-Valdez ◽

Kattia Alejandra Castro-López ◽

Victoria Estefanía Saucedo-Tepanecatl ◽

Josselen Carina Ramírez-Chiquito ◽

...

Keyword(s):

Gene Expression ◽

Acute Lymphoblastic Leukemia ◽

Reference Genes ◽

Expression Profiles ◽

Lymphoblastic Leukemia ◽

Gene Expression Profiles ◽

Housekeeping Genes ◽

Digital Pcr ◽

Droplet Digital Pcr ◽

Cellular Processes

Droplet digital PCR is the most robust method for absolute nucleic acid quantification. However, RNA is a very versatile molecule and its abundance is tissue-dependent. RNA quantification is dependent on a reference control to estimate the abundance. Additionally, in cancer, many cellular processes are deregulated which consequently affects the gene expression profiles. In this work, we performed microarray data mining of different childhood cancers and healthy controls. We selected four genes that showed no gene expression variations (PSMB6, PGGT1B, UBQLN2 and UQCR2) and four classical reference genes (ACTB, GAPDH, RPL4 and RPS18). Gene expression was validated in 40 acute lymphoblastic leukemia samples by means of droplet digital PCR. We observed that PSMB6, PGGT1B, UBQLN2 and UQCR2 were expressed ~100 times less than ACTB, GAPDH, RPL4 and RPS18. However, we observed excellent correlations among the new reference genes (p < 0.0001). We propose that PSMB6, PGGT1B, UBQLN2 and UQCR2 are housekeeping genes with low expression in childhood cancer.

Download Full-text

Classification of Gene Expression Profiles: Comparison of K-means and Expectation Maximization Algorithms

10.1109/his.2008.92 ◽

2008 ◽

Cited By ~ 4

Author(s):

Cristina Rubio-Escudero ◽

Francisco Martínez-Álvarez ◽

Rocío Romero-Zaliz ◽

Igor Zwir

Keyword(s):

Gene Expression ◽

Expectation Maximization ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Expectation Maximization Algorithms ◽

Maximization Algorithms

Download Full-text

Molecular Classification of Human Diffuse Gliomas by Multidimensional Scaling Analysis of Gene Expression Profiles Parallels Morphology-Based Classification, Correlates with Survival, and Reveals Clinically-Relevant Novel Glioma Subsets

Brain Pathology ◽

10.1111/j.1750-3639.2002.tb00427.x ◽

2006 ◽

Vol 12 (1) ◽

pp. 108-116 ◽

Cited By ~ 64

Author(s):

Gregory N. Fuller ◽

Kenneth R. Hess ◽

Chang Hun Rhee ◽

W. K. Alfred Yung ◽

Raymond A. Sawaya ◽

...

Keyword(s):

Gene Expression ◽

Multidimensional Scaling ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Molecular Classification ◽

Scaling Analysis ◽

Multidimensional Scaling Analysis ◽

Diffuse Gliomas

Download Full-text

Molecular classification of human endometrial cancer based on gene expression profiles from specialized microarrays

International Journal of Gynecology & Obstetrics ◽

10.1016/j.ijgo.2010.03.020 ◽

2010 ◽

Vol 110 (2) ◽

pp. 125-129 ◽

Cited By ~ 6

Author(s):

YuanYang Yao ◽

YongHua Chen ◽

Yue Wang ◽

XiaoPing Li ◽

JianLiu Wang ◽

...

Keyword(s):

Gene Expression ◽

Endometrial Cancer ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Molecular Classification

Download Full-text

Single-sample classification of breast cancer tumors using data-driven reference genes

2016 Indian Control Conference (ICC) ◽

10.1109/indiancc.2016.7441098 ◽

2016 ◽

Author(s):

Burook Misganaw ◽

M. Vidyasagar

Keyword(s):

Breast Cancer ◽

Reference Genes ◽

Single Sample ◽

Data Driven ◽

Sample Classification ◽

Using Data

Download Full-text

Organ-Specific Molecular Classification of Primary Lung, Colon, and Ovarian Adenocarcinomas Using Gene Expression Profiles

American Journal Of Pathology ◽

10.1016/s0002-9440(10)62509-6 ◽

2001 ◽

Vol 159 (4) ◽

pp. 1231-1238 ◽

Cited By ~ 117

Author(s):

Thomas J. Giordano ◽

Kerby A. Shedden ◽

Donald R. Schwartz ◽

Rork Kuick ◽

Jeremy M.G. Taylor ◽

...

Keyword(s):

Gene Expression ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Molecular Classification ◽

Organ Specific ◽

Ovarian Adenocarcinomas

Download Full-text

PCP: a program for supervised classification of gene expression profiles

Bioinformatics ◽

10.1093/bioinformatics/bti760 ◽

2005 ◽

Vol 22 (2) ◽

pp. 245-247 ◽

Cited By ~ 19

Author(s):

Ljubomir J. Buturović

Keyword(s):

Gene Expression ◽

Supervised Classification ◽

Expression Profiles ◽

Gene Expression Profiles

Download Full-text