Klasifikasi Data Microarray Menggunakan Discrete Wavelet Transform dan Extreme Learning Machine

Khadijah Khadijah; Sri Hartati

doi:10.22146/ijccs.6638

Klasifikasi Data Microarray Menggunakan Discrete Wavelet Transform dan Extreme Learning Machine

IJCCS (Indonesian Journal of Computing and Cybernetics Systems) ◽

10.22146/ijccs.6638 ◽

2015 ◽

Vol 9 (1) ◽

pp. 33

Author(s):

Khadijah Khadijah ◽

Sri Hartati

Keyword(s):

Cancer Diagnosis ◽

Microarray Data ◽

Discrete Wavelet ◽

Microarray Gene Expression ◽

Testing Data ◽

Leukemia Dataset ◽

Minimum Sensitivity ◽

Learning Machine ◽

Approximation Coefficients ◽

Low Sensitivity

AbstrakData microarray digunakan sebagai alternatif untuk diagnosa penyakit kanker karena kesulitan dalam dignosa kanker berdasarkan bentuk morfologis, yaitu perbedaan morfologis yang tipis antar jenis kanker yang berbeda. Penelitian ini bertujuan untuk membangun pengklasifikasi data microarray. Proses klasifikasi diawali dengan reduksi dimensi data microarray menggunakan DWT, dengan cara mendekomposisi sampel hingga level tertentu, kemudian mengambil nilai koefisien aproksimasi pada level tersebut sebagai fitur sampel. Fitur tersebut selanjutnya menjadi masukan untuk klasifikasi. Metode klasifikasi yang digunakan adalah ELM yang diterapkan pada RBFN. Dataset yang digunakan adalah data microarray multikelas, yaitu dataset GCM (16.063 gen, 14 kelas) dan Subtypes-Leukemia (12.600 gen, 7 kelas).Pengujian dilakukan dengan cara membagi data latih dan data uji secara random sepuluh kali dengan proporsi data yang sama. Classifier yang dihasilkan dari penelitian ini untuk dataset GCM belum memiliki performa yang cukup baik, ditunjukkan dengan nilai akurasi sekitar 75% ± 6,25% dan nilai minimum sensitivity yang masih rendah, yaitu 15% ± 19,95% menunjukkan bahwa sensitivity untuk tiap kelas belum merata, terdapat beberapa kelas yang sensitivity-nya masih rendah. Namun, classifier untuk dataset Subtypes-Leukemia yang memiliki jumlah kelas lebih sedikit dari dataset GCM memiliki performa yang cukup baik, ditunjukkan dengan nilai akurasi 87,68% ± 2,88% dan minimum sensitivity 51,90% ± 20,29%. Kata kunci— microarray, ekspresi gen, DWT, ELM, RBFN AbstractMicroarray data is used as an alternative in cancer diagnosis because of the difficulties cancer diagnosis based on morphologis structures. Different classes of cancer usually have poor distintion of morphologis structures. The aim of this reserach is to bulid microarray data classfier. The classification process is started by reducing dimension of microarray data. The method used to reduce the microarray data dimension is DWT by decomposing the samples until certain decomposition level and then use approximation coefficients at those level as feature to classifier. Classifier used in this reserach is ELM implemeted on RBFN. Dataset used are GCM (16.063 genes, 14 classes) and Subtypes-Leukemia (12.600 genes, 7 classes). Testing process is done by randomly dividing the training and testing data ten times with same proprotion of training and testing data. The perfomance of classifier built in this research is not so good for GCM dataset, shown by accuracy 75% ± 6,25% and mean of minimum sensitivity 15% ± 19,95%. The low minimum sensitivity indicate that there are few classes that have low sensitivity. But the classifier for Subtypes-Leukemia dataset give better result, that is accuracy 87,68% ± 2,88% and mean of minimum sensitivity 51,90% ± 20,29%. Keywords— microarray, gene expression, DWT, ELM, RBFN

Download Full-text

Multicategory classification using an Extreme Learning Machine for microarray gene expression cancer diagnosis

2010 INTERNATIONAL CONFERENCE ON COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES ◽

10.1109/icccct.2010.5670741 ◽

2010 ◽

Cited By ~ 4

Author(s):

S. Santhosh Baboo ◽

S. Sasikala

Keyword(s):

Gene Expression ◽

Extreme Learning Machine ◽

Cancer Diagnosis ◽

Microarray Gene Expression ◽

Multicategory Classification ◽

Microarray Gene ◽

Learning Machine

Download Full-text

Multicategory Classification Using An Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2007.1012 ◽

2007 ◽

Vol 4 (3) ◽

pp. 485-495 ◽

Cited By ~ 152

Author(s):

Runxuan Zhang ◽

G.-B. Huang ◽

N. Sundararajan ◽

P. Saratchandran

Keyword(s):

Gene Expression ◽

Extreme Learning Machine ◽

Cancer Diagnosis ◽

Microarray Gene Expression ◽

Multicategory Classification ◽

Microarray Gene ◽

Learning Machine

Download Full-text

Extracellular Vesicles in Tumor Diagnosis: A Mini-Review

Current Molecular Medicine ◽

10.2174/1573405616666201209103154 ◽

2020 ◽

Vol 20 ◽

Author(s):

Si Yu ◽

Menglin Huang ◽

Jingyu Wang ◽

Yongchang Zheng ◽

Haifeng Xu

Keyword(s):

Clinical Diagnosis ◽

Sensitivity And Specificity ◽

Cancer Diagnosis ◽

Cancer Biomarkers ◽

Related Substances ◽

Physiological Processes ◽

Cancer Pathogenesis ◽

Limited Application ◽

Low Sensitivity ◽

Shed Light

: Widely exploration of noninvasive tumor/cancer biomarkers has shed light on clinical diagnosis. However, many under-investigated biomarkers showed limited application potency due to low sensitivity and specificity, while extracellular vehicles (EVs) were gradually recognized as promising candidates. EVs are small vesicles transporting bioactive cargos between cells in multiple physiological processes and also in tumor/cancer pathogenesis. This review aimed to offer recent studies of EVs on structure, classification, physiological functions, as well as changes in tumor initiation and progression. Furthermore, we focused on advances of EVs and/or EV-related substances in cancer diagnosis, and summarized ongoing studies of promising candidates for future investigations.

Download Full-text

SEMIPARAMETRIC CLUSTERING METHOD FOR MICROARRAY DATA ANALYSIS

Journal of Bioinformatics and Computational Biology ◽

10.1142/s021972000800345x ◽

2008 ◽

Vol 06 (02) ◽

pp. 261-282 ◽

Cited By ~ 2

Author(s):

AO YUAN ◽

WENQING HE

Keyword(s):

Data Analysis ◽

Microarray Data ◽

Mixture Distribution ◽

Information Criterion ◽

Optimal Number ◽

Microarray Data Analysis ◽

Parametric Methods ◽

Clustering Methods ◽

Microarray Gene Expression ◽

Data Set

Clustering is a major tool for microarray gene expression data analysis. The existing clustering methods fall mainly into two categories: parametric and nonparametric. The parametric methods generally assume a mixture of parametric subdistributions. When the mixture distribution approximately fits the true data generating mechanism, the parametric methods perform well, but not so when there is nonnegligible deviation between them. On the other hand, the nonparametric methods, which usually do not make distributional assumptions, are robust but pay the price for efficiency loss. In an attempt to utilize the known mixture form to increase efficiency, and to free assumptions about the unknown subdistributions to enhance robustness, we propose a semiparametric method for clustering. The proposed approach possesses the form of parametric mixture, with no assumptions to the subdistributions. The subdistributions are estimated nonparametrically, with constraints just being imposed on the modes. An expectation-maximization (EM) algorithm along with a classification step is invoked to cluster the data, and a modified Bayesian information criterion (BIC) is employed to guide the determination of the optimal number of clusters. Simulation studies are conducted to assess the performance and the robustness of the proposed method. The results show that the proposed method yields reasonable partition of the data. As an illustration, the proposed method is applied to a real microarray data set to cluster genes.

Download Full-text

Microarray gene expression profiling in colorectal (HCT116) and hepatocellular (HepG2) carcinoma cell lines treated withMelicope ptelefolialeaf extract reveals transcriptome profiles exhibiting anticancer activity

PeerJ ◽

10.7717/peerj.5203 ◽

2018 ◽

Vol 6 ◽

pp. e5203 ◽

Cited By ~ 6

Author(s):

Mohammad Faujul Kabir ◽

Johari Mohd Ali ◽

Onn Haji Hashim

Keyword(s):

Gene Expression ◽

Cell Cycle ◽

Dna Replication ◽

Cell Lines ◽

Microarray Data ◽

Expression Profiling ◽

Cell Cycle Progression ◽

Anticancer Activities ◽

Microarray Gene Expression ◽

Cycle Progression

BackgroundWe have previously reported anticancer activities ofMelicope ptelefolia(MP) leaf extracts on four different cancer cell lines. However, the underlying mechanisms of actions have yet to be deciphered. In the present study, the anticancer activity of MP hexane extract (MP-HX) on colorectal (HCT116) and hepatocellular carcinoma (HepG2) cell lines was characterized through microarray gene expression profiling.MethodsHCT116 and HepG2 cells were treated with MP-HX for 24 hr. Total RNA was extracted from the cells and used for transcriptome profiling using Applied Biosystem GeneChip™ Human Gene 2.0 ST Array. Gene expression data was analysed using an Applied Biosystems Expression Console and Transcriptome Analysis Console software. Pathway enrichment analyses was performed using Ingenuity Pathway Analysis (IPA) software. The microarray data was validated by profiling the expression of 17 genes through quantitative reverse transcription PCR (RT-qPCR).ResultsMP-HX induced differential expression of 1,290 and 1,325 genes in HCT116 and HepG2 cells, respectively (microarray data fold change, MA_FC ≥ ±2.0). The direction of gene expression change for the 17 genes assayed through RT-qPCR agree with the microarray data. In both cell lines, MP-HX modulated the expression of many genes in directions that support antiproliferative activity. IPA software analyses revealed MP-HX modulated canonical pathways, networks and biological processes that are associated with cell cycle, DNA replication, cellular growth and cell proliferation. In both cell lines, upregulation of genes which promote apoptosis, cell cycle arrest and growth inhibition were observed, while genes that are typically overexpressed in diverse human cancers or those that promoted cell cycle progression, DNA replication and cellular proliferation were downregulated. Some of the genes upregulated by MP-HX include pro-apoptotic genes (DDIT3, BBC3, JUN), cell cycle arresting (CDKN1A, CDKN2B), growth arrest/repair (TP53, GADD45A) and metastasis suppression (NDRG1). MP-HX downregulated the expression of genes that could promote anti-apoptotic effect, cell cycle progression, tumor development and progression, which include BIRC5, CCNA2, CCNB1, CCNB2, CCNE2, CDK1/2/6, GINS2, HELLS, MCM2/10 PLK1, RRM2 and SKP2. It is interesting to note that all six top-ranked genes proposed to be cancer-associated (PLK1, MCM2, MCM3, MCM7, MCM10 and SKP2) were downregulated by MP-HX in both cell lines.DiscussionThe present study showed that the anticancer activities of MP-HX are exerted through its actions on genes regulating apoptosis, cell proliferation, DNA replication and cell cycle progression. These findings further project the potential use of MP as a nutraceutical agent for cancer therapeutics.

Download Full-text

Intraoperative Frozen Section Performance for Thyroid Cancer Diagnosis

10.21203/rs.3.rs-570487/v1 ◽

2021 ◽

Author(s):

Iuri Martin Goemann ◽

Francisco Paixão ◽

Alceu Migliavaca ◽

José Ricardo Guimarães ◽

Rafael Selbach Scheffel ◽

...

Keyword(s):

Thyroid Cancer ◽

Cancer Diagnosis ◽

Thyroid Nodules ◽

Frozen Section ◽

Needle Aspiration ◽

Intraoperative Frozen Section ◽

Indeterminate Nodules ◽

Intraoperative Management ◽

Low Sensitivity ◽

Sensitivity Specificity

Abstract Purpose: A primary medical relevance of thyroid nodules consists of excluding thyroid cancer, present in approximately 5% of all thyroid nodules. Fine-needle aspiration biopsy (FNAB) has a paramount role in distinguishing benign from malignant thyroid nodules due to its availability and diagnostic performance. Nevertheless, intraoperative frozen section (iFS) is still advocated as a valuable tool for surgery planning, especially for indeterminate nodules. Methods: To compare the FNAB and iFS performances in thyroid cancer diagnosis among nodules in Bethesda Categories (BC) I to VI. The performance of FNAB and iFS tests were calculated using final histopathology results as the gold standard.Results: In total, 316 patients were included in the analysis. Both FNAB and iFS data were available for 272 patients (86.1%). The overall malignancy rate was 30.4%% (n=96). The FNAB sensitivity, specificity, and accuracy for benign (BC II) and malignant (BC V and VI) were 89.5%, 97.1%, and 94.1%, respectively. For all nodules evaluated, the iFS sensitivity, specificity, and accuracy were 80.9%, 100%, and 94.9%, respectively. For indeterminate nodules and follicular lesions (BC III and IV), the iFS sensitivity, specificity, and accuracy were 25%, 100%, and 88.7%, respectively. For BC I nodules, iFS had 95.2% of accuracy.Conclusion: Our results do not support routine iFS for indeterminate nodules or follicular neoplasms (BC III and IV) due to its low sensitivity. In these categories, iFS is not sufficiently accurate to guide the intraoperative management of thyroidectomies. iFS for BC I nodules could be a reasonable option and should be specifically investigated.

Download Full-text

A Quest for New Cancer Diagnosis, Prognosis and Prediction Biomarkers and Their Use in Biosensors Development

Technology in Cancer Research & Treatment ◽

10.1177/1533033820957033 ◽

2020 ◽

Vol 19 ◽

pp. 153303382095703

Author(s):

Eda G. Ramirez-Valles ◽

Alicia Rodríguez-Pulido ◽

Marcelo Barraza-Salas ◽

Isaac Martínez-Velis ◽

Iván Meneses-Morales ◽

...

Keyword(s):

Sensitivity And Specificity ◽

Cancer Diagnosis ◽

Tissue Analysis ◽

Breast Cancers ◽

Processing Times ◽

Operation Costs ◽

Prognosis And Prediction ◽

Relevant Topic ◽

Polymerase Chain ◽

Low Sensitivity

Traditional techniques for cancer diagnosis, such as nuclear magnetic resonance, ultrasound and tissue analysis, require sophisticated devices and highly trained personnel, which are characterized by elevated operation costs. The use of biomarkers has emerged as an alternative for cancer diagnosis, prognosis and prediction because their measurement in tissues or fluids, such as blood, urine or saliva, is characterized by shorter processing times. However, the biomarkers used currently, and the techniques used for their measurement, including ELISA, western-blot, polymerase chain reaction (PCR) or immunohistochemistry, possess low sensitivity and specificity. Therefore, the search for new proteomic, genomic or immunological biomarkers and the development of new noninvasive, easier and cheaper techniques that meet the sensitivity and specificity criteria for the diagnosis, prognosis and prediction of this disease has become a relevant topic. The purpose of this review is to provide an overview about the search for new cancer biomarkers, including the strategies that must be followed to identify them, as well as presenting the latest advances in the development of biosensors that possess a high potential for cancer diagnosis, prognosis and prediction, mainly focusing on their relevance in lung, prostate and breast cancers.

Download Full-text

SYSTEMATIC VARIATION NORMALIZATION IN MICROARRAY DATA TO GET GENE EXPRESSION COMPARISON UNBIASED

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720005001028 ◽

2005 ◽

Vol 03 (02) ◽

pp. 225-241 ◽

Cited By ~ 13

Author(s):

JEFF W. CHOU ◽

RICHARD S. PAULES ◽

PIERRE R. BUSHEL

Keyword(s):

Gene Expression ◽

Linear Regression ◽

Microarray Data ◽

Expression Patterns ◽

Microarray Gene Expression Data ◽

Systematic Variation ◽

Data Sets ◽

Microarray Gene Expression ◽

Pixel Intensity ◽

Non Linear

Normalization removes or minimizes the biases of systematic variation that exists in experimental data sets. This study presents a systematic variation normalization (SVN) procedure for removing systematic variation in two channel microarray gene expression data. Based on an analysis of how systematic variation contributes to variability in microarray data sets, our normalization procedure includes background subtraction determined from the distribution of pixel intensity values from each data acquisition channel and log conversion, linear or non-linear regression, restoration or transformation, and multiarray normalization. In the case when a non-linear regression is required, an empirical polynomial approximation approach is used. Either the high terminated points or their averaged values in the distributions of the pixel intensity values observed in control channels may be used for rescaling multiarray datasets. These pre-processing steps remove systematic variation in the data attributable to variability in microarray slides, assay-batches, the array process, or experimenters. Biologically meaningful comparisons of gene expression patterns between control and test channels or among multiple arrays are therefore unbiased using normalized but not unnormalized datasets.

Download Full-text

SPECTRAL CLUSTERING ON GENE EXPRESSION PROFILE TO IDENTIFY CANCER TYPES OR SUBTYPES

Jurnal Teknologi ◽

10.11113/jt.v76.4036 ◽

2015 ◽

Vol 76 (1) ◽

Author(s):

Ang Jun Chin ◽

Andri Mirzal ◽

Habibollah Haron

Keyword(s):

Gene Expression ◽

Gene Expression Profile ◽

Expression Profile ◽

Microarray Data ◽

Spectral Clustering ◽

Data Sets ◽

Clustering Methods ◽

Microarray Gene Expression ◽

Cancer Types ◽

Microarray Gene

Gene expression profile is eminent for its broad applications and achievements in disease discovery and analysis, especially in cancer research. Spectral clustering is robust to irrelevant features which are appropriated for gene expression analysis. However, previous works show that performance comparison with other clustering methods is limited and only a few microarray data sets were analyzed in each study. In this study, we demonstrate the use of spectral clustering in identifying cancer types or subtypes from microarray gene expression profiling. Spectral clustering was applied to eleven microarray data sets and its clustering performances were compared with the results in the literature. Based on the result, overall the spectral clustering slightly outperformed the corresponding results in the literature. The spectral clustering can also offer more stable clustering performances as it has smaller standard deviation value. Moreover, out of eleven data sets the spectral clustering outperformed the corresponding methods in the literature for six data sets. So, it can be stated that the spectral clustering is a promising method in identifying the cancer types or subtypes for microarray gene expression data sets.

Download Full-text

gene-CBR: A CASE-BASED REASONIG TOOL FOR CANCER DIAGNOSIS USING MICROARRAY DATA SETS

Computational Intelligence ◽

10.1111/j.1467-8640.2006.00287.x ◽

2006 ◽

Vol 22 (3-4) ◽

pp. 254-268 ◽

Cited By ~ 57

Author(s):

Fernando Díaz ◽

Florentino Fdez-Riverola ◽

Juan M. Corchado

Keyword(s):

Cancer Diagnosis ◽

Microarray Data ◽

Data Sets ◽

Case Based

Download Full-text