scholarly journals Klasifikasi Data Microarray Menggunakan Discrete Wavelet Transform dan Extreme Learning Machine

Author(s):  
Khadijah Khadijah ◽  
Sri Hartati

AbstrakData microarray digunakan sebagai alternatif untuk diagnosa penyakit kanker karena kesulitan dalam dignosa kanker berdasarkan bentuk morfologis, yaitu perbedaan morfologis yang tipis antar jenis kanker yang berbeda. Penelitian ini bertujuan untuk membangun pengklasifikasi data microarray. Proses klasifikasi diawali dengan reduksi dimensi data microarray menggunakan DWT, dengan cara mendekomposisi sampel hingga level tertentu, kemudian mengambil nilai koefisien aproksimasi pada level tersebut sebagai fitur sampel. Fitur tersebut selanjutnya menjadi masukan untuk klasifikasi. Metode klasifikasi yang digunakan adalah ELM yang diterapkan pada RBFN. Dataset yang digunakan adalah data microarray multikelas, yaitu dataset GCM (16.063 gen, 14 kelas) dan Subtypes-Leukemia (12.600 gen, 7 kelas).Pengujian dilakukan dengan cara membagi data latih dan data uji secara random sepuluh kali dengan proporsi data yang sama. Classifier yang dihasilkan dari penelitian ini untuk dataset GCM belum memiliki performa yang cukup baik, ditunjukkan dengan nilai akurasi sekitar 75% ± 6,25% dan nilai minimum sensitivity yang masih rendah, yaitu 15% ± 19,95% menunjukkan bahwa sensitivity untuk tiap kelas belum merata, terdapat beberapa kelas yang sensitivity-nya masih rendah. Namun, classifier untuk dataset Subtypes-Leukemia yang memiliki jumlah kelas lebih sedikit dari dataset GCM memiliki performa yang cukup baik, ditunjukkan dengan nilai akurasi 87,68% ± 2,88% dan minimum sensitivity 51,90% ± 20,29%.   Kata kunci— microarray, ekspresi gen, DWT, ELM, RBFN AbstractMicroarray data is used as an alternative in cancer diagnosis because of the difficulties cancer diagnosis based on morphologis structures. Different classes of cancer usually have poor distintion of morphologis structures. The aim of this reserach is to bulid microarray data classfier. The classification process is started by reducing dimension of microarray data. The method used to reduce the microarray data dimension is DWT by decomposing the samples until certain decomposition level and then use approximation coefficients at those level as feature to classifier. Classifier used in this reserach is ELM implemeted on RBFN. Dataset used are GCM (16.063 genes, 14 classes) and Subtypes-Leukemia (12.600 genes, 7 classes). Testing process is done by randomly dividing the training and testing data ten times with same proprotion of training and testing data. The perfomance of classifier built in this research is not so good for GCM dataset, shown by accuracy 75% ± 6,25% and mean of minimum sensitivity 15% ± 19,95%. The low minimum sensitivity indicate that there are few classes that have low sensitivity. But the classifier for Subtypes-Leukemia dataset give better result, that is accuracy 87,68% ± 2,88%  and mean of minimum sensitivity 51,90% ± 20,29%.    Keywords— microarray, gene expression, DWT, ELM, RBFN

2020 ◽  
Vol 20 ◽  
Author(s):  
Si Yu ◽  
Menglin Huang ◽  
Jingyu Wang ◽  
Yongchang Zheng ◽  
Haifeng Xu

: Widely exploration of noninvasive tumor/cancer biomarkers has shed light on clinical diagnosis. However, many under-investigated biomarkers showed limited application potency due to low sensitivity and specificity, while extracellular vehicles (EVs) were gradually recognized as promising candidates. EVs are small vesicles transporting bioactive cargos between cells in multiple physiological processes and also in tumor/cancer pathogenesis. This review aimed to offer recent studies of EVs on structure, classification, physiological functions, as well as changes in tumor initiation and progression. Furthermore, we focused on advances of EVs and/or EV-related substances in cancer diagnosis, and summarized ongoing studies of promising candidates for future investigations.


2008 ◽  
Vol 06 (02) ◽  
pp. 261-282 ◽  
Author(s):  
AO YUAN ◽  
WENQING HE

Clustering is a major tool for microarray gene expression data analysis. The existing clustering methods fall mainly into two categories: parametric and nonparametric. The parametric methods generally assume a mixture of parametric subdistributions. When the mixture distribution approximately fits the true data generating mechanism, the parametric methods perform well, but not so when there is nonnegligible deviation between them. On the other hand, the nonparametric methods, which usually do not make distributional assumptions, are robust but pay the price for efficiency loss. In an attempt to utilize the known mixture form to increase efficiency, and to free assumptions about the unknown subdistributions to enhance robustness, we propose a semiparametric method for clustering. The proposed approach possesses the form of parametric mixture, with no assumptions to the subdistributions. The subdistributions are estimated nonparametrically, with constraints just being imposed on the modes. An expectation-maximization (EM) algorithm along with a classification step is invoked to cluster the data, and a modified Bayesian information criterion (BIC) is employed to guide the determination of the optimal number of clusters. Simulation studies are conducted to assess the performance and the robustness of the proposed method. The results show that the proposed method yields reasonable partition of the data. As an illustration, the proposed method is applied to a real microarray data set to cluster genes.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5203 ◽  
Author(s):  
Mohammad Faujul Kabir ◽  
Johari Mohd Ali ◽  
Onn Haji Hashim

BackgroundWe have previously reported anticancer activities ofMelicope ptelefolia(MP) leaf extracts on four different cancer cell lines. However, the underlying mechanisms of actions have yet to be deciphered. In the present study, the anticancer activity of MP hexane extract (MP-HX) on colorectal (HCT116) and hepatocellular carcinoma (HepG2) cell lines was characterized through microarray gene expression profiling.MethodsHCT116 and HepG2 cells were treated with MP-HX for 24 hr. Total RNA was extracted from the cells and used for transcriptome profiling using Applied Biosystem GeneChip™ Human Gene 2.0 ST Array. Gene expression data was analysed using an Applied Biosystems Expression Console and Transcriptome Analysis Console software. Pathway enrichment analyses was performed using Ingenuity Pathway Analysis (IPA) software. The microarray data was validated by profiling the expression of 17 genes through quantitative reverse transcription PCR (RT-qPCR).ResultsMP-HX induced differential expression of 1,290 and 1,325 genes in HCT116 and HepG2 cells, respectively (microarray data fold change, MA_FC ≥ ±2.0). The direction of gene expression change for the 17 genes assayed through RT-qPCR agree with the microarray data. In both cell lines, MP-HX modulated the expression of many genes in directions that support antiproliferative activity. IPA software analyses revealed MP-HX modulated canonical pathways, networks and biological processes that are associated with cell cycle, DNA replication, cellular growth and cell proliferation. In both cell lines, upregulation of genes which promote apoptosis, cell cycle arrest and growth inhibition were observed, while genes that are typically overexpressed in diverse human cancers or those that promoted cell cycle progression, DNA replication and cellular proliferation were downregulated. Some of the genes upregulated by MP-HX include pro-apoptotic genes (DDIT3, BBC3, JUN), cell cycle arresting (CDKN1A, CDKN2B), growth arrest/repair (TP53, GADD45A) and metastasis suppression (NDRG1). MP-HX downregulated the expression of genes that could promote anti-apoptotic effect, cell cycle progression, tumor development and progression, which include BIRC5, CCNA2, CCNB1, CCNB2, CCNE2, CDK1/2/6, GINS2, HELLS, MCM2/10 PLK1, RRM2 and SKP2. It is interesting to note that all six top-ranked genes proposed to be cancer-associated (PLK1, MCM2, MCM3, MCM7, MCM10 and SKP2) were downregulated by MP-HX in both cell lines.DiscussionThe present study showed that the anticancer activities of MP-HX are exerted through its actions on genes regulating apoptosis, cell proliferation, DNA replication and cell cycle progression. These findings further project the potential use of MP as a nutraceutical agent for cancer therapeutics.


2021 ◽  
Author(s):  
Iuri Martin Goemann ◽  
Francisco Paixão ◽  
Alceu Migliavaca ◽  
José Ricardo Guimarães ◽  
Rafael Selbach Scheffel ◽  
...  

Abstract Purpose: A primary medical relevance of thyroid nodules consists of excluding thyroid cancer, present in approximately 5% of all thyroid nodules. Fine-needle aspiration biopsy (FNAB) has a paramount role in distinguishing benign from malignant thyroid nodules due to its availability and diagnostic performance. Nevertheless, intraoperative frozen section (iFS) is still advocated as a valuable tool for surgery planning, especially for indeterminate nodules. Methods: To compare the FNAB and iFS performances in thyroid cancer diagnosis among nodules in Bethesda Categories (BC) I to VI. The performance of FNAB and iFS tests were calculated using final histopathology results as the gold standard.Results: In total, 316 patients were included in the analysis. Both FNAB and iFS data were available for 272 patients (86.1%). The overall malignancy rate was 30.4%% (n=96). The FNAB sensitivity, specificity, and accuracy for benign (BC II) and malignant (BC V and VI) were 89.5%, 97.1%, and 94.1%, respectively. For all nodules evaluated, the iFS sensitivity, specificity, and accuracy were 80.9%, 100%, and 94.9%, respectively. For indeterminate nodules and follicular lesions (BC III and IV), the iFS sensitivity, specificity, and accuracy were 25%, 100%, and 88.7%, respectively. For BC I nodules, iFS had 95.2% of accuracy.Conclusion: Our results do not support routine iFS for indeterminate nodules or follicular neoplasms (BC III and IV) due to its low sensitivity. In these categories, iFS is not sufficiently accurate to guide the intraoperative management of thyroidectomies. iFS for BC I nodules could be a reasonable option and should be specifically investigated.


2020 ◽  
Vol 19 ◽  
pp. 153303382095703
Author(s):  
Eda G. Ramirez-Valles ◽  
Alicia Rodríguez-Pulido ◽  
Marcelo Barraza-Salas ◽  
Isaac Martínez-Velis ◽  
Iván Meneses-Morales ◽  
...  

Traditional techniques for cancer diagnosis, such as nuclear magnetic resonance, ultrasound and tissue analysis, require sophisticated devices and highly trained personnel, which are characterized by elevated operation costs. The use of biomarkers has emerged as an alternative for cancer diagnosis, prognosis and prediction because their measurement in tissues or fluids, such as blood, urine or saliva, is characterized by shorter processing times. However, the biomarkers used currently, and the techniques used for their measurement, including ELISA, western-blot, polymerase chain reaction (PCR) or immunohistochemistry, possess low sensitivity and specificity. Therefore, the search for new proteomic, genomic or immunological biomarkers and the development of new noninvasive, easier and cheaper techniques that meet the sensitivity and specificity criteria for the diagnosis, prognosis and prediction of this disease has become a relevant topic. The purpose of this review is to provide an overview about the search for new cancer biomarkers, including the strategies that must be followed to identify them, as well as presenting the latest advances in the development of biosensors that possess a high potential for cancer diagnosis, prognosis and prediction, mainly focusing on their relevance in lung, prostate and breast cancers.


2005 ◽  
Vol 03 (02) ◽  
pp. 225-241 ◽  
Author(s):  
JEFF W. CHOU ◽  
RICHARD S. PAULES ◽  
PIERRE R. BUSHEL

Normalization removes or minimizes the biases of systematic variation that exists in experimental data sets. This study presents a systematic variation normalization (SVN) procedure for removing systematic variation in two channel microarray gene expression data. Based on an analysis of how systematic variation contributes to variability in microarray data sets, our normalization procedure includes background subtraction determined from the distribution of pixel intensity values from each data acquisition channel and log conversion, linear or non-linear regression, restoration or transformation, and multiarray normalization. In the case when a non-linear regression is required, an empirical polynomial approximation approach is used. Either the high terminated points or their averaged values in the distributions of the pixel intensity values observed in control channels may be used for rescaling multiarray datasets. These pre-processing steps remove systematic variation in the data attributable to variability in microarray slides, assay-batches, the array process, or experimenters. Biologically meaningful comparisons of gene expression patterns between control and test channels or among multiple arrays are therefore unbiased using normalized but not unnormalized datasets.


2015 ◽  
Vol 76 (1) ◽  
Author(s):  
Ang Jun Chin ◽  
Andri Mirzal ◽  
Habibollah Haron

Gene expression profile is eminent for its broad applications and achievements in disease discovery and analysis, especially in cancer research. Spectral clustering is robust to irrelevant features which are appropriated for gene expression analysis. However, previous works show that performance comparison with other clustering methods is limited and only a few microarray data sets were analyzed in each study. In this study, we demonstrate the use of spectral clustering in identifying cancer types or subtypes from microarray gene expression profiling. Spectral clustering was applied to eleven microarray data sets and its clustering performances were compared with the results in the literature. Based on the result, overall the spectral clustering slightly outperformed the corresponding results in the literature. The spectral clustering can also offer more stable clustering performances as it has smaller standard deviation value. Moreover, out of eleven data sets the spectral clustering outperformed the corresponding methods in the literature for six data sets. So, it can be stated that the spectral clustering is a promising method in identifying the cancer types or subtypes for microarray gene expression data sets.


2006 ◽  
Vol 22 (3-4) ◽  
pp. 254-268 ◽  
Author(s):  
Fernando Díaz ◽  
Florentino Fdez-Riverola ◽  
Juan M. Corchado

Sign in / Sign up

Export Citation Format

Share Document