scholarly journals Integrated Use of Statistical-Based Approaches and Computational Intelligence Techniques for Tumors Classification Using Microarray

2015 ◽  
Vol 2015 ◽  
pp. 1-8
Author(s):  
Chia-Ding Hou ◽  
Yuehjen E. Shao

With the recent development of biotechnologies, cDNA microarray chips are increasingly applied in cancer research. Microarray experiments can lead to a more thorough grasp of the molecular variations among tumors because they can allow the monitoring of expression levels in cells for thousands of genes simultaneously. Accordingly, how to successfully discriminate the classes of tumors using gene expression data is an urgent research issue and plays an important role in carcinogenesis. To refine the large dimension of the genes data and effectively classify tumor classes, this study proposes several hybrid discrimination procedures that combine the statistical-based techniques and computational intelligence approaches to discriminate the tumor classes. A real microarray data set was used to demonstrate the performance of the proposed approaches. In addition, the results of cross-validation experiments reveal that the proposed two-stage hybrid models are more efficient in discriminating the acute leukemia classes than the established single stage models.

Author(s):  
Mei-Ling Ting Lee ◽  
Robert J Gray ◽  
Harry Björkbacka ◽  
Mason W Freeman

Gene expression data from microarray experiments have been studied using several statistical models. Significance Analysis of Microarrays (SAM), for example, has proved to be useful in analyzing microarray data. In the spirit of the SAM procedures, we develop permutation based rank-tests for generalized Wilcoxon ranksum test for two-group comparisons of replicated microarray data. Also, for microarray experiments with randomized block design, we consider generalized signed rank test. The statistical analysis software package is written in R and is freely available in a package.


2005 ◽  
Vol 44 (03) ◽  
pp. 418-422 ◽  
Author(s):  
C. Ittrich

Summary Objectives: In two-channel microarray experiments the measured gene expression levels are affected by many sources of systematic variation. Normalization refers to the process of removing such systematic sources of variation, to make measured intensities within and between slides comparable. Some commonly used normalization methods removing intensity-dependent dye bias and adjusting differences in variability between slides will be reviewed with the main focus on intensity-dependent normalization methods. Methods: This article describes different intensity-dependent within-slide normalization methods for the log ratios of red and green channel intensities but also refers to single channel normalization methods incorporating all single channels of the slides at once. Results: The described procedures provide a useful approach to remove systematic sources of variation like intensity-dependent dye bias and variability between slides in cDNA microarray experiments. This is illustrated by an experimental data set. Conclusions: Several reasonable normalization procedures for two-channel microarray data have recently been proposed. Deciding on which method would perform well for a concrete experiment is difficult. Designed spike-in experiments or dilution series with known differences for some selected genes would be helpful to assess the different methods, but may be impractical for most laboratories due to the high costs.


Author(s):  
O. V. KALE ◽  
B. F. MOMIN

Microarray technology has created a revolution in the field of biological research. Association rules can not only group the similarly expressed genes but also discern relationships among genes. We propose a new row-enumeration rule mining method to mine high confidence rules from microarray data. It is a support-free algorithm that directly uses the confidence measure to effectively prune the search space. Experiments on Leukemia microarray data set show that proposed algorithm outperforms support-based rule mining with respect to scalability and rule extraction.


2009 ◽  
Vol 07 (01) ◽  
pp. 75-91 ◽  
Author(s):  
SUNG-GON YI ◽  
YOON-JEONG JOO ◽  
TAESUNG PARK

Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. In time-course microarray experiments in which gene expression is monitored over time, we are interested in clustering genes that show similar temporal profiles and identifying genes that show a pre-specified candidate profile. Unfortunately, many traditional clustering methods used for analyzing microarray data do not effectively detect temporal profiles for the time-course microarray data. We propose a rank-based clustering analysis for the time-course microarray data. Our clustering method consists of two steps: the first step discretizes the expression data into groups and then transform them into the rank data, the second step performs the rank-based clustering analysis. Our testing procedure uses the bootstrap samples to select the genes that show similar patterns for the candidate profiles. Simulation study is performed to evaluate the performance of the proposed rank-based method. The results are illustrated with the breast cancer data and the Arabidopsis cold stress data.


2005 ◽  
Vol 2005 (2) ◽  
pp. 215-225 ◽  
Author(s):  
David J. Hand ◽  
Nicholas A. Heard

The vast potential of the genomic insight offered by microarray technologies has led to their widespread use since they were introduced a decade ago. Application areas include gene function discovery, disease diagnosis, and inferring regulatory networks. Microarray experiments enable large-scale, high-throughput investigations of gene activity and have thus provided the data analyst with a distinctive, high-dimensional field of study. Many questions in this field relate to finding subgroups of data profiles which are very similar. A popular type of exploratory tool for finding subgroups is cluster analysis, and many different flavors of algorithms have been used and indeed tailored for microarray data. Cluster analysis, however, implies a partitioning of the entire data set, and this does not always match the objective. Sometimes pattern discovery or bump hunting tools are more appropriate. This paper reviews these various tools for finding interesting subgroups.


Author(s):  
Naveen Trivedi ◽  
Suvendu Kanungo

Background: Today bi-clustering technique plays a vital role to analyze gene expression data in microarray technology. This technique performs clustering on both rows and columns of expression data simultaneously. It determines the expression level of genes set under the subset of several conditions or samples. Basically, obtained information is collected in the form of a sub matrix comprising of microarray data that satisfy coherent expression patterns of subsets of genes with respect to subsets of conditions. These sub matrices are represented as bi-clusters and overall process is called bi-clustering. In this paper, we proposed a new meta-heuristics hybrid ABC-MWOA-CC which is based on artificial bee colony (ABC), modified whale optimization algorithm (MWOA) and Cheng and Church (CC) algorithm to optimize the extracted bi-clusters. In order to validate this algorithm, we also delve into finding the statistical and biological relevancy of extracted genes with respect to various conditions. However, most of the bi-clustering techniques do not address the biological significance of genes belonging to extracted bi-clusters Objective: The major aim of the proposed work is to design and develop a novel hybrid multi-objective bi-clustering approach for in microarray data to produce desired number of valid bi-clusters. Further, these extracted bi-clusters are to be optimized to obtain optimal solution. Method: In the proposed approach, a hybrid multi-objective bi-clustering algorithm which is based on ABC along with MWOA is recommended to group the data into desired number of bi-clusters. Further, ABC with MWOA multi-objective optimization algorithm is applied in order to optimize the solutions using variety of the fitness functions. Results: In the analysis of the result, the multi-objective functions which are employed to judge the fitness calculation like Volume Mean (VM), Mean of Genes (GM), Mean of Conditions (CM) and Mean of MSR (MMSR) leads to improve the performance analysis of the CC bi-clustering algorithm on real life data set such as Yeast Saccharomyces cerevisiae cell cycle gene Expression datasets. Conclusion: The effectiveness of the ABC-MWOA-CC algorithm is comprehensively demonstrated by comparing it with well-known traditional ABC-CC, OPSM and CC algorithm in terms of VM, GM, CM and MMSR.


2004 ◽  
Vol 02 (02) ◽  
pp. 273-288 ◽  
Author(s):  
HIDEO BANNAI ◽  
SHUNSUKE INENAGA ◽  
AYUMI SHINOHARA ◽  
MASAYUKI TAKEDA ◽  
SATORU MIYANO

We present an efficient algorithm for detecting putative regulatory elements in the upstream DNA sequences of genes, using gene expression information obtained from microarray experiments. Based on a generalized suffix tree, our algorithm looks for motif patterns whose appearance in the upstream region is most correlated with the expression levels of the genes. We are able to find the optimal pattern, in time linear in the total length of the upstream sequences. We implement and apply our algorithm to publicly available microarray gene expression data, and show that our method is able to discover biologically significant motifs, including various motifs which have been reported previously using the same data set. We further discuss applications for which the efficiency of the method is essential, as well as possible extensions to our algorithm.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
A. Wong ◽  
Z. Q. Lin ◽  
L. Wang ◽  
A. G. Chung ◽  
B. Shen ◽  
...  

AbstractA critical step in effective care and treatment planning for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause for the coronavirus disease 2019 (COVID-19) pandemic, is the assessment of the severity of disease progression. Chest x-rays (CXRs) are often used to assess SARS-CoV-2 severity, with two important assessment metrics being extent of lung involvement and degree of opacity. In this proof-of-concept study, we assess the feasibility of computer-aided scoring of CXRs of SARS-CoV-2 lung disease severity using a deep learning system. Data consisted of 396 CXRs from SARS-CoV-2 positive patient cases. Geographic extent and opacity extent were scored by two board-certified expert chest radiologists (with 20+ years of experience) and a 2nd-year radiology resident. The deep neural networks used in this study, which we name COVID-Net S, are based on a COVID-Net network architecture. 100 versions of the network were independently learned (50 to perform geographic extent scoring and 50 to perform opacity extent scoring) using random subsets of CXRs from the study, and we evaluated the networks using stratified Monte Carlo cross-validation experiments. The COVID-Net S deep neural networks yielded R$$^2$$ 2 of $$0.664 \pm 0.032$$ 0.664 ± 0.032 and $$0.635 \pm 0.044$$ 0.635 ± 0.044 between predicted scores and radiologist scores for geographic extent and opacity extent, respectively, in stratified Monte Carlo cross-validation experiments. The best performing COVID-Net S networks achieved R$$^2$$ 2 of 0.739 and 0.741 between predicted scores and radiologist scores for geographic extent and opacity extent, respectively. The results are promising and suggest that the use of deep neural networks on CXRs could be an effective tool for computer-aided assessment of SARS-CoV-2 lung disease severity, although additional studies are needed before adoption for routine clinical use.


2021 ◽  
Vol 11 (10) ◽  
pp. 4494
Author(s):  
Qicai Wu ◽  
Haiwen Yuan ◽  
Haibin Yuan

The case-based reasoning (CBR) method can effectively predict the future health condition of the system based on past and present operating data records, so it can be applied to the prognostic and health management (PHM) framework, which is a type of data-driven problem-solving. The establishment of a CBR model for practical application of the Ground Special Vehicle (GSV) PHM framework is in great demand. Since many CBR algorithms are too complicated in weight optimization methods, and are difficult to establish effective knowledge and reasoning models for engineering practice, an application development using a CBR model that includes case representation, case retrieval, case reuse, and simulated annealing algorithm is introduced in this paper. The purpose is to solve the problem of normal/abnormal determination and the degree of health performance prediction. Based on the proposed CBR model, optimization methods for attribute weights are described. State classification accuracy rate and root mean square error are adopted to setup objective functions. According to the reasoning steps, attribute weights are trained and put into case retrieval; after that, different rules of case reuse are established for these two kinds of problems. To validate the model performance of the application, a cross-validation test is carried on a historical data set. Comparative analysis of even weight allocation CBR (EW-CBR) method, correlation coefficient weight allocation CBR (CW-CBR) method, and SA weight allocation CBR (SA-CBR) method is carried out. Cross-validation results show that the proposed method can reach better results compared with the EW-CBR model and CW-CBR model. The developed PHM framework is applied to practical usage for over three years, and the proposed CBR model is an effective approach toward the best PHM framework solutions in practical applications.


2021 ◽  
Vol 11 (1) ◽  
pp. 450
Author(s):  
Jinfu Liu ◽  
Mingliang Bai ◽  
Na Jiang ◽  
Ran Cheng ◽  
Xianling Li ◽  
...  

Multi-classifiers are widely applied in many practical problems. But the features that can significantly discriminate a certain class from others are often deleted in the feature selection process of multi-classifiers, which seriously decreases the generalization ability. This paper refers to this phenomenon as interclass interference in multi-class problems and analyzes its reason in detail. Then, this paper summarizes three interclass interference suppression methods including the method based on all-features, one-class classifiers and binary classifiers and compares their effects on interclass interference via the 10-fold cross-validation experiments in 14 UCI datasets. Experiments show that the method based on binary classifiers can suppress the interclass interference efficiently and obtain the best classification accuracy among the three methods. Further experiments were done to compare the suppression effect of two methods based on binary classifiers including the one-versus-one method and one-versus-all method. Results show that the one-versus-one method can obtain a better suppression effect on interclass interference and obtain better classification accuracy. By proposing the concept of interclass inference and studying its suppression methods, this paper significantly improves the generalization ability of multi-classifiers.


Sign in / Sign up

Export Citation Format

Share Document