scholarly journals Multiple Suboptimal Solutions for Prediction Rules in Gene Expression Data

2013 ◽  
Vol 2013 ◽  
pp. 1-14
Author(s):  
Osamu Komori ◽  
Mari Pritchard ◽  
Shinto Eguchi

This paper discusses mathematical and statistical aspects in analysis methods applied to microarray gene expressions. We focus on pattern recognition to extract informative features embedded in the data for prediction of phenotypes. It has been pointed out that there are severely difficult problems due to the unbalance in the number of observed genes compared with the number of observed subjects. We make a reanalysis of microarray gene expression published data to detect many other gene sets with almost the same performance. We conclude in the current stage that it is not possible to extract only informative genes with high performance in the all observed genes. We investigate the reason why this difficulty still exists even though there are actively proposed analysis methods and learning algorithms in statistical machine learning approaches. We focus on the mutual coherence or the absolute value of the Pearson correlations between two genes and describe the distributions of the correlation for the selected set of genes and the total set. We show that the problem of finding informative genes in high dimensional data is ill-posed and that the difficulty is closely related with the mutual coherence.

2007 ◽  
Vol 19 (02) ◽  
pp. 71-78 ◽  
Author(s):  
Cheng-Long Chuang ◽  
Chung-Ming Chen ◽  
Grace S. Shieh ◽  
Joe-Air Jiang

A neuro-fuzzy inference system that recognizes the expression patterns of genes in microarray gene expression (MGE) data, called GeneCFE-ANFIS, is proposed to infer gene interactions. In this study, three primary features are utilized to extract genes' expression patterns and used as inputs to the neuro-fuzzy inference system. The proposed algorithm learns expression patterns from the known genetic interactions, such as the interactions confirmed by qRT-PCR experiments or collected through text-mining technique by surveying previously published literatures, and then predicts other gene interactions according to the learned patterns. The proposed neuro-fuzzy inference system was applied to a public yeast MGE dataset. Two simulations were conducted and checked against 112 pairs of qRT-PCR confirmed gene interactions and 77 TFs (Transcriptional Factors) pairs collected from literature respectively to evaluate the performance of the proposed algorithm.


Author(s):  
Qiang Zhao ◽  
Jianguo Sun

Statistical analysis of microarray gene expression data has recently attracted a great deal of attention. One problem of interest is to relate genes to survival outcomes of patients with the purpose of building regression models for the prediction of future patients' survival based on their gene expression data. For this, several authors have discussed the use of the proportional hazards or Cox model after reducing the dimension of the gene expression data. This paper presents a new approach to conduct the Cox survival analysis of microarray gene expression data with the focus on models' predictive ability. The method modifies the correlation principal component regression (Sun, 1995) to handle the censoring problem of survival data. The results based on simulated data and a set of publicly available data on diffuse large B-cell lymphoma show that the proposed method works well in terms of models' robustness and predictive ability in comparison with some existing partial least squares approaches. Also, the new approach is simpler and easy to implement.


Sign in / Sign up

Export Citation Format

Share Document