scholarly journals A spline function approach for detecting differentially expressed genes in microarray data analysis

2004 ◽  
Vol 20 (17) ◽  
pp. 2954-2963 ◽  
Author(s):  
W. He
2011 ◽  
Vol 58 (4) ◽  
Author(s):  
Marcin T Schmidt ◽  
Luiza Handschuh ◽  
Joanna Zyprych ◽  
Alicja Szabelska ◽  
Agnieszka K Olejnik-Schmidt ◽  
...  

Two-color DNA microarrays are commonly used for the analysis of global gene expression. They provide information on relative abundance of thousands of mRNAs. However, the generated data need to be normalized to minimize systematic variations so that biologically significant differences can be more easily identified. A large number of normalization procedures have been proposed and many softwares for microarray data analysis are available. Here, we have applied two normalization methods (median and loess) from two packages of microarray data analysis softwares. They were examined using a sample data set. We found that the number of genes identified as differentially expressed varied significantly depending on the method applied. The obtained results, i.e. lists of differentially expressed genes, were consistent only when we used median normalization methods. Loess normalization implemented in the two software packages provided less coherent and for some probes even contradictory results. In general, our results provide an additional piece of evidence that the normalization method can profoundly influence final results of DNA microarray-based analysis. The impact of the normalization method depends greatly on the algorithm employed. Consequently, the normalization procedure must be carefully considered and optimized for each individual data set.


2008 ◽  
Vol 06 (02) ◽  
pp. 261-282 ◽  
Author(s):  
AO YUAN ◽  
WENQING HE

Clustering is a major tool for microarray gene expression data analysis. The existing clustering methods fall mainly into two categories: parametric and nonparametric. The parametric methods generally assume a mixture of parametric subdistributions. When the mixture distribution approximately fits the true data generating mechanism, the parametric methods perform well, but not so when there is nonnegligible deviation between them. On the other hand, the nonparametric methods, which usually do not make distributional assumptions, are robust but pay the price for efficiency loss. In an attempt to utilize the known mixture form to increase efficiency, and to free assumptions about the unknown subdistributions to enhance robustness, we propose a semiparametric method for clustering. The proposed approach possesses the form of parametric mixture, with no assumptions to the subdistributions. The subdistributions are estimated nonparametrically, with constraints just being imposed on the modes. An expectation-maximization (EM) algorithm along with a classification step is invoked to cluster the data, and a modified Bayesian information criterion (BIC) is employed to guide the determination of the optimal number of clusters. Simulation studies are conducted to assess the performance and the robustness of the proposed method. The results show that the proposed method yields reasonable partition of the data. As an illustration, the proposed method is applied to a real microarray data set to cluster genes.


Sign in / Sign up

Export Citation Format

Share Document