Robust Fuzzy Cluster Ensemble on Cancer Gene Expression Data

Analyzing Large Gene Expression Data Sets

Computational Text Analysis ◽

10.1093/oso/9780198567400.003.0014 ◽

2006 ◽

Author(s):

Soumya Raychaudhuri

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Data Sets ◽

Expression Data ◽

Clustering Methods ◽

Biologically Relevant ◽

Large Gene ◽

Functional Coherence

The most interesting and challenging gene expression data sets to analyze are large multidimensional data sets that contain expression values for many genes across multiple conditions. In these data sets the use of scientific text can be particularly useful, since there are a myriad of genes examined under vastly different conditions, each of which may induce or repress expression of the same gene for different reasons. There is an enormous complexity to the data that we are examining—each gene is associated with dozens if not hundreds of expression values as well as multiple documents built up from vocabularies consisting of thousands of words. In Section 2.4 we reviewed common gene expression strategies, most of which revolve around defining groups of genes based on common profiles. A limitation of many gene expression analytic approaches is that they do not incorporate comprehensive background knowledge about the genes into the analysis. We present computational methods that leverage the peer-reviewed literature in the automatic analysis of gene expression data sets. Including the literature in gene expression data analysis offers an opportunity to incorporate background functional information about the genes when defining expression clusters. In Chapter 5 we saw how literature- based approaches could help in the analysis of single condition experiments. Here we will apply the strategies introduced in Chapter 6 to assess the coherence of groups of genes to enhance gene expression analysis approaches. The methods proposed here could, in fact, be applied to any multivariate genomics data type. The key concepts discussed in this chapter are listed in the frame box. We begin with a discussion of gene groups and their role in expression analysis; we briefly discuss strategies to assign keywords to groups and strategies to assess their functional coherence. We apply functional coherence measures to gene expression analysis; for examples we focus on a yeast expression data set. We first demonstrate how functional coherence can be used to focus in on the key biologically relevant gene groups derived by clustering methods such as self-organizing maps and k-means clustering.

Download Full-text

Mining and integrating reliable decision rules for imbalanced cancer gene expression data sets

Tsinghua Science & Technology ◽

10.1109/tst.2012.6374368 ◽

2012 ◽

Vol 17 (6) ◽

pp. 666-673 ◽

Cited By ~ 15

Author(s):

Hualong Yu ◽

Jun Ni ◽

Yuanyuan Dan ◽

Sen Xu

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Decision Rules ◽

Data Sets ◽

Cancer Gene ◽

Expression Data

Download Full-text

Cancer Gene Expression Data Analysis Using Rough Based Symmetrical Clustering

Bioinformatics ◽

10.4018/978-1-4666-3604-0.ch085 ◽

2013 ◽

pp. 1626-1641

Author(s):

Anasua Sarkar ◽

Ujjwal Maulik

Keyword(s):

Gene Expression ◽

Data Analysis ◽

Gene Expression Data ◽

Rough Set ◽

Clustering Algorithm ◽

Data Sets ◽

Cancer Gene ◽

Expression Data ◽

Gene Expression Data Analysis ◽

Cancer Subtypes

Identification of cancer subtypes is the central goal in the cancer gene expression data analysis. Modified symmetry-based clustering is an unsupervised learning technique for detecting symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of cancer tissues (samples), in this chapter, the authors propose a rough set based hybrid approach for modified symmetry-based clustering algorithm. A natural basis for analyzing gene expression data using the symmetry-based algorithm is to group together genes with similar symmetrical patterns of microarray expressions. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the problem of unknown knowledge of number of clusters in gene expression measurement data. For rough-set-theoretic decision rule generation, each cluster is classified using heuristically searched optimal reducts to overcome overlapping cluster problem. The rough modified symmetry-based clustering algorithm is compared with another newly implemented rough-improved symmetry-based clustering algorithm and existing K-Means algorithm over five benchmark cancer gene expression data sets, to demonstrate its superiority in terms of validity. The statistical analyses are also performed to establish the significance of this rough modified symmetry-based clustering approach.

Download Full-text

Paramaterless Clustering Techniques for Gene Expression Analysis

Advanced Data Mining Technologies in Bioinformatics ◽

10.4018/978-1-59140-863-5.ch009 ◽

2011 ◽

pp. 155-173

Author(s):

Vincent S. Tseng ◽

Ching-Pin Kao

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Expression Analysis ◽

Gene Expression Analysis ◽

In Silico Analysis ◽

Data Sets ◽

Expression Data ◽

Clustering Methods ◽

High Quality ◽

Or Gene

In recent years, clustering analysis has even become a valuable and useful tool for in-silico analysis of microarray or gene expression data. Although a number of clustering methods have been proposed, they are confronted with difficulties in meeting the requirements of automation, high quality, and high efficiency at the same time. In this chapter, we discuss the issue of parameterless clustering technique for gene expression analysis. We introduce two novel, parameterless and efficient clustering methods that fit for analysis of gene expression data. The unique feature of our methods is they incorporate the validation techniques into the clustering process so that high quality results can be obtained. Through experimental evaluation, these methods are shown to outperform other clustering methods greatly in terms of clustering quality, efficiency, and automation on both of synthetic and real data sets.

Download Full-text

Cancer Gene Expression Data Analysis Using Rough Based Symmetrical Clustering

Handbook of Research on Computational Intelligence for Engineering, Science, and Business ◽

10.4018/978-1-4666-2518-1.ch027 ◽

2013 ◽

pp. 699-715 ◽

Cited By ~ 4

Author(s):

Anasua Sarkar ◽

Ujjwal Maulik

Keyword(s):

Gene Expression ◽

Data Analysis ◽

Gene Expression Data ◽

Rough Set ◽

Clustering Algorithm ◽

Data Sets ◽

Cancer Gene ◽

Expression Data ◽

Gene Expression Data Analysis ◽

Cancer Subtypes

Identification of cancer subtypes is the central goal in the cancer gene expression data analysis. Modified symmetry-based clustering is an unsupervised learning technique for detecting symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of cancer tissues (samples), in this chapter, the authors propose a rough set based hybrid approach for modified symmetry-based clustering algorithm. A natural basis for analyzing gene expression data using the symmetry-based algorithm is to group together genes with similar symmetrical patterns of microarray expressions. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the problem of unknown knowledge of number of clusters in gene expression measurement data. For rough-set-theoretic decision rule generation, each cluster is classified using heuristically searched optimal reducts to overcome overlapping cluster problem. The rough modified symmetry-based clustering algorithm is compared with another newly implemented rough-improved symmetry-based clustering algorithm and existing K-Means algorithm over five benchmark cancer gene expression data sets, to demonstrate its superiority in terms of validity. The statistical analyses are also performed to establish the significance of this rough modified symmetry-based clustering approach.

Download Full-text

Mining Rules for the Automatic Selection Process of Clustering Methods Applied to Cancer Gene Expression Data

Artificial Neural Networks – ICANN 2009 - Lecture Notes in Computer Science ◽

10.1007/978-3-642-04277-5_3 ◽

2009 ◽

pp. 20-29 ◽

Cited By ~ 9

Author(s):

André C. A. Nascimento ◽

Ricardo B. C. Prudêncio ◽

Marcilio C. P. de Souto ◽

Ivan G. Costa

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Selection Process ◽

Cancer Gene ◽

Expression Data ◽

Clustering Methods ◽

Automatic Selection

Download Full-text

Intrinsic bias in breast cancer gene expression data sets

BMC Cancer ◽

10.1186/1471-2407-9-214 ◽

2009 ◽

Vol 9 (1) ◽

Cited By ~ 2

Author(s):

Jonathan D Mosley ◽

Ruth A Keri

Keyword(s):

Breast Cancer ◽

Gene Expression ◽

Gene Expression Data ◽

Data Sets ◽

Cancer Gene ◽

Expression Data ◽

Breast Cancer Gene

Download Full-text

Using Supervised Complexity Measures in the Analysis of Cancer Gene Expression Data Sets

Advances in Bioinformatics and Computational Biology - Lecture Notes in Computer Science ◽

10.1007/978-3-642-03223-3_5 ◽

2009 ◽

pp. 48-59 ◽

Cited By ~ 3

Author(s):

Ivan G. Costa ◽

Ana C. Lorena ◽

Liciana R. M. P. y Peres ◽

Marcilio C. P. de Souto

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Data Sets ◽

Cancer Gene ◽

Expression Data ◽

Complexity Measures

Download Full-text

Ensemble Feature Selection from Cancer Gene Expression Data using Mutual Information and Recursive Feature Elimination

2020 Third International Conference on Advances in Electronics, Computers and Communications (ICAECC) ◽

10.1109/icaecc50550.2020.9339518 ◽

2020 ◽

Author(s):

Nimrita Koul ◽

Sunilkumar S Manvi

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Mutual Information ◽

Gene Expression Data ◽

Recursive Feature Elimination ◽

Cancer Gene ◽

Expression Data

Download Full-text

A New Two-steps Gene Expression Data Clustering Method

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery ◽

10.1109/fskd.2009.481 ◽

2009 ◽

Author(s):

Yanjie Zhang ◽

Veronique Prinet ◽

Shuanhu Wu

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Data Clustering ◽

Expression Data ◽

Clustering Method ◽

Gene Expression Data Clustering

Download Full-text