Relevant Attribute Discovery in High Dimensional Data Based on Rough Sets and Unsupervised Classification: Application to Leukemia Gene Expressions

Author(s):  
Julio J. Valdés ◽  
Alan J. Barton
2012 ◽  
Vol 30 (4) ◽  
pp. 505
Author(s):  
Nilton Correia da Silva ◽  
Osmar Abílio de Carvalho Júnior ◽  
Antonio Nuno de Castro Santa Rosa ◽  
Renato Fontes Guimarães ◽  
Roberto Arnaldo Trancoso Gomes

Os mapas auto-organizáveis (SOFM) consistem em um tipo de rede neural artificial que permite a conversão de dados de alta dimensão, complexos e não lineares, em simples relações geométricas com baixa dimensionalidade. Este método também pode ser utilizado para a classificação de imagens de sensoriamento remoto, pois permite a compressão de dados de alta dimensão preservando as relações topológicas dos dados primários. Este trabalho objetiva desenvolver uma metodologia eficaz para a utilização de mapas auto-organizáveis na detecção de mudanças. No presente estudo o SOFM é utilizado para a classificação não supervisionada de dados de sensoriamento remoto, considerando os seguintes atributos: espaciais (x, y), espectrais e temporais. O método é empregado na região oeste da Bahia, que teve recentemente um aumento significativo em monoculturas. Testes foram realizados com os parâmetros do SOFM com o objetivo de refinar o mapa de detecção demudanças. O SOFM possibilita uma melhor seleção de células e dos correspondentes vetores de peso, que mostram o processo de ordenação e agrupamento hierárquicodos dados. Esta informação é essencial para identificar mudanças ao longo do tempo. Um programa em linguagem C ++ do método proposto foi desenvolvido. ABSTRACT. Self-organizing feature maps (SOFM) consist of a type of artificial neural network that allows the conversion from high-dimensional data into simple geometric relationships with low-dimensionality. This method can also be used for classification of remote sensing images because it allows the compression of high dimensional data while preserving the most important topological and metric relationships of the primary data. This paper aims to develop an effective methodology forusing self-organizing maps in change detection. In this study, SOFM is used for unsupervised classification of remote sensing data, considering the following attributes: spatial (x and y), spectral and temporal. The method is tested and simulated in the western region of Bahia that has observed a significant increase in mechanized agriculture. Tests were performed with the SOFM parameters for the purpose of fine tuning a change detection map. The SOFM provides the best selection of cell and corresponding adjustment of weight vectors, which show the process of ordering and hierarchical clustering of the data. This information is essential to identify changes over time. All algorithms were implemented in C++ language.Keywords: unsupervised classification; land cover; multitemporal analysis; remote sensing


Author(s):  
Md Rezaul Karim ◽  
Oya Beyan ◽  
Achille Zappa ◽  
Ivan G Costa ◽  
Dietrich Rebholz-Schuhmann ◽  
...  

Abstract Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing unstructured and high-dimensional data in the form of sequences, expressions, texts and images. Further, clustering is used to gain insights into biological processes in the genomics level, e.g. clustering of gene expressions provides insights on the natural structure inherent in the data, understanding gene functions, cellular processes, subtypes of cells and understanding gene regulations. Subsequently, clustering approaches, including hierarchical, centroid-based, distribution-based, density-based and self-organizing maps, have long been studied and used in classical machine learning settings. In contrast, deep learning (DL)-based representation and feature learning for clustering have not been reviewed and employed extensively. Since the quality of clustering is not only dependent on the distribution of data points but also on the learned representation, deep neural networks can be effective means to transform mappings from a high-dimensional data space into a lower-dimensional feature space, leading to improved clustering results. In this paper, we review state-of-the-art DL-based approaches for cluster analysis that are based on representation learning, which we hope to be useful, particularly for bioinformatics research. Further, we explore in detail the training procedures of DL-based clustering algorithms, point out different clustering quality metrics and evaluate several DL-based approaches on three bioinformatics use cases, including bioimaging, cancer genomics and biomedical text mining. We believe this review and the evaluation results will provide valuable insights and serve a starting point for researchers wanting to apply DL-based unsupervised methods to solve emerging bioinformatics research problems.


2009 ◽  
Vol 35 (7) ◽  
pp. 859-866
Author(s):  
Ming LIU ◽  
Xiao-Long WANG ◽  
Yuan-Chao LIU

Author(s):  
Punit Rathore ◽  
James C. Bezdek ◽  
Dheeraj Kumar ◽  
Sutharshan Rajasegarar ◽  
Marimuthu Palaniswami

Sign in / Sign up

Export Citation Format

Share Document