Relevant Attribute Discovery in High Dimensional Data Based on Rough Sets and Unsupervised Classification: Application to Leukemia Gene Expressions

Relevant Attribute Discovery in High Dimensional Data: Application to Breast Cancer Gene Expressions

Rough Sets and Knowledge Technology - Lecture Notes in Computer Science ◽

10.1007/11795131_70 ◽

2006 ◽

pp. 482-489 ◽

Cited By ~ 11

Author(s):

Julio J. Valdés ◽

Alan J. Barton

Keyword(s):

Breast Cancer ◽

High Dimensional Data ◽

High Dimensional ◽

Cancer Gene ◽

Gene Expressions ◽

Data Application ◽

Relevant Attribute ◽

Breast Cancer Gene

Download Full-text

CHANGE DETECTION SOFTWARE USING SELF-ORGANIZING FEATURE MAPS

Brazilian Journal of Geophysics ◽

10.22564/rbgf.v30i4.237 ◽

2012 ◽

Vol 30 (4) ◽

pp. 505

Author(s):

Nilton Correia da Silva ◽

Osmar Abílio de Carvalho Júnior ◽

Antonio Nuno de Castro Santa Rosa ◽

Renato Fontes Guimarães ◽

Roberto Arnaldo Trancoso Gomes

Keyword(s):

Remote Sensing ◽

Change Detection ◽

High Dimensional Data ◽

Unsupervised Classification ◽

Fine Tuning ◽

Primary Data ◽

High Dimensional ◽

Feature Maps ◽

Self Organizing

Os mapas auto-organizáveis (SOFM) consistem em um tipo de rede neural artificial que permite a conversão de dados de alta dimensão, complexos e não lineares, em simples relações geométricas com baixa dimensionalidade. Este método também pode ser utilizado para a classificação de imagens de sensoriamento remoto, pois permite a compressão de dados de alta dimensão preservando as relações topológicas dos dados primários. Este trabalho objetiva desenvolver uma metodologia eficaz para a utilização de mapas auto-organizáveis na detecção de mudanças. No presente estudo o SOFM é utilizado para a classificação não supervisionada de dados de sensoriamento remoto, considerando os seguintes atributos: espaciais (x, y), espectrais e temporais. O método é empregado na região oeste da Bahia, que teve recentemente um aumento significativo em monoculturas. Testes foram realizados com os parâmetros do SOFM com o objetivo de refinar o mapa de detecção demudanças. O SOFM possibilita uma melhor seleção de células e dos correspondentes vetores de peso, que mostram o processo de ordenação e agrupamento hierárquicodos dados. Esta informação é essencial para identificar mudanças ao longo do tempo. Um programa em linguagem C ++ do método proposto foi desenvolvido. ABSTRACT. Self-organizing feature maps (SOFM) consist of a type of artificial neural network that allows the conversion from high-dimensional data into simple geometric relationships with low-dimensionality. This method can also be used for classification of remote sensing images because it allows the compression of high dimensional data while preserving the most important topological and metric relationships of the primary data. This paper aims to develop an effective methodology forusing self-organizing maps in change detection. In this study, SOFM is used for unsupervised classification of remote sensing data, considering the following attributes: spatial (x and y), spectral and temporal. The method is tested and simulated in the western region of Bahia that has observed a significant increase in mechanized agriculture. Tests were performed with the SOFM parameters for the purpose of fine tuning a change detection map. The SOFM provides the best selection of cell and corresponding adjustment of weight vectors, which show the process of ordering and hierarchical clustering of the data. This information is essential to identify changes over time. All algorithms were implemented in C++ language.Keywords: unsupervised classification; land cover; multitemporal analysis; remote sensing

Download Full-text

Deep learning-based clustering approaches for bioinformatics

Briefings in Bioinformatics ◽

10.1093/bib/bbz170 ◽

2020 ◽

Cited By ~ 7

Author(s):

Md Rezaul Karim ◽

Oya Beyan ◽

Achille Zappa ◽

Ivan G Costa ◽

Dietrich Rebholz-Schuhmann ◽

...

Keyword(s):

Deep Learning ◽

Cancer Genomics ◽

Clustering Algorithms ◽

Effective Means ◽

High Dimensional Data ◽

Computational Method ◽

High Dimensional ◽

Gene Expressions ◽

Starting Point ◽

Clustering Quality

Abstract Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing unstructured and high-dimensional data in the form of sequences, expressions, texts and images. Further, clustering is used to gain insights into biological processes in the genomics level, e.g. clustering of gene expressions provides insights on the natural structure inherent in the data, understanding gene functions, cellular processes, subtypes of cells and understanding gene regulations. Subsequently, clustering approaches, including hierarchical, centroid-based, distribution-based, density-based and self-organizing maps, have long been studied and used in classical machine learning settings. In contrast, deep learning (DL)-based representation and feature learning for clustering have not been reviewed and employed extensively. Since the quality of clustering is not only dependent on the distribution of data points but also on the learned representation, deep neural networks can be effective means to transform mappings from a high-dimensional data space into a lower-dimensional feature space, leading to improved clustering results. In this paper, we review state-of-the-art DL-based approaches for cluster analysis that are based on representation learning, which we hope to be useful, particularly for bioinformatics research. Further, we explore in detail the training procedures of DL-based clustering algorithms, point out different clustering quality metrics and evaluate several DL-based approaches on three bioinformatics use cases, including bioimaging, cancer genomics and biomedical text mining. We believe this review and the evaluation results will provide valuable insights and serve a starting point for researchers wanting to apply DL-based unsupervised methods to solve emerging bioinformatics research problems.

Download Full-text

Large Sample Covariance Matrices and High-Dimensional Data Analysis

10.1017/cbo9781107588080 ◽

2015 ◽

Cited By ~ 26

Author(s):

Jianfeng Yao ◽

Shurong Zheng ◽

Zhidong Bai

Keyword(s):

Data Analysis ◽

High Dimensional Data ◽

Covariance Matrices ◽

High Dimensional ◽

Large Sample ◽

Sample Covariance Matrices ◽

Sample Covariance ◽

High Dimensional Data Analysis

Download Full-text

Fractal-Based Methods as a Technique for Estimating the Intrinsic Dimensionality of High-Dimensional Data: A Survey

Informatica ◽

10.15388/informatica.2016.84 ◽

2016 ◽

Vol 27 (2) ◽

pp. 257-281 ◽

Cited By ~ 5

Author(s):

Rasa Karbauskaitė ◽

Gintautas Dzemyda

Keyword(s):

High Dimensional Data ◽

High Dimensional ◽

Intrinsic Dimensionality

Download Full-text

A Fast Clustering Algorithm for Large-scale and High Dimensional Data

ACTA AUTOMATICA SINICA ◽

10.3724/sp.j.1004.2009.00859 ◽

2009 ◽

Vol 35 (7) ◽

pp. 859-866

Author(s):

Ming LIU ◽

Xiao-Long WANG ◽

Yuan-Chao LIU

Keyword(s):

Large Scale ◽

Clustering Algorithm ◽

High Dimensional Data ◽

High Dimensional

Download Full-text

Improved negative selection algorithm for network anomaly detection on high-dimensional data

Journal of Computer Applications ◽

10.3724/sp.j.1087.2009.00805 ◽

2009 ◽

Vol 29 (3) ◽

pp. 805-807 ◽

Cited By ~ 1

Author(s):

Wen-zhong GUO ◽

Guo-long CHEN ◽

Qing-liang CHEN

Keyword(s):

Anomaly Detection ◽

Negative Selection ◽

High Dimensional Data ◽

High Dimensional ◽

Selection Algorithm ◽

Negative Selection Algorithm ◽

Network Anomaly Detection

Download Full-text

An Advanced Mining Services in Predicting and Ranking User Vitality across Dynamic and High Dimensional Data Sets

SSRN Electronic Journal ◽

10.2139/ssrn.3395242 ◽

2019 ◽

Author(s):

Ch. Durga Bhavani ◽

Dr. A. Daveedu Raju ◽

Dr. V. Surya Narayana

Keyword(s):

High Dimensional Data ◽

High Dimensional ◽

Data Sets

Download Full-text

Outlier Detection in High Dimensional Data Based on the Anti-Hub and Regression Technique

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2017.8219 ◽

2017 ◽

Vol V (VIII) ◽

pp. 1543-1551

Author(s):

Golla Hemalatha

Keyword(s):

Outlier Detection ◽

High Dimensional Data ◽

Regression Technique ◽

High Dimensional

Download Full-text

Approximate Cluster Heat Maps of Large High-Dimensional Data

2018 24th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr.2018.8545519 ◽

2018 ◽

Cited By ~ 1

Author(s):

Punit Rathore ◽

James C. Bezdek ◽

Dheeraj Kumar ◽

Sutharshan Rajasegarar ◽

Marimuthu Palaniswami

Keyword(s):

High Dimensional Data ◽

High Dimensional ◽

Heat Maps

Download Full-text