Unsupervised feature extraction and band subset selection techniques based on relative entropy criteria for hyperspectral data analysis

Author(s):  
Emmanuel Arzuaga-Cruz ◽  
Luis O. Jimenez-Rodriguez ◽  
Miguel Velez-Reyes
2019 ◽  
Author(s):  
Y-h. Taguchi

AbstractMultiomics data analysis is the central issue of genomics science. In spite of that, there are not well defined methods that can integrate multomics data sets, which are formatted as matrices with different sizes. In this paper, I propose the usage of tensor decomposition based unsupervised feature extraction as a data mining tool for multiomics data set. It can successfully integrate miRNA expression, mRNA expression and proteome, which were used as a demonstration example of DIABLO that is the recently proposed advanced method for the integrated analysis of multiomics data set.


2012 ◽  
Vol 3 (3) ◽  
pp. 269-298 ◽  
Author(s):  
Prashanth Reddy Marpu ◽  
Mattia Pedergnana ◽  
Mauro Dalla Mura ◽  
Stijn Peeters ◽  
Jon Atli Benediktsson ◽  
...  

Author(s):  
D. Akbari

In this paper an extended classification approach for hyperspectral imagery based on both spectral and spatial information is proposed. The spatial information is obtained by an enhanced marker-based minimum spanning forest (MSF) algorithm. Three different methods of dimension reduction are first used to obtain the subspace of hyperspectral data: (1) unsupervised feature extraction methods including principal component analysis (PCA), independent component analysis (ICA), and minimum noise fraction (MNF); (2) supervised feature extraction including decision boundary feature extraction (DBFE), discriminate analysis feature extraction (DAFE), and nonparametric weighted feature extraction (NWFE); (3) genetic algorithm (GA). The spectral features obtained are then fed into the enhanced marker-based MSF classification algorithm. In the enhanced MSF algorithm, the markers are extracted from the classification maps obtained by both SVM and watershed segmentation algorithm. To evaluate the proposed approach, the Pavia University hyperspectral data is tested. Experimental results show that the proposed approach using GA achieves an approximately 8 % overall accuracy higher than the original MSF-based algorithm.


2021 ◽  
Author(s):  
Y-h. Taguchi ◽  
Turki Turki

Motivation: Feature selection of multi-omics data analysis remains challenging since omics data include 102-105 features. How to weight an individual omics dataset is unclear and greatly affects feature selection consequences. In this study, a recently proposed kernel tensor decomposition (KTD)-based unsupervised feature extraction (FE) was extended to integrate multi-omics datasets measured over common samples in a weight-free manner. Results: KTD-based unsupervised FE was reformatted as the collection of kernelized tensors sharing common samples and was applied to synthetic, as well as real, datasets. The proposed advanced KTD-based unsupervised FE performed comparatively with the previously proposed KTD, as well as TD-based unsupervised FE, with reduced memory and central processing unit time. This advanced KTD method, specifically designed for multi-omics analysis, attributes P-values to features, which other multi-omics-oriented methods rarely do. Availability: Sample R code is available in https://github.com/tagtag/MultiR/


2021 ◽  
Author(s):  
Makoto Kashima ◽  
Nobuyoshi Kumagai ◽  
Hiromi Hirata ◽  
Y-h. Taguchi

RNA-Seq data analysis of non-model organisms is often difficult because of the lack of a well-annotated genome. In model organisms, after short reads are mapped to the genome, it is possible to focus on the analysis of regions well-annotated regions. However, in non-model organisms, contigs can be generated by de novo assembling. This can result in a large number of transcripts, making it difficult to easily remove redundancy. A large number of transcripts can also lead to difficulty in the recognition of differentially expressed transcripts (DETs) between more than two experimental conditions, because P-values must be corrected by considering multiple comparison corrections whose effect is enhanced as the number of transcripts increases. Heavily corrected P-values often fail to take sufficiently small P-values as significant. In this study, we applied a recently proposed tensor decomposition (TD)-based unsupervised feature extraction (FE) to the RNA-seq data obtained for a non-model organism, Planarian; we successfully obtained a limited number of transcripts whose expression was altered between normal and defective samples as well as during time development. TD-based unsupervised FE is expected to be an effective tool that can identify a limited number of DETs, even when a poorly annotated genome is available.


Sign in / Sign up

Export Citation Format

Share Document