Unsupervised Subspace Extraction via Deep Kernelized Clustering

2021 ◽  
Vol 16 (1) ◽  
pp. 1-15
Author(s):  
Gyoung S. Na ◽  
Hyunju Chang

Feature extraction has been widely studied to find informative latent features and reduce the dimensionality of data. In particular, due to the difficulty in obtaining labeled data, unsupervised feature extraction has received much attention in data mining. However, widely used unsupervised feature extraction methods require side information about data or rigid assumptions on the latent feature space. Furthermore, most feature extraction methods require predefined dimensionality of the latent feature space,which should be manually tuned as a hyperparameter. In this article, we propose a new unsupervised feature extraction method called Unsupervised Subspace Extractor ( USE ), which does not require any side information and rigid assumptions on data. Furthermore, USE can find a subspace generated by a nonlinear combination of the input feature and automatically determine the optimal dimensionality of the subspace for the given nonlinear combination. The feature extraction process of USE is well justified mathematically, and we also empirically demonstrate the effectiveness of USE for several benchmark datasets.

Computation ◽  
2019 ◽  
Vol 7 (3) ◽  
pp. 39 ◽  
Author(s):  
Laura Sani ◽  
Riccardo Pecori ◽  
Monica Mordonini ◽  
Stefano Cagnoni

The so-called Relevance Index (RI) metrics are a set of recently-introduced indicators based on information theory principles that can be used to analyze complex systems by detecting the main interacting structures within them. Such structures can be described as subsets of the variables which describe the system status that are strongly statistically correlated with one another and mostly independent of the rest of the system. The goal of the work described in this paper is to apply the same principles to pattern recognition and check whether the RI metrics can also identify, in a high-dimensional feature space, attribute subsets from which it is possible to build new features which can be effectively used for classification. Preliminary results indicating that this is possible have been obtained using the RI metrics in a supervised way, i.e., by separately applying such metrics to homogeneous datasets comprising data instances which all belong to the same class, and iterating the procedure over all possible classes taken into consideration. In this work, we checked whether this would also be possible in a totally unsupervised way, i.e., by considering all data available at the same time, independently of the class to which they belong, under the hypothesis that the peculiarities of the variable sets that the RI metrics can identify correspond to the peculiarities by which data belonging to a certain class are distinguishable from data belonging to different classes. The results we obtained in experiments made with some publicly available real-world datasets show that, especially when coupled to tree-based classifiers, the performance of an RI metrics-based unsupervised feature extraction method can be comparable to or better than other classical supervised or unsupervised feature selection or extraction methods.


2020 ◽  
Vol 2 (2) ◽  
pp. 100-108
Author(s):  
Zaurarista Dyarbirru ◽  
Syahroni Hidayat

Voice is the sound emitted from living things. With the development of Automatic Speech Recognition (ASR) technology, voice can be used to make it easier for humans to do something. In the ASR extraction process the features have an important role in the recognition process. The feature extraction methods that are commonly applied to ASR are MFCC and Wavelet. Each of them has advantages and disadvantages. Therefore, this study will combine the wavelet feature extraction method and MFCC to maximize the existing advantages. The proposed method is called Wavelet-MFCC. Voice recognition method that does not use recommendations. Determination of system performance using the Word Recoginition Rate (WRR) method which is validated with the K-Fold Cross Validation with the number of folds is 5. The research dataset used is voice recording digits 0-9 in English. The results show that the digit speech recognition system that has been built gives the highest average value of 63% for digit 4 using wavelet daubechies DB3 and wavelet dyadic transform method. As for the comparison results of the wavelet decomposition method used, that the use of dyadic wavelet transformation is better than the wavelet package.


Author(s):  
Wei Huang ◽  
Xiaohui Wang ◽  
Jianzhong Li ◽  
Zhong Jin

Representation-based classification have received much attention in the field of face recognition. Collaborative representation-based classification (CRC) has shown the robustness and high performance. In this paper, we proposed a new feature extraction method-based collaborative representation. Firstly, we get the coefficients of all face samples by collaborative representation. Then we define the inter-class reconstructive errors and intra-class reconstructive errors for each sample. After that, Fisher criterion is used to get the discriminative feature. At last, CRC is executed to get the identification results in the new feature space. Different from other feature extraction methods, the proposed method integrates the classification criterion into the feature extraction. So the feature space we get fits the classifier better. Experiment results on several face databases show that the proposed method is more effective than other state-of-the-art face recognition methods.


Author(s):  
Htwe Pa Pa Win ◽  
Phyo Thu Thu Khine ◽  
Khin Nwe Ni Tun

This paper proposes a new feature extraction method for off-line recognition of Myanmar printed documents. One of the most important factors to achieve high recognition performance in Optical Character Recognition (OCR) system is the selection of the feature extraction methods. Different types of existing OCR systems used various feature extraction methods because of the diversity of the scripts’ natures. One major contribution of the work in this paper is the design of logically rigorous coding based features. To show the effectiveness of the proposed method, this paper assumed the documents are successfully segmented into characters and extracted features from these isolated Myanmar characters. These features are extracted using structural analysis of the Myanmar scripts. The experimental results have been carried out using the Support Vector Machine (SVM) classifier and compare the pervious proposed feature extraction method.


2010 ◽  
Vol 36 ◽  
pp. 68-74
Author(s):  
Chuan Jun Liao ◽  
Shuang Fu Suo ◽  
Wei Feng Huang

Acoustic emission (AE) techniques are put forward to monitor rub-impacts between rotating rings and stationary rings of mechanical seals by this paper. By analyzing feature extraction methods of the typical rub-impact AE signal, the method combining of wavelet scalogram and power spectrum is found useful, and can used to attribute the feature information implicated in rub-impact AE signals of mechanical seal end faces. Both simulations and experimental research prove that the method is effective, and are used successfully to identify the typical features of different types of rub-impacts of mechanical seal end faces.


Author(s):  
Bhuvaneswari Chandran ◽  
P. Aruna ◽  
D. Loganathan

The purpose of the chapter is to present a novel method to classify lung diseases from the computed tomography images which assist physicians in the diagnosis of lung diseases. The method is based on a new approach which combines a proposed M2 feature extraction method and a novel hybrid genetic approach with different types of classifiers. The feature extraction methods performed in this work are moment invariants, proposed multiscale filter method and proposed M2 feature extraction method. The essential features which are the results of the feature extraction technique are selected by the novel hybrid genetic algorithm feature selection algorithms. Classification is performed by the support vector machine, multilayer perceptron neural network and Bayes Net classifiers. The result obtained proves that the proposed technique is an efficient and robust method. The performance of the proposed M2 feature extraction with proposed hybrid GA and SVM classifier combination achieves maximum classification accuracy.


2020 ◽  
Vol 10 (3) ◽  
pp. 944 ◽  
Author(s):  
Ying Feng ◽  
Jianwen Wu

As a key component to ensure the safe operation of the power grid, mechanical defect diagnosis technology of gas-insulated switchgear (GIS) during operation is often neglected. At present, GIS mechanical fault detection based on vibration information has not been developed. The main reason is that the excitation current is considerable but uncontrollable in the actual operation of GIS. It is difficult to eliminate the influence of excitation on the vibration amplitude and form an effective vibration feature description technology. Therefore, this paper proposes a unified feature-extraction method for GIS vibration information that reduces the influence of current amplitude for mechanical fault diagnosis. Starting from the GIS mechanical analysis, the periodicity of vibration excitation and the influence of amplitude are discussed. Then, combined with the non-linear characteristics of GIS systems and non-linear vibration theory, the multiplier frequency energy ratio (MFER) is designed to extract vibration-unified features of GIS for diagnosing the mechanical fault under different current levels. The diagnosis results of the experimental data with different feature-extraction methods show the applicability and superiority of the proposed method in the GIS’s mechanical fault-detection field based on vibration information.


Author(s):  
NOJUN KWAK

In many pattern recognition problems, it is desirable to reduce the number of input features by extracting important features related to the problems. By focusing on only the problem-relevant features, the dimension of features can be greatly reduced and thereby can result in a better generalization performance with less computational complexity. In this paper, we propose a feature extraction method for handling classification problems. The proposed algorithm is used to search for a set of linear combinations of the original features, whose mutual information with the output class can be maximized. The mutual information between the extracted features and the output class is calculated by using the probability density estimation based on the Parzen window method. A greedy algorithm using the gradient descent method is used to determine the new features. The computational load is proportional to the square of the number of samples. The proposed method was applied to several classification problems, which showed better or comparable performances than the conventional feature extraction methods.


Sign in / Sign up

Export Citation Format

Share Document