scholarly journals Sub-Graph Regularization on Kernel Regression for Robust Semi-Supervised Dimensionality Reduction

Entropy ◽  
2019 ◽  
Vol 21 (11) ◽  
pp. 1125
Author(s):  
Jiao Liu ◽  
Mingbo Zhao ◽  
Weijian Kong

Dimensionality reduction has always been a major problem for handling huge dimensionality datasets. Due to the utilization of labeled data, supervised dimensionality reduction methods such as Linear Discriminant Analysis tend achieve better classification performance compared with unsupervised methods. However, supervised methods need sufficient labeled data in order to achieve satisfying results. Therefore, semi-supervised learning (SSL) methods can be a practical selection rather than utilizing labeled data. In this paper, we develop a novel SSL method by extending anchor graph regularization (AGR) for dimensionality reduction. In detail, the AGR is an accelerating semi-supervised learning method to propagate the class labels to unlabeled data. However, it cannot handle new incoming samples. We thereby improve AGR by adding kernel regression on the basic objective function of AGR. Therefore, the proposed method can not only estimate the class labels of unlabeled data but also achieve dimensionality reduction. Extensive simulations on several benchmark datasets are conducted, and the simulation results verify the effectiveness for the proposed work.

2014 ◽  
Vol 2014 ◽  
pp. 1-14 ◽  
Author(s):  
Jianwei Zheng ◽  
Hangke Zhang ◽  
Carlo Cattani ◽  
Wanliang Wang

Dimensionality reduction is an important issue for numerous applications including biomedical images analysis and living system analysis. Neighbor embedding, those representing the global and local structure as well as dealing with multiple manifolds, such as the elastic embedding techniques, can go beyond traditional dimensionality reduction methods and find better optima. Nevertheless, existing neighbor embedding algorithms can not be directly applied in classification as suffering from several problems: (1) high computational complexity, (2) nonparametric mappings, and (3) lack of class labels information. We propose a supervised neighbor embedding called discriminative elastic embedding (DEE) which integrates linear projection matrix and class labels into the final objective function. In addition, we present the Laplacian search direction for fast convergence. DEE is evaluated in three aspects: embedding visualization, training efficiency, and classification performance. Experimental results on several benchmark databases present that the proposed DEE exhibits a supervised dimensionality reduction approach which not only has strong pattern revealing capability, but also brings computational advantages over standard gradient based methods.


Entropy ◽  
2019 ◽  
Vol 21 (10) ◽  
pp. 988 ◽  
Author(s):  
Fazakis ◽  
Kanas ◽  
Aridas ◽  
Karlos ◽  
Kotsiantis

One of the major aspects affecting the performance of the classification algorithms is the amount of labeled data which is available during the training phase. It is widely accepted that the labeling procedure of vast amounts of data is both expensive and time-consuming since it requires the employment of human expertise. For a wide variety of scientific fields, unlabeled examples are easy to collect but hard to handle in a useful manner, thus improving the contained information for a subject dataset. In this context, a variety of learning methods have been studied in the literature aiming to efficiently utilize the vast amounts of unlabeled data during the learning process. The most common approaches tackle problems of this kind by individually applying active learning or semi-supervised learning methods. In this work, a combination of active learning and semi-supervised learning methods is proposed, under a common self-training scheme, in order to efficiently utilize the available unlabeled data. The effective and robust metrics of the entropy and the distribution of probabilities of the unlabeled set, to select the most sufficient unlabeled examples for the augmentation of the initial labeled set, are used. The superiority of the proposed scheme is validated by comparing it against the base approaches of supervised, semi-supervised, and active learning in the wide range of fifty-five benchmark datasets.


Author(s):  
Mohamed Nadjib Boufenara ◽  
Mahmoud Boufaida ◽  
Mohamed Lamine Berkane

With the exponential growth of biological data, labeling this kind of data becomes difficult and costly. Although unlabeled data are comparatively more plentiful than labeled ones, most supervised learning methods are not designed to use unlabeled data. Semi-supervised learning methods are motivated by the availability of large unlabeled datasets rather than a small amount of labeled examples. However, incorporating unlabeled data into learning does not guarantee an improvement in classification performance. This paper introduces an approach based on a model of semi-supervised learning, which is the self-training with a deep learning algorithm to predict missing classes from labeled and unlabeled data. In order to assess the performance of the proposed approach, two datasets are used with four performance measures: precision, recall, F-measure, and area under the ROC curve (AUC).


2019 ◽  
Vol 2019 ◽  
pp. 1-8
Author(s):  
Hui Xu ◽  
Yongguo Yang ◽  
Xin Wang ◽  
Mingming Liu ◽  
Hongxia Xie ◽  
...  

Traditional supervised multiple kernel learning (MKL) for dimensionality reduction is generally an extension of kernel discriminant analysis (KDA), which has some restrictive assumptions. In addition, they generally are based on graph embedding framework. A more general multiple kernel-based dimensionality reduction algorithm, called multiple kernel marginal Fisher analysis (MKL-MFA), is presented for supervised nonlinear dimensionality reduction combined with ratio-race optimization problem. MKL-MFA aims at relaxing the restrictive assumption that the data of each class is of a Gaussian distribution and finding an appropriate convex combination of several base kernels. To improve the efficiency of multiple kernel dimensionality reduction, the spectral regression frameworks are incorporated into the optimization model. Furthermore, the optimal weights of predefined base kernels can be obtained by solving a different convex optimization. Experimental results on benchmark datasets demonstrate that MKL-MFA outperforms the state-of-the-art supervised multiple kernel dimensionality reduction methods.


2019 ◽  
Vol 2019 ◽  
pp. 1-10 ◽  
Author(s):  
Zhibo Guo ◽  
Ying Zhang

It is very difficult to process and analyze high-dimensional data directly. Therefore, it is necessary to learn a potential subspace of high-dimensional data through excellent dimensionality reduction algorithms to preserve the intrinsic structure of high-dimensional data and abandon the less useful information. Principal component analysis (PCA) and linear discriminant analysis (LDA) are two popular dimensionality reduction methods for high-dimensional sensor data preprocessing. LDA contains two basic methods, namely, classic linear discriminant analysis and FS linear discriminant analysis. In this paper, a new method, called similar distribution discriminant analysis (SDDA), is proposed based on the similarity of samples’ distribution. Furthermore, the method of solving the optimal discriminant vector is given. These discriminant vectors are orthogonal and nearly statistically uncorrelated. The disadvantages of PCA and LDA are overcome, and the extracted features are more effective by using SDDA. The recognition performance of SDDA exceeds PCA and LDA largely. Some experiments on the Yale face database, FERET face database, and UCI multiple features dataset demonstrate that the proposed method is effective. The results reveal that SDDA obtains better performance than comparison dimensionality reduction methods.


Author(s):  
Xuelong Li ◽  
Mulin Chen ◽  
Feiping Nie ◽  
Qi Wang

Linear Discriminant Analysis (LDA) is a popular technique for supervised dimensionality reduction, and its performance is satisfying when dealing with Gaussian distributed data. However, the neglect of local data structure makes LDA inapplicable to many real-world situations. So some works focus on the discriminant analysis between neighbor points, which can be easily affected by the noise in the original data space. In this paper, we propose a new supervised dimensionality reduction method, Locality Adaptive Discriminant Analysis (LADA), to lean a representative subspace of the data. Compared to LDA and its variants, the proposed method has three salient advantages: (1) it finds the principle projection directions without imposing any assumption on the data distribution; (2) it’s able to exploit the local manifold structure of data in the desired subspace; (3) it exploits the points’ neighbor relationship automatically without introducing any additional parameter to be tuned. Performance on synthetic datasets and real-world benchmark datasets demonstrate the superiority of the proposed method.


Author(s):  
A. Paul ◽  
K. Vogt ◽  
F. Rottensteiner ◽  
J. Ostermann ◽  
C. Heipke

In this paper we deal with the problem of measuring the similarity between training and tests datasets in the context of transfer learning (TL) for image classification. TL tries to transfer knowledge from a source domain, where labelled training samples are abundant but the data may follow a different distribution, to a target domain, where labelled training samples are scarce or even unavailable, assuming that the domains are related. Thus, the requirements w.r.t. the availability of labelled training samples in the target domain are reduced. In particular, if no labelled target data are available, it is inherently difficult to find a robust measure of relatedness between the source and target domains. This is of crucial importance for the performance of TL, because the knowledge transfer between unrelated data may lead to negative transfer, i.e. to a decrease of classification performance after transfer. We address the problem of measuring the relatedness between source and target datasets and investigate three different strategies to predict and, consequently, to avoid negative transfer in this paper. The first strategy is based on circular validation. The second strategy relies on the Maximum Mean Discrepancy (MMD) similarity metric, whereas the third one is an extension of MMD which incorporates the knowledge about the class labels in the source domain. Our method is evaluated using two different benchmark datasets. The experiments highlight the strengths and weaknesses of the investigated methods. We also show that it is possible to reduce the amount of negative transfer using these strategies for a TL method and to generate a consistent performance improvement over the whole dataset.


2016 ◽  
Vol 2016 ◽  
pp. 1-12
Author(s):  
Ningbo Hao ◽  
Jie Yang ◽  
Haibin Liao ◽  
Wenhua Dai

Various methods for feature extraction and dimensionality reduction have been proposed in recent decades, including supervised and unsupervised methods and linear and nonlinear methods. Despite the different motivations of these methods, we present in this paper a general formulation known as factor analysis to unify them within a common framework. During factor analysis, an object can be seen as being comprised of content and style factors, and the objective of feature extraction and dimensionality reduction is to obtain the content factor without style factor. There are two vital steps in factor analysis framework; one is the design of factor separating objective function, including the design of partition and weight matrix, and the other is the design of space mapping function. In this paper, classical Linear Discriminant Analysis (LDA) and Locality Preserving Projection (LPP) algorithms are improved based on factor analysis framework, and LDA based on factor analysis (FA-LDA) and LPP based on factor analysis (FA-LPP) are proposed. Experimental results show the superiority of our proposed approach in classification performance compared to classical LDA and LPP algorithms.


Sign in / Sign up

Export Citation Format

Share Document