Sub-Graph Regularization on Kernel Regression for Robust Semi-Supervised Dimensionality Reduction

Jiao Liu; Mingbo Zhao; Weijian Kong

doi:10.3390/e21111125

Sub-Graph Regularization on Kernel Regression for Robust Semi-Supervised Dimensionality Reduction

Entropy ◽

10.3390/e21111125 ◽

2019 ◽

Vol 21 (11) ◽

pp. 1125

Author(s):

Jiao Liu ◽

Mingbo Zhao ◽

Weijian Kong

Keyword(s):

Dimensionality Reduction ◽

Supervised Learning ◽

Kernel Regression ◽

Classification Performance ◽

Unlabeled Data ◽

Graph Regularization ◽

Linear Discriminant ◽

Benchmark Datasets ◽

Reduction Methods ◽

Class Labels

Dimensionality reduction has always been a major problem for handling huge dimensionality datasets. Due to the utilization of labeled data, supervised dimensionality reduction methods such as Linear Discriminant Analysis tend achieve better classification performance compared with unsupervised methods. However, supervised methods need sufficient labeled data in order to achieve satisfying results. Therefore, semi-supervised learning (SSL) methods can be a practical selection rather than utilizing labeled data. In this paper, we develop a novel SSL method by extending anchor graph regularization (AGR) for dimensionality reduction. In detail, the AGR is an accelerating semi-supervised learning method to propagate the class labels to unlabeled data. However, it cannot handle new incoming samples. We thereby improve AGR by adding kernel regression on the basic objective function of AGR. Therefore, the proposed method can not only estimate the class labels of unlabeled data but also achieve dimensionality reduction. Extensive simulations on several benchmark datasets are conducted, and the simulation results verify the effectiveness for the proposed work.

Download Full-text

Dimensionality Reduction by Supervised Neighbor Embedding Using Laplacian Search

Computational and Mathematical Methods in Medicine ◽

10.1155/2014/594379 ◽

2014 ◽

Vol 2014 ◽

pp. 1-14 ◽

Cited By ~ 2

Author(s):

Jianwei Zheng ◽

Hangke Zhang ◽

Carlo Cattani ◽

Wanliang Wang

Keyword(s):

Dimensionality Reduction ◽

System Analysis ◽

Classification Performance ◽

Living System ◽

Linear Projection ◽

Reduction Methods ◽

Gradient Based ◽

Class Labels ◽

Global And Local ◽

Training Efficiency

Dimensionality reduction is an important issue for numerous applications including biomedical images analysis and living system analysis. Neighbor embedding, those representing the global and local structure as well as dealing with multiple manifolds, such as the elastic embedding techniques, can go beyond traditional dimensionality reduction methods and find better optima. Nevertheless, existing neighbor embedding algorithms can not be directly applied in classification as suffering from several problems: (1) high computational complexity, (2) nonparametric mappings, and (3) lack of class labels information. We propose a supervised neighbor embedding called discriminative elastic embedding (DEE) which integrates linear projection matrix and class labels into the final objective function. In addition, we present the Laplacian search direction for fast convergence. DEE is evaluated in three aspects: embedding visualization, training efficiency, and classification performance. Experimental results on several benchmark databases present that the proposed DEE exhibits a supervised dimensionality reduction approach which not only has strong pattern revealing capability, but also brings computational advantages over standard gradient based methods.

Download Full-text

Combination of Active Learning and Semi-Supervised Learning under a Self-Training Scheme

Entropy ◽

10.3390/e21100988 ◽

2019 ◽

Vol 21 (10) ◽

pp. 988 ◽

Cited By ~ 4

Author(s):

Fazakis ◽

Kanas ◽

Aridas ◽

Karlos ◽

Kotsiantis

Keyword(s):

Active Learning ◽

Supervised Learning ◽

Unlabeled Data ◽

Classification Algorithms ◽

Training Phase ◽

Learning Methods ◽

Training Scheme ◽

Wide Range ◽

Benchmark Datasets ◽

Scientific Fields

One of the major aspects affecting the performance of the classification algorithms is the amount of labeled data which is available during the training phase. It is widely accepted that the labeling procedure of vast amounts of data is both expensive and time-consuming since it requires the employment of human expertise. For a wide variety of scientific fields, unlabeled examples are easy to collect but hard to handle in a useful manner, thus improving the contained information for a subject dataset. In this context, a variety of learning methods have been studied in the literature aiming to efficiently utilize the vast amounts of unlabeled data during the learning process. The most common approaches tackle problems of this kind by individually applying active learning or semi-supervised learning methods. In this work, a combination of active learning and semi-supervised learning methods is proposed, under a common self-training scheme, in order to efficiently utilize the available unlabeled data. The effective and robust metrics of the entropy and the distribution of probabilities of the unlabeled set, to select the most sufficient unlabeled examples for the augmentation of the initial labeled set, are used. The superiority of the proposed scheme is validated by comparing it against the base approaches of supervised, semi-supervised, and active learning in the wide range of fifty-five benchmark datasets.

Download Full-text

Randomized independent component analysis and linear discriminant analysis dimensionality reduction methods for hyperspectral image classification

Journal of Applied Remote Sensing ◽

10.1117/1.jrs.14.036507 ◽

2020 ◽

Vol 14 (03) ◽

pp. 1

Author(s):

Chippy Jayaprakash ◽

Bharath Bhushan Damodaran ◽

Sowmya Viswanathan ◽

Kutti Padannayil Soman

Keyword(s):

Discriminant Analysis ◽

Independent Component Analysis ◽

Dimensionality Reduction ◽

Linear Discriminant Analysis ◽

Image Classification ◽

Hyperspectral Image ◽

Independent Component ◽

Hyperspectral Image Classification ◽

Linear Discriminant ◽

Reduction Methods

Download Full-text

A Prediction Approach Based on Self-Training and Deep Learning for Biological Data

International Journal of Organizational and Collective Intelligence ◽

10.4018/ijoci.2020100104 ◽

2020 ◽

Vol 10 (4) ◽

pp. 50-64

Author(s):

Mohamed Nadjib Boufenara ◽

Mahmoud Boufaida ◽

Mohamed Lamine Berkane

Keyword(s):

Deep Learning ◽

Supervised Learning ◽

Learning Algorithm ◽

Classification Performance ◽

Biological Data ◽

Unlabeled Data ◽

Learning Methods ◽

Deep Learning Algorithm ◽

Prediction Approach ◽

F Measure

With the exponential growth of biological data, labeling this kind of data becomes difficult and costly. Although unlabeled data are comparatively more plentiful than labeled ones, most supervised learning methods are not designed to use unlabeled data. Semi-supervised learning methods are motivated by the availability of large unlabeled datasets rather than a small amount of labeled examples. However, incorporating unlabeled data into learning does not guarantee an improvement in classification performance. This paper introduces an approach based on a model of semi-supervised learning, which is the self-training with a deep learning algorithm to predict missing classes from labeled and unlabeled data. In order to assess the performance of the proposed approach, two datasets are used with four performance measures: precision, recall, F-measure, and area under the ROC curve (AUC).

Download Full-text

Multiple Kernel Dimensionality Reduction via Ratio-Trace and Marginal Fisher Analysis

Mathematical Problems in Engineering ◽

10.1155/2019/6941475 ◽

2019 ◽

Vol 2019 ◽

pp. 1-8

Author(s):

Hui Xu ◽

Yongguo Yang ◽

Xin Wang ◽

Mingming Liu ◽

Hongxia Xie ◽

...

Keyword(s):

Dimensionality Reduction ◽

Convex Combination ◽

Multiple Kernel Learning ◽

Nonlinear Dimensionality Reduction ◽

Kernel Discriminant Analysis ◽

Restrictive Assumption ◽

Optimal Weights ◽

Multiple Kernel ◽

Benchmark Datasets ◽

Reduction Methods

Traditional supervised multiple kernel learning (MKL) for dimensionality reduction is generally an extension of kernel discriminant analysis (KDA), which has some restrictive assumptions. In addition, they generally are based on graph embedding framework. A more general multiple kernel-based dimensionality reduction algorithm, called multiple kernel marginal Fisher analysis (MKL-MFA), is presented for supervised nonlinear dimensionality reduction combined with ratio-race optimization problem. MKL-MFA aims at relaxing the restrictive assumption that the data of each class is of a Gaussian distribution and finding an appropriate convex combination of several base kernels. To improve the efficiency of multiple kernel dimensionality reduction, the spectral regression frameworks are incorporated into the optimization model. Furthermore, the optimal weights of predefined base kernels can be obtained by solving a different convex optimization. Experimental results on benchmark datasets demonstrate that MKL-MFA outperforms the state-of-the-art supervised multiple kernel dimensionality reduction methods.

Download Full-text

Exploration of Data Dimensionality Reduction Methods for Improving Classification Performance of Voluntary Movements

IFMBE Proceedings - The International Conference on Health Informatics ◽

10.1007/978-3-319-03005-0_32 ◽

2014 ◽

pp. 126-129

Author(s):

Yanjuan Geng ◽

Xing Kuang ◽

Mingxing Zhu ◽

Yi Zhang ◽

Guanglin Li ◽

...

Keyword(s):

Dimensionality Reduction ◽

Classification Performance ◽

Voluntary Movements ◽

Data Dimensionality Reduction ◽

Reduction Methods

Download Full-text

A Similar Distribution Discriminant Analysis with Orthogonal and Nearly Statistically Uncorrelated Characteristics

Mathematical Problems in Engineering ◽

10.1155/2019/3145973 ◽

2019 ◽

Vol 2019 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Zhibo Guo ◽

Ying Zhang

Keyword(s):

Discriminant Analysis ◽

Dimensionality Reduction ◽

Linear Discriminant Analysis ◽

High Dimensional Data ◽

Sensor Data ◽

High Dimensional ◽

Similar Distribution ◽

Face Database ◽

Linear Discriminant ◽

Reduction Methods

It is very difficult to process and analyze high-dimensional data directly. Therefore, it is necessary to learn a potential subspace of high-dimensional data through excellent dimensionality reduction algorithms to preserve the intrinsic structure of high-dimensional data and abandon the less useful information. Principal component analysis (PCA) and linear discriminant analysis (LDA) are two popular dimensionality reduction methods for high-dimensional sensor data preprocessing. LDA contains two basic methods, namely, classic linear discriminant analysis and FS linear discriminant analysis. In this paper, a new method, called similar distribution discriminant analysis (SDDA), is proposed based on the similarity of samples’ distribution. Furthermore, the method of solving the optimal discriminant vector is given. These discriminant vectors are orthogonal and nearly statistically uncorrelated. The disadvantages of PCA and LDA are overcome, and the extracted features are more effective by using SDDA. The recognition performance of SDDA exceeds PCA and LDA largely. Some experiments on the Yale face database, FERET face database, and UCI multiple features dataset demonstrate that the proposed method is effective. The results reveal that SDDA obtains better performance than comparison dimensionality reduction methods.

Download Full-text

Locality Adaptive Discriminant Analysis

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/306 ◽

2017 ◽

Cited By ~ 14

Author(s):

Xuelong Li ◽

Mulin Chen ◽

Feiping Nie ◽

Qi Wang

Keyword(s):

Discriminant Analysis ◽

Dimensionality Reduction ◽

Real World ◽

Original Data ◽

Additional Parameter ◽

Distributed Data ◽

Local Data ◽

Linear Discriminant ◽

Benchmark Datasets ◽

Neighbor Relationship

Linear Discriminant Analysis (LDA) is a popular technique for supervised dimensionality reduction, and its performance is satisfying when dealing with Gaussian distributed data. However, the neglect of local data structure makes LDA inapplicable to many real-world situations. So some works focus on the discriminant analysis between neighbor points, which can be easily affected by the noise in the original data space. In this paper, we propose a new supervised dimensionality reduction method, Locality Adaptive Discriminant Analysis (LADA), to lean a representative subspace of the data. Compared to LDA and its variants, the proposed method has three salient advantages: (1) it finds the principle projection directions without imposing any assumption on the data distribution; (2) it’s able to exploit the local manifold structure of data in the desired subspace; (3) it exploits the points’ neighbor relationship automatically without introducing any additional parameter to be tuned. Performance on synthetic datasets and real-world benchmark datasets demonstrate the superiority of the proposed method.

Download Full-text

A COMPARISON OF TWO STRATEGIES FOR AVOIDING NEGATIVE TRANSFER IN DOMAIN ADAPTATION BASED ON LOGISTIC REGRESSION

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-2-845-2018 ◽

2018 ◽

Vol XLII-2 ◽

pp. 845-852 ◽

Cited By ~ 1

Author(s):

A. Paul ◽

K. Vogt ◽

F. Rottensteiner ◽

J. Ostermann ◽

C. Heipke

Keyword(s):

Negative Transfer ◽

Domain Adaptation ◽

Classification Performance ◽

Target Domain ◽

Maximum Mean Discrepancy ◽

Source Domain ◽

Training Samples ◽

Benchmark Datasets ◽

Consistent Performance ◽

Class Labels

In this paper we deal with the problem of measuring the similarity between training and tests datasets in the context of transfer learning (TL) for image classification. TL tries to transfer knowledge from a source domain, where labelled training samples are abundant but the data may follow a different distribution, to a target domain, where labelled training samples are scarce or even unavailable, assuming that the domains are related. Thus, the requirements w.r.t. the availability of labelled training samples in the target domain are reduced. In particular, if no labelled target data are available, it is inherently difficult to find a robust measure of relatedness between the source and target domains. This is of crucial importance for the performance of TL, because the knowledge transfer between unrelated data may lead to negative transfer, i.e. to a decrease of classification performance after transfer. We address the problem of measuring the relatedness between source and target datasets and investigate three different strategies to predict and, consequently, to avoid negative transfer in this paper. The first strategy is based on circular validation. The second strategy relies on the Maximum Mean Discrepancy (MMD) similarity metric, whereas the third one is an extension of MMD which incorporates the knowledge about the class labels in the source domain. Our method is evaluated using two different benchmark datasets. The experiments highlight the strengths and weaknesses of the investigated methods. We also show that it is possible to reduce the amount of negative transfer using these strategies for a TL method and to generate a consistent performance improvement over the whole dataset.

Download Full-text

A Unified Factors Analysis Framework for Discriminative Feature Extraction and Object Recognition

Mathematical Problems in Engineering ◽

10.1155/2016/9347838 ◽

2016 ◽

Vol 2016 ◽

pp. 1-12

Author(s):

Ningbo Hao ◽

Jie Yang ◽

Haibin Liao ◽

Wenhua Dai

Keyword(s):

Factor Analysis ◽

Feature Extraction ◽

Dimensionality Reduction ◽

Mapping Function ◽

Classification Performance ◽

Analysis Framework ◽

Linear Discriminant ◽

Discriminative Feature ◽

Common Framework ◽

Content Factor

Various methods for feature extraction and dimensionality reduction have been proposed in recent decades, including supervised and unsupervised methods and linear and nonlinear methods. Despite the different motivations of these methods, we present in this paper a general formulation known as factor analysis to unify them within a common framework. During factor analysis, an object can be seen as being comprised of content and style factors, and the objective of feature extraction and dimensionality reduction is to obtain the content factor without style factor. There are two vital steps in factor analysis framework; one is the design of factor separating objective function, including the design of partition and weight matrix, and the other is the design of space mapping function. In this paper, classical Linear Discriminant Analysis (LDA) and Locality Preserving Projection (LPP) algorithms are improved based on factor analysis framework, and LDA based on factor analysis (FA-LDA) and LPP based on factor analysis (FA-LPP) are proposed. Experimental results show the superiority of our proposed approach in classification performance compared to classical LDA and LPP algorithms.

Download Full-text