scholarly journals Multi-Nyström Method Based on Multiple Kernel Learning for Large Scale Imbalanced Classification

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Ling Wang ◽  
Hongqiao Wang ◽  
Guangyuan Fu

Extensions of kernel methods for the class imbalance problems have been extensively studied. Although they work well in coping with nonlinear problems, the high computation and memory costs severely limit their application to real-world imbalanced tasks. The Nyström method is an effective technique to scale kernel methods. However, the standard Nyström method needs to sample a sufficiently large number of landmark points to ensure an accurate approximation, which seriously affects its efficiency. In this study, we propose a multi-Nyström method based on mixtures of Nyström approximations to avoid the explosion of subkernel matrix, whereas the optimization to mixture weights is embedded into the model training process by multiple kernel learning (MKL) algorithms to yield more accurate low-rank approximation. Moreover, we select subsets of landmark points according to the imbalance distribution to reduce the model’s sensitivity to skewness. We also provide a kernel stability analysis of our method and show that the model solution error is bounded by weighted approximate errors, which can help us improve the learning process. Extensive experiments on several large scale datasets show that our method can achieve a higher classification accuracy and a dramatical speedup of MKL algorithms.

2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Wenjia Niu ◽  
Kewen Xia ◽  
Baokai Zu ◽  
Jianchuan Bai

Unlike Support Vector Machine (SVM), Multiple Kernel Learning (MKL) allows datasets to be free to choose the useful kernels based on their distribution characteristics rather than a precise one. It has been shown in the literature that MKL holds superior recognition accuracy compared with SVM, however, at the expense of time consuming computations. This creates analytical and computational difficulties in solving MKL algorithms. To overcome this issue, we first develop a novel kernel approximation approach for MKL and then propose an efficient Low-Rank MKL (LR-MKL) algorithm by using the Low-Rank Representation (LRR). It is well-acknowledged that LRR can reduce dimension while retaining the data features under a global low-rank constraint. Furthermore, we redesign the binary-class MKL as the multiclass MKL based on pairwise strategy. Finally, the recognition effect and efficiency of LR-MKL are verified on the datasets Yale, ORL, LSVT, and Digit. Experimental results show that the proposed LR-MKL algorithm is an efficient kernel weights allocation method in MKL and boosts the performance of MKL largely.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Xiaoyi Guo ◽  
Wei Zhou ◽  
Yan Yu ◽  
Yijie Ding ◽  
Jijun Tang ◽  
...  

All drugs usually have side effects, which endanger the health of patients. To identify potential side effects of drugs, biological and pharmacological experiments are done but are expensive and time-consuming. So, computation-based methods have been developed to accurately and quickly predict side effects. To predict potential associations between drugs and side effects, we propose a novel method called the Triple Matrix Factorization- (TMF-) based model. TMF is built by the biprojection matrix and latent feature of kernels, which is based on Low Rank Approximation (LRA). LRA could construct a lower rank matrix to approximate the original matrix, which not only retains the characteristics of the original matrix but also reduces the storage space and computational complexity of the data. To fuse multivariate information, multiple kernel matrices are constructed and integrated via Kernel Target Alignment-based Multiple Kernel Learning (KTA-MKL) in drug and side effect space, respectively. Compared with other methods, our model achieves better performance on three benchmark datasets. The values of the Area Under the Precision-Recall curve (AUPR) are 0.677, 0.685, and 0.680 on three datasets, respectively.


2014 ◽  
Vol 143 ◽  
pp. 68-79 ◽  
Author(s):  
Alain Rakotomamonjy ◽  
Sukalpa Chanda

2018 ◽  
Vol 10 (10) ◽  
pp. 1639 ◽  
Author(s):  
Tianming Zhan ◽  
Le Sun ◽  
Yang Xu ◽  
Guowei Yang ◽  
Yan Zhang ◽  
...  

High dimensional image classification is a fundamental technique for information retrieval from hyperspectral remote sensing data. However, data quality is readily affected by the atmosphere and noise in the imaging process, which makes it difficult to achieve good classification performance. In this paper, multiple kernel learning-based low rank representation at superpixel level (Sp_MKL_LRR) is proposed to improve the classification accuracy for hyperspectral images. Superpixels are generated first from the hyperspectral image to reduce noise effect and form homogeneous regions. An optimal superpixel kernel parameter is then selected by the kernel matrix using a multiple kernel learning framework. Finally, a kernel low rank representation is applied to classify the hyperspectral image. The proposed method offers two advantages. (1) The global correlation constraint is exploited by the low rank representation, while the local neighborhood information is extracted as the superpixel kernel adaptively learns the high-dimensional manifold features of the samples in each class; (2) It can meet the challenges of multiscale feature learning and adaptive parameter determination in the conventional kernel methods. Experimental results on several hyperspectral image datasets demonstrate that the proposed method outperforms several state-of-the-art classifiers tested in terms of overall accuracy, average accuracy, and kappa statistic.


2012 ◽  
Vol 24 (7) ◽  
pp. 1853-1881 ◽  
Author(s):  
Hideitsu Hino ◽  
Nima Reyhani ◽  
Noboru Murata

Kernel methods are known to be effective for nonlinear multivariate analysis. One of the main issues in the practical use of kernel methods is the selection of kernel. There have been a lot of studies on kernel selection and kernel learning. Multiple kernel learning (MKL) is one of the promising kernel optimization approaches. Kernel methods are applied to various classifiers including Fisher discriminant analysis (FDA). FDA gives the Bayes optimal classification axis if the data distribution of each class in the feature space is a gaussian with a shared covariance structure. Based on this fact, an MKL framework based on the notion of gaussianity is proposed. As a concrete implementation, an empirical characteristic function is adopted to measure gaussianity in the feature space associated with a convex combination of kernel functions, and two MKL algorithms are derived. From experimental results on some data sets, we show that the proposed kernel learning followed by FDA offers strong classification power.


2017 ◽  
Vol 250 ◽  
pp. 1-15 ◽  
Author(s):  
Liang Lan ◽  
Kai Zhang ◽  
Hancheng Ge ◽  
Wei Cheng ◽  
Jun Liu ◽  
...  

2021 ◽  
Vol 7 ◽  
pp. e363
Author(s):  
Nisar Wani ◽  
Khalid Raza

High throughput multi-omics data generation coupled with heterogeneous genomic data fusion are defining new ways to build computational inference models. These models are scalable and can support very large genome sizes with the added advantage of exploiting additional biological knowledge from the integration framework. However, the limitation with such an arrangement is the huge computational cost involved when learning from very large datasets in a sequential execution environment. To overcome this issue, we present a multiple kernel learning (MKL) based gene regulatory network (GRN) inference approach wherein multiple heterogeneous datasets are fused using MKL paradigm. We formulate the GRN learning problem as a supervised classification problem, whereby genes regulated by a specific transcription factor are separated from other non-regulated genes. A parallel execution architecture is devised to learn a large scale GRN by decomposing the initial classification problem into a number of subproblems that run as multiple processes on a multi-processor machine. We evaluate the approach in terms of increased speedup and inference potential using genomic data from Escherichia coli, Saccharomyces cerevisiae and Homo sapiens. The results thus obtained demonstrate that the proposed method exhibits better classification accuracy and enhanced speedup compared to other state-of-the-art methods while learning large scale GRNs from multiple and heterogeneous datasets.


Sign in / Sign up

Export Citation Format

Share Document