High-Dimensional Separability for One- and Few-Shot Learning

Alexander N. Gorban; Bogdan Grechuk; Evgeny M. Mirkes; Sergey V. Stasenko; Ivan Y. Tyukin

doi:10.3390/e23081090

High-Dimensional Separability for One- and Few-Shot Learning

Entropy ◽

10.3390/e23081090 ◽

2021 ◽

Vol 23 (8) ◽

pp. 1090

Author(s):

Alexander N. Gorban ◽

Bogdan Grechuk ◽

Evgeny M. Mirkes ◽

Sergey V. Stasenko ◽

Ivan Y. Tyukin

Keyword(s):

Domain Adaptation ◽

Cluster Structure ◽

Principal Component ◽

Component Analysis ◽

High Dimensional ◽

Separation Theorems ◽

Mathematical Basis ◽

Fine Grained ◽

New Family ◽

Fine Grained Structure

This work is driven by a practical question: corrections of Artificial Intelligence (AI) errors. These corrections should be quick and non-iterative. To solve this problem without modification of a legacy AI system, we propose special `external’ devices, correctors. Elementary correctors consist of two parts, a classifier that separates the situations with high risk of error from the situations in which the legacy AI system works well and a new decision that should be recommended for situations with potential errors. Input signals for the correctors can be the inputs of the legacy AI system, its internal signals, and outputs. If the intrinsic dimensionality of data is high enough then the classifiers for correction of small number of errors can be very simple. According to the blessing of dimensionality effects, even simple and robust Fisher’s discriminants can be used for one-shot learning of AI correctors. Stochastic separation theorems provide the mathematical basis for this one-short learning. However, as the number of correctors needed grows, the cluster structure of data becomes important and a new family of stochastic separation theorems is required. We refuse the classical hypothesis of the regularity of the data distribution and assume that the data can have a rich fine-grained structure with many clusters and corresponding peaks in the probability density. New stochastic separation theorems for data with fine-grained structure are formulated and proved. On the basis of these theorems, the multi-correctors for granular data are proposed. The advantages of the multi-corrector technology were demonstrated by examples of correcting errors and learning new classes of objects by a deep convolutional neural network on the CIFAR-10 dataset. The key problems of the non-classical high-dimensional data analysis are reviewed together with the basic preprocessing steps including the correlation transformation, supervised Principal Component Analysis (PCA), semi-supervised PCA, transfer component analysis, and new domain adaptation PCA.

Download Full-text

High-Dimensional Separability for One- And Few-Shot Learning

10.20944/preprints202106.0718.v1 ◽

2021 ◽

Author(s):

Alexander N Gorban ◽

Bogdan Grechuk ◽

Evgeny M Mirkes ◽

Sergey V Stasenko ◽

Ivan Y Tyukin

Keyword(s):

Normal Operation ◽

High Dimensional ◽

Separation Theorems ◽

Grand Challenge ◽

Compact Embedding ◽

Linear Discriminant ◽

Fine Grained ◽

The Hierarchical Structure ◽

Fine Grained Structure ◽

The One

This work is driven by a practical question, corrections of Artificial Intelligence (AI) errors. Systematic re-training of a large AI system is hardly possible. To solve this problem, special external devices, correctors, are developed. They should provide quick and non-iterative system fix without modification of a legacy AI system. A common universal part of the AI corrector is a classifier that should separate undesired and erroneous behavior from normal operation. Training of such classifiers is a grand challenge at the heart of the one- and few-shot learning methods. Effectiveness of one- and few-short methods is based on either significant dimensionality reductions or the blessing of dimensionality effects. Stochastic separability is a blessing of dimensionality phenomenon that allows one-and few-shot error correction: in high-dimensional datasets under broad assumptions each point can be separated from the rest of the set by simple and robust linear discriminant. The hierarchical structure of data universe is introduced where each data cluster has a granular internal structure, etc. New stochastic separation theorems for the data distributions with fine-grained structure are formulated and proved. Separation theorems in infinite-dimensional limits are proven under assumptions of compact embedding of patterns into data space. New multi-correctors of AI systems are presented and illustrated with examples of predicting errors and learning new classes of objects by a deep convolutional neural network.

Download Full-text

Using principal component analysis for neural network high-dimensional potential energy surface

The Journal of Chemical Physics ◽

10.1063/5.0009264 ◽

2020 ◽

Vol 152 (23) ◽

pp. 234103

Author(s):

Bastien Casier ◽

Stéphane Carniato ◽

Tsveta Miteva ◽

Nathalie Capron ◽

Nicolas Sisourat

Keyword(s):

Neural Network ◽

Principal Component Analysis ◽

Potential Energy ◽

Potential Energy Surface ◽

Energy Surface ◽

Principal Component ◽

Component Analysis ◽

High Dimensional

Download Full-text

High-Dimensional Data Dimension Reduction Based on KECA

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.303-306.1101 ◽

2013 ◽

Vol 303-306 ◽

pp. 1101-1104 ◽

Cited By ~ 2

Author(s):

Yong De Hu ◽

Jing Chang Pan ◽

Xin Tan

Keyword(s):

Principal Component Analysis ◽

Dimension Reduction ◽

High Dimensional Data ◽

Principal Component ◽

Good Method ◽

Component Analysis ◽

Renyi Entropy ◽

Rényi Entropy ◽

Kernel Principal Component Analysis ◽

High Dimensional

Kernel entropy component analysis (KECA) reveals the original data’s structure by kernel matrix. This structure is related to the Renyi entropy of the data. KECA maintains the invariance of the original data’s structure by keeping the data’s Renyi entropy unchanged. This paper described the original data by several components on the purpose of dimension reduction. Then the KECA was applied in celestial spectra reduction and was compared with Principal Component Analysis (PCA) and Kernel Principal Component Analysis (KPCA) by experiments. Experimental results show that the KECA is a good method in high-dimensional data reduction.

Download Full-text

Principal Component Analysis (PCA) for high-dimensional data. PCA is dead. Long live PCA

Perspectives on Big Data Analysis - Contemporary Mathematics ◽

10.1090/conm/622/12430 ◽

2014 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Fan Yang ◽

Kjell Doksum ◽

Kam-Wah Tsui

Keyword(s):

Principal Component Analysis ◽

High Dimensional Data ◽

Principal Component ◽

Component Analysis ◽

High Dimensional

Download Full-text

Detecting Shilling Attacks with Automatic Features from Multiple Views

Security and Communication Networks ◽

10.1155/2019/6523183 ◽

2019 ◽

Vol 2019 ◽

pp. 1-13 ◽

Cited By ~ 2

Author(s):

Yaojun Hao ◽

Fuzhi Zhang ◽

Jian Wang ◽

Qingshan Zhao ◽

Jianfang Cao

Keyword(s):

Principal Component Analysis ◽

Recommender Systems ◽

Detection Method ◽

Principal Component ◽

Component Analysis ◽

Experimental Results ◽

Detection Methods ◽

Detection Accuracy ◽

Multiple Views ◽

Fine Grained

Due to the openness of the recommender systems, the attackers are likely to inject a large number of fake profiles to bias the prediction of such systems. The traditional detection methods mainly rely on the artificial features, which are often extracted from one kind of user-generated information. In these methods, fine-grained interactions between users and items cannot be captured comprehensively, leading to the degradation of detection accuracy under various types of attacks. In this paper, we propose an ensemble detection method based on the automatic features extracted from multiple views. Firstly, to collaboratively discover the shilling profiles, the users’ behaviors are analyzed from multiple views including ratings, item popularity, and user-user graph. Secondly, based on the data preprocessed from multiple views, the stacked denoising autoencoders are used to automatically extract user features with different corruption rates. Moreover, the features extracted from multiple views are effectively combined based on principal component analysis. Finally, according to the features extracted with different corruption rates, the weak classifiers are generated and then integrated to detect attacks. The experimental results on the MovieLens, Netflix, and Amazon datasets indicate that the proposed method can effectively detect various attacks.

Download Full-text

Enhanced Supervised Principal Component Analysis for Cancer Classification

Iraqi Journal of Science ◽

10.24996/ijs.2021.62.4.28 ◽

2021 ◽

pp. 1321-1333

Author(s):

Ghadeer JM Mahdi ◽

Bayda A. Kalaf ◽

Mundher A. Khaleel

Keyword(s):

Principal Component Analysis ◽

Gradient Descent ◽

Dimensional Space ◽

Principal Component ◽

Component Analysis ◽

Large Datasets ◽

Stochastic Gradient Descent ◽

High Dimensional ◽

Excellent Method ◽

Blue Cell

In this paper, a new hybridization of supervised principal component analysis (SPCA) and stochastic gradient descent techniques is proposed, and called as SGD-SPCA, for real large datasets that have a small number of samples in high dimensional space. SGD-SPCA is proposed to become an important tool that can be used to diagnose and treat cancer accurately. When we have large datasets that require many parameters, SGD-SPCA is an excellent method, and it can easily update the parameters when a new observation shows up. Two cancer datasets are used, the first is for Leukemia and the second is for small round blue cell tumors. Also, simulation datasets are used to compare principal component analysis (PCA), SPCA, and SGD-SPCA. The results show that SGD-SPCA is more efficient than other existing methods.

Download Full-text

A Literature Survey on High-Dimensional Sparse Principal Component Analysis

International Journal of Database Theory and Application ◽

10.14257/ijdta.2015.8.6.06 ◽

2015 ◽

Vol 8 (6) ◽

pp. 57-74

Author(s):

Shen Ning-min ◽

Li Jing

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Literature Survey ◽

High Dimensional ◽

Sparse Principal Component Analysis

Download Full-text

Objective-sensitive principal component analysis for high-dimensional inverse problems

Computational Geosciences ◽

10.1007/s10596-021-10081-y ◽

2021 ◽

Author(s):

Maksim Elizarev ◽

Andrei Mukhin ◽

Aleksey Khlyupin

Keyword(s):

Principal Component Analysis ◽

Inverse Problems ◽

Principal Component ◽

Component Analysis ◽

High Dimensional

Download Full-text

High Dimensional Bayesian Optimization Assisted by Principal Component Analysis

Parallel Problem Solving from Nature – PPSN XVI - Lecture Notes in Computer Science ◽

10.1007/978-3-030-58112-1_12 ◽

2020 ◽

pp. 169-183

Author(s):

Elena Raponi ◽

Hao Wang ◽

Mariusz Bujny ◽

Simonetta Boria ◽

Carola Doerr

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Bayesian Optimization ◽

High Dimensional

Download Full-text

Multilevel Functional Principal Component Analysis for High-Dimensional Data

Journal of Computational and Graphical Statistics ◽

10.1198/jcgs.2011.10122 ◽

2011 ◽

Vol 20 (4) ◽

pp. 852-873 ◽

Cited By ~ 33

Author(s):

Vadim Zipunnikov ◽

Brian Caffo ◽

David M. Yousem ◽

Christos Davatzikos ◽

Brian S. Schwartz ◽

...

Keyword(s):

Principal Component Analysis ◽

High Dimensional Data ◽

Principal Component ◽

Component Analysis ◽

Functional Principal Component Analysis ◽

High Dimensional ◽

Functional Principal Component

Download Full-text