Array Variate Random Variables with Multiway Kro- necker Delta Covariance Matrix Structure

Deniz Akdemir; Arjun K. Gupta

doi:10.18409/jas.v2i1.12

Array Variate Random Variables with Multiway Kro- necker Delta Covariance Matrix Structure

Journal of Algebraic Statistics ◽

10.18409/jas.v2i1.12 ◽

2011 ◽

Vol 2 (1) ◽

Cited By ~ 19

Author(s):

Deniz Akdemir ◽

Arjun K. Gupta

Keyword(s):

Principal Component Analysis ◽

Covariance Matrix ◽

Random Variables ◽

Principal Component ◽

Random Variable ◽

High Dimensional ◽

Underlying Structure ◽

Data Sets ◽

Matrix Structure ◽

Background Material

Standard statistical methods applied to matrix random variables often fail to describethe underlying structure in multiway data sets. After a review of the essential background material,this paper introduces the notion of array variate random variable. A normal array variate randomvariable is dened and a method for estimating the parameters of array variate normal distributionis given. We introduce a technique called slicing for estimating the covariance matrix of highdimensional data. Finally, principal component analysis and classication techniques are developedfor array variate observations and high dimensional data.

Download Full-text

Dimensionality and Its Reduction

Statistics, Data Mining, and Machine Learning in Astronomy ◽

10.23943/princeton/9780691151687.003.0007 ◽

2014 ◽

Author(s):

Andrew J. Connolly ◽

Jacob T. VanderPlas ◽

Alexander Gray ◽

Andrew J. Connolly ◽

Jacob T. VanderPlas ◽

...

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Reduction Technique ◽

High Dimensional ◽

Data Sets ◽

Data Set ◽

Gaussian Distributions ◽

Dimensionality Reduction Technique ◽

Alternative Techniques ◽

New Generation

With the dramatic increase in data available from a new generation of astronomical telescopes and instruments, many analyses must address the question of the complexity as well as size of the data set. This chapter deals with how we can learn which measurements, properties, or combinations thereof carry the most information within a data set. It describes techniques that are related to concepts discussed when describing Gaussian distributions, density estimation, and the concepts of information content. The chapter begins with an exploration of the problems posed by high-dimensional data. It then describes the data sets used in this chapter, and introduces perhaps the most important and widely used dimensionality reduction technique, principal component analysis (PCA). The remainder of the chapter discusses several alternative techniques which address some of the weaknesses of PCA.

Download Full-text

CONDITIONS FOR THE APPLICABILITY OF THE PRINCIPAL COMPONENT ANALYSIS TO THE CHARACTERIZATION OF THE 1/f-NOISE

Fluctuation and Noise Letters ◽

10.1142/s0219477506003100 ◽

2006 ◽

Vol 06 (01) ◽

pp. L17-L28 ◽

Cited By ~ 1

Author(s):

JOSÉ MANUEL LÓPEZ-ALONSO ◽

JAVIER ALDA

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Random Variable ◽

Component Analysis ◽

Optical Systems ◽

Data Sets ◽

Regular Time ◽

Pca Method ◽

Definition Of

Principal Component Analysis (PCA) has been applied to the characterization of the 1/f-noise. The application of the PCA to the 1/f noise requires the definition of a stochastic multidimensional variable. The components of this variable describe the temporal evolution of the phenomena sampled at regular time intervals. In this paper we analyze the conditions about the number of observations and the dimension of the multidimensional random variable necessary to use the PCA method in a sound manner. We have tested the obtained conditions for simulated and experimental data sets obtained from imaging optical systems. The results can be extended to other fields where this kind of noise is relevant.

Download Full-text

A Novel Probabilistic Power Flow Algorithm Based on Principal Component Analysis and High-Dimensional Model Representation Techniques

Energies ◽

10.3390/en13143520 ◽

2020 ◽

Vol 13 (14) ◽

pp. 3520 ◽

Cited By ~ 1

Author(s):

Hang Li ◽

Zhe Zhang ◽

Xianggen Yin

Keyword(s):

Principal Component Analysis ◽

Power System ◽

Power Flow ◽

Random Variables ◽

Principal Component ◽

Component Analysis ◽

High Order ◽

High Dimensional ◽

High Dimensional Model Representation ◽

Dimensional Model

Because the penetration level of renewable energy sources has increased rapidly in recent years, uncertainty in power system operation is gradually increasing. As an efficient tool for power system analysis under uncertainty, probabilistic power flow (PPF) is becoming increasingly important. The point-estimate method (PEM) is a well-known PPF algorithm. However, two significant defects limit the practical use of this method. One is that the PEM struggles to estimate high-order moments accurately; this defect makes it difficult for the PEM to describe the distribution of non-Gaussian output random variables (ORVs). The other is that the calculation burden is strongly related to the scale of input random variables (IRVs), which makes the PEM difficult to use in large-scale power systems. A novel approach based on principal component analysis (PCA) and high-dimensional model representation (HDMR) is proposed here to overcome the defects of the traditional PEM. PCA is applied to decrease the dimension scale of IRVs and eliminate correlations. HDMR is applied to estimate the moments of ORVs. Because HDMR considers the cooperative effects of IRVs, it has a significantly smaller estimation error for high-order moments in particular. Case studies show that the proposed method can achieve a better performance in terms of accuracy and efficiency than traditional PEM.

Download Full-text

Differential Privacy Principal Component Analysis for Support Vector Machines

Security and Communication Networks ◽

10.1155/2021/5542283 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Yuxian Huang ◽

Geng Yang ◽

Yahong Xu ◽

Hao Zhou

Keyword(s):

Principal Component Analysis ◽

Classification Accuracy ◽

Differential Privacy ◽

Principal Component ◽

Component Analysis ◽

High Dimensional ◽

Support Vector ◽

Data Sets ◽

Vector Machines ◽

Fast Classification

In big data era, massive and high-dimensional data is produced at all times, increasing the difficulty of analyzing and protecting data. In this paper, in order to realize dimensionality reduction and privacy protection of data, principal component analysis (PCA) and differential privacy (DP) are combined to handle these data. Moreover, support vector machine (SVM) is used to measure the availability of processed data in our paper. Specifically, we introduced differential privacy mechanisms at different stages of the algorithm PCA-SVM and obtained the algorithms DPPCA-SVM and PCADP-SVM. Both algorithms satisfy ε , 0 -DP while achieving fast classification. In addition, we evaluate the performance of two algorithms in terms of noise expectation and classification accuracy from the perspective of theoretical proof and experimental verification. To verify the performance of DPPCA-SVM, we also compare our DPPCA-SVM with other algorithms. Results show that DPPCA-SVM provides excellent utility for different data sets despite guaranteeing stricter privacy.

Download Full-text

Dimension reduction of high-dimensional dataset with missing values

Journal of Algorithms & Computational Technology ◽

10.1177/1748302619867440 ◽

2019 ◽

Vol 13 ◽

pp. 174830261986744

Author(s):

Ran Zhang ◽

Bin Ye ◽

Peng Liu

Keyword(s):

Principal Component Analysis ◽

Dimension Reduction ◽

Covariance Matrix ◽

Incomplete Data ◽

Missing Values ◽

Principal Component ◽

High Dimensional ◽

Data Handling ◽

Handling Methods ◽

Lasso Estimator

Nowadays, datasets containing a very large number of variables or features are routinely generated in many fields. Dimension reduction techniques are usually performed prior to statistically analyzing these datasets in order to avoid the effects of the curse of dimensionality. Principal component analysis is one of the most important techniques for dimension reduction and data visualization. However, datasets with missing values arising in almost every field will produce biased estimates and are difficult to handle, especially in the high dimension, low sample size settings. By exploiting a Lasso estimator of the population covariance matrix, we propose to regularize the principal component analysis to reduce the dimensionality of dataset with missing data. The Lasso estimator of covariance matrix is computationally tractable by solving a convex optimization problem. To illustrate the effectiveness of our method on dimension reduction, the principal component directions are evaluated by the metrics of Frobenius norm and cosine distance. The performances are compared with other incomplete data handling methods such as mean substitution and multiple imputation. Simulation results also show that our method is superior to other incomplete data handling methods in the context of discriminant analysis of real world high-dimensional datasets.

Download Full-text

ITERATIVE AGGREGATION METHOD FOR SOLVING PRINCIPAL COMPONENT ANALYSIS PROBLEMS

International Journal for Computational Civil and Structural Engineering ◽

10.22337/2587-9618-2017-13-4-47-52 ◽

2017 ◽

Vol 13 (4) ◽

pp. 47-52

Author(s):

Vitaly E. Bulgakov

Keyword(s):

Principal Component Analysis ◽

Structural Analysis ◽

Eigenvalue Problem ◽

Covariance Matrix ◽

Principal Component ◽

Iterative Solution ◽

Component Analysis ◽

Data Sets ◽

Aggregation Method ◽

Text Documents

Motivated by the previously developed multilevel aggregation method for solving structural analysis problems a novel two-level aggregation approach for efficient iterative solution of Principal Component Analysis (PCA) problems is proposed. The course aggregation model of the original covariance matrix is used in the iterative solution of the eigenvalue problem by a power iterations method. The method is tested on several data sets consisting of large number of text documents.

Download Full-text

Principal Component Analysis of Hydrological Data

Handbook of Research on Hydroinformatics ◽

10.4018/978-1-61520-907-1.ch018 ◽

2010 ◽

pp. 364-388

Author(s):

Petr Praus

Keyword(s):

Water Quality ◽

Principal Component Analysis ◽

Drinking Water ◽

Ground Water ◽

Principal Component ◽

Component Analysis ◽

Data Sets ◽

Hydrological Data ◽

First Case

In this chapter the principals and applications of principal component analysis (PCA) applied on hydrological data are presented. Four case studies showed the possibility of PCA to obtain information about wastewater treatment process, drinking water quality in a city network and to find similarities in the data sets of ground water quality results and water-related images. In the first case study, the composition of raw and cleaned wastewater was characterised and its temporal changes were displayed. In the second case study, drinking water samples were divided into clusters in consistency with their sampling localities. In the case study III, the similar samples of ground water were recognised by the calculation of cosine similarity, the Euclidean and Manhattan distances. In the case study IV, 32 water-related images were transformed into a large image matrix whose dimensionality was reduced by PCA. The images were clustered using the PCA scatter plots.

Download Full-text

Using principal component analysis for neural network high-dimensional potential energy surface

The Journal of Chemical Physics ◽

10.1063/5.0009264 ◽

2020 ◽

Vol 152 (23) ◽

pp. 234103

Author(s):

Bastien Casier ◽

Stéphane Carniato ◽

Tsveta Miteva ◽

Nathalie Capron ◽

Nicolas Sisourat

Keyword(s):

Neural Network ◽

Principal Component Analysis ◽

Potential Energy ◽

Potential Energy Surface ◽

Energy Surface ◽

Principal Component ◽

Component Analysis ◽

High Dimensional

Download Full-text

A Study of Effectiveness of Principal Component Analysis on Different Data Sets

2017 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) ◽

10.1109/iccic.2017.8524329 ◽

2017 ◽

Author(s):

Mukti Krishnan ◽

Dipankar Dutta

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Data Sets

Download Full-text

High-Dimensional Data Dimension Reduction Based on KECA

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.303-306.1101 ◽

2013 ◽

Vol 303-306 ◽

pp. 1101-1104 ◽

Cited By ~ 2

Author(s):

Yong De Hu ◽

Jing Chang Pan ◽

Xin Tan

Keyword(s):

Principal Component Analysis ◽

Dimension Reduction ◽

High Dimensional Data ◽

Principal Component ◽

Good Method ◽

Component Analysis ◽

Renyi Entropy ◽

Rényi Entropy ◽

Kernel Principal Component Analysis ◽

High Dimensional

Kernel entropy component analysis (KECA) reveals the original data’s structure by kernel matrix. This structure is related to the Renyi entropy of the data. KECA maintains the invariance of the original data’s structure by keeping the data’s Renyi entropy unchanged. This paper described the original data by several components on the purpose of dimension reduction. Then the KECA was applied in celestial spectra reduction and was compared with Principal Component Analysis (PCA) and Kernel Principal Component Analysis (KPCA) by experiments. Experimental results show that the KECA is a good method in high-dimensional data reduction.

Download Full-text