Array Variate Random Variables with Multiway Kro- necker Delta Covariance Matrix Structure

2011 ◽  
Vol 2 (1) ◽  
Author(s):  
Deniz Akdemir ◽  
Arjun K. Gupta

Standard statistical methods applied to matrix random variables often fail to describethe underlying structure in multiway data sets. After a review of the essential background material,this paper introduces the notion of array variate random variable. A normal array variate randomvariable is dened and a method for estimating the parameters of array variate normal distributionis given. We introduce a technique called slicing for estimating the covariance matrix of highdimensional data. Finally, principal component analysis and classication techniques are developedfor array variate observations and high dimensional data.

Author(s):  
Andrew J. Connolly ◽  
Jacob T. VanderPlas ◽  
Alexander Gray ◽  
Andrew J. Connolly ◽  
Jacob T. VanderPlas ◽  
...  

With the dramatic increase in data available from a new generation of astronomical telescopes and instruments, many analyses must address the question of the complexity as well as size of the data set. This chapter deals with how we can learn which measurements, properties, or combinations thereof carry the most information within a data set. It describes techniques that are related to concepts discussed when describing Gaussian distributions, density estimation, and the concepts of information content. The chapter begins with an exploration of the problems posed by high-dimensional data. It then describes the data sets used in this chapter, and introduces perhaps the most important and widely used dimensionality reduction technique, principal component analysis (PCA). The remainder of the chapter discusses several alternative techniques which address some of the weaknesses of PCA.


2006 ◽  
Vol 06 (01) ◽  
pp. L17-L28 ◽  
Author(s):  
JOSÉ MANUEL LÓPEZ-ALONSO ◽  
JAVIER ALDA

Principal Component Analysis (PCA) has been applied to the characterization of the 1/f-noise. The application of the PCA to the 1/f noise requires the definition of a stochastic multidimensional variable. The components of this variable describe the temporal evolution of the phenomena sampled at regular time intervals. In this paper we analyze the conditions about the number of observations and the dimension of the multidimensional random variable necessary to use the PCA method in a sound manner. We have tested the obtained conditions for simulated and experimental data sets obtained from imaging optical systems. The results can be extended to other fields where this kind of noise is relevant.


Energies ◽  
2020 ◽  
Vol 13 (14) ◽  
pp. 3520 ◽  
Author(s):  
Hang Li ◽  
Zhe Zhang ◽  
Xianggen Yin

Because the penetration level of renewable energy sources has increased rapidly in recent years, uncertainty in power system operation is gradually increasing. As an efficient tool for power system analysis under uncertainty, probabilistic power flow (PPF) is becoming increasingly important. The point-estimate method (PEM) is a well-known PPF algorithm. However, two significant defects limit the practical use of this method. One is that the PEM struggles to estimate high-order moments accurately; this defect makes it difficult for the PEM to describe the distribution of non-Gaussian output random variables (ORVs). The other is that the calculation burden is strongly related to the scale of input random variables (IRVs), which makes the PEM difficult to use in large-scale power systems. A novel approach based on principal component analysis (PCA) and high-dimensional model representation (HDMR) is proposed here to overcome the defects of the traditional PEM. PCA is applied to decrease the dimension scale of IRVs and eliminate correlations. HDMR is applied to estimate the moments of ORVs. Because HDMR considers the cooperative effects of IRVs, it has a significantly smaller estimation error for high-order moments in particular. Case studies show that the proposed method can achieve a better performance in terms of accuracy and efficiency than traditional PEM.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yuxian Huang ◽  
Geng Yang ◽  
Yahong Xu ◽  
Hao Zhou

In big data era, massive and high-dimensional data is produced at all times, increasing the difficulty of analyzing and protecting data. In this paper, in order to realize dimensionality reduction and privacy protection of data, principal component analysis (PCA) and differential privacy (DP) are combined to handle these data. Moreover, support vector machine (SVM) is used to measure the availability of processed data in our paper. Specifically, we introduced differential privacy mechanisms at different stages of the algorithm PCA-SVM and obtained the algorithms DPPCA-SVM and PCADP-SVM. Both algorithms satisfy ε , 0 -DP while achieving fast classification. In addition, we evaluate the performance of two algorithms in terms of noise expectation and classification accuracy from the perspective of theoretical proof and experimental verification. To verify the performance of DPPCA-SVM, we also compare our DPPCA-SVM with other algorithms. Results show that DPPCA-SVM provides excellent utility for different data sets despite guaranteeing stricter privacy.


2019 ◽  
Vol 13 ◽  
pp. 174830261986744
Author(s):  
Ran Zhang ◽  
Bin Ye ◽  
Peng Liu

Nowadays, datasets containing a very large number of variables or features are routinely generated in many fields. Dimension reduction techniques are usually performed prior to statistically analyzing these datasets in order to avoid the effects of the curse of dimensionality. Principal component analysis is one of the most important techniques for dimension reduction and data visualization. However, datasets with missing values arising in almost every field will produce biased estimates and are difficult to handle, especially in the high dimension, low sample size settings. By exploiting a Lasso estimator of the population covariance matrix, we propose to regularize the principal component analysis to reduce the dimensionality of dataset with missing data. The Lasso estimator of covariance matrix is computationally tractable by solving a convex optimization problem. To illustrate the effectiveness of our method on dimension reduction, the principal component directions are evaluated by the metrics of Frobenius norm and cosine distance. The performances are compared with other incomplete data handling methods such as mean substitution and multiple imputation. Simulation results also show that our method is superior to other incomplete data handling methods in the context of discriminant analysis of real world high-dimensional datasets.


Author(s):  
Vitaly E. Bulgakov

Motivated by the previously developed multilevel aggregation method for solving structural analysis problems a novel two-level aggregation approach for efficient iterative solution of Principal Component Analysis (PCA) problems is proposed. The course aggregation model of the original covariance matrix is used in the iterative solution of the eigenvalue problem by a power iterations method. The method is tested on several data sets consisting of large number of text documents.


Author(s):  
Petr Praus

In this chapter the principals and applications of principal component analysis (PCA) applied on hydrological data are presented. Four case studies showed the possibility of PCA to obtain information about wastewater treatment process, drinking water quality in a city network and to find similarities in the data sets of ground water quality results and water-related images. In the first case study, the composition of raw and cleaned wastewater was characterised and its temporal changes were displayed. In the second case study, drinking water samples were divided into clusters in consistency with their sampling localities. In the case study III, the similar samples of ground water were recognised by the calculation of cosine similarity, the Euclidean and Manhattan distances. In the case study IV, 32 water-related images were transformed into a large image matrix whose dimensionality was reduced by PCA. The images were clustered using the PCA scatter plots.


2020 ◽  
Vol 152 (23) ◽  
pp. 234103
Author(s):  
Bastien Casier ◽  
Stéphane Carniato ◽  
Tsveta Miteva ◽  
Nathalie Capron ◽  
Nicolas Sisourat

2013 ◽  
Vol 303-306 ◽  
pp. 1101-1104 ◽  
Author(s):  
Yong De Hu ◽  
Jing Chang Pan ◽  
Xin Tan

Kernel entropy component analysis (KECA) reveals the original data’s structure by kernel matrix. This structure is related to the Renyi entropy of the data. KECA maintains the invariance of the original data’s structure by keeping the data’s Renyi entropy unchanged. This paper described the original data by several components on the purpose of dimension reduction. Then the KECA was applied in celestial spectra reduction and was compared with Principal Component Analysis (PCA) and Kernel Principal Component Analysis (KPCA) by experiments. Experimental results show that the KECA is a good method in high-dimensional data reduction.


Sign in / Sign up

Export Citation Format

Share Document