Gaussian Covariance Faithful Markov Trees

Journal of Probability and Statistics ◽

10.1155/2011/152942 ◽

2011 ◽

Vol 2011 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Dhafer Malouche ◽

Bala Rajaratnam

Keyword(s):

Probability Distribution ◽

Graphical Models ◽

Conditional Independence ◽

High Dimensional ◽

Important Class ◽

Graph Models ◽

Selection Procedures ◽

Gaussian Distributions ◽

Discrete Object ◽

Covariance Graph

Graphical models are useful for characterizing conditional and marginal independence structures in high-dimensional distributions. An important class of graphical models is covariance graph models, where the nodes of a graph represent different components of a random vector, and the absence of an edge between any pair of variables implies marginal independence. Covariance graph models also represent more complex conditional independence relationships between subsets of variables. When the covariance graph captures or reflects all the conditional independence statements present in the probability distribution, the latter is said to befaithfulto its covariance graph—though in general this is not guaranteed. Faithfulness however is crucial, for instance, in model selection procedures that proceed by testing conditional independences. Hence, an analysis of the faithfulness assumption is important in understanding the ability of the graph, a discrete object, to fully capture the salient features of the probability distribution it aims to describe. In this paper, we demonstrate that multivariate Gaussian distributions that have trees as covariance graphs are necessarily faithful.

Download Full-text

One-component regular variation and graphical modeling of extremes

Journal of Applied Probability ◽

10.1017/jpr.2016.37 ◽

2016 ◽

Vol 53 (3) ◽

pp. 733-746 ◽

Cited By ~ 2

Author(s):

Adrien Hitz ◽

Robin Evans

Keyword(s):

Graphical Models ◽

Regular Variation ◽

Conditional Independence ◽

Probability Distributions ◽

Value Theory ◽

High Dimensional ◽

Regularly Varying Functions ◽

Graphical Modeling ◽

Multivariate Extreme Value Theory ◽

Regularly Varying

AbstractThe problem of inferring the distribution of a random vector given that its norm is large requires modeling a homogeneous limiting density. We suggest an approach based on graphical models which is suitable for high-dimensional vectors. We introduce the notion of one-component regular variation to describe a function that is regularly varying in its first component. We extend the representation and Karamata's theorem to one-component regularly varying functions, probability distributions and densities, and explain why these results are fundamental in multivariate extreme-value theory. We then generalize the Hammersley–Clifford theorem to relate asymptotic conditional independence to a factorization of the limiting density, and use it to model multivariate tails.

Download Full-text

High-Dimensional Elliptical Sliced Inverse Regression in non-Gaussian Distributions

Journal of Business and Economic Statistics ◽

10.1080/07350015.2021.1910041 ◽

2021 ◽

pp. 1-39

Author(s):

Xin Chen ◽

Jia Zhang ◽

Wang Zhou

Keyword(s):

Sliced Inverse Regression ◽

High Dimensional ◽

Inverse Regression ◽

Gaussian Distributions ◽

Non Gaussian

Download Full-text

Joint estimation of multiple high-dimensional Gaussian copula graphical models

Australian & New Zealand Journal of Statistics ◽

10.1111/anzs.12198 ◽

2017 ◽

Vol 59 (3) ◽

pp. 289-310 ◽

Cited By ~ 4

Author(s):

Yong He ◽

Xinsheng Zhang ◽

Jiadong Ji ◽

Bin Liu

Keyword(s):

Graphical Models ◽

Joint Estimation ◽

Gaussian Copula ◽

High Dimensional

Download Full-text

Gradient Scan Gibbs Sampler: An Efficient Algorithm for High-Dimensional Gaussian Distributions

IEEE Journal of Selected Topics in Signal Processing ◽

10.1109/jstsp.2015.2510961 ◽

2016 ◽

Vol 10 (2) ◽

pp. 343-352 ◽

Cited By ~ 4

Author(s):

Olivier Feron ◽

Francois Orieux ◽

Jean-Francois Giovannelli

Keyword(s):

Gibbs Sampler ◽

Efficient Algorithm ◽

High Dimensional ◽

Gaussian Distributions

Download Full-text

On the Letac-Massam conjecture and existence of high dimensional Bayes estimators for graphical models

Electronic Journal of Statistics ◽

10.1214/19-ejs1669 ◽

2020 ◽

Vol 14 (1) ◽

pp. 580-604

Author(s):

Emanuel Ben-David ◽

Bala Rajaratnam

Keyword(s):

Graphical Models ◽

High Dimensional ◽

Bayes Estimators

Download Full-text

The ‘Un-Shrunk’ Partial Correlation in Gaussian Graphical Models

10.21203/rs.3.rs-76682/v1 ◽

2020 ◽

Author(s):

Victor Bernal ◽

Rainer Bischoff ◽

Peter Horvatovich ◽

Victor Guryev ◽

Marco Grzegorczyk

Keyword(s):

Graphical Models ◽

Regulatory Networks ◽

Partial Correlation ◽

High Dimensional ◽

Dimensional Problem ◽

Gaussian Graphical Models ◽

High Dimensional Problem ◽

Non Linear ◽

Partial Correlations ◽

Molecular Profiles

Abstract Background: In systems biology, it is important to reconstruct regulatory networks from quantitative molecular profiles. Gaussian graphical models (GGMs) are one of the most popular methods to this end. A GGM consists of nodes (representing the transcripts, metabolites or proteins) inter-connected by edges (reflecting their partial correlations). Learning the edges from quantitative molecular profiles is statistically challenging, as there are usually fewer samples than nodes (‘high dimensional problem’). Shrinkage methods address this issue by learning a regularized GGM. However, it is an open question how the shrinkage affects the final result and its interpretation.Results: We show that the shrinkage biases the partial correlation in a non-linear way. This bias does not only change the magnitudes of the partial correlations but also affects their order. Furthermore, it makes networks obtained from different experiments incomparable and hinders their biological interpretation. We propose a method, referred to as the ‘un-shrunk’ partial correlation, which corrects for this non-linear bias. Unlike traditional methods, which use a fixed shrinkage value, the new approach provides partial correlations that are closer to the actual (population) values and that are easier to interpret. We apply the ‘un-shrunk’ method to two gene expression datasets from Escherichia coli and Mus musculus.Conclusions: GGMs are popular undirected graphical models based on partial correlations. The application of GGMs to reconstruct regulatory networks is commonly performed using shrinkage to overcome the “high-dimensional” problem. Besides it advantages, we have identified that the shrinkage introduces a non-linear bias in the partial correlations. Ignoring this type of effects caused by the shrinkage can obscure the interpretation of the network, and impede the validation of earlier reported results.

Download Full-text

The ‘un-shrunk’ partial correlation in Gaussian graphical models

BMC Bioinformatics ◽

10.1186/s12859-021-04313-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Victor Bernal ◽

Rainer Bischoff ◽

Peter Horvatovich ◽

Victor Guryev ◽

Marco Grzegorczyk

Keyword(s):

Graphical Models ◽

Regulatory Networks ◽

Partial Correlation ◽

High Dimensional ◽

Dimensional Problem ◽

Gaussian Graphical Models ◽

High Dimensional Problem ◽

Non Linear ◽

Partial Correlations ◽

Molecular Profiles

Abstract Background In systems biology, it is important to reconstruct regulatory networks from quantitative molecular profiles. Gaussian graphical models (GGMs) are one of the most popular methods to this end. A GGM consists of nodes (representing the transcripts, metabolites or proteins) inter-connected by edges (reflecting their partial correlations). Learning the edges from quantitative molecular profiles is statistically challenging, as there are usually fewer samples than nodes (‘high dimensional problem’). Shrinkage methods address this issue by learning a regularized GGM. However, it remains open to study how the shrinkage affects the final result and its interpretation. Results We show that the shrinkage biases the partial correlation in a non-linear way. This bias does not only change the magnitudes of the partial correlations but also affects their order. Furthermore, it makes networks obtained from different experiments incomparable and hinders their biological interpretation. We propose a method, referred to as ‘un-shrinking’ the partial correlation, which corrects for this non-linear bias. Unlike traditional methods, which use a fixed shrinkage value, the new approach provides partial correlations that are closer to the actual (population) values and that are easier to interpret. This is demonstrated on two gene expression datasets from Escherichia coli and Mus musculus. Conclusions GGMs are popular undirected graphical models based on partial correlations. The application of GGMs to reconstruct regulatory networks is commonly performed using shrinkage to overcome the ‘high-dimensional problem’. Besides it advantages, we have identified that the shrinkage introduces a non-linear bias in the partial correlations. Ignoring this type of effects caused by the shrinkage can obscure the interpretation of the network, and impede the validation of earlier reported results.

Download Full-text

Kernel partial correlation: a novel approach to capturing conditional independence in graphical models for noisy data

Journal of Applied Statistics ◽

10.1080/02664763.2018.1437123 ◽

2018 ◽

Vol 45 (14) ◽

pp. 2677-2696

Author(s):

Jihwan Oh ◽

Faye Zheng ◽

R. W. Doerge ◽

Hyonho Chun

Keyword(s):

Graphical Models ◽

Conditional Independence ◽

Partial Correlation ◽

Noisy Data ◽

Novel Approach

Download Full-text

Dimensionality and Its Reduction

Statistics, Data Mining, and Machine Learning in Astronomy ◽

10.23943/princeton/9780691151687.003.0007 ◽

2014 ◽

Author(s):

Andrew J. Connolly ◽

Jacob T. VanderPlas ◽

Alexander Gray ◽

Andrew J. Connolly ◽

Jacob T. VanderPlas ◽

...

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Reduction Technique ◽

High Dimensional ◽

Data Sets ◽

Data Set ◽

Gaussian Distributions ◽

Dimensionality Reduction Technique ◽

Alternative Techniques ◽

New Generation

With the dramatic increase in data available from a new generation of astronomical telescopes and instruments, many analyses must address the question of the complexity as well as size of the data set. This chapter deals with how we can learn which measurements, properties, or combinations thereof carry the most information within a data set. It describes techniques that are related to concepts discussed when describing Gaussian distributions, density estimation, and the concepts of information content. The chapter begins with an exploration of the problems posed by high-dimensional data. It then describes the data sets used in this chapter, and introduces perhaps the most important and widely used dimensionality reduction technique, principal component analysis (PCA). The remainder of the chapter discusses several alternative techniques which address some of the weaknesses of PCA.

Download Full-text

Contour Reconstruction

Natural Language Processing ◽

10.4018/978-1-7998-0951-7.ch029 ◽

2020 ◽

pp. 584-618

Author(s):

Dariusz Jacek Jakóbczak

Keyword(s):

Probability Distribution ◽

Image Retrieval ◽

Probability Distribution Function ◽

Feature Space ◽

Linear Interpolation ◽

Distribution Functions ◽

High Dimensional ◽

Feature Spaces ◽

Contour Reconstruction ◽

Multidimensional Feature Spaces

The method of Probabilistic Features Combination (PFC) enables interpolation and modeling of high-dimensional N data using features' combinations and different coefficients γ: polynomial, sinusoidal, cosinusoidal, tangent, cotangent, logarithmic, exponential, arc sin, arc cos, arc tan, arc cot or power function. Functions for γ calculations are chosen individually at each data modeling and it is treated as N-dimensional probability distribution function: γ depends on initial requirements and features' specifications. PFC method leads to data interpolation as handwriting or signature identification and image retrieval via discrete set of feature vectors in N-dimensional feature space. So PFC method makes possible the combination of two important problems: interpolation and modeling in a matter of image retrieval or writer identification. Main features of PFC method are: PFC interpolation develops a linear interpolation in multidimensional feature spaces into other functions as N-dimensional probability distribution functions.

Download Full-text