scholarly journals Gaussian Covariance Faithful Markov Trees

2011 ◽  
Vol 2011 ◽  
pp. 1-10 ◽  
Author(s):  
Dhafer Malouche ◽  
Bala Rajaratnam

Graphical models are useful for characterizing conditional and marginal independence structures in high-dimensional distributions. An important class of graphical models is covariance graph models, where the nodes of a graph represent different components of a random vector, and the absence of an edge between any pair of variables implies marginal independence. Covariance graph models also represent more complex conditional independence relationships between subsets of variables. When the covariance graph captures or reflects all the conditional independence statements present in the probability distribution, the latter is said to befaithfulto its covariance graph—though in general this is not guaranteed. Faithfulness however is crucial, for instance, in model selection procedures that proceed by testing conditional independences. Hence, an analysis of the faithfulness assumption is important in understanding the ability of the graph, a discrete object, to fully capture the salient features of the probability distribution it aims to describe. In this paper, we demonstrate that multivariate Gaussian distributions that have trees as covariance graphs are necessarily faithful.

2016 ◽  
Vol 53 (3) ◽  
pp. 733-746 ◽  
Author(s):  
Adrien Hitz ◽  
Robin Evans

AbstractThe problem of inferring the distribution of a random vector given that its norm is large requires modeling a homogeneous limiting density. We suggest an approach based on graphical models which is suitable for high-dimensional vectors. We introduce the notion of one-component regular variation to describe a function that is regularly varying in its first component. We extend the representation and Karamata's theorem to one-component regularly varying functions, probability distributions and densities, and explain why these results are fundamental in multivariate extreme-value theory. We then generalize the Hammersley–Clifford theorem to relate asymptotic conditional independence to a factorization of the limiting density, and use it to model multivariate tails.


2017 ◽  
Vol 59 (3) ◽  
pp. 289-310 ◽  
Author(s):  
Yong He ◽  
Xinsheng Zhang ◽  
Jiadong Ji ◽  
Bin Liu

2020 ◽  
Author(s):  
Victor Bernal ◽  
Rainer Bischoff ◽  
Peter Horvatovich ◽  
Victor Guryev ◽  
Marco Grzegorczyk

Abstract Background: In systems biology, it is important to reconstruct regulatory networks from quantitative molecular profiles. Gaussian graphical models (GGMs) are one of the most popular methods to this end. A GGM consists of nodes (representing the transcripts, metabolites or proteins) inter-connected by edges (reflecting their partial correlations). Learning the edges from quantitative molecular profiles is statistically challenging, as there are usually fewer samples than nodes (‘high dimensional problem’). Shrinkage methods address this issue by learning a regularized GGM. However, it is an open question how the shrinkage affects the final result and its interpretation.Results: We show that the shrinkage biases the partial correlation in a non-linear way. This bias does not only change the magnitudes of the partial correlations but also affects their order. Furthermore, it makes networks obtained from different experiments incomparable and hinders their biological interpretation. We propose a method, referred to as the ‘un-shrunk’ partial correlation, which corrects for this non-linear bias. Unlike traditional methods, which use a fixed shrinkage value, the new approach provides partial correlations that are closer to the actual (population) values and that are easier to interpret. We apply the ‘un-shrunk’ method to two gene expression datasets from Escherichia coli and Mus musculus.Conclusions: GGMs are popular undirected graphical models based on partial correlations. The application of GGMs to reconstruct regulatory networks is commonly performed using shrinkage to overcome the “high-dimensional” problem. Besides it advantages, we have identified that the shrinkage introduces a non-linear bias in the partial correlations. Ignoring this type of effects caused by the shrinkage can obscure the interpretation of the network, and impede the validation of earlier reported results.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Victor Bernal ◽  
Rainer Bischoff ◽  
Peter Horvatovich ◽  
Victor Guryev ◽  
Marco Grzegorczyk

Abstract Background In systems biology, it is important to reconstruct regulatory networks from quantitative molecular profiles. Gaussian graphical models (GGMs) are one of the most popular methods to this end. A GGM consists of nodes (representing the transcripts, metabolites or proteins) inter-connected by edges (reflecting their partial correlations). Learning the edges from quantitative molecular profiles is statistically challenging, as there are usually fewer samples than nodes (‘high dimensional problem’). Shrinkage methods address this issue by learning a regularized GGM. However, it remains open to study how the shrinkage affects the final result and its interpretation. Results We show that the shrinkage biases the partial correlation in a non-linear way. This bias does not only change the magnitudes of the partial correlations but also affects their order. Furthermore, it makes networks obtained from different experiments incomparable and hinders their biological interpretation. We propose a method, referred to as ‘un-shrinking’ the partial correlation, which corrects for this non-linear bias. Unlike traditional methods, which use a fixed shrinkage value, the new approach provides partial correlations that are closer to the actual (population) values and that are easier to interpret. This is demonstrated on two gene expression datasets from Escherichia coli and Mus musculus. Conclusions GGMs are popular undirected graphical models based on partial correlations. The application of GGMs to reconstruct regulatory networks is commonly performed using shrinkage to overcome the ‘high-dimensional problem’. Besides it advantages, we have identified that the shrinkage introduces a non-linear bias in the partial correlations. Ignoring this type of effects caused by the shrinkage can obscure the interpretation of the network, and impede the validation of earlier reported results.


Author(s):  
Andrew J. Connolly ◽  
Jacob T. VanderPlas ◽  
Alexander Gray ◽  
Andrew J. Connolly ◽  
Jacob T. VanderPlas ◽  
...  

With the dramatic increase in data available from a new generation of astronomical telescopes and instruments, many analyses must address the question of the complexity as well as size of the data set. This chapter deals with how we can learn which measurements, properties, or combinations thereof carry the most information within a data set. It describes techniques that are related to concepts discussed when describing Gaussian distributions, density estimation, and the concepts of information content. The chapter begins with an exploration of the problems posed by high-dimensional data. It then describes the data sets used in this chapter, and introduces perhaps the most important and widely used dimensionality reduction technique, principal component analysis (PCA). The remainder of the chapter discusses several alternative techniques which address some of the weaknesses of PCA.


2020 ◽  
pp. 584-618
Author(s):  
Dariusz Jacek Jakóbczak

The method of Probabilistic Features Combination (PFC) enables interpolation and modeling of high-dimensional N data using features' combinations and different coefficients γ: polynomial, sinusoidal, cosinusoidal, tangent, cotangent, logarithmic, exponential, arc sin, arc cos, arc tan, arc cot or power function. Functions for γ calculations are chosen individually at each data modeling and it is treated as N-dimensional probability distribution function: γ depends on initial requirements and features' specifications. PFC method leads to data interpolation as handwriting or signature identification and image retrieval via discrete set of feature vectors in N-dimensional feature space. So PFC method makes possible the combination of two important problems: interpolation and modeling in a matter of image retrieval or writer identification. Main features of PFC method are: PFC interpolation develops a linear interpolation in multidimensional feature spaces into other functions as N-dimensional probability distribution functions.


Sign in / Sign up

Export Citation Format

Share Document