Stochastic Complexity for Mixture of Exponential Families in Variational Bayes

Author(s):  
Kazuho Watanabe ◽  
Sumio Watanabe
Mathematics ◽  
2021 ◽  
Vol 9 (13) ◽  
pp. 1568
Author(s):  
Shaul K. Bar-Lev

Let F=Fθ:θ∈Θ⊂R be a family of probability distributions indexed by a parameter θ and let X1,⋯,Xn be i.i.d. r.v.’s with L(X1)=Fθ∈F. Then, F is said to be reproducible if for all θ∈Θ and n∈N, there exists a sequence (αn)n≥1 and a mapping gn:Θ→Θ,θ⟼gn(θ) such that L(αn∑i=1nXi)=Fgn(θ)∈F. In this paper, we prove that a natural exponential family F is reproducible iff it possesses a variance function which is a power function of its mean. Such a result generalizes that of Bar-Lev and Enis (1986, The Annals of Statistics) who proved a similar but partial statement under the assumption that F is steep as and under rather restricted constraints on the forms of αn and gn(θ). We show that such restrictions are not required. In addition, we examine various aspects of reproducibility, both theoretically and practically, and discuss the relationship between reproducibility, convolution and infinite divisibility. We suggest new avenues for characterizing other classes of families of distributions with respect to their reproducibility and convolution properties .


2021 ◽  
pp. 001316442199253
Author(s):  
Robert C. Foster

This article presents some equivalent forms of the common Kuder–Richardson Formula 21 and 20 estimators for nondichotomous data belonging to certain other exponential families, such as Poisson count data, exponential data, or geometric counts of trials until failure. Using the generalized framework of Foster (2020), an equation for the reliability for a subset of the natural exponential family have quadratic variance function is derived for known population parameters, and both formulas are shown to be different plug-in estimators of this quantity. The equivalent Kuder–Richardson Formulas 20 and 21 are given for six different natural exponential families, and these match earlier derivations in the case of binomial and Poisson data. Simulations show performance exceeding that of Cronbach’s alpha in terms of root mean square error when the formula matching the correct exponential family is used, and a discussion of Jensen’s inequality suggests explanations for peculiarities of the bias and standard error of the simulations across the different exponential families.


Entropy ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. 384
Author(s):  
Rocío Hernández-Sanjaime ◽  
Martín González ◽  
Antonio Peñalver ◽  
Jose J. López-Espín

The presence of unaccounted heterogeneity in simultaneous equation models (SEMs) is frequently problematic in many real-life applications. Under the usual assumption of homogeneity, the model can be seriously misspecified, and it can potentially induce an important bias in the parameter estimates. This paper focuses on SEMs in which data are heterogeneous and tend to form clustering structures in the endogenous-variable dataset. Because the identification of different clusters is not straightforward, a two-step strategy that first forms groups among the endogenous observations and then uses the standard simultaneous equation scheme is provided. Methodologically, the proposed approach is based on a variational Bayes learning algorithm and does not need to be executed for varying numbers of groups in order to identify the one that adequately fits the data. We describe the statistical theory, evaluate the performance of the suggested algorithm by using simulated data, and apply the two-step method to a macroeconomic problem.


Biometrika ◽  
1996 ◽  
Vol 83 (1) ◽  
pp. 248-248
Author(s):  
PETER MCCULLAGH
Keyword(s):  

2021 ◽  
pp. 002224372110329
Author(s):  
Nicolas Padilla ◽  
Eva Ascarza

The success of Customer Relationship Management (CRM) programs ultimately depends on the firm's ability to identify and leverage differences across customers — a very diffcult task when firms attempt to manage new customers, for whom only the first purchase has been observed. For those customers, the lack of repeated observations poses a structural challenge to inferring unobserved differences across them. This is what we call the “cold start” problem of CRM, whereby companies have difficulties leveraging existing data when they attempt to make inferences about customers at the beginning of their relationship. We propose a solution to the cold start problem by developing a probabilistic machine learning modeling framework that leverages the information collected at the moment of acquisition. The main aspect of the model is that it exibly captures latent dimensions that govern the behaviors observed at acquisition as well as future propensities to buy and to respond to marketing actions using deep exponential families. The model can be integrated with a variety of demand specifications and is exible enough to capture a wide range of heterogeneity structures. We validate our approach in a retail context and empirically demonstrate the model's ability at identifying high-value customers as well as those most sensitive to marketing actions, right after their first purchase.


1996 ◽  
Vol 48 (3) ◽  
pp. 573-576 ◽  
Author(s):  
T. T. Nguyen ◽  
A. K. Gupta ◽  
Y. Wang
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document