Triply Stochastic Variational Inference for Non-linear Beta Process Factor Analysis

Author(s):  
Kai Fan ◽  
Yizhe Zhang ◽  
Ricardo Henao ◽  
Katherine Heller
2020 ◽  
Author(s):  
Pere Ferrando ◽  
Urbano Lorenzo-Seva

<p>Unit-weight sum scores (UWSSs) are routinely used as estimates of factor scores on the basis of solutions obtained with the non-linear exploratory factor analysis (EFA) model for ordered-categorical responses. Theoretically, this practice results in a loss of information and accuracy, and is expected to lead to biased estimates. However, the practical relevance of these limitations is far from clear. In this article we adopt an empirical view, and propose indices and procedures (some of them new) for assessing the appropriateness of UWSSs in non-linear EFA applications. A new automated approach for obtaining UWSSs that maximize fidelity and correlational accuracy is proposed. The appropriateness of UWSSs under different conditions and the behavior of the present proposal in comparison with other more common approaches are assessed with a simulation study. A tutorial for interested practitioners is presented using an illustrative example based on a well-known personality questionnaire. All the procedures proposed in the article have been implemented in a well-known noncommercial EFA program. </p>


Author(s):  
Yuto Yamaguchi ◽  
Kohei Hayashi

How can we decompose a data tensor if the indices are partially missing?Tensor decomposition is a fundamental tool to analyze the tensor data.Suppose, for example, we have a 3rd-order tensor X where each element Xijk takes 1 if user i posts word j at location k on Twitter.Standard tensor decomposition expects all the indices are observed but, in some tweets, location k can be missing.In this paper, we study a tensor decomposition problem where the indices (i, j, or k) of some observed elements are partially missing.Towards the problem, we propose a probabilistic tensor decomposition model that handles missing indices as latent variables.To infer them, we derive an algorithm based on stochastic variational inference, which enables to leverage the information from the incomplete data scalably. The experiments on both synthetic and real datasets show that the proposed method achieves higher accuracy in the tensor completion task than baselines that cannot handle missing indices.


2019 ◽  
Author(s):  
Adam Gayoso ◽  
Romain Lopez ◽  
Zoë Steier ◽  
Jeffrey Regier ◽  
Aaron Streets ◽  
...  

Cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) combines unbiased single-cell transcriptome measurements with surface protein quantification comparable to flow cytometry, the gold standard for cell type identification. However, current analysis pipelines cannot address the two primary challenges of CITE-seq data: combining both modalities in a shared latent space that harnesses the power of the paired measurements, and handling the technical artifacts of the protein measurement, which is obscured by non-negligible background noise. Here we present Total Variational Inference (totalVI), a fully probabilistic end-to-end framework for normalizing and analyzing CITE-seq data, based on a hierarchical Bayesian model. In totalVI, the mRNA and protein measurements for each cell are generated from a low-dimensional latent random variable unique to that cell, representing its cellular state. totalVI uses deep neural networks to specify conditional distributions. By leveraging advances in stochastic variational inference, it scales easily to millions of cells. Explicit modeling of nuisance factors enables totalVI to produce denoised data in both domains, as well as a batch-corrected latent representation of cells for downstream analysis tasks.


2020 ◽  
Author(s):  
Max Reason ◽  
Yang Claire Yang ◽  
Allison Aiello ◽  
Dan Belsky ◽  
Patrick Curran ◽  
...  

Currently, studies of cognition and cognitive decline in the United States are limited by the use of samples that only provide data for respondents during one stage of the adult life course. By using an Integrative Data Analysis (IDA) framework, it is possible to pool multiple national representative samples together in order to create a unified dataset that includes respondents over the entire adult life course. This study applies an IDA framework to two independent public health datasets to create a commensurate measure of cognition using Modified Non-Linear Factor Analysis (MNLFA). The overall goal is to demonstrate the process of using MNLFA for the study of cognition in a pooled dataset.


METRON ◽  
2020 ◽  
Vol 78 (3) ◽  
pp. 329-352
Author(s):  
Erick da Conceição Amorim ◽  
Vinícius Diniz Mayrink

Sign in / Sign up

Export Citation Format

Share Document