Triply Stochastic Variational Inference for Non-linear Beta Process Factor Analysis

<p>Unit-weight sum scores (UWSSs) are routinely used as estimates of factor scores on the basis of solutions obtained with the non-linear exploratory factor analysis (EFA) model for ordered-categorical responses. Theoretically, this practice results in a loss of information and accuracy, and is expected to lead to biased estimates. However, the practical relevance of these limitations is far from clear. In this article we adopt an empirical view, and propose indices and procedures (some of them new) for assessing the appropriateness of UWSSs in non-linear EFA applications. A new automated approach for obtaining UWSSs that maximize fidelity and correlational accuracy is proposed. The appropriateness of UWSSs under different conditions and the behavior of the present proposal in comparison with other more common approaches are assessed with a simulation study. A tutorial for interested practitioners is presented using an illustrative example based on a well-known personality questionnaire. All the procedures proposed in the article have been implemented in a well-known noncommercial EFA program. </p>

Download Full-text

Stochastic Variational Inference-Based Parallel and Online Supervised Topic Model for Large-Scale Text Processing

Journal of Computer Science and Technology ◽

10.1007/s11390-018-1871-y ◽

2018 ◽

Vol 33 (5) ◽

pp. 1007-1022

Author(s):

Yang Li ◽

Wen-Zhuo Song ◽

Bo Yang

Keyword(s):

Large Scale ◽

Topic Model ◽

Text Processing ◽

Variational Inference ◽

Stochastic Variational Inference

Download Full-text

A Two-Stepped Feature Engineering Process for Topic Modeling using Batchwise LDA with Stochastic Variational Inference Model

International Journal of Intelligent Engineering and Systems ◽

10.22266/ijies2020.0831.29 ◽

2020 ◽

Vol 13 (4) ◽

pp. 333-345

Author(s):

Sujatha Kokatnoor ◽

◽

Balachandran Krishnan ◽

Keyword(s):

Topic Modeling ◽

Variational Inference ◽

Feature Engineering ◽

Engineering Process ◽

Inference Model ◽

Stochastic Variational Inference

Download Full-text

Tensor Decomposition with Missing Indices

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/449 ◽

2017 ◽

Cited By ~ 1

Author(s):

Yuto Yamaguchi ◽

Kohei Hayashi

Keyword(s):

Incomplete Data ◽

Tensor Decomposition ◽

Variational Inference ◽

Tensor Completion ◽

Order Tensor ◽

Decomposition Problem ◽

Decomposition Model ◽

Stochastic Variational Inference ◽

Completion Task

How can we decompose a data tensor if the indices are partially missing?Tensor decomposition is a fundamental tool to analyze the tensor data.Suppose, for example, we have a 3rd-order tensor X where each element Xijk takes 1 if user i posts word j at location k on Twitter.Standard tensor decomposition expects all the indices are observed but, in some tweets, location k can be missing.In this paper, we study a tensor decomposition problem where the indices (i, j, or k) of some observed elements are partially missing.Towards the problem, we propose a probabilistic tensor decomposition model that handles missing indices as latent variables.To infer them, we derive an algorithm based on stochastic variational inference, which enables to leverage the information from the incomplete data scalably. The experiments on both synthetic and real datasets show that the proposed method achieves higher accuracy in the tensor completion task than baselines that cannot handle missing indices.

Download Full-text

A Joint Model of RNA Expression and Surface Protein Abundance in Single Cells

10.1101/791947 ◽

2019 ◽

Cited By ~ 5

Author(s):

Adam Gayoso ◽

Romain Lopez ◽

Zoë Steier ◽

Jeffrey Regier ◽

Aaron Streets ◽

...

Keyword(s):

Single Cells ◽

Surface Protein ◽

Random Variable ◽

Protein Quantification ◽

Variational Inference ◽

Stochastic Variational Inference ◽

Cell Transcriptome ◽

Downstream Analysis ◽

Low Dimensional ◽

Single Cell Transcriptome

Cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) combines unbiased single-cell transcriptome measurements with surface protein quantification comparable to flow cytometry, the gold standard for cell type identification. However, current analysis pipelines cannot address the two primary challenges of CITE-seq data: combining both modalities in a shared latent space that harnesses the power of the paired measurements, and handling the technical artifacts of the protein measurement, which is obscured by non-negligible background noise. Here we present Total Variational Inference (totalVI), a fully probabilistic end-to-end framework for normalizing and analyzing CITE-seq data, based on a hierarchical Bayesian model. In totalVI, the mRNA and protein measurements for each cell are generated from a low-dimensional latent random variable unique to that cell, representing its cellular state. totalVI uses deep neural networks to specify conditional distributions. By leveraging advances in stochastic variational inference, it scales easily to millions of cells. Explicit modeling of nuisance factors enables totalVI to produce denoised data in both domains, as well as a batch-corrected latent representation of cells for downstream analysis tasks.

Download Full-text

Measuring Global Cognition in a Pooled Sample using Modified Non-Linear Factor Analysis

10.31235/osf.io/ka3bs ◽

2020 ◽

Author(s):

Max Reason ◽

Yang Claire Yang ◽

Allison Aiello ◽

Dan Belsky ◽

Patrick Curran ◽

...

Keyword(s):

Public Health ◽

Factor Analysis ◽

Life Course ◽

Adult Life ◽

The United States ◽

Linear Factor ◽

Non Linear ◽

Global Cognition ◽

Pooled Sample ◽

Representative Samples

Currently, studies of cognition and cognitive decline in the United States are limited by the use of samples that only provide data for respondents during one stage of the adult life course. By using an Integrative Data Analysis (IDA) framework, it is possible to pool multiple national representative samples together in order to create a unified dataset that includes respondents over the entire adult life course. This study applies an IDA framework to two independent public health datasets to create a commensurate measure of cognition using Modified Non-Linear Factor Analysis (MNLFA). The overall goal is to demonstrate the process of using MNLFA for the study of cognition in a pooled dataset.

Download Full-text