Mechanistic hierarchical population model identifies latent causes of cell-to-cell variability
All biological systems exhibit cell-to-cell variability, and this variability often has functional implications. To gain a thorough understanding of biological processes, the latent causes and underlying mechanisms of this variability must be elucidated. Cell populations comprising multiple distinct subpopulations are commonplace in biology, yet no current methods allow the sources of variability between and within individual subpopulations to be identified. This limits the analysis of single-cell data, for example provided by flow cytometry and microscopy. In this study, we present a data-driven modeling framework for the analysis of populations comprising heterogeneous subpopulations. Our approach combines mixture modeling with frameworks for distribution approximation, facilitating the integration of multiple single-cell datasets and the detection of causal differences between and within subpopulations. The computational efficiency of our framework allows hundreds of competing hypotheses to be compared, giving unprecedented depth of a study. We demonstrated the ability of our method to capture multiple levels of heterogeneity in the analyzes of simulated data and data from highly heterogeneous sensory neurons involved in pain initiation. Our approach identified the sources of cell-to-cell variability and revealed mechanisms that underlie the modulation of nerve growth factor-induced Erk1/2 signaling by extracellular scaffolds.