Covariate-Profile Similarity Weighting and Bagging Studies with the Study Strap: Multi-Study Learning for Human Neurochemical Sensing
Prediction settings with multiple studies have become increasingly common. Ensembling models trained on individual studies has been shown to improve replicability in new studies. Motivated by a groundbreaking new technology in human neuroscience, we introduce two generalizations of multi-study ensemble predictions. First, while existing methods weight ensemble elements by cross-study prediction performance, we extend weighting schemes to also incorporate covariate similarity between training data and target validation studies. Second, we introduce a hierarchical resampling scheme to generate pseudo-study replicates (“study straps”) and ensemble classifiers trained on these rather than the original studies themselves. We demonstrate analytically that existing methods are special cases. Through a tuning parameter, our approach forms a continuum between merging all training data and training with existing multi-study ensembles. Leveraging this continuum helps accommodate different levels of between-study heterogeneity.Our methods are motivated by the application of Voltammetry in humans. This technique records electrical brain measurements and converts signals into neurotransmitter concentration estimates using a prediction model. Using this model in practice presents a cross-study challenge, for which we show marked improvements after application of our methods. We verify our methods in simulations and provide the studyStrap R package.