Predicting emotional state using behavioural markers derived from passively sensed data (Preprint)
BACKGROUND Mental health disorders affect multiple aspects of patients' lives, including mood, cognition, and behaviour. The advent of eHealth and mHealth technologies enables rich sets of information to be collected from individuals in a non-invasive way presenting a promising opportunity for the construction of behavioural markers of mental health. Importantly, combining such data with self-reported information about psychological symptoms may provide a more comprehensive and contextualised view of a patient's mental state than questionnaire data alone. However, in the real world, this kind of data is usually noisy and incomplete - with significant numbers of missing observations. Realising the clinical potential of mHealth tools, therefore depends critically upon the development of methods to cope with such data. OBJECTIVE Here, we present a machine learning-based approach for emotional valence (mood) analysis using passively-collected data from mobile phones and wearable devices. METHODS Passively-sensed behaviour and self-reported emotional state data from an international cohort of N=943 individuals (psychiatric outpatients recruited from community clinics) were available for analysis. All study participants had at least 30 days worth of observations of naturally-occurring behaviour, which included information about physical activity, geolocation, sleep, and smartphone app usage. These regularly sampled, but frequently missing and heterogeneous time series data were analysed using a semi-supervised Hidden Markov Model (HMM) for data averaging and feature extraction, which was then combined with a classifier to provide emotional valence predictions. We examined the performance of both a variety of classical machine learning methods and recurrent neural networks. RESULTS The best-performing models achieved greater than 0.80 Area Under the Curve of the Receiver Operating Characteristic (AUC-ROC) and 0.75 Area Under the Precision-Recall Curve (AUC-PRC) when predicting self-reported emotional valence from behaviour in held-out test data. Models which took into account the posterior probabilities of latent states identified by the HMM analysis outperformed those which did not - suggesting that the underlying behavioural patterns identified were meaningful with respect to individuals' overall emotional state. CONCLUSIONS These findings demonstrate the feasibility of designing machine learning models for predicting emotional state from mobile sensing data that are capable of dealing with heterogeneous data with large numbers of missing observations. Such models may represent a valuable tool for clinicians in the monitoring of mood states of their patients.