Stimulation of the vagus nerve reduces learning in a go/no-go reinforcement learning task

Mapping Intimacies ◽

10.1101/535260 ◽

2019 ◽

Cited By ~ 2

Author(s):

Anne Kühnel ◽

Vanessa Teckentrup ◽

Monja P. Neuser ◽

Quentin J. M. Huys ◽

Caroline Burrasch ◽

...

Keyword(s):

Reinforcement Learning ◽

Vagus Nerve ◽

Learning Task ◽

Afferent Input ◽

Learning Rate ◽

Action Execution ◽

Transcutaneous Vagus Nerve Stimulation ◽

Metabolic States ◽

Reinforcement Learning Models ◽

The Impact

AbstractWhen facing decisions to approach rewards or to avoid punishments, we often figuratively go with our gut, and the impact of metabolic states such as hunger on motivation are well documented. However, whether and how vagal feedback signals from the gut influence instrumental actions is unknown. Here, we investigated the effect of non-invasive transcutaneous vagus nerve stimulation (tVNS) vs. sham (randomized cross-over design) on approach and avoidance behavior using an established go/no-go reinforcement learning paradigm (Guitart-Masip et al., 2012) in 39 healthy, participants after an overnight fast. First, mixed-effects logistic regression analysis of choice accuracy showed that tVNS acutely impaired decision-making, p = .045. Computational reinforcement learning models identified the cause of this as a reduction in the learning rate through tVNS (Δα = −0.092, pboot= .002), particularly after punishment (ΔαPun= −0.081, pboot= .012 vs. ΔαRew= −0.031, p = .22). However, tVNS had no effect on go biases, Pavlovian response biases or response time. Hence, tVNS appeared to influence learning rather than action execution. These results highlight a novel role of vagal afferent input in modulating reinforcement learning by tuning the learning rate according to homeostatic needs.

Download Full-text

Modeling changes in probabilistic reinforcement learning during adolescence

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008524 ◽

2021 ◽

Vol 17 (7) ◽

pp. e1008524

Author(s):

Liyu Xia ◽

Sarah L. Master ◽

Maria K. Eckstein ◽

Beth Baribault ◽

Ronald E. Dahl ◽

...

Keyword(s):

Reinforcement Learning ◽

Learning Task ◽

Learning Rate ◽

Integration Time ◽

Daily Experience ◽

Salivary Testosterone ◽

Probabilistic Uncertainty ◽

Probabilistic Reinforcement ◽

Hierarchical Bayesian Methods ◽

Reinforcement Learning Models

In the real world, many relationships between events are uncertain and probabilistic. Uncertainty is also likely to be a more common feature of daily experience for youth because they have less experience to draw from than adults. Some studies suggest probabilistic learning may be inefficient in youths compared to adults, while others suggest it may be more efficient in youths in mid adolescence. Here we used a probabilistic reinforcement learning task to test how youth age 8-17 (N = 187) and adults age 18-30 (N = 110) learn about stable probabilistic contingencies. Performance increased with age through early-twenties, then stabilized. Using hierarchical Bayesian methods to fit computational reinforcement learning models, we show that all participants’ performance was better explained by models in which negative outcomes had minimal to no impact on learning. The performance increase over age was driven by 1) an increase in learning rate (i.e. decrease in integration time scale); 2) a decrease in noisy/exploratory choices. In mid-adolescence age 13-15, salivary testosterone and learning rate were positively related. We discuss our findings in the context of other studies and hypotheses about adolescent brain development.

Download Full-text

The rational use of causal inference to guide reinforcement learning strengthens with age

10.31234/osf.io/j9zuk ◽

2019 ◽

Author(s):

Alexandra O. Cohen ◽

Kate Nussenbaum ◽

Hayley Dorfman ◽

Samuel J. Gershman ◽

Catherine A. Hartley

Keyword(s):

Reinforcement Learning ◽

Causal Structure ◽

Learning Task ◽

Negative Events ◽

Shape Learning ◽

Adolescents And Adults ◽

Bayesian Reinforcement Learning ◽

External Causes ◽

Reinforcement Learning Models ◽

Best Fit

Beliefs about the controllability of positive or negative events in the environment can shape learning throughout the lifespan. Previous research has shown that adults’ learning is modulated by beliefs about the causal structure of the environment such that they will update their value estimates to a lesser extent when the outcomes can be attributed to hidden causes. The present study examined whether external causes similarly influenced outcome attributions and learning across development. Ninety participants, ages 7 to 25 years, completed a reinforcement learning task in which they chose between two options with fixed reward probabilities. Choices were made in three distinct environments in which different hidden agents occasionally intervened to generate positive, negative, or random outcomes. Participants’ beliefs about hidden-agent intervention aligned well with the true probabilities of positive, negative, or random outcome manipulation in each of the three environments. Computational modeling of the learning data revealed that while the choices made by both adults (ages 18 - 25) and adolescents (ages 13 - 17) were best fit by Bayesian reinforcement learning models that incorporate beliefs about hidden agent intervention, those of children (ages 7 - 12) were best fit by a one learning rate model that updates value estimates based on choice outcomes alone. Together, these results suggest that while children demonstrate explicit awareness of the causal structure of the task environment they do not implicitly use beliefs about the causal structure of the environment to guide reinforcement learning in the same manner as adolescents and adults.

Download Full-text

Frontal Theta Oscillatory Activity Is a Common Mechanism for the Computation of Unexpected Outcomes and Learning Rate

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_00516 ◽

2014 ◽

Vol 26 (3) ◽

pp. 447-458 ◽

Cited By ~ 39

Author(s):

Ernest Mas-Herrero ◽

Josep Marco-Pallarés

Keyword(s):

Prediction Error ◽

Learning Task ◽

Learning Rate ◽

Oscillatory Activity ◽

Common Mechanism ◽

Prediction Errors ◽

Reward Prediction Error ◽

Reward Prediction ◽

Medial Pfc ◽

The Impact

In decision-making processes, the relevance of the information yielded by outcomes varies across time and situations. It increases when previous predictions are not accurate and in contexts with high environmental uncertainty. Previous fMRI studies have shown an important role of medial pFC in coding both reward prediction errors and the impact of this information to guide future decisions. However, it is unclear whether these two processes are dissociated in time or occur simultaneously, suggesting that a common mechanism is engaged. In the present work, we studied the modulation of two electrophysiological responses associated to outcome processing—the feedback-related negativity ERP and frontocentral theta oscillatory activity—with the reward prediction error and the learning rate. Twenty-six participants performed two learning tasks differing in the degree of predictability of the outcomes: a reversal learning task and a probabilistic learning task with multiple blocks of novel cue–outcome associations. We implemented a reinforcement learning model to obtain the single-trial reward prediction error and the learning rate for each participant and task. Our results indicated that midfrontal theta activity and feedback-related negativity increased linearly with the unsigned prediction error. In addition, variations of frontal theta oscillatory activity predicted the learning rate across tasks and participants. These results support the existence of a common brain mechanism for the computation of unsigned prediction error and learning rate.

Download Full-text

Effect of lysergic acid diethylamide (LSD) on reinforcement learning in humans

10.1101/2020.12.04.412189 ◽

2020 ◽

Author(s):

Jonathan W. Kanen ◽

Qiang Luo ◽

Mojtaba R. Kandroodi ◽

Rudolf N. Cardinal ◽

Trevor W. Robbins ◽

...

Keyword(s):

Reinforcement Learning ◽

Computational Modelling ◽

Lysergic Acid ◽

Learning Rate ◽

Lysergic Acid Diethylamide ◽

Immediate Feedback ◽

Healthy Humans ◽

Learning Rates ◽

Psychedelic Drugs ◽

The Impact

AbstractThe non-selective serotonin 2A (5-HT2A) receptor agonist lysergic acid diethylamide (LSD) holds promise as a treatment for some psychiatric disorders. Psychedelic drugs such as LSD have been suggested to have therapeutic actions through their effects on learning. The behavioural effects of LSD in humans, however, remain largely unexplored. Here we examined how LSD affects probabilistic reversal learning in healthy humans. Conventional measures assessing sensitivity to immediate feedback (“win-stay” and “lose-shift” probabilities) were unaffected, whereas LSD increased the impact of the strength of initial learning on perseveration. Computational modelling revealed that the most pronounced effect of LSD was enhancement of the reward learning rate. The punishment learning rate was also elevated. Increased reinforcement learning rates suggest LSD induced a state of heightened plasticity. These results indicate a potential mechanism through which revision of maladaptive associations could occur.

Download Full-text

Using reinforcement learning models in social neuroscience: frameworks, pitfalls and suggestions of best practices

Social Cognitive and Affective Neuroscience ◽

10.1093/scan/nsaa089 ◽

2020 ◽

Vol 15 (6) ◽

pp. 695-707 ◽

Cited By ~ 4

Author(s):

Lei Zhang ◽

Lukas Lengersdorff ◽

Nace Mikus ◽

Jan Gläscher ◽

Claus Lamm

Keyword(s):

Reinforcement Learning ◽

Prediction Error ◽

Model Comparison ◽

Hierarchical Modeling ◽

Learning Rate ◽

Neural Activation ◽

Affective Neuroscience ◽

Outcome Valence ◽

Practical Guidelines ◽

Reinforcement Learning Models

Abstract The recent years have witnessed a dramatic increase in the use of reinforcement learning (RL) models in social, cognitive and affective neuroscience. This approach, in combination with neuroimaging techniques such as functional magnetic resonance imaging, enables quantitative investigations into latent mechanistic processes. However, increased use of relatively complex computational approaches has led to potential misconceptions and imprecise interpretations. Here, we present a comprehensive framework for the examination of (social) decision-making with the simple Rescorla–Wagner RL model. We discuss common pitfalls in its application and provide practical suggestions. First, with simulation, we unpack the functional role of the learning rate and pinpoint what could easily go wrong when interpreting differences in the learning rate. Then, we discuss the inevitable collinearity between outcome and prediction error in RL models and provide suggestions of how to justify whether the observed neural activation is related to the prediction error rather than outcome valence. Finally, we suggest posterior predictive check is a crucial step after model comparison, and we articulate employing hierarchical modeling for parameter estimation. We aim to provide simple and scalable explanations and practical guidelines for employing RL models to assist both beginners and advanced users in better implementing and interpreting their model-based analyses.

Download Full-text

The impact of traumatic stress on Pavlovian biases

Psychological Medicine ◽

10.1017/s003329171700174x ◽

2017 ◽

Vol 48 (2) ◽

pp. 327-336 ◽

Cited By ~ 9

Author(s):

O. T. Ousdal ◽

Q. J. Huys ◽

A. M. Milde ◽

A. R. Craven ◽

L. Ersland ◽

...

Keyword(s):

Decision Making ◽

Traumatic Stress ◽

Instrumental Learning ◽

Learning Task ◽

Anterior Cingulate ◽

Dorsal Anterior Cingulate Cortex ◽

Single Episode ◽

Reinforcement Learning Models ◽

Stress Influences ◽

The Impact

BackgroundDisturbances in Pavlovian valuation systems are reported to follow traumatic stress exposure. However, motivated decisions are also guided by instrumental mechanisms, but to date the effect of traumatic stress on these instrumental systems remain poorly investigated. Here, we examine whether a single episode of severe traumatic stress influences flexible instrumental decisions through an impact on a Pavlovian system.MethodsTwenty-six survivors of the 2011 Norwegian terror attack and 30 matched control subjects performed an instrumental learning task in which Pavlovian and instrumental associations promoted congruent or conflicting responses. We used reinforcement learning models to infer how traumatic stress affected learning and decision-making. Based on the importance of dorsal anterior cingulate cortex (dACC) for cognitive control, we also investigated if individual concentrations of Glx (=glutamate + glutamine) in dACC predicted the Pavlovian bias of choice.ResultsSurvivors of traumatic stress expressed a greater Pavlovian interference with instrumental action selection and had significantly lower levels of Glx in the dACC. Across subjects, the degree of Pavlovian interference was negatively associated with dACC Glx concentrations.ConclusionsExperiencing traumatic stress appears to render instrumental decisions less flexible by increasing the susceptibility to Pavlovian influences. An observed association between prefrontal glutamatergic levels and this Pavlovian bias provides novel insight into the neurochemical basis of decision-making, and suggests a mechanism by which traumatic stress can impair flexible instrumental behaviours.

Download Full-text

Validating the Representational Space of Deep Reinforcement Learning Models of Behavior with Neural Data

10.1101/2021.06.15.448556 ◽

2021 ◽

Author(s):

Sebastian Bruch ◽

Patrick McClure ◽

Jingfeng Zhou ◽

Geoffrey Schoenbaum ◽

Francisco Pereira

Keyword(s):

Reinforcement Learning ◽

Learning Task ◽

Neural Recordings ◽

Learning Modules ◽

Representational Space ◽

Neural Computations ◽

Shared Information ◽

Markov Decision ◽

Ablation Study ◽

Reinforcement Learning Models

Deep Reinforcement Learning (Deep RL) agents have in recent years emerged as successful models of animal behavior in a variety of complex learning tasks, as exemplified by Song et al. [2017]. As agents are typically trained to mimic an animal subject, the emphasis in past studies on behavior as a means of evaluating the fitness of models to experimental data is only natural. But the true power of Deep RL agents lies in their ability to learn neural computations and codes that generate a particular behavior|factors that are also of great relevance and interest to computational neuroscience. On that basis, we believe that model evaluation should include an examination of neural representations and validation against neural recordings from animal subjects. In this paper, we introduce a procedure to test hypotheses about the relationship between internal representations of Deep RL agents and those in animal neural recordings. Taking a sequential learning task as a running example, we apply our method and show that the geometry of representations learnt by artificial agents is similar to that of the biological subjects', and that such similarities are driven by shared information in some latent space. Our method is applicable to any Deep RL agent that learns a Markov Decision Process, and as such enables researchers to assess the suitability of more advanced Deep Learning modules, or map hierarchies of representations to different parts of a circuit in the brain, and help shed light on their function. To demonstrate that point, we conduct an ablation study to deduce that, in the sequential task under consideration, temporal information plays a key role in molding a correct representation of the task.

Download Full-text

Neural Signatures of Prediction Errors in a Decision-Making Task Are Modulated by Action Execution Failures

10.1101/474361 ◽

2018 ◽

Author(s):

Samuel D. McDougle ◽

Peter A. Butcher ◽

Darius Parvin ◽

Fasial Mushtaq ◽

Yael Niv ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Prediction Error ◽

Learning Task ◽

Prediction Errors ◽

Action Execution ◽

Model Driven ◽

Assignment Of Credit ◽

Level Information ◽

High Level

AbstractDecisions must be implemented through actions, and actions are prone to error. As such, when an expected outcome is not obtained, an individual should not only be sensitive to whether the choice itself was suboptimal, but also whether the action required to indicate that choice was executed successfully. The intelligent assignment of credit to action execution versus action selection has clear ecological utility for the learner. To explore this scenario, we used a modified version of a classic reinforcement learning task in which feedback indicated if negative prediction errors were, or were not, associated with execution errors. Using fMRI, we asked if prediction error computations in the human striatum, a key substrate in reinforcement learning and decision making, are modulated when a failure in action execution results in the negative outcome. Participants were more tolerant of non-rewarded outcomes when these resulted from execution errors versus when execution was successful but the reward was withheld. Consistent with this behavior, a model-driven analysis of neural activity revealed an attenuation of the signal associated with negative reward prediction error in the striatum following execution failures. These results converge with other lines of evidence suggesting that prediction errors in the mesostriatal dopamine system integrate high-level information during the evaluation of instantaneous reward outcomes.

Download Full-text

Using reinforcement learning models in social neuroscience: frameworks, pitfalls, and suggestions of best practices

10.31234/osf.io/uthw2 ◽

2019 ◽

Author(s):

Lei Zhang ◽

Lukas Lengersdorff ◽

Nace Mikus ◽

Jan Gläscher ◽

Claus Lamm

Keyword(s):

Reinforcement Learning ◽

Prediction Error ◽

Model Comparison ◽

Hierarchical Modeling ◽

Learning Rate ◽

Neural Activation ◽

Affective Neuroscience ◽

Outcome Valence ◽

Practical Guidelines ◽

Reinforcement Learning Models

Recent years have witnessed a dramatic increase in the use of reinforcement learning (RL) models in social, cognitive and affective neuroscience. This approach, in combination with neuroimaging techniques such as functional magnetic resonance imaging, enables quantitative investigations into latent mechanistic processes. However, increased use of relatively complex computational approaches has led to potential misconceptions and imprecise interpretations. Here, we present a comprehensive framework for the examination of (social) decision-making with the simple Rescorla-Wagner RL model. We discuss common pitfalls in its application and provide practical suggestions. First, with simulation, we unpack the functional role of the learning rate and pinpoint what could easily go wrong when interpreting differences in the learning rate. Then, we discuss the inevitable collinearity between outcome and prediction error in RL models and provide suggestions of how to justify whether the observed neural activation is related to the prediction error rather than outcome valence. Finally, we suggest posterior predictive check is a crucial step after model comparison, and we articulate employing hierarchical modeling for parameter estimation. We aim to provide simple and scalable explanations and practical guidelines for employing RL models to assist both beginners and advanced users in better implementing and interpreting their model-based analyses.

Download Full-text

Acute stress impairs reward learning in men

10.1101/2020.07.13.200568 ◽

2020 ◽

Author(s):

Joana Carvalheiro ◽

Vasco A. Conceição ◽

Ana Mesquita ◽

Ana Seara-Cardoso

Keyword(s):

Reinforcement Learning ◽

Acute Stress ◽

Learning Task ◽

Reward Learning ◽

Prediction Errors ◽

How People Learn ◽

And Control ◽

The Impact ◽

Reinforcement Learning Model ◽

Better Than

AbstractAcute stress is ubiquitous in everyday life, but the extent to which acute stress affects how people learn from the outcomes of their choices is still poorly understood. Here, we investigate how acute stress impacts reward and punishment learning in men using a reinforcement-learning task. Sixty-two male participants performed the task whilst under stress and control conditions. We observed that acute stress impaired participants’ choice performance towards monetary gains, but not losses. To unravel the mechanism(s) underlying such impairment, we fitted a reinforcement-learning model to participants’ trial-by-trial choices. Computational modeling indicated that under acute stress participants learned more slowly from positive prediction errors — when the outcomes were better than expected — consistent with stress-induced dopamine disruptions. Such mechanistic understanding of how acute stress impairs reward learning is particularly important given the pervasiveness of stress in our daily life and the impact that stress can have on our wellbeing and mental health.

Download Full-text