scholarly journals Context-dependent multiplexing by individual VTA dopamine neurons

2018 ◽  
Author(s):  
Kremer Yves ◽  
Flakowski Jérôme ◽  
Rohner Clément ◽  
Lüscher Christian

AbstractDopamine (DA) neurons of the ventral tegmental area (VTA) track external cues and rewards to generate a reward prediction error (RPE) signal during Pavlovian conditioning. Here we explored how RPE is implemented for a self-paced, operant task in freely moving mice. The animal could trigger a reward-predicting cue by remaining in a specific location of an operant box for a brief time before moving to a spout for reward collection. In vivo single-unit recordings revealed phasic responses to the cue and reward in correct trials, while with failures the activity paused, reflecting positive and negative error signals of a reward prediction. In addition, a majority of VTA DA neurons also encoded parameters of the goal-directed action (e.g. movement velocity, acceleration, distance to goal and licking) by changes in tonic firing rate. Such multiplexing of individual neurons was only apparent while the mouse was engaged in the task. We conclude that a multiplexed internal representation during the task modulates VTA DA neuron activity, indicating a multimodal prediction error that shapes behavioral adaptation of a self-paced goal-directed action.

2020 ◽  
Author(s):  
Clément Prévost-Solié ◽  
Benoit Girard ◽  
Beatrice Righetti ◽  
Malika Tapparel ◽  
Camilla Bellone

AbstractSocial interactions motivate behavior in many species, facilitating learning, foraging and cooperative behavior. However, how the brain encodes the reinforcing properties of social interactions remains elusive. Here using in vivo recording in freely moving mice, we show that Dopamine (DA) neurons of the Ventral Tegmental Area (VTA) increase their activity during active interactions with unfamiliar conspecific. Using a social instrumental task, we then show that VTA DA neuron activity signals social reward prediction error and drives social reinforcement learning. Thereby, our findings propose that VTA DA neurons are a neural substrate for a social learning signal driving motivated behavior.One Sentence SummaryDA neurons are a substrate for social reward learning through the Social Reward Prediction Error.


2020 ◽  
Author(s):  
Pramod Kaushik ◽  
Jérémie Naudé ◽  
Surampudi Bapi Raju ◽  
Frédéric Alexandre

AbstractClassical Conditioning is a fundamental learning mechanism where the Ventral Striatum is generally thought to be the source of inhibition to Ventral Tegmental Area (VTA) Dopamine neurons when a reward is expected. However, recent evidences point to a new candidate in VTA GABA encoding expectation for computing the reward prediction error in the VTA. In this system-level computational model, the VTA GABA signal is hypothesised to be a combination of magnitude and timing computed in the Peduncolopontine and Ventral Striatum respectively. This dissociation enables the model to explain recent results wherein Ventral Striatum lesions affected the temporal expectation of the reward but the magnitude of the reward was intact. This model also exhibits other features in classical conditioning namely, progressively decreasing firing for early rewards closer to the actual reward, twin peaks of VTA dopamine during training and cancellation of US dopamine after training.


2018 ◽  
Author(s):  
Rachel S. Lee ◽  
Marcelo G. Mattar ◽  
Nathan F. Parker ◽  
Ilana B. Witten ◽  
Nathaniel D. Daw

AbstractAlthough midbrain dopamine (DA) neurons have been thought to primarily encode reward prediction error (RPE), recent studies have also found movement-related DAergic signals. For example, we recently reported that DA neurons in mice projecting to dorsomedial striatum are modulated by choices contralateral to the recording side. Here, we introduce, and ultimately reject, a candidate resolution for the puzzling RPE vs movement dichotomy, by showing how seemingly movement-related activity might be explained by an action-specific RPE. By considering both choice and RPE on a trial-by-trial basis, we find that DA signals are modulated by contralateral choice in a manner that is distinct from RPE, implying that choice encoding is better explained by movement direction. This fundamental separation between RPE and movement encoding may help shed light on the diversity of functions and dysfunctions of the DA system.


2019 ◽  
Author(s):  
Melissa J. Sharpe ◽  
Hannah M. Batchelor ◽  
Lauren E. Mueller ◽  
Chun Yun Chang ◽  
Etienne J.P. Maes ◽  
...  

AbstractDopamine neurons fire transiently in response to unexpected rewards. These neural correlates are proposed to signal the reward prediction error described in model-free reinforcement learning algorithms. This error term represents the unpredicted or ‘excess’ value of the rewarding event. In model-free reinforcement learning, this value is then stored as part of the learned value of any antecedent cues, contexts or events, making them intrinsically valuable, independent of the specific rewarding event that caused the prediction error. In support of equivalence between dopamine transients and this model-free error term, proponents cite causal optogenetic studies showing that artificially induced dopamine transients cause lasting changes in behavior. Yet none of these studies directly demonstrate the presence of cached value under conditions appropriate for associative learning. To address this gap in our knowledge, we conducted three studies where we optogenetically activated dopamine neurons while rats were learning associative relationships, both with and without reward. In each experiment, the antecedent cues failed to acquired value and instead entered into value-independent associative relationships with the other cues or rewards. These results show that dopamine transients, constrained within appropriate learning situations, support valueless associative learning.


2016 ◽  
Vol 18 (1) ◽  
pp. 23-32 ◽  

Reward prediction errors consist of the differences between received and predicted rewards. They are crucial for basic forms of learning about rewards and make us strive for more rewards—an evolutionary beneficial trait. Most dopamine neurons in the midbrain of humans, monkeys, and rodents signal a reward prediction error; they are activated by more reward than predicted (positive prediction error), remain at baseline activity for fully predicted rewards, and show depressed activity with less reward than predicted (negative prediction error). The dopamine signal increases nonlinearly with reward value and codes formal economic utility. Drugs of addiction generate, hijack, and amplify the dopamine reward signal and induce exaggerated, uncontrolled dopamine effects on neuronal plasticity. The striatum, amygdala, and frontal cortex also show reward prediction error coding, but only in subpopulations of neurons. Thus, the important concept of reward prediction errors is implemented in neuronal hardware.


2021 ◽  
Author(s):  
Linda Requie ◽  
Marta Gómez-Gonzalo ◽  
Francesca Managò ◽  
Mauro Congiu ◽  
Marcello Melone ◽  
...  

Abstract The plasticity of glutamatergic transmission in the Ventral Tegmental Area (VTA) represents a fundamental mechanism in the modulation of dopamine neuron burst firing and the phasic dopamine release at VTA target regions. These processes encode basic behavioral responses, including locomotor activity, learning and motivated-behaviors. Here we describe a hitherto unidentified mechanism of long-lasting potentiation of glutamatergic synapses on DA neurons. We found that VTA astrocytes respond to dopamine neuron bursts with Ca2+ elevations that require activation of endocannabinoid CB1 and dopamine D2 receptors colocalized at the same astrocytic process. Astrocytes, in turn, release glutamate that, through presynaptic metabotropic glutamate receptor activation coupled with neuronal nitric oxide production, induces long-lasting potentiation of excitatory synapses on adjacent dopamine neurons. Consistent with this finding, selective activation of VTA astrocytes increases dopamine neuron bursts in vivo and induces locomotor hyperactivity. Astrocytes play, therefore, a key role in the modulation of VTA dopamine neuron activity.


2014 ◽  
Vol 26 (3) ◽  
pp. 635-644 ◽  
Author(s):  
Olav E. Krigolson ◽  
Cameron D. Hassall ◽  
Todd C. Handy

Our ability to make decisions is predicated upon our knowledge of the outcomes of the actions available to us. Reinforcement learning theory posits that actions followed by a reward or punishment acquire value through the computation of prediction errors—discrepancies between the predicted and the actual reward. A multitude of neuroimaging studies have demonstrated that rewards and punishments evoke neural responses that appear to reflect reinforcement learning prediction errors [e.g., Krigolson, O. E., Pierce, L. J., Holroyd, C. B., & Tanaka, J. W. Learning to become an expert: Reinforcement learning and the acquisition of perceptual expertise. Journal of Cognitive Neuroscience, 21, 1833–1840, 2009; Bayer, H. M., & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron, 47, 129–141, 2005; O'Doherty, J. P. Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Current Opinion in Neurobiology, 14, 769–776, 2004; Holroyd, C. B., & Coles, M. G. H. The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679–709, 2002]. Here, we used the brain ERP technique to demonstrate that not only do rewards elicit a neural response akin to a prediction error but also that this signal rapidly diminished and propagated to the time of choice presentation with learning. Specifically, in a simple, learnable gambling task, we show that novel rewards elicited a feedback error-related negativity that rapidly decreased in amplitude with learning. Furthermore, we demonstrate the existence of a reward positivity at choice presentation, a previously unreported ERP component that has a similar timing and topography as the feedback error-related negativity that increased in amplitude with learning. The pattern of results we observed mirrored the output of a computational model that we implemented to compute reward prediction errors and the changes in amplitude of these prediction errors at the time of choice presentation and reward delivery. Our results provide further support that the computations that underlie human learning and decision-making follow reinforcement learning principles.


2007 ◽  
Vol 97 (4) ◽  
pp. 3036-3045 ◽  
Author(s):  
Signe Bray ◽  
John O'Doherty

Attractive faces can be considered to be a form of visual reward. Previous imaging studies have reported activity in reward structures including orbitofrontal cortex and nucleus accumbens during presentation of attractive faces. Given that these stimuli appear to act as rewards, we set out to explore whether it was possible to establish conditioning in human subjects by pairing presentation of arbitrary affectively neutral stimuli with subsequent presentation of attractive and unattractive faces. Furthermore, we scanned human subjects with functional magnetic resonance imaging (fMRI) while they underwent this conditioning procedure to determine whether a reward-prediction error signal is engaged during learning with attractive faces as is known to be the case for learning with other types of reward such as juice and money. Subjects showed changes in behavioral ratings to the conditioned stimuli (CS) when comparing post- to preconditioning evaluations, notably for those CSs paired with attractive female faces. We used a simple Rescorla-Wagner learning model to generate a reward-prediction error signal and entered this into a regression analysis with the fMRI data. We found significant prediction error-related activity in the ventral striatum during conditioning with attractive compared with unattractive faces. These findings suggest that an arbitrary stimulus can acquire conditioned value by being paired with pleasant visual stimuli just as with other types of reward such as money or juice. This learning process elicits a reward-prediction error signal in a main target structure of dopamine neurons: the ventral striatum. The findings we describe here may provide insights into the neural mechanisms tapped into by advertisers seeking to influence behavioral preferences by repeatedly exposing consumers to simple associations between products and rewarding visual stimuli such as pretty faces.


Sign in / Sign up

Export Citation Format

Share Document