Occasion setters determine responses of putative dopamine neurons to discriminative stimuli

Mapping Intimacies ◽

10.1101/799387 ◽

2019 ◽

Author(s):

Luca Aquili ◽

Eric M. Bowman ◽

Robert Schmidt

Keyword(s):

Single Unit ◽

Population Level ◽

Dopamine Neurons ◽

Prediction Errors ◽

Occasion Setting ◽

Population Response ◽

Unit Level ◽

Reward Prediction ◽

Midbrain Dopamine ◽

The Relationship

AbstractMidbrain dopamine (DA) neurons are involved in the processing of rewards and reward-predicting stimuli, possibly analogous to reinforcement learning reward prediction errors. Here we studied the activity of putative DA neurons (n=41) recorded in the ventral tegmental area of rats (n=6) performing a behavioural task involving occasion setting. In this task an occasion setter (OS) indicated that the relationship between a discriminative stimulus (DS) and reinforcement is in effect, so that reinforcement of bar pressing occurred only after the OS (tone or houselight) was followed by the DS (houselight or tone). We found that responses of putative DA cells to the DS were enhanced when preceded by the OS, as were behavioural responses to obtain rewards. Surprisingly though, we did not find a population response of putative DA neurons to the OS, contrary to predictions of standard temporal-difference models of DA neurons. However, despite the absence of a population response, putative DA neurons exhibited a heterogeneous response on a single unit level, so that some units increased and others decreased their activity as a response to the OS. Similarly, putative non-DA cells did not respond to the DS on a population level, but with heterogeneous responses on a single unit level. The heterogeneity in the responses of putative DA cells may reflect how DA neurons encode context and point to local differences in DA signalling.

Download Full-text

Rethinking dopamine as generalized prediction error

10.1101/239731 ◽

2017 ◽

Cited By ~ 2

Author(s):

Matthew P.H. Gardner ◽

Geoffrey Schoenbaum ◽

Samuel J. Gershman

Keyword(s):

Reinforcement Learning ◽

Prediction Error ◽

Dopamine Neurons ◽

Prediction Errors ◽

Model Free ◽

Reward Prediction ◽

Midbrain Dopamine ◽

Sensory Prediction ◽

Lines Of Evidence ◽

Midbrain Dopamine Neurons

AbstractMidbrain dopamine neurons are commonly thought to report a reward prediction error, as hypothesized by reinforcement learning theory. While this theory has been highly successful, several lines of evidence suggest that dopamine activity also encodes sensory prediction errors unrelated to reward. Here we develop a new theory of dopamine function that embraces a broader conceptualization of prediction errors. By signaling errors in both sensory and reward predictions, dopamine supports a form of reinforcement learning that lies between model-based and model-free algorithms. This account remains consistent with current canon regarding the correspondence between dopamine transients and reward prediction errors, while also accounting for new data suggesting a role for these signals in phenomena such as sensory preconditioning and identity unblocking, which ostensibly draw upon knowledge beyond reward predictions.

Download Full-text

Dopamine, Inference, and Uncertainty

Neural Computation ◽

10.1162/neco_a_01023 ◽

2017 ◽

Vol 29 (12) ◽

pp. 3311-3326 ◽

Cited By ~ 17

Author(s):

Samuel J. Gershman

Keyword(s):

Reinforcement Learning ◽

Learning Rule ◽

Dopamine Neurons ◽

Prediction Errors ◽

Stimulus Representation ◽

Reward Prediction ◽

Probabilistic Beliefs ◽

Bayesian Reinforcement Learning ◽

Simple Error ◽

The Relationship

The hypothesis that the phasic dopamine response reports a reward prediction error has become deeply entrenched. However, dopamine neurons exhibit several notable deviations from this hypothesis. A coherent explanation for these deviations can be obtained by analyzing the dopamine response in terms of Bayesian reinforcement learning. The key idea is that prediction errors are modulated by probabilistic beliefs about the relationship between cues and outcomes, updated through Bayesian inference. This account can explain dopamine responses to inferred value in sensory preconditioning, the effects of cue preexposure (latent inhibition), and adaptive coding of prediction errors when rewards vary across orders of magnitude. We further postulate that orbitofrontal cortex transforms the stimulus representation through recurrent dynamics, such that a simple error-driven learning rule operating on the transformed representation can implement the Bayesian reinforcement learning update.

Download Full-text

Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework

eLife ◽

10.7554/elife.13665 ◽

2016 ◽

Vol 5 ◽

Cited By ~ 56

Author(s):

Brian F Sadacca ◽

Joshua L Jones ◽

Geoffrey Schoenbaum

Keyword(s):

Dopamine Neurons ◽

Prediction Errors ◽

Temporal Difference ◽

Use Value ◽

Value Prediction ◽

Reward Prediction ◽

Common Framework ◽

Midbrain Dopamine ◽

Midbrain Dopamine Neurons ◽

Do So

Midbrain dopamine neurons have been proposed to signal reward prediction errors as defined in temporal difference (TD) learning algorithms. While these models have been extremely powerful in interpreting dopamine activity, they typically do not use value derived through inference in computing errors. This is important because much real world behavior – and thus many opportunities for error-driven learning – is based on such predictions. Here, we show that error-signaling rat dopamine neurons respond to the inferred, model-based value of cues that have not been paired with reward and do so in the same framework as they track the putative cached value of cues previously paired with reward. This suggests that dopamine neurons access a wider variety of information than contemplated by standard TD models and that, while their firing conforms to predictions of TD models in some cases, they may not be restricted to signaling errors from TD predictions.

Download Full-text

Rethinking dopamine as generalized prediction error

Proceedings of The Royal Society B Biological Sciences ◽

10.1098/rspb.2018.1645 ◽

2018 ◽

Vol 285 (1891) ◽

pp. 20181645 ◽

Cited By ~ 32

Author(s):

Matthew P. H. Gardner ◽

Geoffrey Schoenbaum ◽

Samuel J. Gershman

Keyword(s):

Reinforcement Learning ◽

Prediction Error ◽

Dopamine Neurons ◽

Prediction Errors ◽

Model Free ◽

Reward Prediction ◽

Midbrain Dopamine ◽

Sensory Prediction ◽

Lines Of Evidence ◽

Midbrain Dopamine Neurons

Midbrain dopamine neurons are commonly thought to report a reward prediction error (RPE), as hypothesized by reinforcement learning (RL) theory. While this theory has been highly successful, several lines of evidence suggest that dopamine activity also encodes sensory prediction errors unrelated to reward. Here, we develop a new theory of dopamine function that embraces a broader conceptualization of prediction errors. By signalling errors in both sensory and reward predictions, dopamine supports a form of RL that lies between model-based and model-free algorithms. This account remains consistent with current canon regarding the correspondence between dopamine transients and RPEs, while also accounting for new data suggesting a role for these signals in phenomena such as sensory preconditioning and identity unblocking, which ostensibly draw upon knowledge beyond reward predictions.

Download Full-text

A unified framework for dopamine signals across timescales

10.1101/803437 ◽

2019 ◽

Cited By ~ 13

Author(s):

HyungGoo R. Kim ◽

Athar N. Malik ◽

John G. Mikhael ◽

Pol Bech ◽

Iku Tsutsui-Kimura ◽

...

Keyword(s):

Dopamine Neurons ◽

Prediction Errors ◽

Temporal Difference ◽

Unified Framework ◽

Phasic Activity ◽

Reward Prediction ◽

Dynamic Stimulus ◽

Midbrain Dopamine ◽

Gradual Approach ◽

Midbrain Dopamine Neurons

ABSTRACTRapid phasic activity of midbrain dopamine neurons are thought to signal reward prediction errors (RPEs), resembling temporal difference errors used in machine learning. Recent studies describing slowly increasing dopamine signals have instead proposed that they represent state values and arise independently from somatic spiking activity. Here, we developed novel experimental paradigms using virtual reality that disambiguate RPEs from values. We examined the dopamine circuit activity at various stages including somatic spiking, axonal calcium signals, and striatal dopamine concentrations. Our results demonstrate that ramping dopamine signals are consistent with RPEs rather than value, and this ramping is observed at all the stages examined. We further show that ramping dopamine signals can be driven by a dynamic stimulus that indicates a gradual approach to a reward. We provide a unified computational understanding of rapid phasic and slowly ramping dopamine signals: dopamine neurons perform a derivative-like computation over values on a moment-by-moment basis.

Download Full-text

Reward prediction error in the ERP following unconditioned aversive stimuli

Scientific Reports ◽

10.1038/s41598-021-99408-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Harry J. Stewardson ◽

Thomas D. Sambrook

Keyword(s):

Reinforcement Learning ◽

Prediction Error ◽

Dopamine Neurons ◽

Prediction Errors ◽

Temporal Difference ◽

Dopamine System ◽

Reward Prediction Error ◽

Reward Prediction ◽

Midbrain Dopamine ◽

Human Participants

AbstractReinforcement learning in humans and other animals is driven by reward prediction errors: deviations between the amount of reward or punishment initially expected and that which is obtained. Temporal difference methods of reinforcement learning generate this reward prediction error at the earliest time at which a revision in reward or punishment likelihood is signalled, for example by a conditioned stimulus. Midbrain dopamine neurons, believed to compute reward prediction errors, generate this signal in response to both conditioned and unconditioned stimuli, as predicted by temporal difference learning. Electroencephalographic recordings of human participants have suggested that a component named the feedback-related negativity (FRN) is generated when this signal is carried to the cortex. If this is so, the FRN should be expected to respond equivalently to conditioned and unconditioned stimuli. However, very few studies have attempted to measure the FRN’s response to unconditioned stimuli. The present study attempted to elicit the FRN in response to a primary aversive stimulus (electric shock) using a design that varied reward prediction error while holding physical intensity constant. The FRN was strongly elicited, but earlier and more transiently than typically seen, suggesting that it may incorporate other processes than the midbrain dopamine system.

Download Full-text

Dopamine, Inference, and Uncertainty

10.1101/149849 ◽

2017 ◽

Author(s):

Samuel J. Gershman

Keyword(s):

Reinforcement Learning ◽

Learning Rule ◽

Dopamine Neurons ◽

Prediction Errors ◽

Stimulus Representation ◽

Reward Prediction ◽

Probabilistic Beliefs ◽

Bayesian Reinforcement Learning ◽

Simple Error ◽

The Relationship

AbstractThe hypothesis that the phasic dopamine response reports a reward prediction error has become deeply entrenched. However, dopamine neurons exhibit several notable deviations from this hypothesis. A coherent explanation for these deviations can be obtained by analyzing the dopamine response in terms of Bayesian reinforcement learning. The key idea is that prediction errors are modulated by probabilistic beliefs about the relationship between cues and outcomes, updated through Bayesian inference. This account can explain dopamine responses to inferred value in sensory preconditioning, the effects of cue pre-exposure (latent inhibition) and adaptive coding of prediction errors when rewards vary across orders of magnitude. We further postulate that orbitofrontal cortex transforms the stimulus representation through recurrent dynamics, such that a simple error-driven learning rule operating on the transformed representation can implement the Bayesian reinforcement learning update.

Download Full-text

The effect of effort on reward prediction error signals in midbrain dopamine neurons

Current Opinion in Behavioral Sciences ◽

10.1016/j.cobeha.2021.07.004 ◽

2021 ◽

Vol 41 ◽

pp. 152-159

Author(s):

Shingo Tanaka ◽

Jessica E Taylor ◽

Masamichi Sakagami

Keyword(s):

Prediction Error ◽

Dopamine Neurons ◽

Reward Prediction Error ◽

Reward Prediction ◽

Midbrain Dopamine ◽

Midbrain Dopamine Neurons

Download Full-text

Midbrain dopamine neurons signal aversion in a reward-context-dependent manner

eLife ◽

10.7554/elife.17328 ◽

2016 ◽

Vol 5 ◽

Cited By ~ 47

Author(s):

Hideyuki Matsumoto ◽

Ju Tian ◽

Naoshige Uchida ◽

Mitsuko Watabe-Uchida

Keyword(s):

Dopamine Neurons ◽

Prediction Errors ◽

Dependent Manner ◽

Value Prediction ◽

Signal Value ◽

Aversive Events ◽

High Reward ◽

Midbrain Dopamine ◽

Dopamine Signaling ◽

Reward And Punishment

Dopamine is thought to regulate learning from appetitive and aversive events. Here we examined how optogenetically-identified dopamine neurons in the lateral ventral tegmental area of mice respond to aversive events in different conditions. In low reward contexts, most dopamine neurons were exclusively inhibited by aversive events, and expectation reduced dopamine neurons’ responses to reward and punishment. When a single odor predicted both reward and punishment, dopamine neurons’ responses to that odor reflected the integrated value of both outcomes. Thus, in low reward contexts, dopamine neurons signal value prediction errors (VPEs) integrating information about both reward and aversion in a common currency. In contrast, in high reward contexts, dopamine neurons acquired a short-latency excitation to aversive events that masked their VPE signaling. Our results demonstrate the importance of considering the contexts to examine the representation in dopamine neurons and uncover different modes of dopamine signaling, each of which may be adaptive for different environments.

Download Full-text