Timing in reward and decision processes

Maria A. Bermudez; Wolfram Schultz

doi:10.1098/rstb.2012.0468

Timing in reward and decision processes

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2012.0468 ◽

2014 ◽

Vol 369 (1637) ◽

pp. 20120468 ◽

Cited By ~ 28

Author(s):

Maria A. Bermudez ◽

Wolfram Schultz

Keyword(s):

Frontal Cortex ◽

Temporal Discounting ◽

Decision Processes ◽

Reward Processing ◽

Dopamine Neurons ◽

Error Signal ◽

Reward Delivery ◽

Midbrain Dopamine ◽

The Brain ◽

Timing Processes

Sensitivity to time, including the time of reward, guides the behaviour of all organisms. Recent research suggests that all major reward structures of the brain process the time of reward occurrence, including midbrain dopamine neurons, striatum, frontal cortex and amygdala. Neuronal reward responses in dopamine neurons, striatum and frontal cortex show temporal discounting of reward value. The prediction error signal of dopamine neurons includes the predicted time of rewards. Neurons in the striatum, frontal cortex and amygdala show responses to reward delivery and activities anticipating rewards that are sensitive to the predicted time of reward and the instantaneous reward probability. Together these data suggest that internal timing processes have several well characterized effects on neuronal reward processing.

Download Full-text

Targeting Diacylglycerol Lipase to Reduce Alcohol Consumption

10.1101/2021.02.16.431429 ◽

2021 ◽

Author(s):

Gaurav Bedse ◽

Nathan D. Winters ◽

Anastasia Astafyev ◽

Toni A. Patrick ◽

Vikrant R. Mahajan ◽

...

Keyword(s):

Alcohol Consumption ◽

Treatment Options ◽

Ethanol Vapor ◽

Reward Processing ◽

Dopamine Neurons ◽

Free Access ◽

Diacylglycerol Lipase ◽

Reduce Alcohol Consumption ◽

Novel Approach ◽

Midbrain Dopamine

ABSTRACTAlcohol use disorder (AUD) is associated with substantial morbidity, mortality, and societal cost, and pharmacological treatment options for AUD are limited. The endogenous cannabinoid (eCB) signaling system is critically involved in reward processing and alcohol intake is positively correlated with release of the eCB ligand 2-Arachidonoylglycerol (2-AG) within reward neurocircuitry. Here we show that genetic and pharmacological inhibition of diacylglycerol lipase (DAGL), the rate limiting enzyme in the synthesis of 2-AG, reduces alcohol consumption in a variety of preclinical models ranging from a voluntary free-access model to aversion resistant-drinking, and dependence-like drinking induced via chronic intermittent ethanol vapor exposure in mice. DAGL inhibition also prevented ethanol-induced suppression of GABAergic transmission onto midbrain dopamine neurons, providing mechanistic insight into how DAGL inhibition could affect alcohol reward. Lastly, DAGL inhibition during either chronic alcohol consumption or protracted withdrawal was devoid of anxiogenic and depressive-like behavioral effects. These data suggest reducing 2-AG signaling via inhibition of DAGL could represent a novel approach to reduce alcohol consumption across the spectrum of AUD severity.

Download Full-text

Remote control of neural function by X-ray-induced scintillation

10.1101/798702 ◽

2019 ◽

Cited By ~ 5

Author(s):

Takanori Matsubara ◽

Takayuki Yanagida ◽

Noriaki Kawaguchi ◽

Takashi Nakano ◽

Junichiro Yoshimoto ◽

...

Keyword(s):

Dopamine Neurons ◽

Neural Function ◽

X Rays ◽

Cellular Functions ◽

X Ray ◽

Functional Studies ◽

Clinical Dose ◽

Visible Luminescence ◽

Midbrain Dopamine ◽

The Brain

Scintillators emit visible luminescence when irradiated with X-rays. Given the unlimited tissue penetration of X-rays, the employment of scintillators could enable remote optogenetic control of neural functions at any depth of the brain. Here we show that a yellow-emitting inorganic scintillator, Ce-doped Gd3(Al,Ga)5O12 (Ce:GAGG), could effectively activate red-shifted excitatory and inhibitory opsins, ChRmine and GtACR1, respectively. Using injectable Ce:GAGG microparticles, we successfully activated and inhibited midbrain dopamine neurons in freely moving mice by X-ray irradiation, producing bidirectional modulation of place preference behavior. Ce:GAGG microparticles were non-cytotoxic and biocompatible, allowing for chronic implantation. Pulsed X-ray irradiation at a clinical dose level was sufficient to elicit behavioral changes without reducing the number of radiosensitive cells in the brain and bone marrow. Thus, scintillator-mediated optogenetics enables less invasive, wireless control of cellular functions at any tissue depth in living animals, expanding X-ray applications to functional studies of biology and medicine.

Download Full-text

A Possible Role of Midbrain Dopamine Neurons in Short- and Long-Term Adaptation of Saccades to Position-Reward Mapping

Journal of Neurophysiology ◽

10.1152/jn.00238.2004 ◽

2004 ◽

Vol 92 (4) ◽

pp. 2520-2529 ◽

Cited By ~ 47

Author(s):

Yoriko Takikawa ◽

Reiko Kawagoe ◽

Okihide Hikosaka

Keyword(s):

Visual Cues ◽

Early Stage ◽

Intermediate Stage ◽

Saccade Latency ◽

Dopamine Neurons ◽

Neuronal Responses ◽

Sensory Stimuli ◽

Reward Delivery ◽

Midbrain Dopamine

Dopamine (DA) neurons respond to sensory stimuli that predict reward. To understand how DA neurons acquire such ability, we trained monkeys on a one-direction-rewarded version of memory-guided saccade task (1DR) only when we recorded from single DA neurons. In 1DR, position-reward mapping was changed across blocks of trials. In the early stage of training of 1DR, DA neurons responded to reward delivery; in the later stages, they responded predominantly to the visual cue that predicted reward or no reward (reward predictor) differentially. We found that such a shift of activity from reward to reward predictor also occurred within a block of trials after position-reward mapping was altered. A main effect of long-term training was to accelerate the within-block reward-to-predictor shift of DA neuronal responses. The within-block shift appeared first in the intermediate stage, but was slow, and DA neurons often responded to the cue that indicated reward in the preceding block. In the advanced stage, the reward-to-predictor shift occurred quickly such that the DA neurons' responses to visual cues faithfully matched the current position-reward mapping. Changes in the DA neuronal responses co-varied with the reward-predictive differentiation of saccade latency both in short-term (within-block) and long-term adaptation. DA neurons' response to the fixation point also underwent long-term changes until it occurred predominantly in the first trial within a block. This might trigger a switch between the learned sets. These results suggest that midbrain DA neurons play an essential role in adapting oculomotor behavior to frequent switches in position-reward mapping.

Download Full-text

The effect of dopamine transporter blockade on optical self-stimulation: behavioral and computational evidence for parallel processing in brain reward circuitry

10.1101/867481 ◽

2019 ◽

Author(s):

Ivan Trujillo-Pisanty ◽

Kent Conover ◽

Pavel Solis ◽

Daniel Palacios ◽

Peter Shizgal

Keyword(s):

Dopamine Transporter ◽

Time Allocation ◽

Opportunity Cost ◽

Dopamine Neurons ◽

Reward Circuitry ◽

Reward Seeking ◽

Brain Reward ◽

Midbrain Dopamine ◽

The Brain ◽

Midbrain Dopamine Neurons

AbstractThe neurobiological study of reward was launched by the discovery of intracranial self-stimulation (ICSS). Subsequent investigation of this phenomenon provided the initial link between reward-seeking behavior and dopaminergic neurotransmission. We re-evaluated this relationship by psychophysical, pharmacological, optogenetic, and computational means. In rats working for direct, optical activation of midbrain dopamine neurons, we varied the strength and opportunity cost of the stimulation and measured time allocation, the proportion of trial time devoted to reward pursuit. We found that the dependence of time allocation on the strength and cost of stimulation was similar formally to that observed when electrical stimulation of the medial forebrain bundle served as the reward. When the stimulation is strong and cheap, the rats devote almost all their time to reward pursuit; time allocation falls off as stimulation strength is decreased and/or its opportunity cost is increased. A 3D plot of time allocation versus stimulation strength and cost produces a surface resembling the corner of a plateau (the “reward mountain”). We show that dopamine-transporter blockade shifts the mountain along both the strength and cost axes in rats working for optical activation of midbrain dopamine neurons. In contrast, the same drug shifted the mountain uniquely along the opportunity-cost axis when rats worked for electrical MFB stimulation in a prior study. Dopamine neurons are an obligatory stage in the dominant model of ICSS, which positions them at a key nexus in the final common path for reward seeking. This model fails to provide a cogent account for the differential effect of dopamine transporter blockade on the reward mountain. Instead, we propose that midbrain dopamine neurons and neurons with non-dopaminergic, MFB axons constitute parallel limbs of brain-reward circuitry that ultimately converge on the final-common path for the evaluation and pursuit of rewards.Author summaryTo succeed in the struggle for survival and reproductive success, animals must make wise choices about which goals to pursue and how much to pay to attain them. How does the brain make such decisions and adjust behaviour accordingly? An animal model that has long served to address this question entails delivery of rewarding brain stimulation. When the probe is positioned appropriately in the brain, rats will work indefatigably to trigger such stimulation. Dopamine neurons play a crucial role in this phenomenon. The dominant model of the brain circuitry responsible for the reward-seeking behavior treats these cells as a gateway through which the reward-generating brain signals must pass. Here, we challenge this idea on the basis of an experiment in which the dopamine neurons were activated selectively and directly. Mathematical modeling of the results argues for a new view of the structure of brain reward circuitry. On this view, the pathway(s) in which the dopamine neurons are embedded is one of a set of parallel channels that process reward signals in the brain. To achieve a full understanding of how goals are evaluated, selected and pursued, the full set of channels must be identified and investigated.

Download Full-text

Remote control of neural function by X-ray-induced scintillation

Nature Communications ◽

10.1038/s41467-021-24717-1 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Takanori Matsubara ◽

Takayuki Yanagida ◽

Noriaki Kawaguchi ◽

Takashi Nakano ◽

Junichiro Yoshimoto ◽

...

Keyword(s):

Dopamine Neurons ◽

Neural Function ◽

X Rays ◽

Cellular Functions ◽

X Ray ◽

Functional Studies ◽

Clinical Dose ◽

Visible Luminescence ◽

Midbrain Dopamine ◽

The Brain

AbstractScintillators emit visible luminescence when irradiated with X-rays. Given the unlimited tissue penetration of X-rays, the employment of scintillators could enable remote optogenetic control of neural functions at any depth of the brain. Here we show that a yellow-emitting inorganic scintillator, Ce-doped Gd3(Al,Ga)5O12 (Ce:GAGG), can effectively activate red-shifted excitatory and inhibitory opsins, ChRmine and GtACR1, respectively. Using injectable Ce:GAGG microparticles, we successfully activated and inhibited midbrain dopamine neurons in freely moving mice by X-ray irradiation, producing bidirectional modulation of place preference behavior. Ce:GAGG microparticles are non-cytotoxic and biocompatible, allowing for chronic implantation. Pulsed X-ray irradiation at a clinical dose level is sufficient to elicit behavioral changes without reducing the number of radiosensitive cells in the brain and bone marrow. Thus, scintillator-mediated optogenetics enables minimally invasive, wireless control of cellular functions at any tissue depth in living animals, expanding X-ray applications to functional studies of biology and medicine.

Download Full-text

Adolescent dopamine neurons represent reward differently during action and state guided learning

10.1101/2021.07.05.451195 ◽

2021 ◽

Author(s):

Aqilah McCane ◽

Meredyth Wegener ◽

Mojdeh Faraji ◽

Maria Rivera Garcia ◽

Kathryn Wallin-Miller ◽

...

Keyword(s):

Associative Learning ◽

Cell Number ◽

Dopamine Neurons ◽

Male Rats ◽

Adolescent Male ◽

Action Execution ◽

Reward Delivery ◽

Adolescents And Adults ◽

Midbrain Dopamine ◽

Midbrain Dopamine Neurons

The neuronal underpinning of learning cause-and-effect associations in the adolescent brain remains poorly understood. Two fundamental forms of associative learning are Pavlovian (classical) conditioning, where a stimulus is followed by an outcome, and operant (instrumental) conditioning, where outcome is contingent on action execution. Both forms of learning, when associated with a rewarding outcome, rely on midbrain dopamine neurons in the ventral tegmental area (VTA) and substantia nigra (SN). We find that in adolescent male rats, reward-guided associative learning is encoded differently by midbrain dopamine neurons in each conditioning paradigm. Whereas simultaneously recorded VTA and SN adult neurons have a similar phasic response to reward delivery during both forms of conditioning, adolescent neurons display a muted reward response during operant but a profoundly larger reward response during Pavlovian conditioning suggesting that adolescent neurons assign a different value to reward when it is not gated by action. The learning rate of adolescents and adults during both forms of conditioning was similar further supporting the notion that differences in reward response in each paradigm are due to differences in motivation and independent of state versus action value learning. Static characteristics of dopamine neurons such as dopamine cell number and size were similar in the VTA and SN but there were age differences in baseline firing rate, stimulated release and correlated spike activity suggesting that differences in reward responsiveness by adolescent dopamine neurons are not due to differences in intrinsic properties of these neurons but engagement of different networks.

Download Full-text

A gradual backward shift of dopamine responses during associative learning

10.1101/2020.10.04.325324 ◽

2020 ◽

Author(s):

Ryunosuke Amo ◽

Akihiro Yamanaka ◽

Kenji F. Tanaka ◽

Naoshige Uchida ◽

Mitsuko Watabe-Uchida

Keyword(s):

Machine Learning ◽

Reinforcement Learning ◽

Prediction Error ◽

Dopamine Neurons ◽

Temporal Difference ◽

Error Signal ◽

Learning Models ◽

Reward Delivery ◽

Backward Shift ◽

Teaching Signal

AbstractIt has been proposed that the activity of dopamine neurons approximates temporal difference (TD) prediction error, a teaching signal developed in reinforcement learning, a field of machine learning. However, whether this similarity holds true during learning remains elusive. In particular, some TD learning models predict that the error signal gradually shifts backward in time from reward delivery to a reward-predictive cue, but previous experiments failed to observe such a gradual shift in dopamine activity. Here we demonstrate conditions in which such a shift can be detected experimentally. These shared dynamics of TD error and dopamine activity narrow the gap between machine learning theory and biological brains, tightening a long-sought link.

Download Full-text

Representation and Timing in Theories of the Dopamine System

Neural Computation ◽

10.1162/neco.2006.18.7.1637 ◽

2006 ◽

Vol 18 (7) ◽

pp. 1637-1677 ◽

Cited By ~ 98

Author(s):

Nathaniel D. Daw ◽

Aaron C. Courville ◽

David S. Touretzky

Keyword(s):

Delay Line ◽

Dopaminergic System ◽

Dopamine Neurons ◽

Error Signal ◽

Dopamine System ◽

Partial Observability ◽

Sensory Data ◽

Reward Prediction ◽

Future Reward ◽

The Brain

Although the responses of dopamine neurons in the primate midbrain are well characterized as carrying a temporal difference (TD) error signal for reward prediction, existing theories do not offer a credible account of how the brain keeps track of past sensory events that may be relevant to predicting future reward. Empirically, these shortcomings of previous theories are particularly evident in their account of experiments in which animals were exposed to variation in the timing of events. The original theories mispredicted the results of such experiments due to their use of a representational device called a tapped delay line. Here we propose that a richer understanding of history representation and a better account of these experiments can be given by considering TD algorithms for a formal setting that incorporates two features not originally considered in theories of the dopaminergic response: partial observability (a distinction between the animal's sensory experience and the true underlying state of the world) and semi-Markov dynamics (an explicit account of variation in the intervals between events). The new theory situates the dopaminergic system in a richer functional and anatomical context, since it assumes (in accord with recent computational theories of cortex) that problems of partial observability and stimulus history are solved in sensory cortex using statistical modeling and inference and that the TD system predicts reward using the results of this inference rather than raw sensory data. It also accounts for a range of experimental data, including the experiments involving programmed temporal variability and other previously unmodeled dopaminergic response phenomena, which we suggest are related to subjective noise in animals' interval timing. Finally, it offers new experimental predictions and a rich theoretical framework for designing future experiments.

Download Full-text

Orphan Nuclear Receptor Nurr1 Is Essential for Ret Expression in Midbrain Dopamine Neurons and in the Brain Stem

Molecular and Cellular Neuroscience ◽

10.1006/mcne.2001.1057 ◽

2001 ◽

Vol 18 (6) ◽

pp. 649-663 ◽

Cited By ~ 91

Author(s):

Åsa Wallén ◽

Diogo S. Castro ◽

Rolf H. Zetterström ◽

Mattias Karlén ◽

Lars Olson ◽

...

Keyword(s):

Brain Stem ◽

Nuclear Receptor ◽

Dopamine Neurons ◽

Orphan Nuclear Receptor ◽

Midbrain Dopamine ◽

The Brain ◽

Midbrain Dopamine Neurons

Download Full-text

Temporal Difference Model Reproduces Anticipatory Neural Activity

Neural Computation ◽

10.1162/089976601300014376 ◽

2001 ◽

Vol 13 (4) ◽

pp. 841-862 ◽

Cited By ~ 95

Author(s):

Roland E. Suri ◽

Wolfram Schultz

Keyword(s):

Neural Activity ◽

Dopamine Neurons ◽

Neuron Activity ◽

Temporal Difference ◽

Error Signal ◽

Striatal Neurons ◽

Difference Model ◽

Pavlovian Learning ◽

Midbrain Dopamine ◽

Anticipatory Activity

Anticipatory neural activity preceding behaviorally important events has been reported in cortex, striatum, and midbrain dopamine neurons. Whereas dopamine neurons are phasically activated by reward-predictive stimuli, anticipatory activity of cortical and striatal neurons is increased during delay periods before important events. Characteristics of dopa-mine neuron activity resemble those of the prediction error signal of the temporal difference (TD) model of Pavlovian learning (Sutton & Barto, 1990). This study demonstrates that the prediction signal of the TD model reproduces characteristics of cortical and striatal anticipatory neural activity. This finding suggests that tonic anticipatory activities may reflect prediction signals that are involved in the processing of dopamine neuron activity.

Download Full-text