scholarly journals Phasic Activation of Ventral Tegmental, but not Substantia Nigra, Dopamine Neurons Promotes Model-Based Pavlovian Reward Learning

2017 ◽  
Author(s):  
R. Keiflin ◽  
H.J. Pribut ◽  
N.B. Shah ◽  
P.H. Janak

ABSTRACTDopamine (DA) neurons in the ventral tegmental area (VTA) and substantia nigra (SNc) encode reward prediction errors (RPEs) and are proposed to mediate error-driven learning. However the learning strategy engaged by DA-RPEs remains controversial. Model-free associations imbue cue/actions with pure value, independently of representations of their associated outcome. In contrast, model-based associations support detailed representation of anticipated outcomes. Here we show that although both VTA and SNc DA neuron activation reinforces instrumental responding, only VTA DA neuron activation during consumption of expected sucrose reward restores error-driven learning and promotes formation of a new cue→sucrose association. Critically, expression of VTA DA-dependent Pavlovian associations is abolished following sucrose devaluation, a signature of model-based learning. These findings reveal that activation of VTA-or SNc-DA neurons engages largely dissociable learning processes with VTA-DA neurons capable of participating in model-based predictive learning, while the role of SNc-DA neurons appears limited to reinforcement of instrumental responses.

2018 ◽  
Vol 80 (1) ◽  
pp. 219-241 ◽  
Author(s):  
Stephanie C. Gantz ◽  
Christopher P. Ford ◽  
Hitoshi Morikawa ◽  
John T. Williams

2019 ◽  
Author(s):  
Allison Letkiewicz ◽  
Amy L. Cochran ◽  
Josh M. Cisler

Trauma and trauma-related disorders are characterized by altered learning styles. Two learning processes that have been delineated using computational modeling are model-free and model-based reinforcement learning (RL), characterized by trial and error and goal-driven, rule-based learning, respectively. Prior research suggests that model-free RL is disrupted among individuals with a history of assaultive trauma and may contribute to altered fear responding. Currently, it is unclear whether model-based RL, which involves building abstract and nuanced representations of stimulus-outcome relationships to prospectively predict action-related outcomes, is also impaired among individuals who have experienced trauma. The present study sought to test the hypothesis of impaired model-based RL among adolescent females exposed to assaultive trauma. Participants (n=60) completed a three-arm bandit RL task during fMRI acquisition. Two computational models compared the degree to which each participant’s task behavior fit the use of a model-free versus model-based RL strategy. Overall, a greater portion of participants’ behavior was better captured by the model-based than model-free RL model. Although assaultive trauma did not predict learning strategy use, greater sexual abuse severity predicted less use of model-based compared to model-free RL. Additionally, severe sexual abuse predicted less left frontoparietal network encoding of model-based RL updates, which was not accounted for by PTSD. Given the significant impact that sexual trauma has on mental health and other aspects of functioning, it is plausible that altered model-based RL is an important route through which clinical impairment emerges.


2020 ◽  
Author(s):  
Dongjae Kim ◽  
Jaeseung Jeong ◽  
Sang Wan Lee

AbstractThe goal of learning is to maximize future rewards by minimizing prediction errors. Evidence have shown that the brain achieves this by combining model-based and model-free learning. However, the prediction error minimization is challenged by a bias-variance tradeoff, which imposes constraints on each strategy’s performance. We provide new theoretical insight into how this tradeoff can be resolved through the adaptive control of model-based and model-free learning. The theory predicts the baseline correction for prediction error reduces the lower bound of the bias–variance error by factoring out irreducible noise. Using a Markov decision task with context changes, we showed behavioral evidence of adaptive control. Model-based behavioral analyses show that the prediction error baseline signals context changes to improve adaptability. Critically, the neural results support this view, demonstrating multiplexed representations of prediction error baseline within the ventrolateral and ventromedial prefrontal cortex, key brain regions known to guide model-based and model-free learning.One sentence summaryA theoretical, behavioral, computational, and neural account of how the brain resolves the bias-variance tradeoff during reinforcement learning is described.


Author(s):  
Tomohiro Yamaguchi ◽  
Shota Nagahama ◽  
Yoshihiro Ichikawa ◽  
Yoshimichi Honma ◽  
Keiki Takadama

This chapter describes solving multi-objective reinforcement learning (MORL) problems where there are multiple conflicting objectives with unknown weights. Previous model-free MORL methods take large number of calculations to collect a Pareto optimal set for each V/Q-value vector. In contrast, model-based MORL can reduce such a calculation cost than model-free MORLs. However, previous model-based MORL method is for only deterministic environments. To solve them, this chapter proposes a novel model-based MORL method by a reward occurrence probability (ROP) vector with unknown weights. The experimental results are reported under the stochastic learning environments with up to 10 states, 3 actions, and 3 reward rules. The experimental results show that the proposed method collects all Pareto optimal policies, and it took about 214 seconds (10 states, 3 actions, 3 rewards) for total learning time. In future research directions, the ways to speed up methods and how to use non-optimal policies are discussed.


2007 ◽  
Vol 98 (6) ◽  
pp. 3388-3396 ◽  
Author(s):  
J. Russel Keath ◽  
Michael P. Iacoviello ◽  
Lindy E. Barrett ◽  
Huibert D. Mansvelder ◽  
Daniel S. McGehee

Midbrain dopamine (DA) neurons are found in two nuclei, the substantia nigra pars compacta (SNc) and ventral tegmental area (VTA). The SNc dopaminergic projections to the dorsal striatum are involved in voluntary movement and habit learning, whereas the VTA projections to the ventral striatum contribute to reward and motivation. Nicotine induces profound DA release from VTA dopamine neurons but substantially less from the SNc. Nicotinic acetylcholine receptor (nAChR) expression differs between these nuclei, but it is unknown whether there are differences in nAChR expression on the afferent projections to these nuclei. Here we have compared the nicotinic modulation of excitatory and inhibitory synaptic inputs to VTA and SNc dopamine neurons. Although nicotine enhances both the excitatory and inhibitory drive to SNc DA cells with response magnitudes similar to those seen in the VTA, the prevalence of these responses in SNc is much lower. We also found that a mixture of nAChR subtypes underlies the synaptic modulation in SNc, further distinguishing this nucleus from the VTA, where α7 nAChRs enhance glutamate inputs and non-α7 receptors enhance GABA inputs. Finally, we compared the nicotine sensitivity of DA neurons in these two nuclei and found larger response magnitudes in VTA relative to SNc. Thus the observed differences in nicotine-induced DA release from VTA and SNc are likely due to differences in nAChR expression on the afferent inputs as well as on the DA neurons themselves. This may explain why nicotine has a greater effect on behaviors associated with the VTA than the SNc.


2012 ◽  
Vol 33 (3) ◽  
pp. 429-435 ◽  
Author(s):  
Adam C. Munhall ◽  
Yan-Na Wu ◽  
John K. Belknap ◽  
Charles K. Meshul ◽  
Steven W. Johnson

Sign in / Sign up

Export Citation Format

Share Document