scholarly journals Dopamine selectively remediates ‘model-based’ reward learning: a computational approach

Brain ◽  
2015 ◽  
Vol 139 (2) ◽  
pp. 355-364 ◽  
Author(s):  
Madeleine E. Sharp ◽  
Karin Foerde ◽  
Nathaniel D. Daw ◽  
Daphna Shohamy
2021 ◽  
Author(s):  
G. Elliott Wimmer ◽  
Yunzhe Liu ◽  
Daniel McNamee ◽  
Raymond Dolan

Theories of neural replay propose that it supports a range of different functions, most prominently planning and memory maintenance. Here, we test the hypothesis that distinct replay signatures relate to planning and memory maintenance. Our reward learning task required human participants to utilize structure knowledge for 'model-based' evaluation, while maintaining knowledge for two independent and randomly alternating task environments. Using magnetoencephalography (MEG) and multivariate analysis, we found neural evidence for compressed forward replay during planning and backward replay following reward feedback. Prospective replay strength was enhanced for the current environment when the benefits of a model-based planning strategy were higher. Following reward receipt, backward replay for the alternative, distal environment was enhanced as a function of decreasing recency of experience for that environment. Consistent with a memory maintenance role, stronger maintenance-related replay was associated with a modulation of subsequent choices. These findings identify distinct replay signatures consistent with key theoretical proposals on planning and memory maintenance functions, with their relative strength modulated by on-going computational and task demands.


2019 ◽  
Author(s):  
Carolina Feher da Silva ◽  
Todd A. Hare

AbstractDistinct model-free and model-based learning processes are thought to drive both typical and dysfunctional behaviours. Data from two-stage decision tasks have seemingly shown that human behaviour is driven by both processes operating in parallel. However, in this study, we show that more detailed task instructions lead participants to make primarily model-based choices that have little, if any, simple model-free influence. We also demonstrate that behaviour in the two-stage task may falsely appear to be driven by a combination of simple model-free and model-based learning if purely model-based agents form inaccurate models of the task because of misconceptions. Furthermore, we report evidence that many participants do misconceive the task in important ways. Overall, we argue that humans formulate a wide variety of learning models. Consequently, the simple dichotomy of model-free versus model-based learning is inadequate to explain behaviour in the two-stage task and connections between reward learning, habit formation, and compulsivity.


2007 ◽  
Vol 1104 (1) ◽  
pp. 35-53 ◽  
Author(s):  
J. P. O'DOHERTY ◽  
A. HAMPTON ◽  
H. KIM

2016 ◽  
Author(s):  
I Momennejad ◽  
EM Russek ◽  
JH Cheong ◽  
MM Botvinick ◽  
ND Daw ◽  
...  

AbstractTheories of reward learning in neuroscience have focused on two families of algorithms, thought to capture deliberative vs. habitual choice. “Model-based” algorithms compute the value of candidate actions from scratch, whereas “model-free” algorithms make choice more efficient but less flexible by storing pre-computed action values. We examine an intermediate algorithmic family, the successor representation (SR), which balances flexibility and efficiency by storing partially computed action values: predictions about future events. These pre-computation strategies differ in how they update their choices following changes in a task. SR’s reliance on stored predictions about future states predicts a unique signature of insensitivity to changes in the task’s sequence of events, but flexible adjustment following changes to rewards. We provide evidence for such differential sensitivity in two behavioral studies with humans. These results suggest that the SR is a computational substrate for semi-flexible choice in humans, introducing a subtler, more cognitive notion of habit.


Sign in / Sign up

Export Citation Format

Share Document