Dopamine selectively remediates ‘model-based’ reward learning: a computational approach

Theories of neural replay propose that it supports a range of different functions, most prominently planning and memory maintenance. Here, we test the hypothesis that distinct replay signatures relate to planning and memory maintenance. Our reward learning task required human participants to utilize structure knowledge for 'model-based' evaluation, while maintaining knowledge for two independent and randomly alternating task environments. Using magnetoencephalography (MEG) and multivariate analysis, we found neural evidence for compressed forward replay during planning and backward replay following reward feedback. Prospective replay strength was enhanced for the current environment when the benefits of a model-based planning strategy were higher. Following reward receipt, backward replay for the alternative, distal environment was enhanced as a function of decreasing recency of experience for that environment. Consistent with a memory maintenance role, stronger maintenance-related replay was associated with a modulation of subsequent choices. These findings identify distinct replay signatures consistent with key theoretical proposals on planning and memory maintenance functions, with their relative strength modulated by on-going computational and task demands.

Download Full-text

Humans are primarily model-based learners in the two-stage task

10.1101/682922 ◽

2019 ◽

Cited By ~ 7

Author(s):

Carolina Feher da Silva ◽

Todd A. Hare

Keyword(s):

Simple Model ◽

Habit Formation ◽

Learning Processes ◽

Reward Learning ◽

Learning Models ◽

Two Stage ◽

Model Based ◽

Model Free ◽

Task Instructions ◽

Versus Model

AbstractDistinct model-free and model-based learning processes are thought to drive both typical and dysfunctional behaviours. Data from two-stage decision tasks have seemingly shown that human behaviour is driven by both processes operating in parallel. However, in this study, we show that more detailed task instructions lead participants to make primarily model-based choices that have little, if any, simple model-free influence. We also demonstrate that behaviour in the two-stage task may falsely appear to be driven by a combination of simple model-free and model-based learning if purely model-based agents form inaccurate models of the task because of misconceptions. Furthermore, we report evidence that many participants do misconceive the task in important ways. Overall, we argue that humans formulate a wide variety of learning models. Consequently, the simple dichotomy of model-free versus model-based learning is inadequate to explain behaviour in the two-stage task and connections between reward learning, habit formation, and compulsivity.

Download Full-text

A Novel Indoor Path Loss Model based on Path Loss Exponent (PLE) Computational Approach

Journal of Advanced Research in Dynamical and Control Systems ◽

10.5373/jardcs/v12sp7/20202217 ◽

2020 ◽

Vol 12 (SP7) ◽

pp. 1170-1178

Author(s):

Vijay Rayar

Keyword(s):

Path Loss ◽

Computational Approach ◽

Loss Model ◽

Path Loss Model ◽

Path Loss Exponent ◽

Model Based

Download Full-text

Model-Based fMRI and Its Application to Reward Learning and Decision Making

Annals of the New York Academy of Sciences ◽

10.1196/annals.1390.022 ◽

2007 ◽

Vol 1104 (1) ◽

pp. 35-53 ◽

Cited By ~ 267

Author(s):

J. P. O'DOHERTY ◽

A. HAMPTON ◽

H. KIM

Keyword(s):

Decision Making ◽

Reward Learning ◽

Model Based

Download Full-text

Mechanobiology of soft skeletal tissue differentiation?a computational approach of a fiber-reinforced poroelastic model based on homogeneous and isotropic simplifications

Biomechanics and Modeling in Mechanobiology ◽

10.1007/s10237-003-0030-7 ◽

2003 ◽

Vol 2 (2) ◽

pp. 83-96 ◽

Cited By ~ 29

Author(s):

E. G. Loboa ◽

T. A. L. Wren ◽

G. S. Beaupr� ◽

D. R. Carter

Keyword(s):

Computational Approach ◽

Tissue Differentiation ◽

Fiber Reinforced ◽

Skeletal Tissue ◽

Model Based ◽

Poroelastic Model

Download Full-text

Model-based and model-free Pavlovian reward learning: Revaluation, revision, and revelation

Cognitive Affective & Behavioral Neuroscience ◽

10.3758/s13415-014-0277-8 ◽

2014 ◽

Vol 14 (2) ◽

pp. 473-492 ◽

Cited By ~ 165

Author(s):

Peter Dayan ◽

Kent C. Berridge

Keyword(s):

Reward Learning ◽

Model Based ◽

Model Free

Download Full-text

A predictive model based on a 3-D computational approach for film cooling effectiveness over a flat plate using GMDH-type neural networks

Heat and Mass Transfer ◽

10.1007/s00231-013-1239-3 ◽

2013 ◽

Vol 50 (1) ◽

pp. 139-149 ◽

Cited By ~ 9

Author(s):

M. Naghashnejad ◽

N. Amanifard ◽

H. M. Deylami

Keyword(s):

Neural Networks ◽

Flat Plate ◽

Predictive Model ◽

Film Cooling ◽

Computational Approach ◽

Cooling Effectiveness ◽

Film Cooling Effectiveness ◽

Model Based

Download Full-text

The successor representation in human reinforcement learning

10.1101/083824 ◽

2016 ◽

Cited By ~ 9

Author(s):

I Momennejad ◽

EM Russek ◽

JH Cheong ◽

MM Botvinick ◽

ND Daw ◽

...

Keyword(s):

Reinforcement Learning ◽

Choice Model ◽

Differential Sensitivity ◽

Reward Learning ◽

Model Based ◽

Model Free ◽

Sequence Of Events ◽

Behavioral Studies ◽

Future Events ◽

Unique Signature

AbstractTheories of reward learning in neuroscience have focused on two families of algorithms, thought to capture deliberative vs. habitual choice. “Model-based” algorithms compute the value of candidate actions from scratch, whereas “model-free” algorithms make choice more efficient but less flexible by storing pre-computed action values. We examine an intermediate algorithmic family, the successor representation (SR), which balances flexibility and efficiency by storing partially computed action values: predictions about future events. These pre-computation strategies differ in how they update their choices following changes in a task. SR’s reliance on stored predictions about future states predicts a unique signature of insensitivity to changes in the task’s sequence of events, but flexible adjustment following changes to rewards. We provide evidence for such differential sensitivity in two behavioral studies with humans. These results suggest that the SR is a computational substrate for semi-flexible choice in humans, introducing a subtler, more cognitive notion of habit.

Download Full-text