Causal evidence supporting the proposal that dopamine transients function as a temporal difference prediction error

Mapping Intimacies ◽

10.1101/520965 ◽

2019 ◽

Cited By ~ 2

Author(s):

Etienne JP Maes ◽

Melissa J Sharpe ◽

Matthew P.H. Gardner ◽

Chun Yun Chang ◽

Geoffrey Schoenbaum ◽

...

Keyword(s):

Prediction Error ◽

Second Order ◽

Prediction Errors ◽

Temporal Difference ◽

Reward Prediction ◽

Order Conditioning ◽

Central Tenet

Reward-evoked dopamine is well-established as a prediction error. However the central tenet of temporal difference accounts – that similar transients evoked by reward-predictive cues also function as errors – remains untested. To address this, we used two phenomena, second-order conditioning and blocking, in order to examine the role of dopamine in prediction error versus reward prediction. We show that optogenetically-shunting dopamine activity at the start of a reward-predicting cue prevents second-order conditioning without affecting blocking. These results support temporal difference accounts by providing causal evidence that cue-evoked dopamine transients function as prediction errors.

Download Full-text

Higher-Order Conditioning With Simultaneous and Backward Conditioned Stimulus: Implications for Models of Pavlovian Conditioning

Frontiers in Behavioral Neuroscience ◽

10.3389/fnbeh.2021.749517 ◽

2021 ◽

Vol 15 ◽

Author(s):

Arthur Prével ◽

Ruth M. Krebs

Keyword(s):

Conditioned Stimulus ◽

Pavlovian Conditioning ◽

Temporal Structure ◽

Higher Order ◽

Prediction Errors ◽

Temporal Difference ◽

Conditioned Stimuli ◽

Form Complex ◽

Reward Prediction ◽

Order Conditioning

In a new environment, humans and animals can detect and learn that cues predict meaningful outcomes, and use this information to adapt their responses. This process is termed Pavlovian conditioning. Pavlovian conditioning is also observed for stimuli that predict outcome-associated cues; a second type of conditioning is termed higher-order Pavlovian conditioning. In this review, we will focus on higher-order conditioning studies with simultaneous and backward conditioned stimuli. We will examine how the results from these experiments pose a challenge to models of Pavlovian conditioning like the Temporal Difference (TD) models, in which learning is mainly driven by reward prediction errors. Contrasting with this view, the results suggest that humans and animals can form complex representations of the (temporal) structure of the task, and use this information to guide behavior, which seems consistent with model-based reinforcement learning. Future investigations involving these procedures could result in important new insights on the mechanisms that underlie Pavlovian conditioning.

Download Full-text

Reward prediction error in the ERP following unconditioned aversive stimuli

Scientific Reports ◽

10.1038/s41598-021-99408-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Harry J. Stewardson ◽

Thomas D. Sambrook

Keyword(s):

Reinforcement Learning ◽

Prediction Error ◽

Dopamine Neurons ◽

Prediction Errors ◽

Temporal Difference ◽

Dopamine System ◽

Reward Prediction Error ◽

Reward Prediction ◽

Midbrain Dopamine ◽

Human Participants

AbstractReinforcement learning in humans and other animals is driven by reward prediction errors: deviations between the amount of reward or punishment initially expected and that which is obtained. Temporal difference methods of reinforcement learning generate this reward prediction error at the earliest time at which a revision in reward or punishment likelihood is signalled, for example by a conditioned stimulus. Midbrain dopamine neurons, believed to compute reward prediction errors, generate this signal in response to both conditioned and unconditioned stimuli, as predicted by temporal difference learning. Electroencephalographic recordings of human participants have suggested that a component named the feedback-related negativity (FRN) is generated when this signal is carried to the cortex. If this is so, the FRN should be expected to respond equivalently to conditioned and unconditioned stimuli. However, very few studies have attempted to measure the FRN’s response to unconditioned stimuli. The present study attempted to elicit the FRN in response to a primary aversive stimulus (electric shock) using a design that varied reward prediction error while holding physical intensity constant. The FRN was strongly elicited, but earlier and more transiently than typically seen, suggesting that it may incorporate other processes than the midbrain dopamine system.

Download Full-text

Dopamine Ramps Are a Consequence of Reward Prediction Errors

Neural Computation ◽

10.1162/neco_a_00559 ◽

2014 ◽

Vol 26 (3) ◽

pp. 467-471 ◽

Cited By ~ 27

Author(s):

Samuel J. Gershman

Keyword(s):

Prediction Error ◽

Prediction Errors ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Learning Models ◽

Quadratic Transformation ◽

Reward Prediction Error ◽

Reward Prediction

Temporal difference learning models of dopamine assert that phasic levels of dopamine encode a reward prediction error. However, this hypothesis has been challenged by recent observations of gradually ramping stratal dopamine levels as a goal is approached. This note describes conditions under which temporal difference learning models predict dopamine ramping. The key idea is representational: a quadratic transformation of proximity to the goal implies approximately linear ramping, as observed experimentally.

Download Full-text

Reward Prediction Errors Drive Declarative Learning Irrespective of Agency

10.31234/osf.io/63g9w ◽

2020 ◽

Author(s):

Kate Ergo ◽

Luna De Vilder ◽

Esther De Loof ◽

Tom Verguts

Keyword(s):

Learning Theory ◽

Learning Effect ◽

Steady Increase ◽

Prediction Errors ◽

Experimental Paradigm ◽

Reward Prediction ◽

Declarative Learning

Recent years have witnessed a steady increase in the number of studies investigating the role of reward prediction errors (RPEs) in declarative learning. Specifically, in several experimental paradigms RPEs drive declarative learning; with larger and more positive RPEs enhancing declarative learning. However, it is unknown whether this RPE must derive from the participant’s own response, or whether instead any RPE is sufficient to obtain the learning effect. To test this, we generated RPEs in the same experimental paradigm where we combined an agency and a non-agency condition. We observed no interaction between RPE and agency, suggesting that any RPE (irrespective of its source) can drive declarative learning. This result holds implications for declarative learning theory.

Download Full-text

The Embodiment of Concepts

The Oxford Handbook of 4E Cognition ◽

10.1093/oxfordhb/9780198735410.013.34 ◽

2018 ◽

pp. 640-660

Author(s):

Michiel Van Elk ◽

Harold Bekkering

Keyword(s):

Prediction Error ◽

Predictive Processing ◽

Prediction Errors ◽

Conceptual Representation ◽

Recurrent Processing ◽

Different Dimensions ◽

Prior Models ◽

Processing Framework

We characterize theories of conceptual representation as embodied, disembodied, or hybrid according to their stance on a number of different dimensions: the nature of concepts, the relation between language and concepts, the function of concepts, the acquisition of concepts, the representation of concepts, and the role of context. We propose to extend an embodied view of concepts, by taking into account the importance of multimodal associations and predictive processing. We argue that concepts are dynamically acquired and updated, based on recurrent processing of prediction error signals in a hierarchically structured network. Concepts are thus used as prior models to generate multimodal expectations, thereby reducing surprise and enabling greater precision in the perception of exemplars. This view places embodied theories of concepts in a novel predictive processing framework, by highlighting the importance of concepts for prediction, learning and shaping categories on the basis of prediction errors.

Download Full-text

Excitatory second-order conditioning using a backward first-order conditioned stimulus: A challenge for prediction error reduction

Quarterly Journal of Experimental Psychology ◽

10.1177/1747021818793376 ◽

2018 ◽

Vol 72 (6) ◽

pp. 1453-1465 ◽

Cited By ~ 2

Author(s):

Arthur Prével ◽

Vinca Rivière ◽

Jean-Claude Darcheville ◽

Gonzalo P Urcelay ◽

Ralph R Miller

Keyword(s):

Conditioned Stimulus ◽

Conditioned Reinforcement ◽

Prediction Error ◽

Conditioned Reinforcer ◽

Error Reduction ◽

Second Order ◽

Temporal Coding ◽

First Order ◽

Order Conditioning ◽

Opponent Processes

Prével and colleagues reported excitatory learning with a backward conditioned stimulus (CS) in a conditioned reinforcement preparation. Their results add to existing evidence of backward CSs sometimes being excitatory and were viewed as challenging the view that learning is driven by prediction error reduction, which assumes that only predictive (i.e., forward) relationships are learned. The results instead were consistent with the assumptions of both Miller’s Temporal Coding Hypothesis and Wagner’s Sometimes Opponent Processes (SOP) model. The present experiment extended the conditioned reinforcement preparation developed by Prével et al. to a backward second-order conditioning preparation, with the aim of discriminating between these two accounts. We tested whether a second-order CS can serve as an effective conditioned reinforcer, even when the first-order CS with which it was paired is a backward CS that elicits no responding. Evidence of conditioned reinforcement was found, despite no conditioned response (CR) being elicited by the first-order backward CS. The evidence of second-order conditioning in the absence of excitatory conditioning to the first-order CS is interpreted as a challenge to SOP. In contrast, the present results are consistent with the Temporal Coding Hypothesis and constitute a conceptual replication in humans of previous reports of excitatory second-order conditioning in rodents with a backward CS. The proposal is made that learning is driven by “discrepancy” with prior experience as opposed to “ prediction error.”

Download Full-text

The Role of Striatal Tonically Active Neurons in Reward Prediction Error Signaling during Instrumental Task Performance

Journal of Neuroscience ◽

10.1523/jneurosci.4880-10.2011 ◽

2011 ◽

Vol 31 (4) ◽

pp. 1507-1515 ◽

Cited By ~ 36

Author(s):

P. Apicella ◽

S. Ravel ◽

M. Deffains ◽

E. Legallet

Keyword(s):

Task Performance ◽

Prediction Error ◽

Reward Prediction Error ◽

Tonically Active Neurons ◽

Reward Prediction ◽

Instrumental Task

Download Full-text

Neural Mechanisms of Reward Prediction Error in Autism Spectrum Disorder

Autism Research and Treatment ◽

10.1155/2019/5469191 ◽

2019 ◽

Vol 2019 ◽

pp. 1-10 ◽

Cited By ~ 3

Author(s):

Maya G. Mosner ◽

R. Edward McLaurin ◽

Jessica L. Kinard ◽

Shabnam Hakimi ◽

Jacob Parelman ◽

...

Keyword(s):

Prediction Error ◽

Autism Spectrum ◽

Neural Mechanisms ◽

Reward Learning ◽

Prediction Errors ◽

Frontal Pole ◽

Neural Basis ◽

Reward Prediction ◽

Behavioral Impairments ◽

Adults With Asd

Few studies have explored neural mechanisms of reward learning in ASD despite evidence of behavioral impairments of predictive abilities in ASD. To investigate the neural correlates of reward prediction errors in ASD, 16 adults with ASD and 14 typically developing controls performed a prediction error task during fMRI scanning. Results revealed greater activation in the ASD group in the left paracingulate gyrus during signed prediction errors and the left insula and right frontal pole during thresholded unsigned prediction errors. Findings support atypical neural processing of reward prediction errors in ASD in frontostriatal regions critical for prediction coding and reward learning. Results provide a neural basis for impairments in reward learning that may contribute to traits common in ASD (e.g., intolerance of unpredictability).

Download Full-text

Beyond reward prediction errors: the role of dopamine in movement kinematics

Frontiers in Integrative Neuroscience ◽

10.3389/fnint.2015.00039 ◽

2015 ◽

Vol 9 ◽

Cited By ~ 52

Author(s):

Joseph W. Barter ◽

Suellen Li ◽

Dongye Lu ◽

Ryan A. Bartholomew ◽

Mark A. Rossi ◽

...

Keyword(s):

Prediction Errors ◽

Movement Kinematics ◽

Reward Prediction

Download Full-text

Frontal Theta Oscillatory Activity Is a Common Mechanism for the Computation of Unexpected Outcomes and Learning Rate

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_00516 ◽

2014 ◽

Vol 26 (3) ◽

pp. 447-458 ◽

Cited By ~ 39

Author(s):

Ernest Mas-Herrero ◽

Josep Marco-Pallarés

Keyword(s):

Prediction Error ◽

Learning Task ◽

Learning Rate ◽

Oscillatory Activity ◽

Common Mechanism ◽

Prediction Errors ◽

Reward Prediction Error ◽

Reward Prediction ◽

Medial Pfc ◽

The Impact

In decision-making processes, the relevance of the information yielded by outcomes varies across time and situations. It increases when previous predictions are not accurate and in contexts with high environmental uncertainty. Previous fMRI studies have shown an important role of medial pFC in coding both reward prediction errors and the impact of this information to guide future decisions. However, it is unclear whether these two processes are dissociated in time or occur simultaneously, suggesting that a common mechanism is engaged. In the present work, we studied the modulation of two electrophysiological responses associated to outcome processing—the feedback-related negativity ERP and frontocentral theta oscillatory activity—with the reward prediction error and the learning rate. Twenty-six participants performed two learning tasks differing in the degree of predictability of the outcomes: a reversal learning task and a probabilistic learning task with multiple blocks of novel cue–outcome associations. We implemented a reinforcement learning model to obtain the single-trial reward prediction error and the learning rate for each participant and task. Our results indicated that midfrontal theta activity and feedback-related negativity increased linearly with the unsigned prediction error. In addition, variations of frontal theta oscillatory activity predicted the learning rate across tasks and participants. These results support the existence of a common brain mechanism for the computation of unsigned prediction error and learning rate.

Download Full-text