The contribution of striatal pseudo-reward prediction errors to value-based decision-making

Mapping Intimacies ◽

10.1101/097873 ◽

2017 ◽

Author(s):

Ernest Mas-Herrero ◽

Guillaume Sescousse ◽

Roshan Cools ◽

Josep Marco-Pallarés

Keyword(s):

Decision Making ◽

Choice Behavior ◽

Prediction Errors ◽

Life Outcomes ◽

Stimulus Response ◽

Hierarchical Reinforcement Learning ◽

Reward Prediction ◽

Fmri Study ◽

Reward Contingencies ◽

The Brain

AbstractMost studies that have investigated the brain mechanisms underlying learning have focused on the ability to learn simple stimulus-response associations. However, in everyday life, outcomes are often obtained through complex behavioral patterns involving a series of actions. In such scenarios, parallel learning systems are important to reduce the complexity of the learning problem, as proposed in the framework of hierarchical reinforcement learning (HRL). One of the key features of HRL is the computation of pseudo-reward prediction errors (PRPEs) which allow the reinforcement of actions that led to a sub-goal before the final goal itself is achieved. Here we wanted to test the hypothesis that, despite not carrying any rewarding value per se, pseudo-rewards might generate a bias in choice behavior when reward contingencies are not well-known or uncertain. Second, we also hypothesized that this bias might be related to the strength of PRPE striatal representations. In order to test these ideas, we developed a novel decision-making paradigm to assess reward prediction errors (RPEs) and PRPEs in two studies (fMRI study: n = 20; behavioural study: n = 19). Our results show that overall participants developed a preference for the most pseudo-rewarding option throughout the task, even though it did not lead to more monetary rewards. fMRI analyses revealed that this preference was predicted by individual differences in the relative striatal sensitivity to PRPEs vs RPEs. Together, our results indicate that pseudo-rewards generate learning signals in the striatum and subsequently bias choice behavior despite their lack of association with actual reward.

Download Full-text

Effects of affective arousal on choice behavior, reward prediction errors, and feedback-related negativities in human reward-based decision making

Frontiers in Psychology ◽

10.3389/fpsyg.2015.00592 ◽

2015 ◽

Vol 6 ◽

Cited By ~ 4

Author(s):

Hong-Hsiang Liu ◽

Ming H. Hsieh ◽

Yung-Fong Hsu ◽

Wen-Sung Lai

Keyword(s):

Decision Making ◽

Choice Behavior ◽

Prediction Errors ◽

Affective Arousal ◽

Reward Prediction

Download Full-text

Emotion prediction errors guide socially adaptive behavior

10.31234/osf.io/azeyk ◽

2021 ◽

Author(s):

Joseph Heffner ◽

Jae-Young Son ◽

Oriel FeldmanHall

Keyword(s):

Decision Making ◽

Real Time ◽

Adaptive Behavior ◽

Emotional Response ◽

New Method ◽

Prediction Errors ◽

Emotional Experiences ◽

Past Work ◽

Reward Prediction ◽

Expected Outcomes

People make decisions based on deviations from expected outcomes, known as prediction errors. Past work has focused on reward prediction errors, largely ignoring violations of expected emotional experiences—emotion prediction errors. We leverage a new method to measure real-time fluctuations in emotion as people decide to punish or forgive others. Across four studies (N=1,016), we reveal that emotion and reward prediction errors have distinguishable contributions to choice, such that emotion prediction errors exert the strongest impact during decision-making. We additionally find that a choice to punish or forgive can be decoded in less than a second from an evolving emotional response, suggesting emotions swiftly influence choice. Finally, individuals reporting significant levels of depression exhibit selective impairments in using emotion—but not reward—prediction errors. Evidence for emotion prediction errors potently guiding social behaviors challenge standard decision-making models that have focused solely on reward.

Download Full-text

Attenuated directed exploration during reinforcement learning in gambling disorder

10.1101/823583 ◽

2019 ◽

Cited By ~ 3

Author(s):

A. Wiehler ◽

K. Chakroun ◽

J. Peters

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Gambling Disorder ◽

Brain Activity ◽

Clinical Status ◽

Classical Problem ◽

Behavioral Flexibility ◽

Network Connectivity ◽

Prediction Errors ◽

Reward Contingencies

AbstractGambling disorder is a behavioral addiction associated with impairments in decision-making and reduced behavioral flexibility. Decision-making in volatile environments requires a flexible trade-off between exploitation of options with high expected values and exploration of novel options to adapt to changing reward contingencies. This classical problem is known as the exploration-exploitation dilemma. We hypothesized gambling disorder to be associated with a specific reduction in directed (uncertainty-based) exploration compared to healthy controls, accompanied by changes in brain activity in a fronto-parietal exploration-related network.Twenty-three frequent gamblers and nineteen matched controls performed a classical four-armed bandit task during functional magnetic resonance imaging. Computational modeling revealed that choice behavior in both groups contained signatures of directed exploration, random exploration and perseveration. Gamblers showed a specific reduction in directed exploration, while random exploration and perseveration were similar between groups.Neuroimaging revealed no evidence for group differences in neural representations of expected value and reward prediction errors. Likewise, our hypothesis of attenuated fronto-parietal exploration effects in gambling disorder was not supported. However, during directed exploration, gamblers showed reduced parietal and substantia nigra / ventral tegmental area activity. Cross-validated classification analyses revealed that connectivity in an exploration-related network was predictive of clinical status, suggesting alterations in network dynamics in gambling disorder.In sum, we show that reduced flexibility during reinforcement learning in volatile environments in gamblers is attributable to a reduction in directed exploration rather than an increase in perseveration. Neuroimaging findings suggest that patterns of network connectivity might be more diagnostic of gambling disorder than univariate value and prediction error effects. We provide a computational account of flexibility impairments in gamblers during reinforcement learning that might arise as a consequence of dopaminergic dysregulation in this disorder.

Download Full-text

Reward activity in ventral pallidum tracks satiety-sensitive preference and drives choice behavior

Science Advances ◽

10.1126/sciadv.abc9321 ◽

2020 ◽

Vol 6 (45) ◽

pp. eabc9321

Author(s):

David J. Ottenheimer ◽

Karen Wang ◽

Xiao Tong ◽

Kurt M. Fraser ◽

Jocelyn M. Richard ◽

...

Keyword(s):

Decision Making ◽

Nervous System ◽

Prediction Error ◽

Choice Behavior ◽

Physiological State ◽

Ventral Pallidum ◽

Optogenetic Stimulation ◽

Reward Prediction ◽

Behavioral Preference ◽

Stimulation Of

A key function of the nervous system is producing adaptive behavior across changing conditions, like physiological state. Although states like thirst and hunger are known to impact decision-making, the neurobiology of this phenomenon has been studied minimally. Here, we tracked evolving preference for sucrose and water as rats proceeded from a thirsty to sated state. As rats shifted from water choices to sucrose choices across the session, the activity of a majority of neurons in the ventral pallidum, a region crucial for reward-related behaviors, closely matched the evolving behavioral preference. The timing of this signal followed the pattern of a reward prediction error, occurring at the cue or the reward depending on when reward identity was revealed. Additionally, optogenetic stimulation of ventral pallidum neurons at the time of reward was able to reverse behavioral preference. Our results suggest that ventral pallidum neurons guide reward-related decisions across changing physiological states.

Download Full-text

Positive reward prediction errors during decision-making strengthen memory encoding

Nature Human Behaviour ◽

10.1038/s41562-019-0597-3 ◽

2019 ◽

Vol 3 (7) ◽

pp. 719-732 ◽

Cited By ~ 24

Author(s):

Anthony I. Jang ◽

Matthew R. Nassar ◽

Daniel G. Dillon ◽

Michael J. Frank

Keyword(s):

Decision Making ◽

Prediction Errors ◽

Memory Encoding ◽

Reward Prediction

Download Full-text

Short-term mindfulness practice attenuates reward prediction errors signals in the brain

Scientific Reports ◽

10.1038/s41598-019-43474-2 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 3

Author(s):

Ulrich Kirk ◽

Giuseppe Pagnoni ◽

Sébastien Hétu ◽

Read Montague

Keyword(s):

Mindfulness Practice ◽

Prediction Errors ◽

Short Term ◽

Reward Prediction ◽

The Brain

Download Full-text

The contribution of striatal pseudo-reward prediction errors to value-based decision-making

NeuroImage ◽

10.1016/j.neuroimage.2019.02.052 ◽

2019 ◽

Vol 193 ◽

pp. 67-74 ◽

Cited By ~ 2

Author(s):

Ernest Mas-Herrero ◽

Guillaume Sescousse ◽

Roshan Cools ◽

Josep Marco-Pallarés

Keyword(s):

Decision Making ◽

Prediction Errors ◽

Reward Prediction

Download Full-text

Increased fronto-striatal reward prediction errors moderate decision making in obsessive–compulsive disorder

Psychological Medicine ◽

10.1017/s0033291716003305 ◽

2017 ◽

Vol 47 (7) ◽

pp. 1246-1258 ◽

Cited By ~ 19

Author(s):

T. U. Hauser ◽

R. Iannaccone ◽

R. J. Dolan ◽

J. Ball ◽

J. Hättenschwiler ◽

...

Keyword(s):

Decision Making ◽

Obsessive Compulsive Disorder ◽

Late Onset ◽

Anterior Cingulate ◽

Prediction Errors ◽

Learning Mechanisms ◽

Obsessive Compulsive ◽

Compulsive Disorder ◽

Reward Prediction ◽

Reinforcement Learning Models

BackgroundObsessive–compulsive disorder (OCD) has been linked to functional abnormalities in fronto-striatal networks as well as impairments in decision making and learning. Little is known about the neurocognitive mechanisms causing these decision-making and learning deficits in OCD, and how they relate to dysfunction in fronto-striatal networks.MethodWe investigated neural mechanisms of decision making in OCD patients, including early and late onset of disorder, in terms of reward prediction errors (RPEs) using functional magnetic resonance imaging. RPEs index a mismatch between expected and received outcomes, encoded by the dopaminergic system, and are known to drive learning and decision making in humans and animals. We used reinforcement learning models and RPE signals to infer the learning mechanisms and to compare behavioural parameters and neural RPE responses of the OCD patients with those of healthy matched controls.ResultsPatients with OCD showed significantly increased RPE responses in the anterior cingulate cortex (ACC) and the putamen compared with controls. OCD patients also had a significantly lower perseveration parameter than controls.ConclusionsEnhanced RPE signals in the ACC and putamen extend previous findings of fronto-striatal deficits in OCD. These abnormally strong RPEs suggest a hyper-responsive learning network in patients with OCD, which might explain their indecisiveness and intolerance of uncertainty.

Download Full-text

Beyond Reward Prediction Errors: Human Striatum Updates Rule Values During Learning

10.1101/115253 ◽

2017 ◽

Cited By ~ 1

Author(s):

Ian Ballard ◽

Eric M. Miller ◽

Steven T. Piantadosi ◽

Noah Goodman ◽

Samuel M. McClure

Keyword(s):

Prediction Error ◽

Rule Learning ◽

Prediction Errors ◽

Rule Generation ◽

Functional Magnetic Resonance ◽

Resonance Imaging ◽

Stimulus Response ◽

Reward Prediction ◽

The World ◽

Explicit Rule

ABSTRACTHumans naturally group the world into coherent categories defined by membership rules. Rules can be learned implicitly by building stimulus-response associations using reinforcement learning (RL) or by using explicit reasoning. We tested if the striatum, in which activation reliably scales with reward prediction error, would track prediction errors in a task that required explicit rule generation. Using functional magnetic resonance imaging during a categorization task, we show that striatal responses to feedback scale with a “surprise” signal derived from a Bayesian rule-learning model and are inconsistent with RL prediction error. We also find that striatum and caudal inferior frontal sulcus (cIFS) are involved in updating the likelihood of discriminative rules. We conclude that the striatum, in cooperation with the cIFS, is involved in updating the values assigned to categorization rules when people learn using explicit reasoning.

Download Full-text

From Retrospective to Prospective: Integrated Value Representation in Frontal Cortex for Predictive Choice Behavior

10.1101/2021.12.27.474215 ◽

2021 ◽

Author(s):

Kosuke Hamaguchi ◽

Hiromi Takahashi-Aoki ◽

Dai Watanabe

Keyword(s):

Decision Making ◽

Frontal Cortex ◽

Choice Behavior ◽

State Transition ◽

Neural Circuit ◽

The Past ◽

Preparatory Activity ◽

Action Value ◽

The Brain ◽

Different Sources

Animals must flexibly estimate the value of their actions to successfully adapt in a changing environment. The brain is thought to estimate action-value from two different sources, namely the action-outcome history (retrospective value) and the knowledge of the environment (prospective value). How these two different estimates of action-value are reconciled to make a choice is not well understood. Here we show that as a mouse learns the state-transition structure of a decision-making task, retrospective and prospective values become jointly encoded in the preparatory activity of neurons in the frontal cortex. Suppressing this preparatory activity in expert mice returned their behavior to a naive state. These results reveal the neural circuit that integrates knowledge about the past and future to support predictive decision-making.

Download Full-text