scholarly journals The contribution of striatal pseudo-reward prediction errors to value-based decision-making

2017 ◽  
Author(s):  
Ernest Mas-Herrero ◽  
Guillaume Sescousse ◽  
Roshan Cools ◽  
Josep Marco-Pallarés

AbstractMost studies that have investigated the brain mechanisms underlying learning have focused on the ability to learn simple stimulus-response associations. However, in everyday life, outcomes are often obtained through complex behavioral patterns involving a series of actions. In such scenarios, parallel learning systems are important to reduce the complexity of the learning problem, as proposed in the framework of hierarchical reinforcement learning (HRL). One of the key features of HRL is the computation of pseudo-reward prediction errors (PRPEs) which allow the reinforcement of actions that led to a sub-goal before the final goal itself is achieved. Here we wanted to test the hypothesis that, despite not carrying any rewarding value per se, pseudo-rewards might generate a bias in choice behavior when reward contingencies are not well-known or uncertain. Second, we also hypothesized that this bias might be related to the strength of PRPE striatal representations. In order to test these ideas, we developed a novel decision-making paradigm to assess reward prediction errors (RPEs) and PRPEs in two studies (fMRI study: n = 20; behavioural study: n = 19). Our results show that overall participants developed a preference for the most pseudo-rewarding option throughout the task, even though it did not lead to more monetary rewards. fMRI analyses revealed that this preference was predicted by individual differences in the relative striatal sensitivity to PRPEs vs RPEs. Together, our results indicate that pseudo-rewards generate learning signals in the striatum and subsequently bias choice behavior despite their lack of association with actual reward.

2021 ◽  
Author(s):  
Joseph Heffner ◽  
Jae-Young Son ◽  
Oriel FeldmanHall

People make decisions based on deviations from expected outcomes, known as prediction errors. Past work has focused on reward prediction errors, largely ignoring violations of expected emotional experiences—emotion prediction errors. We leverage a new method to measure real-time fluctuations in emotion as people decide to punish or forgive others. Across four studies (N=1,016), we reveal that emotion and reward prediction errors have distinguishable contributions to choice, such that emotion prediction errors exert the strongest impact during decision-making. We additionally find that a choice to punish or forgive can be decoded in less than a second from an evolving emotional response, suggesting emotions swiftly influence choice. Finally, individuals reporting significant levels of depression exhibit selective impairments in using emotion—but not reward—prediction errors. Evidence for emotion prediction errors potently guiding social behaviors challenge standard decision-making models that have focused solely on reward.


2019 ◽  
Author(s):  
A. Wiehler ◽  
K. Chakroun ◽  
J. Peters

AbstractGambling disorder is a behavioral addiction associated with impairments in decision-making and reduced behavioral flexibility. Decision-making in volatile environments requires a flexible trade-off between exploitation of options with high expected values and exploration of novel options to adapt to changing reward contingencies. This classical problem is known as the exploration-exploitation dilemma. We hypothesized gambling disorder to be associated with a specific reduction in directed (uncertainty-based) exploration compared to healthy controls, accompanied by changes in brain activity in a fronto-parietal exploration-related network.Twenty-three frequent gamblers and nineteen matched controls performed a classical four-armed bandit task during functional magnetic resonance imaging. Computational modeling revealed that choice behavior in both groups contained signatures of directed exploration, random exploration and perseveration. Gamblers showed a specific reduction in directed exploration, while random exploration and perseveration were similar between groups.Neuroimaging revealed no evidence for group differences in neural representations of expected value and reward prediction errors. Likewise, our hypothesis of attenuated fronto-parietal exploration effects in gambling disorder was not supported. However, during directed exploration, gamblers showed reduced parietal and substantia nigra / ventral tegmental area activity. Cross-validated classification analyses revealed that connectivity in an exploration-related network was predictive of clinical status, suggesting alterations in network dynamics in gambling disorder.In sum, we show that reduced flexibility during reinforcement learning in volatile environments in gamblers is attributable to a reduction in directed exploration rather than an increase in perseveration. Neuroimaging findings suggest that patterns of network connectivity might be more diagnostic of gambling disorder than univariate value and prediction error effects. We provide a computational account of flexibility impairments in gamblers during reinforcement learning that might arise as a consequence of dopaminergic dysregulation in this disorder.


2020 ◽  
Vol 6 (45) ◽  
pp. eabc9321
Author(s):  
David J. Ottenheimer ◽  
Karen Wang ◽  
Xiao Tong ◽  
Kurt M. Fraser ◽  
Jocelyn M. Richard ◽  
...  

A key function of the nervous system is producing adaptive behavior across changing conditions, like physiological state. Although states like thirst and hunger are known to impact decision-making, the neurobiology of this phenomenon has been studied minimally. Here, we tracked evolving preference for sucrose and water as rats proceeded from a thirsty to sated state. As rats shifted from water choices to sucrose choices across the session, the activity of a majority of neurons in the ventral pallidum, a region crucial for reward-related behaviors, closely matched the evolving behavioral preference. The timing of this signal followed the pattern of a reward prediction error, occurring at the cue or the reward depending on when reward identity was revealed. Additionally, optogenetic stimulation of ventral pallidum neurons at the time of reward was able to reverse behavioral preference. Our results suggest that ventral pallidum neurons guide reward-related decisions across changing physiological states.


2019 ◽  
Vol 3 (7) ◽  
pp. 719-732 ◽  
Author(s):  
Anthony I. Jang ◽  
Matthew R. Nassar ◽  
Daniel G. Dillon ◽  
Michael J. Frank

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Ulrich Kirk ◽  
Giuseppe Pagnoni ◽  
Sébastien Hétu ◽  
Read Montague

NeuroImage ◽  
2019 ◽  
Vol 193 ◽  
pp. 67-74 ◽  
Author(s):  
Ernest Mas-Herrero ◽  
Guillaume Sescousse ◽  
Roshan Cools ◽  
Josep Marco-Pallarés

2017 ◽  
Vol 47 (7) ◽  
pp. 1246-1258 ◽  
Author(s):  
T. U. Hauser ◽  
R. Iannaccone ◽  
R. J. Dolan ◽  
J. Ball ◽  
J. Hättenschwiler ◽  
...  

BackgroundObsessive–compulsive disorder (OCD) has been linked to functional abnormalities in fronto-striatal networks as well as impairments in decision making and learning. Little is known about the neurocognitive mechanisms causing these decision-making and learning deficits in OCD, and how they relate to dysfunction in fronto-striatal networks.MethodWe investigated neural mechanisms of decision making in OCD patients, including early and late onset of disorder, in terms of reward prediction errors (RPEs) using functional magnetic resonance imaging. RPEs index a mismatch between expected and received outcomes, encoded by the dopaminergic system, and are known to drive learning and decision making in humans and animals. We used reinforcement learning models and RPE signals to infer the learning mechanisms and to compare behavioural parameters and neural RPE responses of the OCD patients with those of healthy matched controls.ResultsPatients with OCD showed significantly increased RPE responses in the anterior cingulate cortex (ACC) and the putamen compared with controls. OCD patients also had a significantly lower perseveration parameter than controls.ConclusionsEnhanced RPE signals in the ACC and putamen extend previous findings of fronto-striatal deficits in OCD. These abnormally strong RPEs suggest a hyper-responsive learning network in patients with OCD, which might explain their indecisiveness and intolerance of uncertainty.


2017 ◽  
Author(s):  
Ian Ballard ◽  
Eric M. Miller ◽  
Steven T. Piantadosi ◽  
Noah Goodman ◽  
Samuel M. McClure

ABSTRACTHumans naturally group the world into coherent categories defined by membership rules. Rules can be learned implicitly by building stimulus-response associations using reinforcement learning (RL) or by using explicit reasoning. We tested if the striatum, in which activation reliably scales with reward prediction error, would track prediction errors in a task that required explicit rule generation. Using functional magnetic resonance imaging during a categorization task, we show that striatal responses to feedback scale with a “surprise” signal derived from a Bayesian rule-learning model and are inconsistent with RL prediction error. We also find that striatum and caudal inferior frontal sulcus (cIFS) are involved in updating the likelihood of discriminative rules. We conclude that the striatum, in cooperation with the cIFS, is involved in updating the values assigned to categorization rules when people learn using explicit reasoning.


2021 ◽  
Author(s):  
Kosuke Hamaguchi ◽  
Hiromi Takahashi-Aoki ◽  
Dai Watanabe

Animals must flexibly estimate the value of their actions to successfully adapt in a changing environment. The brain is thought to estimate action-value from two different sources, namely the action-outcome history (retrospective value) and the knowledge of the environment (prospective value). How these two different estimates of action-value are reconciled to make a choice is not well understood. Here we show that as a mouse learns the state-transition structure of a decision-making task, retrospective and prospective values become jointly encoded in the preparatory activity of neurons in the frontal cortex. Suppressing this preparatory activity in expert mice returned their behavior to a naive state. These results reveal the neural circuit that integrates knowledge about the past and future to support predictive decision-making.


Sign in / Sign up

Export Citation Format

Share Document