Attenuated directed exploration during reinforcement learning in gambling disorder

Mapping Intimacies ◽

10.1101/823583 ◽

2019 ◽

Cited By ~ 3

Author(s):

A. Wiehler ◽

K. Chakroun ◽

J. Peters

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Gambling Disorder ◽

Brain Activity ◽

Clinical Status ◽

Classical Problem ◽

Behavioral Flexibility ◽

Network Connectivity ◽

Prediction Errors ◽

Reward Contingencies

AbstractGambling disorder is a behavioral addiction associated with impairments in decision-making and reduced behavioral flexibility. Decision-making in volatile environments requires a flexible trade-off between exploitation of options with high expected values and exploration of novel options to adapt to changing reward contingencies. This classical problem is known as the exploration-exploitation dilemma. We hypothesized gambling disorder to be associated with a specific reduction in directed (uncertainty-based) exploration compared to healthy controls, accompanied by changes in brain activity in a fronto-parietal exploration-related network.Twenty-three frequent gamblers and nineteen matched controls performed a classical four-armed bandit task during functional magnetic resonance imaging. Computational modeling revealed that choice behavior in both groups contained signatures of directed exploration, random exploration and perseveration. Gamblers showed a specific reduction in directed exploration, while random exploration and perseveration were similar between groups.Neuroimaging revealed no evidence for group differences in neural representations of expected value and reward prediction errors. Likewise, our hypothesis of attenuated fronto-parietal exploration effects in gambling disorder was not supported. However, during directed exploration, gamblers showed reduced parietal and substantia nigra / ventral tegmental area activity. Cross-validated classification analyses revealed that connectivity in an exploration-related network was predictive of clinical status, suggesting alterations in network dynamics in gambling disorder.In sum, we show that reduced flexibility during reinforcement learning in volatile environments in gamblers is attributable to a reduction in directed exploration rather than an increase in perseveration. Neuroimaging findings suggest that patterns of network connectivity might be more diagnostic of gambling disorder than univariate value and prediction error effects. We provide a computational account of flexibility impairments in gamblers during reinforcement learning that might arise as a consequence of dopaminergic dysregulation in this disorder.

Download Full-text

Neural Signatures of Prediction Errors in a Decision-Making Task Are Modulated by Action Execution Failures

10.1101/474361 ◽

2018 ◽

Author(s):

Samuel D. McDougle ◽

Peter A. Butcher ◽

Darius Parvin ◽

Fasial Mushtaq ◽

Yael Niv ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Prediction Error ◽

Learning Task ◽

Prediction Errors ◽

Action Execution ◽

Model Driven ◽

Assignment Of Credit ◽

Level Information ◽

High Level

AbstractDecisions must be implemented through actions, and actions are prone to error. As such, when an expected outcome is not obtained, an individual should not only be sensitive to whether the choice itself was suboptimal, but also whether the action required to indicate that choice was executed successfully. The intelligent assignment of credit to action execution versus action selection has clear ecological utility for the learner. To explore this scenario, we used a modified version of a classic reinforcement learning task in which feedback indicated if negative prediction errors were, or were not, associated with execution errors. Using fMRI, we asked if prediction error computations in the human striatum, a key substrate in reinforcement learning and decision making, are modulated when a failure in action execution results in the negative outcome. Participants were more tolerant of non-rewarded outcomes when these resulted from execution errors versus when execution was successful but the reward was withheld. Consistent with this behavior, a model-driven analysis of neural activity revealed an attenuation of the signal associated with negative reward prediction error in the striatum following execution failures. These results converge with other lines of evidence suggesting that prediction errors in the mesostriatal dopamine system integrate high-level information during the evaluation of instantaneous reward outcomes.

Download Full-text

Pupil responses as indicators of value-based decision-making

10.1101/302166 ◽

2018 ◽

Cited By ~ 5

Author(s):

Joanne C. Van Slooten ◽

Sara Jahfari ◽

Tomas Knapen ◽

Jan Theeuwes

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Computational Model ◽

Cognitive Processes ◽

Learning Task ◽

Brain Regions ◽

Computational Approach ◽

Prediction Errors ◽

Reward Prediction ◽

Exciting Possibility

AbstractPupil responses have been used to track cognitive processes during decision-making. Studies have shown that in these cases the pupil reflects the joint activation of many cortical and subcortical brain regions, also those traditionally implicated in value-based learning. However, how the pupil tracks value-based decisions and reinforcement learning is unknown. We combined a reinforcement learning task with a computational model to study pupil responses during value-based decisions, and decision evaluations. We found that the pupil closely tracks reinforcement learning both across trials and participants. Prior to choice, the pupil dilated as a function of trial-by-trial fluctuations in value beliefs. After feedback, early dilation scaled with value uncertainty, whereas later constriction scaled with reward prediction errors. Our computational approach systematically implicates the pupil in value-based decisions, and the subsequent processing of violated value beliefs, ttese dissociable influences provide an exciting possibility to non-invasively study ongoing reinforcement learning in the pupil.

Download Full-text

Dorsal Striatal–midbrain Connectivity in Humans Predicts How Reinforcements Are Used to Guide Decisions

Journal of Cognitive Neuroscience ◽

10.1162/jocn.2009.21092 ◽

2009 ◽

Vol 21 (7) ◽

pp. 1332-1345 ◽

Cited By ~ 67

Author(s):

Thorsten Kahnt ◽

Soyoung Q Park ◽

Michael X Cohen ◽

Anne Beck ◽

Andreas Heinz ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Functional Connectivity ◽

Ventral Striatum ◽

Critical Role ◽

Prediction Errors ◽

Midbrain Neurons ◽

Reward Prediction ◽

Future Behavior ◽

The Impact

It has been suggested that the target areas of dopaminergic midbrain neurons, the dorsal (DS) and ventral striatum (VS), are differently involved in reinforcement learning especially as actor and critic. Whereas the critic learns to predict rewards, the actor maintains action values to guide future decisions. The different midbrain connections to the DS and the VS seem to play a critical role in this functional distinction. Here, subjects performed a dynamic, reward-based decision-making task during fMRI acquisition. A computational model of reinforcement learning was used to estimate the different effects of positive and negative reinforcements on future decisions for each subject individually. We found that activity in both the DS and the VS correlated with reward prediction errors. Using functional connectivity, we show that the DS and the VS are differentially connected to different midbrain regions (possibly corresponding to the substantia nigra [SN] and the ventral tegmental area [VTA], respectively). However, only functional connectivity between the DS and the putative SN predicted the impact of different reinforcement types on future behavior. These results suggest that connections between the putative SN and the DS are critical for modulating action values in the DS according to both positive and negative reinforcements to guide future decision making.

Download Full-text

Credit assignment in movement-dependent reinforcement learning

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1523669113 ◽

2016 ◽

Vol 113 (24) ◽

pp. 6797-6802 ◽

Cited By ~ 23

Author(s):

Samuel D. McDougle ◽

Matthew J. Boggess ◽

Matthew J. Crossley ◽

Darius Parvin ◽

Richard B. Ivry ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Assignment Problem ◽

Visual Cues ◽

Prediction Errors ◽

Credit Assignment ◽

Implicit Representation ◽

Movement Error ◽

Key Press ◽

Reinforcement Model

When a person fails to obtain an expected reward from an object in the environment, they face a credit assignment problem: Did the absence of reward reflect an extrinsic property of the environment or an intrinsic error in motor execution? To explore this problem, we modified a popular decision-making task used in studies of reinforcement learning, the two-armed bandit task. We compared a version in which choices were indicated by key presses, the standard response in such tasks, to a version in which the choices were indicated by reaching movements, which affords execution failures. In the key press condition, participants exhibited a strong risk aversion bias; strikingly, this bias reversed in the reaching condition. This result can be explained by a reinforcement model wherein movement errors influence decision-making, either by gating reward prediction errors or by modifying an implicit representation of motor competence. Two further experiments support the gating hypothesis. First, we used a condition in which we provided visual cues indicative of movement errors but informed the participants that trial outcomes were independent of their actual movements. The main result was replicated, indicating that the gating process is independent of participants’ explicit sense of control. Second, individuals with cerebellar degeneration failed to modulate their behavior between the key press and reach conditions, providing converging evidence of an implicit influence of movement error signals on reinforcement learning. These results provide a mechanistically tractable solution to the credit assignment problem.

Download Full-text

The contribution of striatal pseudo-reward prediction errors to value-based decision-making

10.1101/097873 ◽

2017 ◽

Author(s):

Ernest Mas-Herrero ◽

Guillaume Sescousse ◽

Roshan Cools ◽

Josep Marco-Pallarés

Keyword(s):

Decision Making ◽

Choice Behavior ◽

Prediction Errors ◽

Life Outcomes ◽

Stimulus Response ◽

Hierarchical Reinforcement Learning ◽

Reward Prediction ◽

Fmri Study ◽

Reward Contingencies ◽

The Brain

AbstractMost studies that have investigated the brain mechanisms underlying learning have focused on the ability to learn simple stimulus-response associations. However, in everyday life, outcomes are often obtained through complex behavioral patterns involving a series of actions. In such scenarios, parallel learning systems are important to reduce the complexity of the learning problem, as proposed in the framework of hierarchical reinforcement learning (HRL). One of the key features of HRL is the computation of pseudo-reward prediction errors (PRPEs) which allow the reinforcement of actions that led to a sub-goal before the final goal itself is achieved. Here we wanted to test the hypothesis that, despite not carrying any rewarding value per se, pseudo-rewards might generate a bias in choice behavior when reward contingencies are not well-known or uncertain. Second, we also hypothesized that this bias might be related to the strength of PRPE striatal representations. In order to test these ideas, we developed a novel decision-making paradigm to assess reward prediction errors (RPEs) and PRPEs in two studies (fMRI study: n = 20; behavioural study: n = 19). Our results show that overall participants developed a preference for the most pseudo-rewarding option throughout the task, even though it did not lead to more monetary rewards. fMRI analyses revealed that this preference was predicted by individual differences in the relative striatal sensitivity to PRPEs vs RPEs. Together, our results indicate that pseudo-rewards generate learning signals in the striatum and subsequently bias choice behavior despite their lack of association with actual reward.

Download Full-text

Reduced model-based decision-making in gambling disorder

Scientific Reports ◽

10.1038/s41598-019-56161-z ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 6

Author(s):

Florent Wyckmans ◽

A. Ross Otto ◽

Miriam Sebold ◽

Nathaniel Daw ◽

Antoine Bechara ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Gambling Disorder ◽

Reaction Times ◽

Learning System ◽

Behavioral Learning ◽

Model Based ◽

Model Free ◽

Effective Interventions ◽

Compulsive Behaviors

AbstractCompulsive behaviors (e.g., addiction) can be viewed as an aberrant decision process where inflexible reactions automatically evoked by stimuli (habit) take control over decision making to the detriment of a more flexible (goal-oriented) behavioral learning system. These behaviors are thought to arise from learning algorithms known as “model-based” and “model-free” reinforcement learning. Gambling disorder, a form of addiction without the confound of neurotoxic effects of drugs, showed impaired goal-directed control but the way in which problem gamblers (PG) orchestrate model-based and model-free strategies has not been evaluated. Forty-nine PG and 33 healthy participants (CP) completed a two-step sequential choice task for which model-based and model-free learning have distinct and identifiable trial-by-trial learning signatures. The influence of common psychopathological comorbidities on those two forms of learning were investigated. PG showed impaired model-based learning, particularly after unrewarded outcomes. In addition, PG exhibited faster reaction times than CP following unrewarded decisions. Troubled mood, higher impulsivity (i.e., positive and negative urgency) and current and chronic stress reported via questionnaires did not account for those results. These findings demonstrate specific reinforcement learning and decision-making deficits in behavioral addiction that advances our understanding and may be important dimensions for designing effective interventions.

Download Full-text

Neural basis of decision making guided by emotional outcomes

Journal of Neurophysiology ◽

10.1152/jn.00564.2014 ◽

2015 ◽

Vol 113 (9) ◽

pp. 3056-3068 ◽

Cited By ~ 13

Author(s):

Kentaro Katahira ◽

Yoshi-Taka Matsuda ◽

Tomomi Fujimura ◽

Kenichi Ueno ◽

Takeshi Asamizuya ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Prediction Error ◽

Brain Regions ◽

Parahippocampal Gyrus ◽

Prediction Errors ◽

Neural Basis ◽

Decision Outcomes ◽

Gain Loss ◽

Emotional Events

Emotional events resulting from a choice influence an individual's subsequent decision making. Although the relationship between emotion and decision making has been widely discussed, previous studies have mainly investigated decision outcomes that can easily be mapped to reward and punishment, including monetary gain/loss, gustatory stimuli, and pain. These studies regard emotion as a modulator of decision making that can be made rationally in the absence of emotions. In our daily lives, however, we often encounter various emotional events that affect decisions by themselves, and mapping the events to a reward or punishment is often not straightforward. In this study, we investigated the neural substrates of how such emotional decision outcomes affect subsequent decision making. By using functional magnetic resonance imaging (fMRI), we measured brain activities of humans during a stochastic decision-making task in which various emotional pictures were presented as decision outcomes. We found that pleasant pictures differentially activated the midbrain, fusiform gyrus, and parahippocampal gyrus, whereas unpleasant pictures differentially activated the ventral striatum, compared with neutral pictures. We assumed that the emotional decision outcomes affect the subsequent decision by updating the value of the options, a process modeled by reinforcement learning models, and that the brain regions representing the prediction error that drives the reinforcement learning are involved in guiding subsequent decisions. We found that some regions of the striatum and the insula were separately correlated with the prediction error for either pleasant pictures or unpleasant pictures, whereas the precuneus was correlated with prediction errors for both pleasant and unpleasant pictures.

Download Full-text

A Research Domain Criteria (RDoC) approach to Gambling Disorder: focus on preference-based decision-making and response inhibition

10.30435/aba.01.2019.06 ◽

2018 ◽

Vol 01 (01) ◽

Author(s):

A. Marras ◽

N. Makris

Keyword(s):

Decision Making ◽

Response Inhibition ◽

Gambling Disorder ◽

Research Domain Criteria ◽

Research Domain

Download Full-text

Epidemiology of Blood Culture Utilization in a Cohort of Critically Ill Children

Journal of Pediatric Intensive Care ◽

10.1055/s-0038-1676993 ◽

2019 ◽

Vol 08 (03) ◽

pp. 144-147

Author(s):

Christine Anh-Thu Tran ◽

Jenna Verena Zschaebitz ◽

Michael Campbell Spaeder

Keyword(s):

Decision Making ◽

Blood Culture ◽

Systemic Inflammatory Response Syndrome ◽

Inflammatory Response ◽

Critically Ill ◽

Clinical Status ◽

Blood Cultures ◽

Systemic Inflammatory Response ◽

Critically Ill Children ◽

Culture Acquisition

AbstractBlood culture acquisition is integral in the assessment of patients with sepsis, though there exists a lack of clarity relating to clinical states that warrant acquisition. We investigated the clinical status of critically ill children in the timeframe proximate to acquisition of blood cultures. The associated rates of systemic inflammatory response syndrome (72%) and sepsis (57%) with blood culture acquisition were relatively low suggesting a potential overutilization of blood cultures. Efforts are needed to improve decision making at the time that acquisition of blood cultures is under consideration and promote percutaneous blood draws over indwelling lines.

Download Full-text

Tactical Decision-Making in Autonomous Driving by Reinforcement Learning with Uncertainty Estimation

2020 IEEE Intelligent Vehicles Symposium (IV) ◽

10.1109/iv47402.2020.9304614 ◽

2020 ◽

Author(s):

Carl-Johan Hoel ◽

Krister Wolff ◽

Leo Laine

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Autonomous Driving ◽

Uncertainty Estimation ◽

Tactical Decision

Download Full-text