Distentangling the systems contributing to changes in learning during adolescence

Mapping Intimacies ◽

10.1101/622860 ◽

2019 ◽

Cited By ~ 4

Author(s):

Sarah L. Master ◽

Maria K. Eckstein ◽

Neta Gotlieb ◽

Ronald Dahl ◽

Linda Wilbrecht ◽

...

Keyword(s):

Working Memory ◽

Prefrontal Cortex ◽

Reinforcement Learning ◽

Computational Modeling ◽

Developmental Trajectories ◽

Learning Task ◽

Developmental Changes ◽

List Type ◽

Precise Information ◽

Learning Rates

AbstractMultiple neurocognitive systems contribute simultaneously to learning. For example, dopamine and basal ganglia (BG) systems are thought to support reinforcement learning (RL) by incrementally updating the value of choices, while the prefrontal cortex (PFC) contributes different computations, such as actively maintaining precise information in working memory (WM). It is commonly thought that WM and PFC show more protracted development than RL and BG systems, yet their contributions are rarely assessed in tandem. Here, we used a simple learning task to test how RL and WM contribute to changes in learning across adolescence. We tested 187 subjects ages 8 to 17 and 53 adults (25-30). Participants learned stimulus-action associations from feedback; the learning load was varied to be within or exceed WM capacity. Participants age 8-12 learned slower than participants age 13-17, and were more sensitive to load. We used computational modeling to estimate subjects’ use of WM and RL processes. Surprisingly, we found more robust changes in RL than WM during development. RL learning rate increased significantly with age across adolescence and WM parameters showed more subtle changes, many of them early in adolescence. These results underscore the importance of changes in RL processes for the developmental science of learning.Highlights- Subjects combine reinforcement learning (RL) and working memory (WM) to learn- Computational modeling shows RL learning rates grew with age during adolescence- When load was beyond WM capacity, weaker RL compensated less in younger adolescents- WM parameters showed subtler and more puberty-related changes- WM reliance, maintenance, and capacity had separable developmental trajectories- Underscores importance of RL processes in developmental changes in learning

Download Full-text

Reward and punishment reversal learning in major depressive disorder

10.31234/osf.io/aqgx3 ◽

2020 ◽

Author(s):

Dahlia Mukherjee ◽

Alexandre Leo Stephen Filipowicz ◽

Khoi D. Vo ◽

Theodore Sattherwaite ◽

Joe Kable

Keyword(s):

Major Depressive Disorder ◽

Reinforcement Learning ◽

Depressive Disorder ◽

Computational Modeling ◽

Reversal Learning ◽

Performance Metrics ◽

Learning Task ◽

Major Depressive ◽

Learning Rates ◽

Reward And Punishment

Depression has been associated with impaired reward and punishment processing, but the specific nature of these deficits is less understood and still widely debated. We analyzed reinforcement-based decision-making in individuals diagnosed with major depressive disorder (MDD) to identify the specific decision mechanisms contributing to poorer performance. Individuals with MDD (n = 64) and matched healthy controls (n = 64) performed a probabilistic reversal learning task in which they used feedback to identify which of two stimuli had the highest probability of reward (reward condition) or lowest probability of punishment (punishment condition). Learning differences were characterized using a hierarchical Bayesian reinforcement learning model. While both groups showed reinforcement learning-like behavior, depressed individuals made fewer optimal choices and adjusted more slowly to reversals in both the reward and punishment conditions. Our computational modeling analysis found that depressed individuals showed lower learning rates and, to a lesser extent, lower value sensitivity in both the reward and punishment conditions. Learning rates also predicted depression more accurately than simple performance metrics. These results demonstrate that depression is characterized by a hyposensitivity to positive outcomes, which influences the rate at which depressed individuals learn from feedback, but not a hypersensitivity to negative outcomes as has previously been suggested. Additionally, we demonstrate that computational modeling provides a more precise characterization of the dynamics contributing to these learning deficits, and offers stronger insights into the mechanistic processes affected by depression.

Download Full-text

The Tortoise and the Hare: Interactions between Reinforcement Learning and Working Memory

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_01238 ◽

2018 ◽

Vol 30 (10) ◽

pp. 1422-1432 ◽

Cited By ~ 16

Author(s):

Anne G. E. Collins

Keyword(s):

Working Memory ◽

Reinforcement Learning ◽

Computational Modeling ◽

Behavioral Experiment ◽

Prediction Errors ◽

Multiple Systems ◽

Memory Interference ◽

Neural Representations ◽

The Cost

Learning to make rewarding choices in response to stimuli depends on a slow but steady process, reinforcement learning, and a fast and flexible, but capacity-limited process, working memory. Using both systems in parallel, with their contributions weighted based on performance, should allow us to leverage the best of each system: rapid early learning, supplemented by long-term robust acquisition. However, this assumes that using one process does not interfere with the other. We use computational modeling to investigate the interactions between the two processes in a behavioral experiment and show that working memory interferes with reinforcement learning. Previous research showed that neural representations of reward prediction errors, a key marker of reinforcement learning, were blunted when working memory was used for learning. We thus predicted that arbitrating in favor of working memory to learn faster in simple problems would weaken the reinforcement learning process. We tested this by measuring performance in a delayed testing phase where the use of working memory was impossible, and thus participant choices depended on reinforcement learning. Counterintuitively, but confirming our predictions, we observed that associations learned most easily were retained worse than associations learned slower: Using working memory to learn quickly came at the cost of long-term retention. Computational modeling confirmed that this could only be accounted for by working memory interference in reinforcement learning computations. These results further our understanding of how multiple systems contribute in parallel to human learning and may have important applications for education and computational psychiatry.

Download Full-text

Trading off the cost of conflict against expected rewards

10.1101/412809 ◽

2018 ◽

Author(s):

Nura Sidarus ◽

Stefano Palminteri ◽

Valérian Chambon

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Cognitive Control ◽

Reversal Learning ◽

Computational Models ◽

Learning Task ◽

Learning Rates ◽

Different Types ◽

Irrelevant Distractors ◽

The Cost

AbstractValue-based decision-making involves trading off the cost associated with an action against its expected reward. Research has shown that both physical and mental effort constitute such subjective costs, biasing choices away from effortful actions, and discounting the value of obtained rewards. Facing conflicts between competing action alternatives is considered aversive, as recruiting cognitive control to overcome conflict is effortful. Yet, it remains unclear whether conflict is also perceived as a cost in value-based decisions. The present study investigated this question by embedding irrelevant distractors (flanker arrows) within a reversal-learning task, with intermixed free and instructed trials. Results showed that participants learned to adapt their choices to maximize rewards, but were nevertheless biased to follow the suggestions of irrelevant distractors. Thus, the perceived cost of being in conflict with an external suggestion could sometimes trump internal value representations. By adapting computational models of reinforcement learning, we assessed the influence of conflict at both the decision and learning stages. Modelling the decision showed that conflict was avoided when evidence for either action alternative was weak, demonstrating that the cost of conflict was traded off against expected rewards. During the learning phase, we found that learning rates were reduced in instructed, relative to free, choices. Learning rates were further reduced by conflict between an instruction and subjective action values, whereas learning was not robustly influenced by conflict between one’s actions and external distractors. Our results show that the subjective cost of conflict factors into value-based decision-making, and highlights that different types of conflict may have different effects on learning about action outcomes.

Download Full-text

Explaining Valence Asymmetries in Value Learning: A Reinforcement Learning Account

10.31234/osf.io/23kuf ◽

2022 ◽

Author(s):

Chenxu Hao ◽

Lilian E. Cabrera-Haro ◽

Ziyong Lin ◽

Patricia Reuter-Lorenz ◽

Richard L. Lewis

Keyword(s):

Reinforcement Learning ◽

Downstream Processing ◽

Learning Task ◽

Model Parameters ◽

Special Role ◽

Learning Rates ◽

The Asymmetry ◽

Value Learning ◽

Choice Policy ◽

Better Than

To understand how acquired value impacts how we perceive and process stimuli, psychologists have developed the Value Learning Task (VLT; e.g., Raymond & O’Brien, 2009). The task consists of a series of trials in which participants attempt to maximize accumulated winnings as they make choices from a pair of presented images associated with probabilistic win, loss, or no-change outcomes. Despite the task having a symmetric outcome structure for win and loss pairs, people learn win associations better than loss associations (Lin, Cabrera-Haro, & Reuter-Lorenz, 2020). This asymmetry could lead to differences when the stimuli are probed in subsequent tasks, compromising inferences about how acquired value affects downstream processing. We investigate the nature of the asymmetry using a standard error-driven reinforcement learning model with a softmax choice rule. Despite having no special role for valence, the model yields the asymmetry observed in human behavior, whether the model parameters are set to maximize empirical fit, or task payoff. The asymmetry arises from an interaction between a neutral initial value estimate and a choice policy that exploits while exploring, leading to more poorly discriminated value estimates for loss stimuli. We also show how differences in estimated individual learning rates help to explain individual differences in the observed win-loss asymmetries, and how the final value estimates produced by the model provide a simple account of a post-learning explicit value categorization task.

Download Full-text

Whole-genome and RNA sequencing reveal variation and transcriptomic coordination in the developing human prefrontal cortex

10.1101/585430 ◽

2019 ◽

Author(s):

Donna M. Werling ◽

Sirisha Pochareddy ◽

Jinmyung Choi ◽

Joon-Yong An ◽

Brooke Sheppard ◽

...

Keyword(s):

Gene Expression ◽

Prefrontal Cortex ◽

Rna Sequencing ◽

Protein Interactions ◽

Developmental Trajectories ◽

Neuropsychiatric Disorder ◽

Integrated Analysis ◽

List Type ◽

Whole Genome ◽

Specific Expression

SummaryVariation in gene expression underlies neurotypical development, while genomic variants contribute to neuropsychiatric disorders. BrainVar is a unique resource of paired whole-genome sequencing and bulk-tissue RNA-sequencing from the human dorsolateral prefrontal cortex of 176 neurotypical individuals across prenatal and postnatal development, providing the opportunity to assay genomic and transcriptomic variation in tandem. Leveraging this resource, we identified rare premature stop codons with commensurate reduced and allele-specific expression of corresponding genes, and common variants that alter gene expression (expression quantitative trait loci, eQTLs). Categorizing eQTLs by prenatal and postnatal effect, genes affected by temporally-specific eQTLs, compared to constitutive eQTLs, are enriched for haploinsufficiency, protein-protein interactions, and neuropsychiatric disorder risk loci. Expression levels of over 12,000 genes rise or fall in a concerted late-fetal transition, with the transitional genes enriched for cell type specific genes and neuropsychiatric disorder loci, underscoring the importance of cataloguing developmental trajectories in understanding cortical physiology and pathology.HighlightsWhole-genome and RNA-sequencing across human prefrontal cortex development in BrainVarGene-specific developmental trajectories characterize the late-fetal transitionIdentification of constitutive, prenatal-specific, postnatal-specific, and rare eQTLsIntegrated analysis reveals genetic and developmental influences on CNS traits and disorders

Download Full-text

Identifying regions in prefrontal cortex related to working memory improvement: a novel meta-analytic method using electric field modeling

10.1101/2021.03.11.435002 ◽

2021 ◽

Author(s):

Miles Wischnewski ◽

Kathleen E. Mantell ◽

Alexander Opitz

Keyword(s):

Working Memory ◽

Prefrontal Cortex ◽

Field Strength ◽

Electric Field ◽

Electric Field Strength ◽

Electric Fields ◽

Brain Regions ◽

Medium Effect ◽

Analytic Method ◽

List Type

AbstractAltering cortical activity using transcranial direct current stimulation (tDCS) has been shown to improve working memory (WM) performance. Due to large inter-experimental variability in the tDCS montage configuration and strength of induced electric fields, results have been mixed. Here, we present a novel meta-analytic method relating behavioral effect sizes to electric field strength to identify brain regions underlying largest tDCS-induced WM improvement. Simulations on 69 studies targeting left prefrontal cortex showed that tDCS electric field strength in lower dorsolateral prefrontal cortex (Brodmann area 45/47) relates most strongly to improved WM performance. This region explained 7.8% of variance, equaling a medium effect. A similar region was identified when correlating WM performance and electric field strength of right prefrontal tDCS studies (n = 18). Maximum electric field strength of five previously used tDCS configurations were outside of this location. We thus propose a new tDCS montage which maximizes the tDCS electric field strength in that brain region. Our findings can benefit future tDCS studies that aim to affect WM function.Highlights-We summarize the effect of 87 tDCS studies on working memory performance-We introduce a new meta-analytic method correlating tDCS electric fields and performance-tDCS-induced electric fields in lower DLPFC correlate significantly with improved working memory-The lower DLPFC was not maximally targeted by most tDCS montages and we provide an optimized montage

Download Full-text

Diffusion modeling reveals reinforcement learning impairments in gambling disorder that are linked to attenuated ventromedial prefrontal cortex value representations

10.1101/2020.06.03.131359 ◽

2020 ◽

Cited By ~ 2

Author(s):

Antonius Wiehler ◽

Jan Peters

Keyword(s):

Prefrontal Cortex ◽

Reinforcement Learning ◽

Gambling Disorder ◽

Ventromedial Prefrontal Cortex ◽

Response Threshold ◽

Boundary Separation ◽

Learning Impairments ◽

Model Based ◽

Learning Rates ◽

Over Time

AbstractGambling disorder is associated with deficits in classical feedback-based learning tasks, but the computational mechanisms underlying such learning impairments are still poorly understood. Here, we examined this question using a combination of computational modeling and functional resonance imaging (fMRI) in gambling disorder participants (n=23) and matched controls (n=19). Participants performed a simple reinforcement learning task with two pairs of stimuli (80% vs. 20% reinforcement rates per pair). As predicted, gamblers made significantly fewer selections of the optimal stimulus, while overall response times (RTs) were not significantly different between groups. We then used comprehensive modeling using reinforcement learning drift diffusion models (RLDDMs) in combination with hierarchical Bayesian parameter estimation to shed light on the computational underpinnings of this performance impairment. In both groups, an RLDDM in which both non-decision time and response threshold (boundary separation) changed over the course of the experiment accounted for the data best. The model showed good parameter recovery, and posterior predictive checks revealed that in both groups, the model reproduced the evolution of both accuracy and RTs over time. Examination of the group-wise posterior distributions revealed that the learning impairment in gamblers was attributable to both reduced learning rates and a more rapid reduction in boundary separation over time, compared to controls. Furthermore, gamblers also showed substantially shorter non-decision times. Model-based imaging analyses then revealed that value representations in gamblers in the ventromedial prefrontal cortex were attenuated compared to controls, and these effects were partly associated with model-based learning rates. Exploratory analyses revealed that a more anterior ventromedial prefrontal cortex cluster showed attenuations in value representations in proportion to gambling disorder severity in gamblers. Taken together, our findings reveal computational mechanisms underlying reinforcement learning impairments in gambling disorder, and confirm the ventromedial prefrontal cortex and as a critical neural hub in this disorder.

Download Full-text

Increased ventromedial prefrontal cortex activity in adolescence benefits prosocial reinforcement learning

10.1101/2021.01.21.427660 ◽

2021 ◽

Author(s):

Bianca Westhoff ◽

Neeltje E. Blankenstein ◽

Elisabeth Schreuders ◽

Eveline A. Crone ◽

Anna C. K. van Duijvenvoorde

Keyword(s):

Prefrontal Cortex ◽

Reinforcement Learning ◽

Functional Neuroimaging ◽

Learning Task ◽

Developmental Trajectory ◽

Ventromedial Prefrontal Cortex ◽

Prediction Errors ◽

Age Related ◽

Probabilistic Reinforcement ◽

Neuroimaging Study

AbstractLearning which of our behaviors benefit others contributes to social bonding and being liked by others. An important period for the development of (pro)social behavior is adolescence, in which peers become more salient and relationships intensify. It is, however, unknown how learning to benefit others develops across adolescence and what the underlying cognitive and neural mechanisms are. In this functional neuroimaging study, we assessed learning for self and others (i.e., prosocial learning) and the concurring neural tracking of prediction errors across adolescence (ages 9-21, N=74). Participants performed a two-choice probabilistic reinforcement learning task in which outcomes resulted in monetary consequences for themselves, an unknown other, or no one. Participants from all ages were able to learn for themselves and others, but learning for others showed a more protracted developmental trajectory. Prediction errors for self were observed in the ventral striatum and showed no age-related differences. However, prediction error coding for others was specifically observed in the ventromedial prefrontal cortex and showed age-related increases. These results reveal insights into the computational mechanisms of learning for others across adolescence, and highlight that learning for self and others show different age-related patterns.

Download Full-text

Striatum–Medial Prefrontal Cortex Connectivity Predicts Developmental Changes in Reinforcement Learning

Cerebral Cortex ◽

10.1093/cercor/bhr198 ◽

2012 ◽

Vol 22 (6) ◽

pp. 1247-1255 ◽

Cited By ~ 134

Author(s):

Wouter van den Bos ◽

Michael X. Cohen ◽

Thorsten Kahnt ◽

Eveline A. Crone

Keyword(s):

Prefrontal Cortex ◽

Reinforcement Learning ◽

Medial Prefrontal Cortex ◽

Developmental Changes

Download Full-text

Choice-confirmation bias in reinforcement learning changes with age during adolescence

10.31234/osf.io/xvzwb ◽

2021 ◽

Author(s):

Gabriele Chierchia ◽

Magdaléna Soukupová ◽

Emma J. Kilford ◽

Cait Griffin ◽

Jovita Tung Leung ◽

...

Keyword(s):

Reinforcement Learning ◽

Instrumental Learning ◽

Confirmation Bias ◽

Preceding Trial ◽

Learning Task ◽

Prior Beliefs ◽

Learning Rates ◽

Age Related ◽

Rewards And Punishment ◽

Increase In Accuracy

Confirmation bias, the widespread tendency to favour evidence that confirms rather than disconfirms one’s prior beliefs and choices, has been shown to play a role in the way decisions are shaped by rewards and punishment, known as confirmatory reinforcement learning. Given that exploratory tendencies change during adolescence, we investigated whether confirmatory learning also changes during this age. In an instrumental learning task, participants aged 11-33 years attempted to maximize monetary rewards by repeatedly sampling different pairs of novel options, which varied in their reward/punishment probabilities. Our results showed an age-related increase in accuracy with as long as learning contingencies remained stable across trials, but less so when they reversed halfway through the trials. Across participants, there was a greater tendency to stay with an option that had delivered a reward on the immediately preceding trial, more than to switch away from an option that had just delivered a punishment, and this behavioural asymmetry also increased with age. Younger participants spent more time assessing the outcomes of their choices than did older participants, suggesting that their learning inefficiencies were not due to reduced attention. At a computational level, these decision patterns were best described by a model that assumes that people learn very little from disconfirmatory evidence and that they vary in the extent to which they learn from confirmatory evidence. Such confirmatory learning rates also increased with age. Overall, these findings are consistent with the hypothesis that a discrepancy between confirmatory and disconfirmatory learning increases with age during adolescence.

Download Full-text