scholarly journals Learning the payoffs and costs of actions

2018 ◽  
Author(s):  
Moritz Möller ◽  
Rafal Bogacz

AbstractA set of sub-cortical nuclei called basal ganglia is critical for learning the values of actions. The basal ganglia include two pathways, which have been associated with approach and avoid behavior respectively, and are differentially modulated by dopamine projections from the midbrain. According to the influential opponent actor learning model, these pathways represent learned estimates of the positive and negative consequences (payoffs and costs) of actions. The level of dopamine release controls to what extent payoffs and costs enter the overall evaluation of actions. How the knowledge about payoff and cost is acquired is still an open question, even though many theories describe learning from feedback in the basal ganglia. We examine whether a set of plasticity rules proposed to model reinforcement learning in the pathways of the basal ganglia is suitable to extract payoffs and costs from a reward prediction error signal. First, we determine the result of such learning, both analytically and via simulations, for different reward schedules that feature payoffs and costs. Then, we combine the plasticity rules with a decision rule to examine the emerging effect of dopaminergic modulation on the willingness to work for reward. We find that the plasticity rules are suitable to infer the mean payoffs and costs of actions, if those occur at different moments in time. Successful learning requires differential effects of positive and negative reward prediction errors on the two pathways, and a weak decay of synaptic weights over trials. We also confirm that dopaminergic modulation produces effects on the willingness to work for reward similar to those observed in classical experiments.Author summaryThe basal ganglia are structures underneath the surface of the vertebrate brain, associated with error driven learning. Much is known about the anatomical and biological features of the basal ganglia; scientists now try to understand the algorithms implemented by these structures. Numerous models aspire to capture the learning functionality, but many of them only cover some specific aspect of the algorithm. Instead of further adding to that pool of partial models, we unify two existing ones - one which captures what the basal ganglia learns, and one that describes the learning mechanism itself. The first model suggests that the basal ganglia keeps track of both positive and negative consequences of frequent opportunities, and weighs these by the motivational state in decisions. It explains how payoff and cost are represented, but not how those representations arise. The other model consists of biologically plausible plasticity rules, which describe how learning takes place, but not how the brain makes use of what is learned. We show that the two theories are compatible. Together, they form a model of learning and decision making that integrates the motivational state as well as the learned payoffs and costs of opportunities.

2017 ◽  
Author(s):  
Rafal Bogacz

AbstractThis paper proposes how the neural circuits in vertebrates select actions on the basis of past experience and the current motivational state. According to the presented theory, the basal ganglia evaluate the utility of considered actions by combining the positive consequences (e.g. nutrition) scaled by the motivational state (e.g. hunger) with the negative consequences (e.g. effort). The theory suggests how the basal ganglia compute utility by combining the positive and negative consequences encoded in the synaptic weights of striatal Go and No-Go neurons, and the motivational state carried by neuromodulators including dopamine. Furthermore, the theory suggests how the striatal neurons to learn separately about consequences of actions, and how the dopaminergic neurons themselves learn what level of activity they need to produce to optimize behaviour. The theory accounts for the effects of dopaminergic modulation on behaviour, patterns of synaptic plasticity in striatum, and responses of dopaminergic neurons in diverse situations.


2019 ◽  
Vol 42 (1) ◽  
pp. 459-483 ◽  
Author(s):  
Andreas Klaus ◽  
Joaquim Alves da Silva ◽  
Rui M. Costa

Deciding what to do and when to move is vital to our survival. Clinical and fundamental studies have identified basal ganglia circuits as critical for this process. The main input nucleus of the basal ganglia, the striatum, receives inputs from frontal, sensory, and motor cortices and interconnected thalamic areas that provide information about potential goals, context, and actions and directly or indirectly modulates basal ganglia outputs. The striatum also receives dopaminergic inputs that can signal reward prediction errors and also behavioral transitions and movement initiation. Here we review studies and models of how direct and indirect pathways can modulate basal ganglia outputs to facilitate movement initiation, and we discuss the role of cortical and dopaminergic inputs to the striatum in determining what to do and if and when to do it. Complex but exciting scenarios emerge that shed new light on how basal ganglia circuits modulate self-paced movement initiation.


2020 ◽  
Author(s):  
Kate Ergo ◽  
Luna De Vilder ◽  
Esther De Loof ◽  
Tom Verguts

Recent years have witnessed a steady increase in the number of studies investigating the role of reward prediction errors (RPEs) in declarative learning. Specifically, in several experimental paradigms RPEs drive declarative learning; with larger and more positive RPEs enhancing declarative learning. However, it is unknown whether this RPE must derive from the participant’s own response, or whether instead any RPE is sufficient to obtain the learning effect. To test this, we generated RPEs in the same experimental paradigm where we combined an agency and a non-agency condition. We observed no interaction between RPE and agency, suggesting that any RPE (irrespective of its source) can drive declarative learning. This result holds implications for declarative learning theory.


2021 ◽  
Author(s):  
Joseph Heffner ◽  
Jae-Young Son ◽  
Oriel FeldmanHall

People make decisions based on deviations from expected outcomes, known as prediction errors. Past work has focused on reward prediction errors, largely ignoring violations of expected emotional experiences—emotion prediction errors. We leverage a new method to measure real-time fluctuations in emotion as people decide to punish or forgive others. Across four studies (N=1,016), we reveal that emotion and reward prediction errors have distinguishable contributions to choice, such that emotion prediction errors exert the strongest impact during decision-making. We additionally find that a choice to punish or forgive can be decoded in less than a second from an evolving emotional response, suggesting emotions swiftly influence choice. Finally, individuals reporting significant levels of depression exhibit selective impairments in using emotion—but not reward—prediction errors. Evidence for emotion prediction errors potently guiding social behaviors challenge standard decision-making models that have focused solely on reward.


SAGE Open ◽  
2019 ◽  
Vol 9 (1) ◽  
pp. 215824401983591
Author(s):  
Yariv Feniger ◽  
Anastasia Gorodzeisky ◽  
Michal Krumer-Nevo

In recent years, education–occupation mismatch has become an important area of social research. However, little is known about its impact on the intergenerational transmission of educational attainment. This study investigates the possible negative consequences of a specific aspect of parental education–occupation mismatch, also known as overeducation, for high school students. Drawing from a sample of high school students in an Israeli city with a high incidence of overeducation, our analysis suggests that parental education–occupation mismatch does not affect student expectations for progressing to higher education. The results did reveal, however, that maternal education–occupation mismatch is related to school truancy among boys and girls, and that paternal education–occupation mismatch contributes to lower odds of enrollment in advanced science courses, especially among boys.


2017 ◽  
Vol 129 ◽  
pp. 265-272 ◽  
Author(s):  
Chad C. Williams ◽  
Cameron D. Hassall ◽  
Robert Trska ◽  
Clay B. Holroyd ◽  
Olave E. Krigolson

2020 ◽  
Vol 22 (8) ◽  
pp. 849-859
Author(s):  
Julian Macoveanu ◽  
Hanne L. Kjærstad ◽  
Henry W. Chase ◽  
Sophia Frangou ◽  
Gitte M. Knudsen ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document