scholarly journals Reward Magnitude Coding in Primate Amygdala Neurons

2010 ◽  
Vol 104 (6) ◽  
pp. 3424-3432 ◽  
Author(s):  
Maria A. Bermudez ◽  
Wolfram Schultz

Animals assess the values of rewards to learn and choose the best possible outcomes. We studied how single neurons in the primate amygdala coded reward magnitude, an important variable determining the value of rewards. A single, Pavlovian-conditioned visual stimulus predicted fruit juice to be delivered with one of three equiprobable volumes ( P = 1/3). A population of amygdala neurons showed increased activity after reward delivery, and almost one half of these responses covaried with reward magnitude in a monotonically increasing or decreasing fashion. A subset of the reward responding neurons were tested with two different probability distributions of reward magnitude; the reward responses in almost one half of them adapted to the predicted distribution and thus showed reference-dependent coding. These data suggest parametric reward value coding in the amygdala as a characteristic component of its function in reinforcement learning and economic decision making.

2018 ◽  
Author(s):  
Greg Jensen ◽  
Yelda Alkan ◽  
Vincent P Ferrera ◽  
Herbert S Terrace

The observation that monkeys appear to make transitive inferences has been taken as evidence of their ability to form and manipulate mental representations. However, alternative explanations have been proposed arguing that transitive inference performance based on expected or experienced reward value. To test the contribution of reward value to monkeys’ behavior in TI paradigms, we performed two experiments in which we manipulated the amount of reward associated with each item in an ordered list. In these experiments, monkeys were presented with pairs of items drawn from the list, and delivered rewards if subjects selected the item with the earlier list rank. When reward magnitude was biased to favor later list items, correct responding was reduced. However, monkeys eventually learned to make correct rule-based choices despite countervailing incentives. The results demonstrate that monkeys’ performance in TI paradigms is not driven solely by expected reward, but that they are able to make appropriate inferences in the face of discordant reward associations.


2010 ◽  
Vol 30 (39) ◽  
pp. 13095-13104 ◽  
Author(s):  
G. Sescousse ◽  
J. Redoute ◽  
J.-C. Dreher

Neuron ◽  
2013 ◽  
Vol 80 (6) ◽  
pp. 1519-1531 ◽  
Author(s):  
Peter H. Rudebeck ◽  
Andrew R. Mitz ◽  
Ravi V. Chacko ◽  
Elisabeth A. Murray

2018 ◽  
Author(s):  
Greg Jensen ◽  
Yelda Alkan ◽  
Vincent P Ferrera ◽  
Herbert S Terrace

The observation that monkeys appear to make transitive inferences has been taken as evidence of their ability to form and manipulate mental representations. However, alternative explanations have been proposed arguing that transitive inference performance based on expected or experienced reward value. To test the contribution of reward value to monkeys’ behavior in TI paradigms, we performed two experiments in which we manipulated the amount of reward associated with each item in an ordered list. In these experiments, monkeys were presented with pairs of items drawn from the list, and delivered rewards if subjects selected the item with the earlier list rank. When reward magnitude was biased to favor later list items, correct responding was reduced. However, monkeys eventually learned to make correct rule-based choices despite countervailing incentives. The results demonstrate that monkeys’ performance in TI paradigms is not driven solely by expected reward, but that they are able to make appropriate inferences in the face of discordant reward associations.


2018 ◽  
Author(s):  
Greg Jensen ◽  
Yelda Alkan ◽  
Vincent P Ferrera ◽  
Herbert S Terrace

The observation that monkeys appear to make transitive inferences has been taken as evidence of their ability to form and manipulate mental representations. However, alternative explanations have been proposed arguing that transitive inference performance based on expected or experienced reward value. To test the contribution of reward value to monkeys’ behavior in TI paradigms, we performed two experiments in which we manipulated the amount of reward associated with each item in an ordered list. In these experiments, monkeys were presented with pairs of items drawn from the list, and delivered rewards if subjects selected the item with the earlier list rank. When reward magnitude was biased to favor later list items, correct responding was reduced. However, monkeys eventually learned to make correct rule-based choices despite countervailing incentives. The results demonstrate that monkeys’ performance in TI paradigms is not driven solely by expected reward, but that they are able to make appropriate inferences in the face of discordant reward associations.


2018 ◽  
Author(s):  
Greg Jensen ◽  
Yelda Alkan ◽  
Vincent P Ferrera ◽  
Herbert S Terrace

The observation that monkeys appear to make transitive inferences has been taken as evidence of their ability to form and manipulate mental representations. However, alternative explanations have been proposed arguing that transitive inference performance based on expected or experienced reward value. To test the contribution of reward value to monkeys’ behavior in TI paradigms, we performed two experiments in which we manipulated the amount of reward associated with each item in an ordered list. In these experiments, monkeys were presented with pairs of items drawn from the list, and delivered rewards if subjects selected the item with the earlier list rank. When reward magnitude was biased to favor later list items, correct responding was reduced. However, monkeys eventually learned to make correct rule-based choices despite countervailing incentives. The results demonstrate that monkeys’ performance in TI paradigms is not driven solely by expected reward, but that they are able to make appropriate inferences in the face of discordant reward associations.


2018 ◽  
Author(s):  
Hernaus Dennis ◽  
Michael J. Frank ◽  
Elliot C. Brown ◽  
Jaime K. Brown ◽  
James M. Gold ◽  
...  

ABSTRACTBackgroundMotivational deficits in people with schizophrenia (PSZ) are associated with an inability to integrate the magnitude and probability of previous outcomes. The mechanisms that underlie probability-magnitude integration deficits, however, are poorly understood. We hypothesized that increased reliance on “value-less” stimulus-response associations, in lieu of expected value (EV)-based learning, could drive probability-magnitude integration deficits in PSZ with motivational deficits.MethodsHealthy volunteers (n= 38) and PSZ (n=49) completed a reinforcement learning paradigm consisting of four stimulus pairs. Reward magnitude (3/2/1/0 points) and probability (90%/80%/20%/10%) together determined each stimulus’ EV. Following a learning phase, new and familiar stimulus pairings were presented. Participants were asked to select stimuli with the highest reward value.ResultsPSZ with high motivational deficits made increasingly less optimal choices as the difference in reward value (probability*magnitude) between two competing stimuli increased. Using a previously-validated computational hybrid model, PSZ relied less on EV (“Q-learning”) and more on stimulus-response learning (“actor-critic”), which correlated with SANS motivational deficit severity. PSZ specifically failed to represent reward magnitude, consistent with model demonstrations showing that response tendencies in the actor-critic were preferentially driven by reward probability. ConclusionsProbability-magnitude deficits in PSZ with motivational deficits arise from underutilization of EV in favor of reliance on value-less stimulus-response associations. Consistent with previous work and confirmed by our computational hybrid framework, probability-magnitude integration deficits were driven specifically by a failure to represent reward magnitude. This work reconfirms the importance of decreased Q-learning/increased actor-critic-type learning as an explanatory framework for a range of EV deficits in PSZ.


2017 ◽  
Author(s):  
Joshua I. Glaser ◽  
Matthew G. Perich ◽  
Pavan Ramkumar ◽  
Lee E. Miller ◽  
Konrad P. Kording

AbstractOur bodies and the environment constrain our movements. For example, when our arm is fully outstretched, we cannot extend it further. More generally, the distribution of possible movements is conditioned on the state of our bodies in the environment, which is constantly changing. However, little is known about how the brain represents such distributions, and uses them in movement planning. Here, we recorded from dorsal premotor cortex (PMd) and primary motor cortex (M1) while monkeys reached to randomly placed targets. The hand’s position within the workspace created probability distributions of possible upcoming targets, which affected movement trajectories and latencies. PMd, but not M1, neurons had increased activity when the monkey’s hand position made it likely the upcoming movement would be in the neurons’ preferred directions. Across the population, PMd activity represented probability distributions of individual upcoming reaches, which depended on rapidly changing information about the body’s state in the environment.


2017 ◽  
Author(s):  
Bowen John Fung ◽  
Carsten Murawski ◽  
Stefan Bode

Human time perception can be influenced by contextual factors, such as the presence of reward. Yet, the exact nature of the relationship between time perception and reward has not been conclusively characterized. We implemented a novel experimental paradigm to measure estimations of time across a range of suprasecond intervals, during the anticipation and after the consumption of fruit juice, a physiologically relevant primary reward. We show that average time estimations were systematically affected by the consumption of reward, but not by the anticipation of reward. Compared with baseline estimations of time, reward consumption was associated with subsequent overproductions of time, and this effect increased for larger magnitudes of reward. Additional experiments demonstrated that the effect of consumption did not extend to a secondary reward (money), a tasteless, noncaloric primary reward (water), or a sweet, noncaloric reward (aspartame). However, a tasteless caloric reward (maltodexrin) did induce overproductions of time, although this effect did not scale with reward magnitude. These results suggest that the consumption of caloric primary rewards can alter time perception, which may be a psychophysiological mechanism by which organisms regulate homeostatic balance.


2021 ◽  
Author(s):  
Andrew T Marshall ◽  
Sean B. Ostlund

The Pavlovian-instrumental transfer (PIT) paradigm is widely used to assay the motivational influence of reward-paired cues, which is reflected by their ability to stimulate instrumental reward-seeking behavior. Leading models of incentive learning assume that motivational value is assigned to cues based on the total amount of reward they signal (i.e., their state value). Based on recent findings, we lay out the alternative hypothesis that cue-elicited reward predictions may actually suppress the motivation to seek out new rewards through instrumental behavior in order to facilitate efficient retrieval of a reward that is already expected, before it is lost or stolen. According to this view, cue-motivated reward seeking should be inversely related to the magnitude of an expected reward, since there is more to lose by failing to secure a large reward than a small reward. We investigated the influence of expected reward magnitude on PIT expression. Hungry rats were initially trained to lever press for food pellets before undergoing Pavlovian conditioning, in which two distinct auditory cues signaled food pellet delivery at cue offset. Reward magnitude was varied across cues and groups. While all groups had at least one cue that signaled three food pellets, the alternate cue signaled either one (Group 1/3), three (Group 3/3), or nine food pellets (Group 3/9). PIT testing revealed that the motivational influence of reward-predictive cues on lever pressing varied inversely with expected reward magnitude, with the 1-pellet cue augmenting performance and the 3- and 9-pellet cues suppressing performance, particularly near the expected time of reward delivery. This pattern was mirrored by opposing changes in the food-port entry behavior, which varied positively with expected reward magnitude. We discuss how these findings may relate to cognitive control over cue-motivated behavior.


Sign in / Sign up

Export Citation Format

Share Document