Robust Pavlovian-to-Instrumental and Pavlovian-to-Metacognitive Transfers in human reinforcement learning

Mapping Intimacies ◽

10.1101/593368 ◽

2019 ◽

Cited By ~ 1

Author(s):

Chih-Chung Ting ◽

Stefano Palminteri ◽

Jan B. Engelmann ◽

Maël Lebreton

Keyword(s):

Reinforcement Learning ◽

Response Times ◽

Instrumental Learning ◽

Reaction Times ◽

Learning Models ◽

Learning Tasks ◽

Single Mechanism ◽

Choice Reaction Times

AbstractIn simple instrumental-learning tasks, humans learn to seek gains and to avoid losses equally well. Yet, two effects of valence are observed. First, decisions in loss-contexts are slower, which is consistent with the Pavlovian-instrumental transfer (PIT) hypothesis. Second, loss contexts decrease individuals’ confidence in their choices – a bias akin to a Pavlovian-to-metacognitive transfer (PMT). Whether these two effects are two manifestations of a single mechanism or whether they can be partially dissociated is unknown. Here, across six experiments, we attempted to disrupt the PIT effects by manipulating the mapping between decisions and actions and imposing constraints on response times (RTs). Our goal was to assess the presence of the metacognitive bias in the absence of the RT bias. Were observed both PIT and PMT despite our disruption attempts, establishing that the effects of valence on motor and metacognitive responses are very robust and replicable. Nonetheless, within- and between-individual inferences reveal that the confidence bias resists the disruption of the RT bias. Therefore, although concomitant in most cases, PMT and PIT seem to be – partly – dissociable. These results highlight new important mechanistic constraints that should be incorporated in learning models to jointly explain choice, reaction times and confidence.

Download Full-text

Robust valence-induced biases on motor response and confidence in human reinforcement learning

Cognitive Affective & Behavioral Neuroscience ◽

10.3758/s13415-020-00826-0 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1184-1199 ◽

Cited By ~ 1

Author(s):

Chih-Chung Ting ◽

Stefano Palminteri ◽

Jan B. Engelmann ◽

Maël Lebreton

Keyword(s):

Reinforcement Learning ◽

Motor Response ◽

Response Times ◽

Instrumental Learning ◽

Reaction Times ◽

Learning Models ◽

Learning Tasks ◽

Single Mechanism ◽

Choice Reaction Times

AbstractIn simple instrumental-learning tasks, humans learn to seek gains and to avoid losses equally well. Yet, two effects of valence are observed. First, decisions in loss-contexts are slower. Second, loss contexts decrease individuals’ confidence in their choices. Whether these two effects are two manifestations of a single mechanism or whether they can be partially dissociated is unknown. Across six experiments, we attempted to disrupt the valence-induced motor bias effects by manipulating the mapping between decisions and actions and imposing constraints on response times (RTs). Our goal was to assess the presence of the valence-induced confidence bias in the absence of the RT bias. We observed both motor and confidence biases despite our disruption attempts, establishing that the effects of valence on motor and metacognitive responses are very robust and replicable. Nonetheless, within- and between-individual inferences reveal that the confidence bias resists the disruption of the RT bias. Therefore, although concomitant in most cases, valence-induced motor and confidence biases seem to be partly dissociable. These results highlight new important mechanistic constraints that should be incorporated in learning models to jointly explain choice, reaction times and confidence.

Download Full-text

A new model of decision processing in instrumental learning tasks

eLife ◽

10.7554/elife.63055 ◽

2021 ◽

Vol 10 ◽

Author(s):

Steven Miletić ◽

Russell J Boag ◽

Anne C Trutti ◽

Niek Stevenson ◽

Birte U Forstmann ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Cognitive Modeling ◽

Response Times ◽

Instrumental Learning ◽

Binary Choice ◽

Stimulus Response ◽

Fundamental Limitations ◽

Learning Tasks ◽

Speed Accuracy

Learning and decision-making are interactive processes, yet cognitive modeling of error-driven learning and decision-making have largely evolved separately. Recently, evidence accumulation models (EAMs) of decision-making and reinforcement learning (RL) models of error-driven learning have been combined into joint RL-EAMs that can in principle address these interactions. However, we show that the most commonly used combination, based on the diffusion decision model (DDM) for binary choice, consistently fails to capture crucial aspects of response times observed during reinforcement learning. We propose a new RL-EAM based on an advantage racing diffusion (ARD) framework for choices among two or more options that not only addresses this problem but captures stimulus difficulty, speed-accuracy trade-off, and stimulus-response-mapping reversal effects. The RL-ARD avoids fundamental limitations imposed by the DDM on addressing effects of absolute values of choices, as well as extensions beyond binary choice, and provides a computationally tractable basis for wider applications.

Download Full-text

A new model of decision processing in instrumental learning tasks

10.1101/2020.09.12.294512 ◽

2020 ◽

Author(s):

Steven Miletić ◽

Russell J. Boag ◽

Anne C. Trutti ◽

Birte U. Forstmann ◽

Andrew Heathcote

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Response Times ◽

Instrumental Learning ◽

Binary Choice ◽

Stimulus Response ◽

Fundamental Limitations ◽

Learning Tasks ◽

Speed Accuracy ◽

Diffusion Decision Model

AbstractLearning and decision making are interactive processes, yet cognitive modelling of error-driven learning and decision making have largely evolved separately. Recently, evidence accumulation models (EAMs) of decision making and reinforcement learning (RL) models of error-driven learning have been combined into joint RL-EAMs that can in principle address these interactions. However, we show that the most commonly used combination, based on the diffusion decision model (DDM) for binary choice, consistently fails to capture crucial aspects of response times observed during reinforcement learning. We propose a new RL-EAM based on an advantage racing diffusion (ARD) framework for choices among two or more options that not only addresses this problem but captures stimulus difficulty, speed-accuracy trade-off, and stimulus-response-mapping reversal effects. The RL-ARD avoids fundamental limitations imposed by the DDM on addressing effects of absolute values of choices, as well as extensions beyond binary choice, and provides a computationally tractable basis for wider applications.

Download Full-text

Joint modeling of reaction times and choice improves parameter identifiability in reinforcement learning models

Journal of Neuroscience Methods ◽

10.1016/j.jneumeth.2019.01.006 ◽

2019 ◽

Vol 317 ◽

pp. 37-44 ◽

Cited By ~ 11

Author(s):

Ian C. Ballard ◽

Samuel M. McClure

Keyword(s):

Reinforcement Learning ◽

Reaction Times ◽

Joint Modeling ◽

Learning Models ◽

Parameter Identifiability ◽

Reinforcement Learning Models

Download Full-text

Joint Modeling of Reaction Times and Choice Improves Parameter Identifiability in Reinforcement Learning Models

10.1101/306720 ◽

2018 ◽

Author(s):

Ian C. Ballard ◽

Samuel M. McClure

Keyword(s):

Reinforcement Learning ◽

Model Fitting ◽

Reaction Times ◽

Learning Rate ◽

List Type ◽

Learning Models ◽

Parameter Identifiability ◽

Bayesian Priors ◽

Reinforcement Learning Models ◽

Parameters Of Reinforcement

AbstractBackgroundReinforcement learning models provide excellent descriptions of learning in multiple species across a variety of tasks. Many researchers are interested in relating parameters of reinforcement learning models to neural measures, psychological variables or experimental manipulations. We demonstrate that parameter identification is difficult because a range of parameter values provide approximately equal quality fits to data. This identification problem has a large impact on power: we show that a researcher who wants to detect a medium sized correlation (r = .3) with 80% power between a variable and learning rate must collect 60% more subjects than specified by a typical power analysis in order to account for the noise introduced by model fitting.New MethodWe derive a Bayesian optimal model fitting technique that takes advantage of information contained in choices and reaction times to constrain parameter estimates.ResultsWe show using simulation and empirical data that this method substantially improves the ability to recover learning rates.Comparison with Existing MethodsWe compare this method against the use of Bayesian priors. We show in simulations that the combined use of Bayesian priors and reaction times confers the highest parameter identifiability. However, in real data where the priors may have been misspecified, the use of Bayesian priors interferes with the ability of reaction time data to improve parameter identifiability.ConclusionsWe present a simple technique that takes advantage of readily available data to substantially improve the quality of inferences that can be drawn from parameters of reinforcement learning models.Highlights–Parameters of reinforcement learning models are particularly difficult to estimate–Incorporating reaction times into model fitting improves parameter identifiability–Bayesian weighting of choice and reaction times improves the power of analyses assessing learning rate

Download Full-text

Subjective Expectancy and Choice Reaction Times

Quarterly Journal of Experimental Psychology ◽

10.1080/17470216408416371 ◽

1964 ◽

Vol 16 (3) ◽

pp. 216-223 ◽

Cited By ~ 15

Author(s):

G. H. Mowbray

Keyword(s):

Response Times ◽

Reaction Times ◽

Stimulus Interval ◽

Probability Of Occurrence ◽

Inter Stimulus Interval ◽

Two Factors ◽

Choice Reaction Times

Previous findings suggested that selective response times might be affected both by the inter-stimulus interval and by the probability of occurrence of the stimulus for reaction. These two factors have been tested independently and have been found to influence reaction times in a fashion that an expectancy hypothesis would predict.

Download Full-text

Faster Choice-Reaction Times to Positive than to Negative Facial Expressions

Journal of Psychophysiology ◽

10.1027//0269-8803.17.3.113 ◽

2003 ◽

Vol 17 (3) ◽

pp. 113-123 ◽

Cited By ~ 76

Author(s):

Jukka M. Leppänen ◽

Mirja Tenhunen ◽

Jari K. Hietanen

Keyword(s):

Facial Expressions ◽

Cognitive Processing ◽

Response Selection ◽

Reaction Times ◽

Happy Face ◽

Response Onset ◽

Response Execution ◽

Onset Response ◽

Positive Stimuli ◽

Choice Reaction Times

Abstract Several studies have shown faster choice-reaction times to positive than to negative facial expressions. The present study examined whether this effect is exclusively due to faster cognitive processing of positive stimuli (i.e., processes leading up to, and including, response selection), or whether it also involves faster motor execution of the selected response. In two experiments, response selection (onset of the lateralized readiness potential, LRP) and response execution (LRP onset-response onset) times for positive (happy) and negative (disgusted/angry) faces were examined. Shorter response selection times for positive than for negative faces were found in both experiments but there was no difference in response execution times. Together, these results suggest that the happy-face advantage occurs primarily at premotoric processing stages. Implications that the happy-face advantage may reflect an interaction between emotional and cognitive factors are discussed.

Download Full-text

Supplemental Material for Reconciling Reinforcement Learning Models With Behavioral Extinction and Renewal: Implications for Addiction, Relapse, and Problem Gambling

Psychological Review ◽

10.1037/0033-295x.114.3.784.supp ◽

2007 ◽

Cited By ~ 1

Keyword(s):

Reinforcement Learning ◽

Problem Gambling ◽

Learning Models ◽

Behavioral Extinction ◽

Reinforcement Learning Models

Download Full-text

Bayes factors for reinforcement-learning models of the Iowa gambling task.

Decision ◽

10.1037/dec0000040 ◽

2016 ◽

Vol 3 (2) ◽

pp. 115-131 ◽

Cited By ~ 14

Author(s):

Helen Steingroever ◽

Ruud Wetzels ◽

Eric-Jan Wagenmakers

Keyword(s):

Reinforcement Learning ◽

Iowa Gambling Task ◽

Bayes Factors ◽

Gambling Task ◽

Learning Models ◽

Reinforcement Learning Models

Download Full-text

Effects of Working Memory Capacity on the Speed and Accuracy of Learning in Reinforcement Learning Models

PsycEXTRA Dataset ◽

10.1037/e528942014-552 ◽

2014 ◽

Author(s):

Adnane Ez-Zizi ◽

Simon Farrell ◽

David Leslie

Keyword(s):

Working Memory ◽

Reinforcement Learning ◽

Working Memory Capacity ◽

Memory Capacity ◽

Learning Models ◽

Reinforcement Learning Models ◽

Speed And Accuracy

Download Full-text